Market Experiments: What Lab Auctions Taught About Design

A Market in a Classroom

In January 1956, a young economist named Vernon Smith walked into a classroom at Purdue University and ran an experiment that should not have worked. He gave half the students cards with private “values” — the maximum they would pay for a hypothetical good — and the other half cards with private “costs” — the minimum they would accept to sell. Then he opened a market. Buyers called out bids; sellers called out asking prices. When a buyer and seller agreed, they traded. Smith recorded every price.

He expected chaos. The students had no experience, no market information beyond their own card, and no training in economics. What he got instead was convergence. Within a few rounds of trading, the transaction prices clustered tightly around the competitive equilibrium predicted by standard supply-and-demand theory — the price at which the quantity demanded equals the quantity supplied, given the distribution of values and costs on the cards.

This was, in one sense, a vindication of economic theory. The invisible hand worked, even in a room of undergraduates with index cards. But in another sense, it was deeply puzzling. The competitive equilibrium is derived under assumptions of perfect information, infinite agents, and price-taking behavior. Smith’s markets had none of these. There were perhaps twenty students. Each knew only their own value or cost. They were actively setting prices, not passively accepting them. And yet the outcome was competitive equilibrium, achieved in minutes.

Smith spent the next four decades figuring out why, and in the process, he built an entirely new field: experimental economics.

The Double Auction and Its Surprising Power

The institution Smith used in that first experiment — and in hundreds of variations afterward — was the continuous double auction. In a double auction, both buyers and sellers can announce prices at any time. A buyer can call out a bid; a seller can call out an ask. A trade occurs when someone accepts the other side’s offer. The New York Stock Exchange, for much of its history, operated as a version of this institution.

What makes the double auction remarkable is its informational efficiency. No participant needs to know anyone else’s values or costs. No one needs to know the shape of the supply or demand curve. No one needs to calculate the equilibrium. All they need to do is pursue their own interest — buy low, sell high — within the rules of the institution. The institution itself aggregates the dispersed information and drives prices toward the equilibrium.

Smith’s early experiments, published in a landmark 1962 paper in the Journal of Political Economy, demonstrated this convergence across a range of supply and demand configurations. Prices converged whether the curves were symmetric or asymmetric, whether the equilibrium was at a high or low quantity, and whether the number of traders was large or small. Convergence was fastest when there were many traders but still occurred with as few as six or eight.

The speed of convergence was the real surprise. In many experiments, prices were within a few percentage points of the competitive equilibrium by the third or fourth trading period, each period lasting only a few minutes. This was far faster than any theoretical model predicted. The Walrasian auctioneer — the imaginary figure in general equilibrium theory who adjusts prices until markets clear — was supposed to be a metaphor. In the double auction, it was approximately real, except that no one was playing the role. The institution was the auctioneer.

Institutions as Variables

Smith’s most important methodological contribution was the insight that the rules of the market — what he called the “institution” — are not background assumptions to be held constant. They are independent variables that can be manipulated, and they profoundly affect outcomes.

Consider the difference between a double auction and a posted-offer market. In a posted-offer market, sellers set prices and buyers take them or leave them — think of a retail store. Smith and his colleagues showed that posted-offer markets converge to equilibrium more slowly, often settle at higher prices (favoring sellers), and are more susceptible to tacit collusion than double auctions. The same supply and demand curves, the same number of traders, the same information structure — but a different institution produced a different outcome.

This finding had deep theoretical implications. In standard price theory, the market institution is transparent — a given supply and demand should produce a given equilibrium regardless of how the market is organized. Smith’s experiments showed that this is wrong. Institutions matter, and they matter in systematic, predictable ways. A market with one set of rules may be efficient; the same market with different rules may not be. This opened the door to mechanism design — the engineering of institutions to achieve specific goals — which would become one of the most practically important branches of economics.

Smith also explored markets with features that violate the assumptions of the first welfare theorem: market power, externalities, asymmetric information, common-value uncertainty. In each case, he tested whether the theoretical predictions held in practice, and the results were a mix of confirmations and surprises. Markets with a single seller (monopoly) did produce higher prices and lower quantities than competitive markets, but the markup was often less than theory predicted — the monopolist left money on the table, perhaps because of fairness norms or because the institution constrained exploitative behavior. Markets with externalities did fail to reach efficient outcomes, but the degree of failure depended on the institution: some trading rules mitigated the externality better than others.

The 2002 Nobel: Smith and Kahneman

In 2002, the Nobel Committee made a striking choice: it awarded the prize jointly to Vernon Smith and Daniel Kahneman. The pairing was deliberate. Smith had shown that market institutions can produce rational outcomes even when individual participants are boundedly rational. Kahneman had shown that individual judgment systematically departs from the predictions of rational choice theory. The two bodies of work were not contradictory but complementary — they mapped the terrain between individual cognition and market performance, showing that the gap between the two is mediated by institutions.

The Nobel citation for Smith emphasized that he had “established laboratory experiments as a tool in empirical economic analysis.” Before Smith, economics was considered an observational science, like astronomy — you could watch economies but not manipulate them. After Smith, economists had a laboratory. They could control supply, demand, information, market rules, and trader characteristics, and they could isolate the causal effect of each variable in a way that field data almost never permits.

Kahneman’s citation emphasized the opposite: the systematic irrationality of individual judgment. Together, the two prizes said something important about human economic behavior: individuals are biased, but markets can still work — and whether they do depends on how they are designed.

Auction Theory: First-Price, Second-Price, and Beyond

The experimental approach proved especially valuable in the study of auctions, one of the oldest and most ubiquitous market institutions. Auction theory had been developed mathematically by William Vickrey in 1961, but his predictions were difficult to test with field data because bidders’ true valuations are never observed. In the lab, the experimenter assigns the valuations and can compare actual bids to theoretical predictions.

First-price sealed-bid auctions work as follows: each bidder submits a single bid without knowing the others’ bids. The highest bidder wins and pays their bid. Theory predicts that rational bidders will shade their bids below their true values — bidding your true value guarantees zero profit if you win. The optimal amount of shading depends on the number of bidders and the distribution of values. Experimental evidence confirms that bidders do shade, but they typically shade less than the theory predicts. This “overbidding” in first-price auctions is one of the most robust findings in experimental economics, and it has been attributed to risk aversion, the “joy of winning,” and competitive arousal.

Second-price sealed-bid auctions (Vickrey auctions) have an elegant property: the dominant strategy is to bid your true value. The winner pays the second-highest bid, so you can never benefit from bidding below your value (you might lose an auction you should have won) or above it (you might win and pay more than the item is worth). This “strategy-proofness” makes the second-price auction theoretically attractive, and Vickrey won the 1996 Nobel partly for this insight. But experiments show that many bidders do not follow the dominant strategy. They overbid, underbid, and display a general failure to grasp the logic of second-price auctions, at least initially. With experience and feedback, behavior moves closer to the theoretical prediction, but the convergence is slower than in double auctions.

Ascending (English) auctions — the familiar format in which the auctioneer calls out rising prices and bidders drop out when the price exceeds their value — are strategically equivalent to second-price auctions for private-value goods (where each bidder knows exactly how much the item is worth to them). But ascending auctions produce closer-to-optimal behavior in experiments, presumably because the strategic logic is transparent: stay in while the price is below your value, drop out when it exceeds it. The ascending auction makes the dominant strategy obvious; the sealed-bid second-price auction hides it.

Common-value auctions — where the item has the same value to all bidders, but each bidder has only a noisy estimate of that value (think of bidding for oil drilling rights) — introduce a famous trap: the winner’s curse. The winner is likely the bidder with the most optimistic estimate, which means they probably overpaid. Theory predicts that sophisticated bidders will adjust for the winner’s curse by bidding more conservatively. Experiments show that inexperienced bidders fall prey to the winner’s curse repeatedly, suffering consistent losses, but that experienced bidders learn to adjust — slowly and incompletely.

The FCC Spectrum Auctions: From Lab to Field

The most celebrated application of auction theory and experimental economics was the design of the Federal Communications Commission’s spectrum auctions, which began in 1994. Before the auctions, the FCC allocated radio spectrum — the electromagnetic frequencies used for broadcasting, mobile phones, and wireless data — through administrative hearings or lotteries. Both methods were widely regarded as inefficient and corrupt.

Congress authorized the FCC to use auctions, but no standard auction format was adequate for the task. Spectrum licenses are interdependent: a license for the New York metropolitan area is worth more if you also hold licenses for adjacent areas, because you can offer seamless regional coverage. A simple sequential auction would not work because bidders could not anticipate the prices of complementary licenses. A simultaneous sealed-bid auction would not allow bidders to express their preferences over combinations.

The FCC turned to a group of economists — Paul Milgrom, Robert Wilson, Preston McAfee, and John McMillan, among others — who designed a novel format: the simultaneous ascending auction (later called the simultaneous multi-round auction, or SMRA). All licenses were auctioned simultaneously, with multiple rounds of ascending bids. Bidders could see the standing high bid on every license after each round and shift their bids across licenses in response. The auction continued until no new bids were submitted on any license.

The design drew directly on experimental findings. Smith and others had shown that ascending auctions produced better outcomes than sealed-bid auctions for complex goods, because bidders could learn and adjust during the auction. The simultaneous structure allowed bidders to pursue complementary packages. Activity rules — requirements that bidders remain active on a minimum number of licenses each round — prevented “snake in the grass” strategies where bidders waited until the last moment to bid.

The first major FCC auction, in 1994, raised $617 million — far more than the government had expected. Subsequent auctions raised tens of billions. The auctions were widely regarded as a triumph of economic design, and they accelerated the development of mechanism design as a practical discipline. Milgrom and Wilson shared the 2020 Nobel Prize in part for this work.

Roth and Matching Markets

While Smith and the auction theorists focused on markets where prices do the work of allocation, Alvin Roth pursued a different question: how do you design a good market when prices are not permitted, not desirable, or not sufficient?

The motivating example was the market for medical residencies. In the early twentieth century, hospitals competed for graduating medical students by making offers earlier and earlier — a classic unraveling problem. By the 1940s, offers were being made two years before graduation, before students had completed most of their clinical training. The market was chaotic and inefficient.

In 1952, a centralized clearinghouse was established: the National Resident Matching Program (NRMP). Students and hospitals submitted ranked preference lists, and an algorithm matched them. Roth showed in the 1980s that the algorithm used by the NRMP was a variant of the Gale-Shapley deferred acceptance algorithm — a theoretical construct from 1962 that Gale and Shapley had proved produces stable matchings, meaning no student-hospital pair would prefer each other to their assigned match.

Roth’s contribution was to treat matching markets as design problems, analogous to the FCC’s spectrum auctions but in domains where money could not be used. He redesigned the NRMP algorithm to handle couples (two doctors who need to be placed in the same city), and he applied matching theory to school choice in Boston and New York City, where families are assigned to public schools through a centralized process.

His most dramatic application was the design of kidney exchange programs. Kidney transplants from living donors often fail because the donor’s kidney is incompatible with the intended recipient. Roth and colleagues devised algorithms for matching incompatible donor-recipient pairs with each other: if your donor is compatible with my recipient and my donor is compatible with yours, we can trade. The resulting exchanges — sometimes involving chains of a dozen or more pairs — have saved thousands of lives.

Roth shared the 2012 Nobel Prize with Lloyd Shapley for this work. Like the spectrum auctions, it demonstrated that economic theory is not only descriptive but prescriptive — it can design institutions, not just analyze them.

What Lab Experiments Can and Cannot Do

Experimental economics, now more than six decades old, has established itself as an indispensable tool. But every tool has limitations, and the honest practitioners of the field have been explicit about what lab experiments can and cannot tell us.

What they can do. Lab experiments can test the internal logic of a theory: does the theory’s prediction hold when its assumptions are exactly satisfied? They can identify the causal effect of a single variable — a change in the auction format, a shift in information structure, a modification of the rules — by holding everything else constant. They can discover phenomena that theory did not predict, such as the robustness of double-auction convergence or the persistent overbidding in first-price auctions. And they can compare the performance of alternative institutional designs in controlled conditions before deploying them in the field, as the FCC experience demonstrated.

What they cannot do. Lab experiments use student subjects who may differ from real-world decision-makers in experience, motivation, and stakes. The stakes in lab experiments are typically small — a few dollars — and behavior may change when thousands or millions are involved. The time horizons are compressed: a “market” that runs for thirty minutes cannot capture the dynamics of a market that operates over years. And the artificial clarity of the lab — where values are assigned, rules are stated, and information is controlled — may not translate to the messy, ambiguous conditions of the field.

The response to these limitations has been threefold. First, replication: the key findings of experimental economics have been replicated hundreds of times, across different subject pools, countries, and stake levels. Second, the development of field experiments — randomized interventions conducted in real markets, with real stakes and real participants — as a complement to lab experiments. Third, a growing body of work that bridges the two, using lab findings to generate hypotheses that are then tested in the field.

Smith himself was always clear-eyed about the scope of his method. “The experiments tell you what can happen under controlled conditions,” he said. “They don’t tell you what will happen in the wild. But knowing what can happen is enormously valuable, because it tells you what your theory should be able to explain.”

The Legacy: Economics as a Design Science

The deepest legacy of Smith’s experimental revolution, and of the auction and matching-market design work that followed, is a shift in what economics aspires to be. For most of its history, economics was an analytical discipline: it explained why markets work (or don’t), described equilibrium conditions, and predicted the effects of policy changes. It was, in the broadest sense, a science of observation and explanation.

Experimental economics, mechanism design, and market design turned economics into something more: a design science. Economists do not just analyze markets; they build them. They do not just predict auction outcomes; they engineer auctions to achieve specific objectives — revenue maximization, efficient allocation, fairness. They do not just study matching; they create matching algorithms that are deployed in school districts, hospitals, and organ transplant networks.

This shift has practical consequences. The FCC spectrum auctions have generated over $200 billion in revenue for the U.S. government. Kidney exchange programs have facilitated thousands of transplants that would not otherwise have occurred. School choice algorithms have improved the assignment of students to schools in cities around the world.

It also has intellectual consequences. When you design an institution, you learn what matters in ways that pure analysis cannot teach. You discover that theoretical equivalences (first-price and Dutch auctions, second-price and English auctions) break down in practice because real people are not the frictionless agents of theory. You learn that small details of institutional design — the timing of information disclosure, the rules governing bid increments, the structure of tiebreaking — can have large effects on outcomes. You learn that the gap between theory and practice is not noise to be averaged away but a rich source of insight about how humans actually behave in economic environments.

Vernon Smith’s 1956 classroom experiment launched this enterprise. What began as a test of competitive equilibrium theory became a methodology, a field, and ultimately a new way of doing economics — one that takes the design of institutions as seriously as the analysis of their outcomes. The students with the index cards could not have known it, but they were participants in a revolution.