When does competition lead to recognisable value?
Beren Millidge gave an interesting talk at the Post-AGI Workshop (NeurIPS 2025) asking what happens to human values in a world of many powerful AIs. Many have worried that this would produce ruthless competition that erodes all value (see e.g. Meditations on Moloch).
But Millidge claims that human values emerged from competition. This is a story familiar from cultural evolution—more cooperative groups tend to outcompete less cooperative groups—though Millidge frames it in ML terms:
- Reinforcement learning - We have innate drives from evolution: avoid hunger, avoid pain, seek reproduction. These are sparse reward signals.
- Unsupervised learning - We absorb massive amounts of data from culture and society. In terms of pure bits, this vastly outweighs the RL signals.
Once cooperative values have evolved (for competitive reasons), we need mechanisms to maintain and transmit them to the next generation. Each generation further distills and rationalizes these values, until we end up with very abstract concepts like utilitarianism or deontology. So on this picture, values are cultural/linguistic constructs built on top of basic drives, refined over generations. This is good news for alignment: LLMs are extremely good at understanding these values, because they exist in the cultural corpuses we create.
But these values emerged under particular conditions, which may not hold for advanced AIs:
- Roughly equal power. Coalition politics works: if someone gets too powerful, others band together.
- Positive-sum interactions. People have somewhat overlapping goals, allowing for gains from trade.
- Prevention of defection and deception. Cooperation requires enforcement mechanisms.
- Memory and reputation. Tracking past interactions and reputations helps identify defectors.
- Communication bandwidth. Coordination and negotiation require communication. The more agents can share, the more sophisticated their cooperation can be.
- Computational limitations. Values are heuristics. Agents with unlimited compute could just figure out when defection is safe.
So: will a world of many competing AIs satisfy these conditions?
- Power gaps - If one AI becomes vastly more powerful than all others combined, cooperation loses value for it. But there’s a big gap between “more powerful than anybody” and “more powerful than everybody.” As long as coalitions can form, the game theory still favors cooperation.
- Positive-sum interactions - AIs might be more monomaniacal: if one cares only about paperclips and another only about staplers, there’s less room for positive-sum exchange.
- Perfect monitoring - AIs might be able to read each other’s thoughts. If deception becomes impossible, we get a hyper-cooperative world, but human values around defection-prevention (anger, envy, suspicion) would disappear.
- Fluid agency - AIs can merge, fork, share utility functions. If agents can just combine rather than negotiate, the game theory changes fundamentally. “Agent” stops being a fixed unit.
- Long time horizons - AI are effectively immortal with perfect memory. Longer games favor cooperation. This one probably helps.
- Computational power - If AIs can do the full game-theoretic calculation every time, they don’t need heuristic values, though there would still be cooperation.
So this multipolar outcome might produce cooperation without producing values as we know them.
Millidge speculates further. If AIs can communicate with high bandwidth and merge fluidly, coordination costs disappear. Empires throughout history have been limited by principal-agent problems. AI copies with telepathic links don’t have this problem. This could enable “super-minds” - a transition analogous to single cells becoming multicellular organisms. Individual minds might combine into larger structures, with extreme specialization (like cells in a liver versus free-living bacteria).
This might just recreate the singleton. But if so, it’s a singleton whose values are some combination of all the minds that joined it - not some arbitrary utility function that emerged from unilateral optimization. Or we might get slime mold dynamics: minds combine for specific purposes, then disperse. Fluid coalitions rather than permanent merger.