Just how does cooperation evolve? If you followed Darwin’s survival of the fittest, cooperation doesn’t make sense. How do you benefit from sacrificing for someone else? That’s the problem that game theory sets out to solve. Along the way, they found an emergence of cooperation as a normal form of evolving to win in a competitive environment. The Evolution of Cooperation walks through the models and competitions that lead to a better understanding of how we evolved to cooperate.
Scientific Computer Programming
It was my junior year of high school, and I got into a class named Scientific Computer Programming. It was so named, I was told, because the science department wanted to teach it, and there was some conversation about whether the math department should be allowed to teach it. The man who taught it was also my physics teacher. He was a tall man and a gentle but imposing force to be reckoned with. Somewhere along the way, he prepared us to write a competitive game.
I didn’t know then, but I do know now, that it was a variant of the game that Robert Axelrod ran. It was called The Prisoner’s Dilemma. The basic construct was two prisoners are caught and separated. Each is given a deal. They can rat out their co-conspirator – to defect — for a lesser sentence. If both defect, they both end up with long sentences. If they both cooperate (don’t defect), both prisoners end up with shorter sentences. If one defects and the other doesn’t, the defector gets the best possible deal, while the person who didn’t defect gets the worst possible result – even worse than if both had defected.
There are some more rules, like non-communication between the two prisoners (competing programs) and so forth, but the one interesting thing is that you can record what happened in prior moves. In our class, I remember I took the average of every prior move that the opponent had made and used that to predict what they would do next.
In our case, as in the second edition of Alexrod’s competition, there was no way for the program to know when the last round would be played – there was a small probability that each round was the last. This prevents the strategy of always defecting on the last move, since there’s no better alternative.
I didn’t realize back then that I was learning about game theory. It was just another assignment in the class – which, though I liked it, was still a class. I got a reintroduction to it in Gottman’s The Science of Trust. In a book on relationships, it seemed like a stretch. That being said, it had an important lesson, one that Axelrod’s competitions played out. There are two equilibriums possible. The first is the von Neumann-Morgenstern equilibrium, where everyone looks out only for their best interests. The second is the Nash equilibrium, where people look out for the overall good – not just their own good.
Axelrod showed through the simulations how independent programs – or organisms – could collectively develop towards the Nash equilibrium, even using something as simple as an eye for an eye.
Tit for Tat
As it turns out, my program was beat rather handily by some others in the competition, but it was fun anyway. What I didn’t expect was that Axelrod’s competition, which drew entries from scholars in many different disciplines, was beat by a very simple program. The simple program that won his competition was Tit for Tat. It cooperates on the first move, and then every move after that, it simply does what the other program did. Thus, if the other program defected, Tit for Tat would defect. It’s very simple, but its simplicity got great results.
While Tit for Tat didn’t ever get the highest score, it always got a good score. Whether it was competing with itself or other strategies, overall, Tit for Tat was the winner.
Characteristics for Cooperation
Axelrod took his findings from running these competitions with many different programs and generalized a set of principles that defined the winners for the competition he established. He asserts that, to win this game, the programs needed to be:
- Nice – The nice programs won over not-so-nice programs
- Provokable – The program needed to respond quickly when the opponent would defect.
- Forgiving – Once the other program started to cooperate, the program should start to cooperate too.
- Clear – The program should make its behavior clear enough that the opponent would be able to understand the behavior and learn to work towards mutual benefit.
Tit for Tat was an ideal approach based on these rules. It started with cooperation, and after a single defection, it would defect to penalize the opposing program. Once the opposing program started responding with cooperation, it would respond in kind. Its logic and approach was neither complex nor cloaked. The other program could easily anticipate how Tit for Tat would respond after only a few rounds.
Barriers to Cooperation
Tit for Tat, you may recall, never got the highest score – it couldn’t. However, it did consistently get good scores. Tit for Tat avoided some of the barriers that other entrants had, like:
- Being Envious – By being worried about your opponent in the current challenge, programs were less effective.
- First to Defect – Programs that were the first to defect tended to do less well.
- Failure to Reciprocate – Whether it’s niceness or not-so-niceness, programs that gained long-term cooperation tended to reciprocate.
- Being Clever – Programs that were too clever didn’t allow a condition to occur where the other program could predict its responses.
All in all, these barriers to cooperation are largely opposites of the kinds of characteristics that drive cooperation.
Some interesting learnings show up if you randomize the different programs and run a sort of evolutionary game with them. The programs that get the most points collectively every few rounds get to replicate, and those who lose consistently die out. The result is that the programs that weren’t nice may have won for a while when there were “sucker” programs, but when the suckers died out, the not-nice programs eventually became extinct as well.
There were some challenges, however. Once an environment became All D (short for “all defects”), it was impossible for any strategy – including Tit for Tat – to gain a foothold. However, if you introduced new programs in clusters, so that at least some of their interactions would be with like programs, it was possible for programs like Tit for Tat to not only get a foothold, but also to start to eradicate All D. This works, because the benefits of both programs collaborating far outweigh the benefits for both programs defecting. Tit for Tat does so much better against itself that it ends up with a point surplus, even if it has to give up one move to All D. (The first round, All D will defect, and Tit for Tat will cooperate, giving it a disadvantage – however, a round with two Tit for Tat programs handily makes up for this small difference.)
Applicability to Life
Perhaps my greatest concern with the exercise and the learnings is their applicability to our lives as humans. I don’t intend to discount what we’ve learned but rather, I want to make clear the narrow space where this works.
One of the key driving factors for this game is that the cooperative payoff is collectively larger than one defection and one cooperation or both sides defecting. Effectively, this is a built-in bias that cooperation is the winning move – when you can get both parties to agree. The good news is that, on this front, I expect we’re in relatively safe footing. In most cases in life, we’re better off cooperating versus competing or attempting to take advantage of one another.
Will We Meet Again
The second concern is that the game presumes that the participants will meet again. In fact, retaining the probability that the participants will meet again is absolutely key to the system working. When you remove the chance that you’ll meet again, the best strategy is to defect. As mentioned earlier, this caused Axelrod to modify the game to not have a fixed endpoint, since knowing the end caused defections at the end.
In our world today, it’s unclear to me how much we expect that we’ll meet with others again. It’s unclear how much our reputations precede us and how much our behaviors impact our future interactions. I know that they should. I know that, for the system to remain stable, we must believe we’ll meet again, because otherwise there’s no point in working with someone. There are better payoffs to take advantage of them.
As we move from smaller communities to larger towns, and we have higher mobility, I’m concerned that this critical condition may be lost.
Value of the Future
Another inherent requirement is that the future not be discounted too much. It’s important that we believe that giving up some level of benefit today is worth the future benefits of cooperation. This obviously goes down when we don’t think we’ll meet again – thus we can’t assume cooperation. However, more than that we learned in Thinking, Fast and Slow that we discount things in the future more than we should. Whatever true value we get in the future has to outweigh our cognitive bias against it.
Lack of Escalation
In most things in life, there’s a compounding that happens. Compounding of interest at a modest 12% causes money to double in six years. The Rise of Superman, Flow, and Finding Flow spoke about how a 4% improvement each year in skill can lead to performances that seem impossible today. Fundamental to The Prisoner’s Dilemma game is a lack of escalation. This leads to the future not being seen as more valuable – and that can be problematic.
Variation and Non-Zero Sum
Most of life isn’t zero sum – and The Prisoner’s Dilemma illustrates this to some degree. However, the stability of the outcomes for each round isn’t what we see in life. Some interactions are more important than others. They’re worth more. In effect, there’s a relatively large degree of variation in real life in terms of the rewards (or punishments), but these aren’t captured. In real life, someone can die – or exit the game – if they’re hurt too badly. However, this is explicitly prevented by the rules in The Prisoner’s Dilemma.
There’s been a human behavior that has fascinated economists. As I mentioned in my review of Drive, there’s a strange human behavior in The Ultimatum Game. In short, two people play with a fixed amount of money (say $10) that the first person decides how to split. The second person gets to decide whether both get the split – or neither. From an economist’s point of view, the second person should always take the split, because they’ll be better off. However, that’s not what happens.
If the split gets too unbalanced – say $7-$3 or $8-$2 – the second person starts to prevent either person from getting money. Seen in the context of The Evolution of Cooperation, this makes sense. It’s necessary for someone to punish the other when their behavior exceeds acceptable boundaries.
An area of concern that Collaboration raised was the issue of “social loafing:” people not pulling their own weight and relying on others to do all the work. Accountability is the proposed solution. We see this in The Prisoner’s Dilemma, where it’s important for misbehaving programs – or people – to be punished. We also see that when all the “suckers” are weeded out, those living off of those “suckers” also die out. So evolution has primed us to weed out those folks who are social loafers.
For each of us, there’s a line between “social loafing” and being able to contribute our fair share to the community. Finding that line seems to be one of the things that happens in The Evolution of Cooperation.