A group of researchers set out to answer this question, and published their results in Science last week. To tackle the issue, the researchers set up a computer-based tournament based on Robert Axelrod’s ‘Prisoner’s Dilemma’ competitions in the late 1970s. In this type of tournament, entrants submit computerized strategies that compete against each other in a virtual world. Individuals, or “agents,” with the most successful strategies survive and reproduce, while less successful strategies die out.
In each round of the social learning tournament, automated agents could choose from 100 behaviors, each of which returned a certain payoff. The payoffs changed over the course of the tournament, simulating changing environmental conditions that might render a behavior more or less useful. In any round, agents could make one of three moves: use a behavior they already knew (Exploit), use asocial learning to test a new behavior by trial-and-error (Innovate), or learn socially by watching a behavior that another agent was performing in that round (Observe). Out of the three possible moves, only Exploit resulted in a payoff; the two learning moves would only return information about how profitable the behavior was in the current environmental conditions. Social learning was especially costly; if Observe was played when no other agent was performing a novel behavior, the agent learned nothing.
Over 10,000 rounds, agents had a constant probability of dying, but their ability to reproduce was based on their “success,” or the total of the payoffs they had received. Each strategy’s final score was determined by its average frequency in the population during the final 2,500 rounds.
The researchers received submissions of agents from academics, graduate students, and high-schoolers from 16 different countries. A huge variety of disciplines were represented, including computer science, philosophy, neuroscience, and primatology. Entries could be submitted as Matlab functions or in pseudocode form, which is a series of verbal, mathematical, and logical instructions of how the decisions should be made.
Out of 104 submitted strategies, one called discountmachine was the runaway winner. The researchers expected that the best strategies would balance Observe and Innovate moves, in order to limit the costs associated with social learning. Surprisingly, discountmachine (as well as the second-place strategy, intergeneration) used almost exclusively social, rather than asocial, learning. The results suggest that social learning was successful because agents playing Observe were learning behaviors that other agents had chosen to play based on their high payoffs; in other words, despite the potential cost, they were consistently learning behaviors with high returns.
The most successful strategies relied more heavily on information that was recently acquired, since knowledge of the payoffs was up-to-date. However, discountmachine went one step further than other strategies, varying the use of outdated information based on how quickly the environmental conditions were changing. When the environment was changing rapidly, old information was discounted much more heavily than when conditions were relatively stable.
Even when the researchers ran repeated simulations slowing the rate of environmental change, increasing the probability of social learning errors, and increasing the cost of social learning, the top-ranked strategies still dominated, suggesting that highly social strategies are adaptive across a wide range of conditions. Interestingly, there were a few situations in which social learning didn’t pay. Obviously, playing Observe was only beneficial when there were other agents around to imitate. Additionally, social learning wasn’t as successful when researchers eliminated the possibility of incorrectly imitating a behavior. It seems that this kind of copying error may be a source of behavioral diversity in populations.
Thanks to this tournament, winners Daniel Cownden and Timothy Lillicrap—the graduate students who created discountmachine—are £10,000 richer, and scientists have a much better grasp on when social learning pays, when it doesn’t, and why it is such a successful strategy.
Source: Arstechnica.com