The prisoners’ dilemma

 

             the old story about two friends…

 

Two suspects, Harold and William, are taken into custody and separated. The policeman accuses them of a crime but lacks sufficient evidence to convict them, unless at least one of them confesses. He explains the consequences following the two actions they could take: namely confessing the crime or not confessing it.

 

what are the consequences? …

 

The policeman says:

”If neither of you confess, then both will be convicted of a minor offence and sentenced to 1 month in jail.

If you both confess, then I will sentence you to jail for 6 months.

Finally, if only one of you confesses then he will be treated leniently and will be freed; while his confession is used as a witness against the other, who will be sentenced to 9 months in jail; 6 for the crime and 3 for obstructing the justice.”

 

the consequences written in a simple form …

 

The bi-matrix form below presents the four possible outcomes, or payoff pairs, depending on the actions chosen by Harold and William. They can be presented as pairs of prison years. The first number in each cell is Harold's payoff and the latter is that for William:

 

 

The bi-matrix is read as follows: if, for example, Harold confesses but William does not, then Harold’s payoff is 0, representing immediate release, and William’s payoff is -9 representing 9 months in jail.

 

representation as a game …

 

Text Box: By definition, representation as a normal form game specifies:

•	The players in the game
•	The action alternatives, called strategies for each player
•	The payoffs received by the players for each possible combination of strategies.

 

In prisoners’ dilemma game, the players are Harold and William. Each player has two strategies available: confess, denoted by c, and not confess, nc. The payoffs to the players depending on the chosen pair of strategies are given in the corresponding cell of bi-matrix.

 

… Nash equilibrium to the game …

 

We can easily deduce the solution to the prisoners’ dilemma game. After having learnt the consequences of their crime, William thinks in his cell what is his best choice, or best response strategy, if Harold chooses to play c or nc. His best response in both cases is c. Similarly, Harold thinks in his cell to play c because it is always his best response to whatever William might choose. Hence both players want to confess and the strategy pair (c,c) is a self evident solution to the prisoners’ dilemma game and both players will be 6 months in jail. This solution is called the Nash equilibrium solution of the game.

 

Text Box: By definition, a strategy pair is Nash equilibrium solution if each player’s strategy is the best response to the other player’s strategy.

 

In Nash equilibrium neither of the players wants to deviate from his strategy alone.

 

… Pareto optimal outcomes …

 

Note, however, that there is an outcome in this game, see the bi-matrix, which would give both players a better payoff, namely -1,-1, meaning one month in jail for both of them. Because the self-evident Nash equilibrium giving worse payoffs is chosen, there is a dilemma hidden in this game. The outcome -1, -1 is called a Pareto optimal outcome.

 

Text Box: By definition, an outcome is Pareto optimal if any other outcome gives a worse outcome for at least one of the players.

 

Hence, the outcomes 0, -9 and -9, 0 are Pareto optimal as well.

 

    the game in mathematical form …

 

Let si denote an arbitrary strategy for player i; i refers either to player 1 or 2. The set of all strategies available to player i is denoted by Si and called i’s strategy set. We denote siÎSi to indicate that si belongs to the strategy set Si. Let ui denote player i’s payoff function. It specifies all possible payoffs for player i for each combination (s1, s2) that might be chosen by the players.

 

In prisoners’ dilemma game we let player 1 be Harold and player 2 William. Hence, if s1ÎS1, it is either c or nc and similarly for s2ÎS2. We may also write S1=S2={c,nc}. From the bi-matrix we can read the values ui for all combinations (s1,s2), e.g.,  u1(nc,nc)=-1, u1(c,nc)=0, u2(nc,c)=-9, u2(c,c)=-6.

 

    the solution to the game …

 

The solution to the game above is a strategy pair  with the following property: The strategy  is the best strategy for player 1, provided that player 2 chooses to play  and vice versa. Mathematically the solution satisfies

 

Hence, each player is willing to choose the strategy indicated by the solution provided the other player also does so.

 

This solution to the game is called Nash equilibrium.

 

tree representation of the game …

 

The game can be represented as a tree form, too. It is called an extensive form game:

 

              

Text Box: By definition, in the game tree, there are: 

•	nodes at which the players make their decisions 
•	arcs connecting them. 

 

 

On the top of the game tree, there is first Harold’s decision node, called initial node, but since here the timing of the decisions is irrelevant it could be William’s decision node, as well.

 

how to read the tree? …

 

For example, let Harold play c. Then William’s decision node on the left is reached and William makes his decision. William’s decision is then followed by a terminal node and the game ends. The payoffs for the players are shown below the terminal nodes.

 

Note that when William is making his decision, he does not know Harold’s decision, i.e., in which of his decision nodes he actually is in the game tree. This ignorance is denoted by connecting the corresponding nodes with the red dotted line.