[Computer-go] MCTS and Point Differential

Brian Sheppard Mon, 04 Jul 2011 20:01:41 -0700

Related to the "perfect endgame" thread, but different...

Fuego claims that adding a few percent of point differential to the result
of a trial results in a stronger player. Pachi later confirmed that result,
and I have confirmed it in Pebbles as well.


The standard explanation for this is that the small bias (just 2% in Fuego,
4% in Pachi) help to avoid losses by endgame blunders. Well, that might be,
but I see something more fundamental.

When you score a game in a Win/Loss dimension, there is only one player who
can make an error: the side that is winning. For the loser, all moves are
losing. So a playout stumbles to the right conclusion if it contains an even
number of errors, and if it contains an odd number of errors then it reaches
the wrong conclusion. If you take a probability P of making an error and
model the probability of making an even nubmer of errors then you will find
out that this is a daunting model. You might doubt that MCTS could ever
work.

But MCTS works quite well, and I think that it is because of point
differential.

In a point differential model, *both* players can make errors. So the point
differential takes a random walk from the leaf of the tree to the terminal
position. The trial reaches the right conclusion if the random walk crosses
zero an even number of times.

In a random walk, the probability of crossing zero depends on how far from
zero you start from. So if one tree node is better (that is, higher point
differential) than another, it is more likely that a simulation trial will
result in a win.

To get back to Fuego's finding: why does adding in some point differential
help? Because the larger the point differential of the terminal position,
the higher (on average) was the point differential of the leaf node.

The random walk that takes a leaf node to a terminal node is invertible, so
the same probability distribution relates the leaf and terminal positions.
Accordingly, we can use the terminal point differential to compute a
probability distribution of the leaf node, and that distribution implies a
probability that the leaf node is winning.

So, without doubting the standard theory about how point differential could
reduce yose errors, I see point differential as a factor in opening and
middle play, too.

Brian

_______________________________________________
Computer-go mailing list
Computer-go@dvandva.org
http://dvandva.org/cgi-bin/mailman/listinfo/computer-go

[Computer-go] MCTS and Point Differential

Reply via email to