Hi Antti, I had a quick look at your numbers. Maybe I misunderstood something, but at first glance there appears to be a parity effect (an even number of 100% blunder moves always get it right).
How do the statistics look if the game length is odd? If it matters, maybe you should sample over a reasonable distribution of game lengths or otherwise just average odd and even. Erik On 7/28/07, Antti Huima <[EMAIL PROTECTED]> wrote: > > Hi, > > there was some time ago discussion about whether it pays off to improve the > quality of an MC play-out agent or not, and how important it is to keep it > "balanced", so I performed the following abstract experiment: > > Assume that we start from a position that is game-theoretic win for Black. > If we play out moves from this position--say for instance 100--then every > move can either switch the game-theoretic value of the present position > (blunder) or not (in general a correct move). Of course the only way to > switch the game-theoretic value by a player is by blundering in a position > that is won for the player and ending up in a position that is lost for the > same player: there is no way to "blunder" a lost position into a won one. > > I implemented a simple C program that calculates the probability of ending > up with correct game-theoretic value at the end of the simulation when the > probability of blundering, when possible, is given as a function of the move > number. Here are some results (explanations below): > > Game length 100, simulations 1000000 > 0% flat | 100.00% > 1% flat | 99.01% > 2% flat | 98.04% > 5% flat | 95.27% > 10% flat | 90.97% > 20% flat | 83.28% > 50% flat | 66.74% > 80% flat | 55.59% > 90% flat | 52.65% > 95% flat | 51.58% > 98% flat | 57.02% > 99% flat | 68.53% > 99.5% flat | 80.37% > 99.8% flat | 90.97% > 99.9% flat | 95.21% > 100% flat | 100.00% > Linear ramp up | 50.17% > Linear ramp down | 99.03% > Squared ramp up | 50.17% > Squared ramp down | 99.99% > Squared ramp up, inverted | 98.09% > Squared ramp down, inverted | 49.97% > Spike | 0.00% > Spike with 10%/10% noise | 52.30% > Spike with 10%/0% noise | 9.95% > Spike with 0%/10% noise | 52.34% > > Each row represents one million play-outs. The left column is the > probability function (how probable it is to blunder) and the right column is > the probability that we get the "correct" result at the end of a play-out. > Here are the descriptions of the functions: > > - N% flat means that the move is correct with probability N% and a blunder > with probability (1-N%), when possible (you can't blunder if you are in a > lost position) > - Linear ramp up means that the probability is 100% * (k/N) where k is the > move number, i.e. moves tend to get better and better by the end of the game > - Linear ramp down is 100% * (1-k/N), i.e. inverted > - Squared ramp up is 100% * (k/N)^2 > - Squared ramp down is 100% * (1 - (k/N)^2) > - Squared ramp up and down inverted are obtained by 100% - X where X is the > squared ramp > - Spike means that black makes one blunder in the middle but all other > moves are correct > - Spike 10%/10% noise is 10% correct move in the middle move and 90% > elsewhere > - Spike 10%/0% noise is 10% correct move in the middle and 100% elsewhere > - Spike 0%/10% noise is 0% correct move in the middle and 90% elsewhere > > And here some analysis: > > - Obviously a move generated that blunders always with probability 1/2 when > possible is a great basis for MC analysis because it ends up with correct > game-theoretic value with 67% probability > > - It is also obvious that of the ones sampled above, the worst probability > patterns are rising ramps, i.e. playout agents that play badly in the > beginning but get better and better towards the end of the game. For these > agents the end result is basically just random noise. The reason is, I > believe, that first both players blunder all the time and the game-theoretic > value remains always won for Black (two blunders --> Black still winning), > but when the blunder probability starts to drop, first the result becomes > more or less random, and then the dropping probability "locks" the > game-theoretic value to the random value. > > Finally, to those who question these numbers, here some intuitive > explanation of the mechanics behind: > > Suppose you play correctly with probability 50% and you start with Black's > move from a position that is win for Black. > > With probability 50% you play correct, White answers whatever, but you have > still a won position (White cannot turn lost position into won by playing a > move.) > > With probability 50% you play incorrect, and the position is now won for > White. But White also blunders now with probability 50%, so you get another > 25% probability to have won position after the two plys. > > So even though you the playout agent has only 50% probability of playing > correctly, the probability that after 2 plys the position is still won is > 75%! > > All the best, > > > -- > Antti Huima > > > _______________________________________________ > computer-go mailing list > computer-go@computer-go.org > http://www.computer-go.org/mailman/listinfo/computer-go/ > _______________________________________________ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/