Eric,

Yes,  as Magnus also stated MC play-out doesn't really accurately
estimate the "real" winning probability but it still get the move order
right most of the time.  

The situation is that if the position is really a win,  it doesn't mean
that a MC is able to find the proof tree.   But it means that it's
easier to find wins than losses as so the score as expressed by a
winning percentage goes up - but not to 1.0

I have also found that if Lazarus says 60%,   it is going to win much
more than 60% against equal opposition.   Of course it's not surprise if
it beats weaker opposition from this point.   

I thought of mapping these percentages to actual percentages by playing
a few thousand self play games, but this is pretty much futile.     If
the score is, for instance 65% it will tend to grow higher and higher
depending on how long I let it think.    Unless it discovers a clever
defense that is - so it's possible that it will start declining at a
deep level.

So these numbers are really meaningless as absolute figures and have
everything to do with the current context of the search.

- Don




Eric Boesch wrote:
> On 12/11/07, Mark Boon <[EMAIL PROTECTED]> wrote:
>   
>> Question: how do MC programs perform with a long ladder on the board?
>>
>> My understandig of MC is limited but thinking about it, a crucial
>> long ladder would automatically make the chances of any playout
>> winning 50-50, regardless of the actual outcome of the ladder.
>>     
>
> No, 50/50 would make too much sense. It might be higher or it might be
> lower, depending on whose move it is in the ladder and what style of
> playouts you use, but exactly 50/50 would be like flipping a coin and
> having it land on its edge. In test cases, MC-UCT evaluations tend to
> cluster near 50/50 in any case, because MC-UCT, especially dumb
> uniform MC-UCT, tends to be conservative about predicting the winner,
> especially in 19x19 where the opportunities for luck to overwhelm the
> actual advantage on the board are greater. But if you accept this as
> just a moot scaling issue -- that a clearly lopsided exchange can mean
> just a 2% increment in winning percentage even if read more or less
> correctly -- then the numbers may not look so even after all. I's
> certainly possible for MC-UCT to climb a broken ladder in a winning
> position (and climbing a broken ladder in an even position is at least
> half as bad as that anyhow).
>
> I tried testing this on 19x19 using libego at 1 million playouts per
> move. The behavior was not consistent, but the numbers trended in the
> defender's favor as the sides played out the ladder. In one bizarre
> case, the attacker played out the ladder until there were just 17
> plies left, and then backed off.
>
> Why would the attacker give up a winning ladder? It appears the MC-UCT
> was never actually reading the ladder to begin with; just four or five
> plies in, sometimes just a few thousand simulations were still
> following the key line. 1 million playouts were not nearly enough for
> that in this case; maybe 100 million would be enough, but I couldn't
> test that. Also, after enough simulations, decisively inferior moves
> lead to fewer losses than slightly inferior ones. Suppose you have
> three moves available: one wins 75% of the time, one 50%, and one 25%.
> In the long run, the 75% move will be simulated almost all the time,
> but the middle move will be simulated roughly four times as often as
> the 25% one that, compared to the best move available, is twice as
> bad, and four times the simulations with half the loss per simulation
> adds up to twice the excess losses compared to the 25% move. That is
> apropos here, because giving up on an open-field ladder once it has
> been played out for a dozen moves is much more painful for the
> defender than for the attacker. The longer the ladder got, the more
> the evaluations trended in the defender's favor, and my best
> explanation would be the fact that -- until you actually read the
> ladder all the way out and find that the defender is dead -- every
> move except pulling out of atari is so obviously bad that even uniform
> MC-UCT did a better job of focusing on that one good move.
>
> (Incidentally, the conservative nature of MC-UCT ratings largely
> explains why maximizing winning probabilities alone is not a bad
> strategy, at least in even games. The classic beginner mistake, when
> you already have a clear lead in theory, is to fail to fight hard to
> grab still more points as blunder insurance. But an MC-UCT evaluation
> of 90% typically means a >90% probability of actually winning against
> even opposition, not just a 90% likelihood of a theoretical win.
> Assigning a 65% evaluation to an obvious THEORETICAL win allows plenty
> of room to assign higher evaluations to even more lopsided advantages.
> As Don said, when MC-UCT starts blatantly throwing away points for no
> obvious reason, it's almost certainly because the game is REALLY over,
> because MC-UCT's errors tend to be probabilistic instead of absolute
> -- it may in effect evaluate a dead group as 75% alive, but it won't
> call it 100% alive except in the rare cases when the underlying random
> playout rules forbid the correct line of play.)
> _______________________________________________
> computer-go mailing list
> computer-go@computer-go.org
> http://www.computer-go.org/mailman/listinfo/computer-go/
>
>   
_______________________________________________
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

Reply via email to