Quoting Christoph Birk <[EMAIL PROTECTED]>:
On Tue, 4 Mar 2008, Magnus Persson wrote:
But here you are missing the point that close to 0% winning
probability means that it cannot win against random play. The
opponent could lose only by killing his own groups.
I don't know why you (and Don) keep bringing up the 0% against random
play ...
Maybe we all should stop discussing hypothetical positions that has
properties that fits our own arguments.
I mean if there is a typical situation then please post it so we can
see what different program does in the situation and why.
I am talking about a (typical) situation in the endgame
where best play (as seen from the program) leads to a sure 0.5 pt loss.
Many MC programs will make unreasonable attempts of winning by chosing
a line that shows a possible win (10 pt) if the opponent makes a
(stupid) mistake. Instead they should go for the (supposedly sure)
0.5 pt loss, because the opponent will much more likely make
the 1pt mistake, and not the 10 pt mistake.
I do not see why an MC programs in general is biased towards winning
with 10p instead of a single 1p mistake.
As we have repeatedly discussed here all strong programs go for
winrate and ignore the size of the win.
What is happening here is that when MC-programs knows that a simple
endgame is lost then it will play a sequence that makes the game as
long and complicated as possible. I belive this is a perfectly
reasonable stretegy. If this is wrong someone needs to provide a
solution and show that it really makes a difference against for
example gnugo which makes humanlike endgame mistakes. Testing against
humans is too noise unless there is an astronomic improvement in
playing strength.
The problem is that the likelihood of your opponent making a mistake
is hard to determine by the UCT (MC) playouts. I guess one needs
to use the meta information that is is more likely to make a small
mistake than to make a big one.
Random playouts makes small and big endgame mistakes for about almost
every move played. The likelyhood is measured all the time, and is the
reason UCT (MC) is successful.
The argument I do not like here is in short something like this
1) UCT(MC) programs are so strong that it freaks out when it is behind
in a game.
2) Solution: Make it believe it can win by playing losing moves
I have been thinking like this. I have tried it and it failed. So did
Don and this is why we are a little stubborn on arguing that it is not
possible to improve playing strength this way.
It is much better to make it even stronger so it takes the lead in
more games and refuse to lose those games. It will still freak out but
in fewer and and later in the game.
But as always I am willing to admit that I am wrong. I am happy to see
1) real positions to discuss, 2) solutions that are backed up with 3)
solid empircal data. (That I can easily incorporate in my own
program... ;-) )
-Magnus
_______________________________________________
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/