Quoting Christoph Birk <[EMAIL PROTECTED]>:

On Tue, 4 Mar 2008, Magnus Persson wrote:
But here you are missing the point that close to 0% winning probability means that it cannot win against random play. The opponent could lose only by killing his own groups.

I don't know why you (and Don) keep bringing up the 0% against random
play ...

Maybe we all should stop discussing hypothetical positions that has properties that fits our own arguments.

I mean if there is a typical situation then please post it so we can see what different program does in the situation and why.

I am talking about a (typical) situation in the endgame
where best play (as seen from the program) leads to a sure 0.5 pt loss.
Many MC programs will make unreasonable attempts of winning by chosing
a line that shows a possible win (10 pt) if the opponent makes a
(stupid) mistake. Instead they should go for the (supposedly sure)
0.5 pt loss, because the opponent will much more likely make
the 1pt mistake, and not the 10 pt mistake.

I do not see why an MC programs in general is biased towards winning with 10p instead of a single 1p mistake.

As we have repeatedly discussed here all strong programs go for winrate and ignore the size of the win.

What is happening here is that when MC-programs knows that a simple endgame is lost then it will play a sequence that makes the game as long and complicated as possible. I belive this is a perfectly reasonable stretegy. If this is wrong someone needs to provide a solution and show that it really makes a difference against for example gnugo which makes humanlike endgame mistakes. Testing against humans is too noise unless there is an astronomic improvement in playing strength.



The problem is that the likelihood of your opponent making a mistake
is hard to determine by the UCT (MC) playouts. I guess one needs
to use  the meta information that is is more likely to make a small
mistake than to make a big one.

Random playouts makes small and big endgame mistakes for about almost every move played. The likelyhood is measured all the time, and is the reason UCT (MC) is successful.

The argument I do not like here is in short something like this

1) UCT(MC) programs are so strong that it freaks out when it is behind in a game.
2) Solution: Make it believe it can win by playing losing moves

I have been thinking like this. I have tried it and it failed. So did Don and this is why we are a little stubborn on arguing that it is not possible to improve playing strength this way.

It is much better to make it even stronger so it takes the lead in more games and refuse to lose those games. It will still freak out but in fewer and and later in the game.


But as always I am willing to admit that I am wrong. I am happy to see
1) real positions to discuss, 2) solutions that are backed up with 3) solid empircal data. (That I can easily incorporate in my own program... ;-) )

-Magnus


_______________________________________________
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

Reply via email to