Re: [Computer-go] replacing dynamic komi with a scoring function

Vlad Dumitrescu Mon, 09 Jan 2012 07:27:27 -0800

Hi,

On Mon, Jan 9, 2012 at 13:17, Don Dailey <dailey....@gmail.com> wrote:
> Summary:
>
> I believe a more correct scoring function won't be based on how much you win
> by OR how often you win but will incorporate some other more relevant
> concept and it will be dynamic.    And it will not matter if the game is
> a handicap game or otherwise because the scoring function will always be
> relevant.   The goal will be to maximize your winning chances but it
> will incorporate something more sophisticated that just counting how often
> you win or how much you win by.


I hope I may interfere with something that Don's nice description
revealed to me. It feels rather obvious, but since nobody stated it
explicitly, maybe it's news for at least some people here.

MCTS is maximizing the chances of winning. These chances are largest
for a minimal score difference because this allows for making some
errors. Winning by the largest possible score has rather small chances
to happen because every move has to be perfect.

The curve describing the probability of ending the game with a certain
score is bell-shaped and MCTS explores the area beneath it, looking
for winning moves. With handicap, the disadvantaged side is getting
less samples explored, making it less likely to discover the really
good moves. Dynamic komi shifts the bell left or right in order to
equalize the sampling on both sides, but as mentioned it isn't dynamic
enough (the curve changes after each move) and also is actually using
a different shape for the curve than the real "handicap curve".

In theory, I think that the solution for keeping the same level of
play with handicap as without would be to make sure that the the
disadvantaged side gets just as many samples with or without handicap.
That is, use more playouts when playing with handicap. In practice,
this is probably prohibitive...

I wonder if it might be possible to estimate the shape of this curve
after each move and use that estimate to dynamically adjust the number
of playouts. One might have to use higher precision calculations, too,
so that the noise doesn't get too loud.

Does this make any sense? Has anyone tried something like this?

best regards,
Vlad
_______________________________________________
Computer-go mailing list
Computer-go@dvandva.org
http://dvandva.org/cgi-bin/mailman/listinfo/computer-go

Re: [Computer-go] replacing dynamic komi with a scoring function

Reply via email to