Re: [computer-go] AMAF Scalability study + Responses to previous

Claus Reinke Thu, 09 Oct 2008 09:34:17 -0700

>>What about that claim that "the program could figure out the rule by itself"?
>I made experiments with no-go-knowledge. (no eye knowledge).


Any random player that accurately avoids suicide seems to need an awful lot
of Go knowledge implicitly:

- liberties (or at least pseudo-liberties, since we only care about the step to 
zero)
- (immediately) connected strings of stones of the same colour
- strings with zero liberties are dead, and are removed
- first remove dead opponent strings, then check for suicide, which is ruled out

In particular, moves that can't be played because they would be suicide
are in opponent eye positions, so the bot "knows" about these in some form,
and that knowledge influences possible moves, even if the bot might not be
aware of that knowledge, nor of its own eyes;-)

I've sometimes wondered whether that biases light playouts for white.
Since light playouts do no planning, black has little advantage from playing
first, and white does get slightly better use out of the eye knowledge implicit
in the no-suicide rule:

For any pair of consecutive black-then-white moves, if the white move
forms an eye, the black move may well be caught in it, whereas if the
black move forms an eye, the white move will never be caught in it. For
an equal number of moves, white strings seem safer than black ones.

>I think what i did, was just to play a random number of moves, and scoring like
>if the game had terminated. Using that i remember that i could get a bot that 
>was
>able not to fill it's own eyes :) That's about as strong as it got though.

So you interpreted "passing" as not continuing the game, introducing an
arbitrary termination criterion, and found that  games without eye-filling
resulted in better scores. That makes sense.

But it raises another question, about evaluating a playout with arbitrary
length limit: Isn't the value fluctuating wildly with the length?

Consider a playout that stops just before black fills one of its two eyes
vs the same playout but stopped just after black filled one and white filled
the other.

And if the dead string was big enough (random moves cluster together
without any notion of bad shape), the same game can happen inside the
newly cleared area if enough further moves are played.

Or consider a playout where black randomly throws stones into a white
area (surrounded by white stones, but without explicit two-eye separation):
the black score for that is going down with every black move, until that
final move that kills the surrounding white string, after which the black
score suddenly goes up because those black invaders are no longer dead.

One could perhaps record upper and lower bounds on such fluctuating
evaluations, and stop the playout if its evaluation converges on a narrow
enough range? And one would like to avoid wins that rely on the
evaluation going downhill for a long time before suddenly going up

All of which of course assumes an external scoring function that doesn't
need the playouts to result in simple positions for scoring to work. But
isn't such a standalone scoring function exactly what the random playouts
are meant to replace?

Claus



_______________________________________________
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

Re: [computer-go] AMAF Scalability study + Responses to previous

Reply via email to