>
>
> But is it shown that "the score is well done" for these properties to
> hold in case of RAVE-guided exploration? Since it massively perpetuates
> any kind of MC bias...
>

This only matters for the fact that we don't visit all the tree. For the
consistency (the fact that
asymptotically we will find the best possible decision), there's no problem.
If "score ~ success rate" for n--> infinity (which holds for most usual
rules, including rave rules) we also
have that, for binary games, we have some good properties on the part of the
tree which is visited.

Please not that I do not claim that major improvements are possible in
computer-go thanks to this.
We only observed a very small improvement on success rates, and a good
behavior on the situation
which appeared during the game against Fan Hui. It might be interesting to
know, for people who have
similar problems in their bot (a situation in which, even with huge
computation time, the good estimate is
not found), they solve it with this.

>
> > We use a stupid method, i.e. the success rate. The pattenrs are bigger
> than
> > 3x3, with jokers in them. Bandits (Bernstein races, slightly modified)
> are
> > used
> > for distributing the computational effort among the tested patterns.
>
> Thank you for pointing me to more study material. :-)
>
>
The following paper is great for Bernstein races:

http://icml2008.cs.helsinki.fi/papers/523.pdf

Please note, however, that we had only very small improvements with races.
Maybe our code has had too many tuning steps in the past for being strongly
improved
by random generation of patterns and bernstein races for validating them.

Best regards,
Olivier
_______________________________________________
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

Reply via email to