Re: [computer-go] How to "properly" implement RAVE?

Sylvain Gelly Tue, 10 Feb 2009 01:01:19 -0800

On Sun, Feb 8, 2009 at 1:42 PM, Petr Baudis <pa...@ucw.cz> wrote:

> On Sat, Jan 17, 2009 at 08:29:32PM +0100, Sylvain Gelly wrote:
> > A small point: in "PlayoutOutTree", just after "if
> > (!played.AlreadyPlayed(move)) {", there should have a
> "played.Play(move)".
> > I believe it does not change the final result (as the check is also done
> in
> > the backup, and the move played in the backup), but I simply forgot that
> > line (that should make moves_played_out_tree smaller).
> >
> > To avoid confusion, I repost the pseudo code with that correction (and
> > hoping the indentation is not broken by the email editor once again).
>
> Thank you so much for this! I have switched my RAVE implementation to
> this formula and the bot has gotten noticeably stronger, though I
> apparently still have some bugs to chase, since it seems to have trouble
> considering strongest opponent's responses and frequently focuses on
> unreasonable opponent's replies instead of the obvious (e.g. keeping a
> group of stones in atari). Maybe I need better prior hinting...
>
> I have few questions. Of course, please feel free to skip questions
> about particular constants if you feel that's giving away too much. :-)
>
> > ChooseMove(node, board) {
> >   bias = 0.015  // I put a random number here, to be tuned
> >   b = bias * bias / 0.25
>
> Maybe it would be cleaner to define b = 1 / rave_equiv, where rave_equiv
> is the number of playouts RAVE is thought to be equivalent of? Or is the
> meaning of this constant actually different?
>


The meaning is supposed to be the difference between the expected value of
AMAF and the expected value of the tree. But at that point it is just a
constant, so I am not sure there is a "good" interpretation.


>
> What value works best for people? I did not do much tuning yet, but I
> use b=1/3000. I see Fuego uses b=1/5000. (This example b=1/1111.)


I believe the actually value depends on the other parts of your
implementation, namely the playouts, the prior you use in the tree and the
exploration term (there was none in the pseudo code I posted, but you might
use one).


>
>
> >   best_value = -1
> >   best_move = PASSMOVE
> >   for (move in board.allmoves) {
> >     c = node.child(move).counts
> >     w = node.child(move).wins
> >     rc = node.rave_counts[move]
> >     rw = node.rave_wins[move]
> >     coefficient = 1 - rc / (rc + c + rc * c * b)
> >     value = w / c * coef + rw / rc * (1 - coef)  // please here take care
> of
> > the c==0 and rc == 0 cases
> >     if (value > best_value) {
> >       best_value = value
> >       best_move = move
> >     }
> >   }
> >   return best_move
> > }
>
> I have two questions here:
>
> * Is the FPU concept abandoned? Or what values are reasonable? It seems
>  to me 1.0, which is usually recommended, is obviously too big here
>  since that's the upper bound of the value already. So far I have tried
>  0.6 and 0.7 but both just make my bot slightly weaker.



>
> * How to accomodate prior knowledge? (I'm using grand-parent heuristics,
>  atari liberties, and few patterns.) Do you use it to fill normal
>  counts, RAVE values or both? What count values work best for you?
>  I have settled on 50 playouts.


The FPU concept is here replaced by the prior value. I used
node.rave_wins[move]
and node.rave_counts[move] when the node is initialized. If you don't want
to use go knowledge, you can initialize those values to something like 7 and
14 respectively. Grand-parent heuristics did not work at all in my
experiments (was even very bad). The heuristics I was using were:
  - save a chain (positive prior)
  - self atari (negative prior)
  - match a simple 3x3 pattern (hane, cut, and a few others I don't
remember) (positive prior).

For node.rave_counts[move], 50 seems a little high. I think I was using
values around 15, but it was not very sensitive (50 was working almost as
well as 15). I give those numbers from memory, they may be quite off.

Sylvain

>
>
> --
>                                Petr "Pasky" Baudis
> The average, healthy, well-adjusted adult gets up at seven-thirty
> in the morning feeling just terrible. -- Jean Kerr
> _______________________________________________
> computer-go mailing list
> computer-go@computer-go.org
> http://www.computer-go.org/mailman/listinfo/computer-go/
>

_______________________________________________
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

Re: [computer-go] How to "properly" implement RAVE?

Reply via email to