Hi Erik,

> (1) They compared Rave to plain UCT. If they would have compared it to
> a more sophisticated implementation (like the best Mogo before Rave)
> they probably could not have shown a spectacular improvement.

The best Mogo before Rave was very close to plain UCT with the
sequence-like simulations. And indeed we exactly compared the best
Mogo before and after Rave. There is a table (I don't remember which
number), which show the incremental improvements from plain UCT, to
Rave, passing by plain UCT with sequence-like simulations. All
experiments have been done with MoGo's code, all other parts of the
code staying constant. There were not "secret part" of MoGo disabled
to make the improvement of Rave more interesting.

One discrepancy between our results and the one some of you observe,
as Gian-Carlo stated, is likely to come from the parameters and detail
of implementation. We heavily tuned those parameters and details
against gnugo, and that makes quite a big difference. I chatted more
closely with some of you about details and it did make a difference.
Maybe some of you can share what made a change, if you want.

Note as well that the current implementation of MoGo (not the one at
the time of the ICML paper) use a different tradeoff between UCT and
Rave value, thanks to an idea of David Silver, which brought
improvements in 19x19 (where the Rave values are the most useful),
while it was marginal (still better) in 9x9. But anyway we here are
talking about 9x9, so it can't explain what you are talking about.

> (2) (....) Depending on the playout
> policy, adding an upper confidence bound to the rave values can push
> some terrible bad moves up (like playing on 1-1). The reason seems to
> be that such moves are normally sampled very infrequently (so the UCB
> will be higher), and when they are selected (...)

That could be an explanation, but there are two points:
- the prior you put on top of Rave often avoid to first sample 1-1,
and even when you do, you very often loose just 1 playout because of
the UCT value you get right away.
- I never observed a big discrepancy between the number of Rave
samples for each move.

computer-go mailing list

Reply via email to