Re: [computer-go] More UCT / Monte-Carlo questions (Effect of rave)

Hideki Kato Wed, 06 Feb 2008 06:42:45 -0800

I found some data.  GGMC Go v2r6, against GNU Go 3.7.10 level 10, 9x9, 
komi 7.5, 3000 playouts/move, 2000 games match:


Without RAVE:   winning rate was 23.1 +- 0.9% (-209 +- 9 ELO)
With RAVE:      winning rate was 65.3 +- 1.1% (+110 +- 8 ELO)

Though this includes some other improvements, most come from RAVE.  
Unlike MoGo, my best 'K' was 1000.

Following is my implementation of RAVE for GGMC v2r6.
1) Each playout returns the score and all moves with colors played.
2) While back-propagating the value (degitized score), computes the 
mean and the variance according to UCB1 and do the same for RAVE 
seperatelly.  For RAVE, the values of all (legal) moves, except played 
one, in a node are updated.
3) In the computation of values for RAVE, the point is that there 
appeares three colors (as someone, I remember GCP, mentioned before).  
If the players' colors aren't the same then skip.  Count the value as 
is or negate (1 - score, for me), depending on the color of the player 
at the position and the color for the score.
4) Before back-propagating the value of each playout, I setup a color 
table for all intersections of the board for speed-up, in fact 
(initialized with EMPTY). That is, fill the board (table[move] = 
color) by tracing the moves and the colors returned by the playout 
forward (from leaf node to end of the game). Then, by tracing the 
path from root to the leaf node, clear the table[move] (table[move] = 
EMPTY), in order to avoid duplicate counting with UCB1.
5) While descending the tree, merge the values come from UCB1 and 
RAVE with 'K' according to the formula in the paper.

#Though I'm writing this by reading my source code, this description 
may include some errors.

Hope this helps,
Hideki

Gian-Carlo Pascutto: <[EMAIL PROTECTED]>:
>> I also implemented RAVE in Mango. There was a few points of improvements
>> (around 60 Elo points with gnugo as reference), but as much as in the
>> paper of Gelly and Silver :( (around 250 Elo points if I remember well)
>>
>> It might be that the effect of RAVE depends a lot on the simulation
>> strategy. Indeed, sometimes my RAVE was playing very good moves but also
>> very bad ones.
>
>I don't think the simulation strategy is the key.
>
>I suspect the improvement is largest when you don't do progressive widening.
>
>Nevertheless it would be quite interesting to see the implementation
>details of ggmc's RAVE. RAVE performance is quite dependent on exact
>implementation and parameters.
--
[EMAIL PROTECTED] (Kato)
_______________________________________________
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

Re: [computer-go] More UCT / Monte-Carlo questions (Effect of rave)

Reply via email to