On Wed, Jan 28, 2009 at 10:01 PM, Isaac Deutsch <i...@gmx.ch> wrote:

> Hi again ;)
>
> I found some time to actually implement this stuff. And, this has raised
> some small questions. In this part of the code:
>
>  for (j = index; j < moves_played_in_tree.size(); j += 2) {
>     //stuff....
>   }
>  for (j = 0; j < moves_played_out_tree.size(); ++j) {
>     //more stuff
>
>    // If it is either not the right color or the intersection is
>    // already played we ignore that move for that node
>    if (move < 0 || already_played.AlreadyPlayed(move)) continue
>
>    already_played.Play(move)
>     //stuff
>  }
>
> 1. Shouldn't the first loop start at j=index+1? Starting at j=index would
> mean that the RAVE value of the node is updated with the move of the node
> itself, wouldn't it? It makes more sense to me to actually start at the
> first child of the node that is being back-upped. Correct me if I'm wrong.

No, you need to update the RAVE value of the node for the first move (move
taken in the position of the node itself). So it is j=index, and that is
important to make the algorithm work.


>
> 2. Shouldn't the order in the second loop be:
> -if (already played): continue;
> -update already played;
> -if (wrong color): continue;
> Otherwise, moves that are the wrong color don't get counted as already
> played (because they never get updated). I'm not sure if it makes a
> difference in this case because you check in the playouts, too, but maybe
> it does.


I think it is ok like that because indeed the check is already done in the
playout. The "already_played.Play(move)" is actually also unnecessary in the
second loop (not really rechecked as I speak, but I think so as far as I
remember).


> And a final question: You calculate the (beta) coefficient as
> c = rc / (rc+c+rc*c*BIAS);
> which looks similar to the formula proposed by David Silver (If I recall
> his name correctly). However, in his formula, the last term looks like
> rc*c*BIAS/(q_ur*(1-q_ur))
> Is it correct that we could get q_ur from the current UCT-RAVE mean value,
> and that it is used like that?


Yes the formula looks very similar (David proposed that formula to me in the
beginning of 2007). However my implementation did not contain
the (q_ur*(1-q_ur) factor, that I approximated by a constant, taking q=0.5
so the factor=0.25.
I did not try the other formula, maybe it works better in practice, while I
would expect it is similar in practice.

Sylvain

>
>
> Regards,
> Isaac Deutsch
> --
> Psssst! Schon vom neuen GMX MultiMessenger gehört? Der kann`s mit allen:
> http://www.gmx.net/de/go/multimessenger
> _______________________________________________
> computer-go mailing list
> computer-go@computer-go.org
> http://www.computer-go.org/mailman/listinfo/computer-go/
>
_______________________________________________
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

Reply via email to