On Fri, Mar 28, 2008 at 11:20 AM, Jaonary Rabarisoa <[EMAIL PROTECTED]> wrote:
> Hi all,
>
> After a long search on the computer go mailing list archive and reading and
> reading again the paper of Gelly and Silver (ICML 2007) I didn't find
> answers to my question.
> In this paper they introduce a way to select the next move, at a given
> state, using the rave and uct value of its childs. They do this by comparing
>
> (1-beta)*Q_uct + beta*Q_rave
>
>
> But, by the definition of the rave and uct value, for each child of a given
> node we may have the following situation :
>
>  - its rave and uct value are defined ( in this case we can compute the
> above score)
> - only the rave value is defined (in this situation the n(s,a) = 0 and the
> uct value is not defined)

Plain UCT always selects unvisited nodes first ( n=0 -> Q=infinite ).

> - neiher rave nor uct value is defined

This cannot happen if you always select unvisited nodes first (because
if you select a move it leads to an update for both Q_rave and Q_uct).

Erik
_______________________________________________
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

Reply via email to