On Fri, Mar 28, 2008 at 11:20 AM, Jaonary Rabarisoa <[EMAIL PROTECTED]> wrote: > Hi all, > > After a long search on the computer go mailing list archive and reading and > reading again the paper of Gelly and Silver (ICML 2007) I didn't find > answers to my question. > In this paper they introduce a way to select the next move, at a given > state, using the rave and uct value of its childs. They do this by comparing > > (1-beta)*Q_uct + beta*Q_rave > > > But, by the definition of the rave and uct value, for each child of a given > node we may have the following situation : > > - its rave and uct value are defined ( in this case we can compute the > above score) > - only the rave value is defined (in this situation the n(s,a) = 0 and the > uct value is not defined)
Plain UCT always selects unvisited nodes first ( n=0 -> Q=infinite ). > - neiher rave nor uct value is defined This cannot happen if you always select unvisited nodes first (because if you select a move it leads to an update for both Q_rave and Q_uct). Erik _______________________________________________ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/