On Fri, Mar 28, 2008 at 2:36 PM, Jaonary Rabarisoa <[EMAIL PROTECTED]> wrote:
> So if I understand, at each node we need to play every possible action once
> at first, even many of these actions are surely non optimal. And this may be
> slow if the number of the possible action at this node is huge.

Well, as discussed in their ICML paper you could also initialize nodes
with prior knowledge.

> When you talk about FPU, does it mean that you give  a kind of default value
> for unvisited node and compare this value with (1-beta)*Q_uct + beta*Q_rave
> if we can compute it ?

Yes, you do the normal UCT-RAVE selection for the moves that have been
already been explored at least once, then if the highest upper
confidence bound (from the move you would normally select if there are
no unexplored nodes) does not exceed the FPU value you select an
unexplored node (FPU=infinity gives standard UCT).

Erik
_______________________________________________
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

Reply via email to