Hi all,
After a long search on the computer go mailing list archive and reading and
reading again the paper of Gelly and Silver (ICML 2007) I didn't find
answers to my question.
In this paper they introduce a way to select the next move, at a given
state, using the rave and uct value of its
On Fri, Mar 28, 2008 at 11:20 AM, Jaonary Rabarisoa [EMAIL PROTECTED] wrote:
Hi all,
After a long search on the computer go mailing list archive and reading and
reading again the paper of Gelly and Silver (ICML 2007) I didn't find
answers to my question.
In this paper they introduce a way to
On Fri, 2008-03-28 at 11:20 +0100, Jaonary Rabarisoa wrote:
- its rave and uct value are defined ( in this case we can
compute the above score)
- only the rave value is defined (in this situation the n(s,a)
= 0 and the uct value is not defined)
-
So if I understand, at each node we need to play every possible action once
at first, even many of these actions are surely non optimal. And this may be
slow if the number of the possible action at this node is huge.
When you talk about FPU, does it mean that you give a kind of default value
for
I use FPU for both values for precisely the reasons you describe.
Sent from my iPhone
On Mar 28, 2008, at 9:36 AM, Jaonary Rabarisoa [EMAIL PROTECTED]
wrote:
So if I understand, at each node we need to play every possible
action once at first, even many of these actions are surely non
On Fri, Mar 28, 2008 at 2:36 PM, Jaonary Rabarisoa [EMAIL PROTECTED] wrote:
So if I understand, at each node we need to play every possible action once
at first, even many of these actions are surely non optimal. And this may be
slow if the number of the possible action at this node is huge.
So to sum up we have the following pseudo code :
at a given node :
- find the child (among the visited child only) that maximizes de UCT-RAVE
value
- if this maximum UCT-RAVE value is less than FPU value and if there still
exisits unvisited nodes :
choose one unvisited node
- continue
Is
So to sum up we have the following pseudo code :
at a given node :
- find the child (among the visited child only) that maximizes de UCT-RAVE
value
- if this maximum UCT-RAVE value is less than FPU value and if there still
exisits unvisited nodes :
choose one unvisited node
- continue
On Fri, Mar 28, 2008 at 3:10 PM, Jaonary Rabarisoa [EMAIL PROTECTED] wrote:
So to sum up we have the following pseudo code :
at a given node :
- find the child (among the visited child only) that maximizes de UCT-RAVE
value
- if this maximum UCT-RAVE value is less than FPU value and if there