Thank you for your answer. However, I am even more confused now. I understand that "-" is for negamax, but I don't understand why it became "1-". I am trying to implement your algorithm and I just want to know what lines 7, 16 and 26 should be?
It became a "1-" because I said a mistake while answering. The "1" would have been here only to keep values always between 0 and 1 (instead of [0,1] if black or [-1,0] if white), IF "value" was the average win and not the total win. So my fault, sorry :-/. Is that make things clearer? Sylvain -----Original Message-----
From: "Sylvain Gelly" <[EMAIL PROTECTED]> To: "Dmitry Kamenetsky" <[EMAIL PROTECTED]> Date: Wed, 21 Feb 2007 11:03:08 +0100 Subject: Re: [computer-go] UCT vs MC > > Hello Dmitry, > > > >> Your code says that the value is backed up by sum and negation (line 26, > > >> value := -value). But I don't see any negative values in your sample > > tree, > > >> or values greater than one. How do you actually back up values to the > > >> root? > > >Sorry, it is value := 1-value. Thank you for pointing out the mistake. > > > > I am confused about value. What is it actually storing? I thought > > node[i].value stores the number of wins (for Black) for node i. Then why > > some of the values in Figure 1 not integer? > > > > If line 26 is now value := 1-value, then should some of the other lines > > also change? For example should line 7 be updateValue(node, > > 1-node[i].value), and line 16 be else v[i]:= (1-node.childNode > > [i].value)/node.childNode[i].nb+sqrt(...)? > > > You're right there were some confusion :-). > In fact it is very simple. The "-" is here because it is negamax and not > minimax, so that you can always take the max of the value (but the value is > negated every 2 levels). The value stored then corresponds to the value of > "the player to play" in the node. > It seems that node[i].value indeed keeps the number of wins for the player > to play in node i. the "1-" does not exist. > In Figure 1, it is an example of UCT in general case, where the reward is > not always in [0,1]. And the values displayed in the nodes are the averages. > So that explains the non integers and the values not in [0,1]. > > > > > Can you also update all the changes in your report? Thank you. > > > I'll try to find sometime to do that. Can't tell it will be soon though. > > Regards, > Sylvain > >
_______________________________________________ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/