On Jan 21, 2009, at 11:53 AM, Olivier Teytaud wrote:

Here, we have a non-zero initialization of the number of wins, of the numbere of simulations, of the number of Rave-wins, of the number of Rave-losses. We have then a 0 constant for exploration, but also an exploratory term which is very different, and for which I am not the main author - therefore I let the main author
give an explanation if he wants to :-)

I point out that even before this exploratory term, the best UCB- like exploration-constant was 0 - as soon as the initializations of numbers of wins, of losses, of Rave-wins, of Rave-losses are heuristic values.

I'd like to make sure I understand what you mean exactly. You use some heuristics to intialize all the moves (or maybe some of the moves) with a certain win-loss and rave-win-loss ratios?

To a certain extent I suppose these could come from the reading of the previous move? I think I slowly start to make sense of things...

Mark



_______________________________________________
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

Reply via email to