On 27-nov-08, at 19:50, Denis fidaali wrote:
So, you use AMAF for "simulating" the first UCT evaluations ? I though the classical way to use AMAF, was to affect only the win/lose ratio portion of the uct equation. Obvioulsy it should be allowed to use an arbitrary large number of AMAF simulation accumulating them longer than what it take to expand a node. I think that a classical way to affect the win/ratio is to decrease the effect of the AMAF correction as the number of simulation grows. If you test with a very low number of simulation (in the 1000 - 3000 range), i think you should be able to get out a very nice improvement out of the AMAF version. If you don't, i would think that something is wrong somewhere. What test process do you use for this version ?
I tested it mostly doing 2,000 playouts. When AMAF is true I create a map of virtual-win values of all the moves played during a playout. These values get accumulated over all the playouts (not just the first ones). The 'virtual-value' of a move is calculated as follows:
exploration-factor * UCT + ( (nr-wins*2 + nr-virtual-wins) / (nr- playouts*2 + nr-virtual-playouts))
where the exploration-factor is currently sqrt(0.2) and UCT is sqrt ( log( nr-parent-playouts ) / ( nr-playouts+1) )
Like I said, I haven't had time to experiment much so this formula may not be any good. I had also expected to see some positive effect of the virtual-win / virtual-playout ratio from AMAF, but I see none. Of course it's also possible I have a different kind of bug still.
What happens in my 'formula' is that when it never expands beyond the first level, which is what happens if the number of simulations is equal to the number of simulations before expansion, the virtual- value becomes completely determined by nr-virtual-wins / nr-virtual- playouts making it equivalent to the original ref-bot. In case it does expand further and creates a tree, the actual win-loss ratio is weighed twice as heavily as the virtual win-loss ratio. That seemed like a reasonable first try. I have tried a few others, but usually didn't get much different results or much worse results.
Mark _______________________________________________ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/