Quoting Hideki Kato <[EMAIL PROTECTED]>:
Yes, UCT does. From my recent experiments with a delay
line (a fixed size FIFO queue) between a UCTsearcher and an MC
simulator with RAVE against GNU Go 3.7.11 level 0 on 9x9 (single
thread):
delay #po wins games winning rate ELO 1 sigma of wr
0 1,000 721 2,000 36.05% -99.6 1.07%
1 1,000 721 2,000 36.05% -99.6 1.07%
2 1,000 690 2,000 34.50% -111.4 1.06%
3 1,000 663 2,000 33.15% -121.8 1.05%
5 1,000 642 2,000 32.10% -130.1 1.04%
10 1,000 522 2,000 26.10% -180.8 0.98%
20 1,000 412 2,000 20.60% -234.4 0.90%
50 1,000 82 2,000 4.10% -547.6 0.44%
If I understand this correctly this simulation for delay 50 computes
50 playouts and then updates the tree.
In Valkyria I do the following. Every simulation from the root with
their own thread updates all nodes as visited down the tree before
entering the heavy playout. This means that all moves made in the tree
are temporarily updated as losses. When a playout has finished, half
of the moves were winners and updated accordingly.
The idea behind this is that this hopefully avoids searching the same
path over and over again. Have tried anything like this?
Also your results shows clearly that there is inefficency. But do you
also have results where for example delay 50 also computes 50x1000
simulations so that we can see if what it means in practise?
Magnus
_______________________________________________
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/