Can you explain a bit more about how 1plyShuffle works?
On 9/14/07, Jason House <[EMAIL PROTECTED]> wrote: > Time management is identical. Based on quick profiling, the MCTR version > does about 1/6 of the simulations. Actually, since the MCTR does extra > tracking of info, it can reuse some old simulations, so it may be more like > 1/4 or 1/5 of the simulations. It's just using the results more > efficiently. > > > On 9/14/07, Chris Fant <[EMAIL PROTECTED]> wrote: > > Does it defeat it based on number of samples taken or time allotted per > turn? > > > > On 9/14/07, Jason House <[EMAIL PROTECTED]> wrote: > > > I know I'm only wading in the kiddie pool of computer go with my 1-ply > bots, > > > but I think I may have found a useful enhancement to monte carlo. > > > > > > HouseBot supports three 1-ply search modes: > > > 1plyMC - Uniform sampling > > > 1plyShuffle - Uniform sampling with monte carlo transposition reuse > > > 1plyUCT - Non-uniform sampling based on the UCT algorithm (AKA UCB) > > > > > > Obviously, 1plyMC is far inferior to 1plyUCT as everyone probably > expects. > > > What may surprise many is that 1plyShuffle defeats 1plyUCT nearly every > > > time. I'm basic this on self-play data from CGOS. Currently, > > > > http://cgos.boardspace.net/9x9/cross/housebot-617-UCB.html > > > shows 10 matches between housebot-617-UCB has played housebot-618-shuff. > > > housebot-617-UCB (1plyUCT) lost every time. > > > > > > While tricky, it should be possible to combine UCT and MCTR for an even > > > stronger bot. MCTR can be thought of as a low bias alternative to the > AMAF > > > heuristic. Rather than using all moves, MCTR takes only the top N > moves, > > > where N is computed based on which moves were played in the random game. > > > From an open board position MCTR uses about 1/3 of the moves that AMAF > > > would. Computation of the resulting winning percentage must also be > > > weighted based on the probabilities of duplicating results (roughly > > > speaking, it's 1/N). > > > > > > As a result of using MCTR, winning rates are no longer integers as one > would > > > expect. Here's the estimated winning rates for all three algorithms > when > > > asked for a white response to black G3: > > > > > > 1plyMC: 781 / 1272 > > > 1plyShuffle: 140.15 / 231.75 > > > 1plyUCT: 936 / 1515 > > > > > > 1plyShuffle is slower because of the extra work information tracking, > but > > > the variance in estimates should be far lower than the numbers would > > > indicate. I have yet to do the computations, but a sample size of > 231.75 > > > has an estimation error of around 6000 normal MC runs for that position. > > > That is why my implementation of MCTR is defeating my (1ply) > implementation > > > of UCT. > > > > > _______________________________________________ > computer-go mailing list > [email protected] > http://www.computer-go.org/mailman/listinfo/computer-go/ > _______________________________________________ computer-go mailing list [email protected] http://www.computer-go.org/mailman/listinfo/computer-go/
