Re: [computer-go] ReadyFreddy on CGOS
On Fri, 2007-09-14 at 15:15 -0400, Jason House wrote: > I'd expect your enhancements are worth significantly more than 100 > ELO. I can't really remember except that it was a non-trivial pleasing amount that surprised me. Sorry I can't be more accurate. Also, it may not scale the same at every level. Perhaps my hacks are more helpful at weak levels? I think I implemented these hacks in such a way that the bonuses were less aggressive at higher levels - but it was still very useful at high levels. What's interesting is that AnchorMan hits the wall beyond 4-5 k nodes. I doubt you could squeeze more than another 20 or 30 ELO no matter how long you run the simulations. - Don ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] ReadyFreddy on CGOS
On Fri, 2007-09-14 at 15:15 -0400, Jason House wrote: > It's interesting that you say that it's just 512 all as first > simulations... especially when I look at the performance of > ego110_allfirst. Anyone know how ego110_allfirst was implemented? Does 110 mean 110 simulations? Here are some more reference points from AnchorMan at various levels: ELO 1434 - ControlBoy - (on old CGOS this was identical to Anchorman 5000 sims) 1398 - SuperDog is running 2000 simulations 1059 - ReadyFreddy running 256 simulations 763 - Dodo is running 64 simulations On the old CGOS ControlBoy should be 1500. This pool may have deflated or perhaps it's not calibrated accurately - I wanted the ratings to be compatible. Someone complained recently that the pool has deflated when FatMan didn't run for a few days. I put those up because someone complained about gaps in the pool and I wanted some good competition for every strength level. - Don ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] ReadyFreddy on CGOS
It's interesting that you say that it's just 512 all as first simulations... especially when I look at the performance of ego110_allfirst. Anyone know how ego110_allfirst was implemented? I'd be really curious to see how 512 AMAF simulations would do without those extra enhancements. If ego110_allfirst really is a good benchmark for AMAF performance, I'd expect your enhancements are worth significantly more than 100 ELO. On 9/14/07, Don Dailey <[EMAIL PROTECTED]> wrote: > > I'm not sure how many simulations ReadyToGo does but if it's on Senesis > that is probably correct. > > ReadyFreddy always plays on CGOS and runs from the server. It DOES DO > 512 simulations but they are all-as-first simulations. There is also > a strong fixed discouragement for playing self-atari in losing > territory. > Also a weaker discouragement for playing ANY move in territory that > appears > statistically won by either side. These little hacks boost the > strength a good little bit - around 100 ELO I'm guessing - I don't > remember. > > It DOES NOT do 512 simulation per candidate move, just 512 total. > > ReadyFreddy and ReadyToGo are just AnchorMan running at a very fast > level. > > - Don > > > > > On Fri, 2007-09-14 at 14:30 -0400, Jason House wrote: > > I'm curious - What is ReadyFreddy? > > http://senseis.xmp.net/?ComputerGoServer does not list it. I do see > > ReadyToGo. If it is, the description says 512 simulations. Over 81 > > moves, that seems like almost nothing... 6 sims per move. Does it > > instead mean 512 sims per candidate move? That seems to make sense to > > me. > > ___ > > computer-go mailing list > > computer-go@computer-go.org > > http://www.computer-go.org/mailman/listinfo/computer-go/ > > ___ > computer-go mailing list > computer-go@computer-go.org > http://www.computer-go.org/mailman/listinfo/computer-go/ > ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] ReadyFreddy on CGOS
I'm not sure how many simulations ReadyToGo does but if it's on Senesis that is probably correct. ReadyFreddy always plays on CGOS and runs from the server. It DOES DO 512 simulations but they are all-as-first simulations. There is also a strong fixed discouragement for playing self-atari in losing territory. Also a weaker discouragement for playing ANY move in territory that appears statistically won by either side. These little hacks boost the strength a good little bit - around 100 ELO I'm guessing - I don't remember. It DOES NOT do 512 simulation per candidate move, just 512 total. ReadyFreddy and ReadyToGo are just AnchorMan running at a very fast level. - Don On Fri, 2007-09-14 at 14:30 -0400, Jason House wrote: > I'm curious - What is ReadyFreddy? > http://senseis.xmp.net/?ComputerGoServer does not list it. I do see > ReadyToGo. If it is, the description says 512 simulations. Over 81 > moves, that seems like almost nothing... 6 sims per move. Does it > instead mean 512 sims per candidate move? That seems to make sense to > me. > ___ > computer-go mailing list > computer-go@computer-go.org > http://www.computer-go.org/mailman/listinfo/computer-go/ ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
[computer-go] ReadyFreddy on CGOS
I'm curious - What is ReadyFreddy? http://senseis.xmp.net/?ComputerGoServerdoes not list it. I do see ReadyToGo. If it is, the description says 512 simulations. Over 81 moves, that seems like almost nothing... 6 sims per move. Does it instead mean 512 sims per candidate move? That seems to make sense to me. ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] monte carlo transposition reuse (MCTR)
On 9/14/07, Jason House <[EMAIL PROTECTED]> wrote: > // Note that variance of the estimate increases by (0.5 * fractional > game)^2 I should say that the variance of "wins" increases by that amount. The estimate of winning percentage will be computed as wins / sims. The variance for that is (variance of wins variable) / sims^2. In normal monte carlo, each sim causes the variance of wins to increase by 0.5^2 for a total of 0.5^2 * sims. The final result is a variance of 0.5^2/sims. Of course, final computations typically use the standard deviation (the square root of variance). This gives the classical result of 0.25 / sqrt(sims)... For the same MCBR, the result is less for the same value of sims... ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] monte carlo transposition reuse (MCTR)
Sure. I'm not sure how best to answer your question, so I'll respond with mostly pseudo code. I can answer questions on theory, but I expect that to go back and forth a bit since it's typically tough to explain. Where there's grey areas in implementation choice, I have text describing it (including basic theory). For the real source code, the logic is in search.d of HouseBot's (open) source. HouseBot's home page is http://housebot.sourceforge.net Slowdown over basic MC occurs in two ways: 1. Maintaining a list of moves whose order can be shuffled around. 2. Applying the results to all affected leaves in the search tree (3. In my current implementation, I may save too much info and memory management may be another problem) Here's some pseudo code Main thinking loop: while ( can think ){ reset board reset shuffle list shuffleMode = true foreach(legal move){ play(move) if ( shuffleMode ){ shuffleMode = transposition still true //see below if (shuffleMode || first move) insert move into shuffle list } apply result //see below } "Transposition still true" By transposition or shuffle, I mean that I can reorder the moves by each color and arrive at the same board position without violating any rules along the way... Still representing a mostly random game. Obviously, any move that's illegal in the starting board position should will violate this and breaks the move shuffling property. Illegal would be stuff like suicide plays, taking a ko, or playing on top of another stone. If you assume all captures break the move shuffling property, life is simple and strange suicide plays that occur as combinations of stones won't occur. If you're smart enough to find a way to do that efficiently (or ignore it like I did), then you must check to ensure that stones captured are not in the shuffle list. If they are, order matters and the latest move breaks the move shuffling property. My code is far different, but when I think about it, the simplest form would be : "return not (capture on current board)". I should probably try that one out... Allowing captures does have the benefit that the shuffle list gets longer, but the control logic gets more complex (simple ko's, suicide in original position or a complex reordered combination of moves that create suicides, etc...). How to apply the result: Essentially, everything in the shuffle list can consider this as a game that began with that move first. At first, I had fractional game = 1 in a nieve attempt to test out my theory and it failed horribly. The problem is that a positions with long shuffle lists can be reached in many ways by many starting moves. Because of this, sequences with long shuffle lists got counted way too much. I had to make it so that regardless of the shuffle list length, a position would not be over-counted in the final average. To do this, you should compute the probability of the position being arrived at by all candidate methods and then do a weighted sum. prob(this method) / [ prob (this method) + prob (alternate method 1) + prob( alternate method 2) + ...]. I made the naive assumption that prob(method a) == prob(method b). I think that's reasonable for uniform sampling. That value is used in the pseudo code below foreach(random game) float fractional game = 1 / number of moves in shuffle list with the same color to move foreach(move in shuffle list with same color to move){ num sims (move of interest) += fractional game if (random game is is win) num wins (move of interest) += fractional game // Note that variance of the estimate increases by (0.5 * fractional game)^2 } } On 9/14/07, Brian Slesinsky <[EMAIL PROTECTED]> wrote: > > Can you explain a bit more about how 1plyShuffle works? > > On 9/14/07, Jason House <[EMAIL PROTECTED]> wrote: > > Time management is identical. Based on quick profiling, the MCTR > version > > does about 1/6 of the simulations. Actually, since the MCTR does extra > > tracking of info, it can reuse some old simulations, so it may be more > like > > 1/4 or 1/5 of the simulations. It's just using the results more > > efficiently. > > > > > > On 9/14/07, Chris Fant <[EMAIL PROTECTED]> wrote: > > > Does it defeat it based on number of samples taken or time allotted > per > > turn? > > > > > > On 9/14/07, Jason House <[EMAIL PROTECTED]> wrote: > > > > I know I'm only wading in the kiddie pool of computer go with my > 1-ply > > bots, > > > > but I think I may have found a useful enhancement to monte carlo. > > > > > > > > HouseBot supports three 1-ply search modes: > > > > 1plyMC - Uniform sampling > > > > 1plyShuffle - Uniform sampling with monte carlo transposition > reuse > > > > 1plyUCT - Non-uniform sampling based on the UCT algorithm (AKA > UCB) > > > > > > > > Obviously, 1plyMC is far inferior to 1plyUCT as everyone probably > > expects. > > > > What may surprise many is that 1
Re: [computer-go] monte carlo transposition reuse (MCTR)
Can you explain a bit more about how 1plyShuffle works? On 9/14/07, Jason House <[EMAIL PROTECTED]> wrote: > Time management is identical. Based on quick profiling, the MCTR version > does about 1/6 of the simulations. Actually, since the MCTR does extra > tracking of info, it can reuse some old simulations, so it may be more like > 1/4 or 1/5 of the simulations. It's just using the results more > efficiently. > > > On 9/14/07, Chris Fant <[EMAIL PROTECTED]> wrote: > > Does it defeat it based on number of samples taken or time allotted per > turn? > > > > On 9/14/07, Jason House <[EMAIL PROTECTED]> wrote: > > > I know I'm only wading in the kiddie pool of computer go with my 1-ply > bots, > > > but I think I may have found a useful enhancement to monte carlo. > > > > > > HouseBot supports three 1-ply search modes: > > > 1plyMC - Uniform sampling > > > 1plyShuffle - Uniform sampling with monte carlo transposition reuse > > > 1plyUCT - Non-uniform sampling based on the UCT algorithm (AKA UCB) > > > > > > Obviously, 1plyMC is far inferior to 1plyUCT as everyone probably > expects. > > > What may surprise many is that 1plyShuffle defeats 1plyUCT nearly every > > > time. I'm basic this on self-play data from CGOS. Currently, > > > > http://cgos.boardspace.net/9x9/cross/housebot-617-UCB.html > > > shows 10 matches between housebot-617-UCB has played housebot-618-shuff. > > > housebot-617-UCB (1plyUCT) lost every time. > > > > > > While tricky, it should be possible to combine UCT and MCTR for an even > > > stronger bot. MCTR can be thought of as a low bias alternative to the > AMAF > > > heuristic. Rather than using all moves, MCTR takes only the top N > moves, > > > where N is computed based on which moves were played in the random game. > > > From an open board position MCTR uses about 1/3 of the moves that AMAF > > > would. Computation of the resulting winning percentage must also be > > > weighted based on the probabilities of duplicating results (roughly > > > speaking, it's 1/N). > > > > > > As a result of using MCTR, winning rates are no longer integers as one > would > > > expect. Here's the estimated winning rates for all three algorithms > when > > > asked for a white response to black G3: > > > > > > 1plyMC: 781 / 1272 > > > 1plyShuffle: 140.15 / 231.75 > > > 1plyUCT: 936 / 1515 > > > > > > 1plyShuffle is slower because of the extra work information tracking, > but > > > the variance in estimates should be far lower than the numbers would > > > indicate. I have yet to do the computations, but a sample size of > 231.75 > > > has an estimation error of around 6000 normal MC runs for that position. > > > That is why my implementation of MCTR is defeating my (1ply) > implementation > > > of UCT. > > > > > ___ > computer-go mailing list > computer-go@computer-go.org > http://www.computer-go.org/mailman/listinfo/computer-go/ > ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] monte carlo transposition reuse (MCTR)
Time management is identical. Based on quick profiling, the MCTR version does about 1/6 of the simulations. Actually, since the MCTR does extra tracking of info, it can reuse some old simulations, so it may be more like 1/4 or 1/5 of the simulations. It's just using the results more efficiently. On 9/14/07, Chris Fant <[EMAIL PROTECTED]> wrote: > > Does it defeat it based on number of samples taken or time allotted per > turn? > > On 9/14/07, Jason House <[EMAIL PROTECTED]> wrote: > > I know I'm only wading in the kiddie pool of computer go with my 1-ply > bots, > > but I think I may have found a useful enhancement to monte carlo. > > > > HouseBot supports three 1-ply search modes: > > 1plyMC - Uniform sampling > > 1plyShuffle - Uniform sampling with monte carlo transposition reuse > > 1plyUCT - Non-uniform sampling based on the UCT algorithm (AKA UCB) > > > > Obviously, 1plyMC is far inferior to 1plyUCT as everyone probably > expects. > > What may surprise many is that 1plyShuffle defeats 1plyUCT nearly every > > time. I'm basic this on self-play data from CGOS. Currently, > > http://cgos.boardspace.net/9x9/cross/housebot-617-UCB.html > > shows 10 matches between housebot-617-UCB has played housebot-618-shuff. > > housebot-617-UCB (1plyUCT) lost every time. > > > > While tricky, it should be possible to combine UCT and MCTR for an even > > stronger bot. MCTR can be thought of as a low bias alternative to the > AMAF > > heuristic. Rather than using all moves, MCTR takes only the top N > moves, > > where N is computed based on which moves were played in the random game. > > From an open board position MCTR uses about 1/3 of the moves that AMAF > > would. Computation of the resulting winning percentage must also be > > weighted based on the probabilities of duplicating results (roughly > > speaking, it's 1/N). > > > > As a result of using MCTR, winning rates are no longer integers as one > would > > expect. Here's the estimated winning rates for all three algorithms > when > > asked for a white response to black G3: > > > > 1plyMC: 781 / 1272 > > 1plyShuffle: 140.15 / 231.75 > > 1plyUCT: 936 / 1515 > > > > 1plyShuffle is slower because of the extra work information tracking, > but > > the variance in estimates should be far lower than the numbers would > > indicate. I have yet to do the computations, but a sample size of > 231.75 > > has an estimation error of around 6000 normal MC runs for that position. > > That is why my implementation of MCTR is defeating my (1ply) > implementation > > of UCT. > ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] monte carlo transposition reuse (MCTR)
Does it defeat it based on number of samples taken or time allotted per turn? On 9/14/07, Jason House <[EMAIL PROTECTED]> wrote: > I know I'm only wading in the kiddie pool of computer go with my 1-ply bots, > but I think I may have found a useful enhancement to monte carlo. > > HouseBot supports three 1-ply search modes: > 1plyMC - Uniform sampling > 1plyShuffle - Uniform sampling with monte carlo transposition reuse > 1plyUCT - Non-uniform sampling based on the UCT algorithm (AKA UCB) > > Obviously, 1plyMC is far inferior to 1plyUCT as everyone probably expects. > What may surprise many is that 1plyShuffle defeats 1plyUCT nearly every > time. I'm basic this on self-play data from CGOS. Currently, > http://cgos.boardspace.net/9x9/cross/housebot-617-UCB.html > shows 10 matches between housebot-617-UCB has played housebot-618-shuff. > housebot-617-UCB (1plyUCT) lost every time. > > While tricky, it should be possible to combine UCT and MCTR for an even > stronger bot. MCTR can be thought of as a low bias alternative to the AMAF > heuristic. Rather than using all moves, MCTR takes only the top N moves, > where N is computed based on which moves were played in the random game. > From an open board position MCTR uses about 1/3 of the moves that AMAF > would. Computation of the resulting winning percentage must also be > weighted based on the probabilities of duplicating results (roughly > speaking, it's 1/N). > > As a result of using MCTR, winning rates are no longer integers as one would > expect. Here's the estimated winning rates for all three algorithms when > asked for a white response to black G3: > > 1plyMC: 781 / 1272 > 1plyShuffle: 140.15 / 231.75 > 1plyUCT: 936 / 1515 > > 1plyShuffle is slower because of the extra work information tracking, but > the variance in estimates should be far lower than the numbers would > indicate. I have yet to do the computations, but a sample size of 231.75 > has an estimation error of around 6000 normal MC runs for that position. > That is why my implementation of MCTR is defeating my (1ply) implementation > of UCT. > > ___ > computer-go mailing list > computer-go@computer-go.org > http://www.computer-go.org/mailman/listinfo/computer-go/ > ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
[computer-go] monte carlo transposition reuse (MCTR)
I know I'm only wading in the kiddie pool of computer go with my 1-ply bots, but I think I may have found a useful enhancement to monte carlo. HouseBot supports three 1-ply search modes: 1plyMC - Uniform sampling 1plyShuffle - Uniform sampling with monte carlo transposition reuse 1plyUCT - Non-uniform sampling based on the UCT algorithm (AKA UCB) Obviously, 1plyMC is far inferior to 1plyUCT as everyone probably expects. What may surprise many is that 1plyShuffle defeats 1plyUCT nearly every time. I'm basic this on self-play data from CGOS. Currently, http://cgos.boardspace.net/9x9/cross/housebot-617-UCB.html shows 10 matches between housebot-617-UCB has played housebot-618-shuff. housebot-617-UCB (1plyUCT) lost every time. While tricky, it should be possible to combine UCT and MCTR for an even stronger bot. MCTR can be thought of as a low bias alternative to the AMAF heuristic. Rather than using all moves, MCTR takes only the top N moves, where N is computed based on which moves were played in the random game. >From an open board position MCTR uses about 1/3 of the moves that AMAF would. Computation of the resulting winning percentage must also be weighted based on the probabilities of duplicating results (roughly speaking, it's 1/N). As a result of using MCTR, winning rates are no longer integers as one would expect. Here's the estimated winning rates for all three algorithms when asked for a white response to black G3: 1plyMC: 781 / 1272 1plyShuffle: 140.15 / 231.75 1plyUCT: 936 / 1515 1plyShuffle is slower because of the extra work information tracking, but the variance in estimates should be far lower than the numbers would indicate. I have yet to do the computations, but a sample size of 231.75has an estimation error of around 6000 normal MC runs for that position. That is why my implementation of MCTR is defeating my (1ply) implementation of UCT. ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] 2007 Cotsen Open wants your program to enter
I am more than willing to provide a binary version of my program if someone else provides the operator and (refundable) entrance fee. While I doubt my weak program would get best of the year, if it did win, I'd be fine with giving the extra $$ to whoever paid the entrance fee. On 9/13/07, Ray Tayek <[EMAIL PROTECTED]> wrote: > > At 12:49 PM 9/13/2007, you wrote: > ... > >I would also like to list the results from past years, as far as > >computer participation goes. But I can't find any Cotsen results > >listed on the web. Do you happen to know what program won in the > >last two years? > > we've only done this twice. slug beat smart go the first time and > slugo was the sole program entered last year. > > i know the time is short, but i am hoping that we have more programs > in the mix this year. > > thanks > > --- > vice-chair http://ocjug.org/ > > > ___ > computer-go mailing list > computer-go@computer-go.org > http://www.computer-go.org/mailman/listinfo/computer-go/ > ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/