Re: [computer-go] ReadyFreddy on CGOS

2007-09-14 Thread Don Dailey
On Fri, 2007-09-14 at 15:15 -0400, Jason House wrote:
> I'd expect your enhancements are worth significantly more than 100
> ELO. 

I can't really remember except that it was a non-trivial pleasing amount
that surprised me.  Sorry I can't be more accurate. 

Also, it may not scale the same at every level. Perhaps my hacks are
more
helpful at weak levels?   I think I implemented these hacks in such a
way
that the bonuses were less aggressive at higher levels - but it was
still
very useful at high levels.

What's interesting is that AnchorMan hits the wall beyond 4-5 k nodes.
I
doubt you could squeeze more than another 20 or 30 ELO no matter how
long
you run the simulations.


- Don



___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/


Re: [computer-go] ReadyFreddy on CGOS

2007-09-14 Thread Don Dailey
On Fri, 2007-09-14 at 15:15 -0400, Jason House wrote:
> It's interesting that you say that it's just 512 all as first
> simulations... especially when I look at the performance of
> ego110_allfirst.  Anyone know how ego110_allfirst was implemented?

Does 110 mean 110 simulations?  

Here are some more reference points from AnchorMan at various levels:

  ELO   
 1434  - ControlBoy - (on old CGOS this was identical to Anchorman 5000
sims)
 1398  - SuperDog is running 2000 simulations
 1059  - ReadyFreddy running 256 simulations
  763  - Dodo is running 64 simulations

On the old CGOS ControlBoy should be 1500.  This pool may have deflated
or perhaps it's not calibrated accurately - I wanted the ratings to be
compatible.
Someone complained recently that the pool has deflated when FatMan
didn't run
for a few days.

I put those up because someone complained about gaps in the pool and I
wanted
some good competition for every strength level.

- Don


___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/


Re: [computer-go] ReadyFreddy on CGOS

2007-09-14 Thread Jason House
It's interesting that you say that it's just 512 all as first simulations...
especially when I look at the performance of ego110_allfirst.  Anyone know
how ego110_allfirst was implemented?

I'd be really curious to see how 512 AMAF simulations would do without those
extra enhancements.  If ego110_allfirst really is a good benchmark for AMAF
performance, I'd expect your enhancements are worth significantly more than
100 ELO.


On 9/14/07, Don Dailey <[EMAIL PROTECTED]> wrote:
>
> I'm not sure how many simulations ReadyToGo does but if it's on Senesis
> that is probably correct.
>
> ReadyFreddy always plays on CGOS and runs from the server.  It DOES DO
> 512 simulations but they are all-as-first simulations.  There is also
> a strong fixed discouragement for playing self-atari in losing
> territory.
> Also a weaker discouragement for playing ANY move in territory that
> appears
> statistically won  by either side.   These little hacks boost the
> strength a good little bit - around 100 ELO I'm guessing - I don't
> remember.
>
> It DOES NOT do 512 simulation per candidate move, just 512 total.
>
> ReadyFreddy and ReadyToGo are just AnchorMan running at a very fast
> level.
>
> - Don
>
>
>
>
> On Fri, 2007-09-14 at 14:30 -0400, Jason House wrote:
> > I'm curious - What is ReadyFreddy?
> > http://senseis.xmp.net/?ComputerGoServer does not list it.  I do see
> > ReadyToGo.  If it is, the description says 512 simulations.  Over 81
> > moves, that seems like almost nothing... 6 sims per move.  Does it
> > instead mean 512 sims per candidate move?  That seems to make sense to
> > me.
> > ___
> > computer-go mailing list
> > computer-go@computer-go.org
> > http://www.computer-go.org/mailman/listinfo/computer-go/
>
> ___
> computer-go mailing list
> computer-go@computer-go.org
> http://www.computer-go.org/mailman/listinfo/computer-go/
>
___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

Re: [computer-go] ReadyFreddy on CGOS

2007-09-14 Thread Don Dailey
I'm not sure how many simulations ReadyToGo does but if it's on Senesis
that is probably correct.  

ReadyFreddy always plays on CGOS and runs from the server.  It DOES DO
512 simulations but they are all-as-first simulations.  There is also
a strong fixed discouragement for playing self-atari in losing
territory.
Also a weaker discouragement for playing ANY move in territory that
appears
statistically won  by either side.   These little hacks boost the
strength a good little bit - around 100 ELO I'm guessing - I don't
remember.

It DOES NOT do 512 simulation per candidate move, just 512 total.

ReadyFreddy and ReadyToGo are just AnchorMan running at a very fast
level.

- Don




On Fri, 2007-09-14 at 14:30 -0400, Jason House wrote:
> I'm curious - What is ReadyFreddy?
> http://senseis.xmp.net/?ComputerGoServer does not list it.  I do see
> ReadyToGo.  If it is, the description says 512 simulations.  Over 81
> moves, that seems like almost nothing... 6 sims per move.  Does it
> instead mean 512 sims per candidate move?  That seems to make sense to
> me. 
> ___
> computer-go mailing list
> computer-go@computer-go.org
> http://www.computer-go.org/mailman/listinfo/computer-go/

___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/


[computer-go] ReadyFreddy on CGOS

2007-09-14 Thread Jason House
I'm curious - What is ReadyFreddy?
http://senseis.xmp.net/?ComputerGoServerdoes not list it.  I do see
ReadyToGo.  If it is, the description says 512
simulations.  Over 81 moves, that seems like almost nothing... 6 sims per
move.  Does it instead mean 512 sims per candidate move?  That seems to make
sense to me.
___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

Re: [computer-go] monte carlo transposition reuse (MCTR)

2007-09-14 Thread Jason House
On 9/14/07, Jason House <[EMAIL PROTECTED]> wrote:

>   // Note that variance of the estimate increases by (0.5 * fractional
> game)^2


I should say that the variance of "wins" increases by that amount.  The
estimate of winning percentage will be computed as wins / sims.  The
variance for that is (variance of wins variable) / sims^2.  In normal monte
carlo, each sim causes the variance of wins to increase by 0.5^2 for a total
of 0.5^2 * sims.  The final result is a variance of 0.5^2/sims.  Of course,
final computations typically use the standard deviation (the square root of
variance).  This gives the classical result of 0.25 / sqrt(sims)...  For the
same MCBR, the result is less for the same value of sims...
___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

Re: [computer-go] monte carlo transposition reuse (MCTR)

2007-09-14 Thread Jason House
Sure.

I'm not sure how best to answer your question, so I'll respond with mostly
pseudo code. I can answer questions on theory, but I expect that to go back
and forth a bit since it's typically tough to explain.

Where there's grey areas in implementation choice, I have text describing it
(including basic theory).  For the real source code, the logic is in
search.d of HouseBot's (open) source. HouseBot's home page is
http://housebot.sourceforge.net

Slowdown over basic MC occurs in two ways:
1. Maintaining a list of moves whose order can be shuffled around.
2. Applying the results to all affected leaves in the search tree
(3. In my current implementation, I may save too much info and memory
management may be another problem)

Here's some pseudo code
Main thinking loop:
  while ( can think ){
reset board
reset shuffle list
shuffleMode = true
foreach(legal move){
  play(move)
  if ( shuffleMode ){
shuffleMode = transposition still true  //see below
if (shuffleMode || first move)
  insert move into shuffle list
}
apply result   //see below
  }

"Transposition still true"
  By transposition or shuffle, I mean that I can reorder the moves by each
color and arrive at the same board position without violating any rules
along the way... Still representing a mostly random game.
  Obviously, any move that's illegal in the starting board position should
will violate this and breaks the move shuffling property.  Illegal would be
stuff like suicide plays, taking a ko, or playing on top of another stone.
If you assume all captures break the move shuffling property, life is simple
and strange suicide plays that occur as combinations of stones won't occur.
If you're smart enough to find a way to do that efficiently (or ignore it
like I did), then you must check to ensure that stones captured are not in
the shuffle list.  If they are, order matters and the latest move breaks the
move shuffling property.
  My code is far different, but when I think about it, the simplest form
would be : "return not (capture on current board)".  I should probably try
that one out...  Allowing captures does have the benefit that the shuffle
list gets longer, but the control logic gets more complex (simple ko's,
suicide in original position or a complex reordered combination of moves
that create suicides, etc...).


How to apply the result:
  Essentially, everything in the shuffle list can consider this as a game
that began with that move first.  At first, I had fractional game = 1 in a
nieve attempt to test out my theory and it failed horribly.  The problem is
that a positions with long shuffle lists can be reached in many ways by many
starting moves.  Because of this, sequences with long shuffle lists got
counted way too much.  I had to make it so that regardless of the shuffle
list length, a position would not be over-counted in the final average.  To
do this, you should compute the probability of the position being arrived at
by all candidate methods and then do a weighted sum.  prob(this method) / [
prob (this method) + prob (alternate method 1) + prob( alternate method 2) +
...].  I made the naive assumption that prob(method a) == prob(method b).  I
think that's reasonable for uniform sampling.  That value is used in the
pseudo code below

  foreach(random game)
float fractional game = 1 / number of moves in shuffle list with the
same color to move
foreach(move in shuffle list with same color to move){
  num sims (move of interest) += fractional game
  if (random game is is win)
num wins (move of interest) += fractional game
  // Note that variance of the estimate increases by (0.5 * fractional
game)^2
}
  }

On 9/14/07, Brian Slesinsky <[EMAIL PROTECTED]> wrote:
>
> Can you explain a bit more about how 1plyShuffle works?
>
> On 9/14/07, Jason House <[EMAIL PROTECTED]> wrote:
> > Time management is identical.  Based on quick profiling, the MCTR
> version
> > does about 1/6 of the simulations.  Actually, since the MCTR does extra
> > tracking of info, it can reuse some old simulations, so it may be more
> like
> > 1/4 or 1/5 of the simulations.  It's just using the results more
> > efficiently.
> >
> >
> > On 9/14/07, Chris Fant <[EMAIL PROTECTED]> wrote:
> > > Does it defeat it based on number of samples taken or time allotted
> per
> > turn?
> > >
> > > On 9/14/07, Jason House <[EMAIL PROTECTED]> wrote:
> > > > I know I'm only wading in the kiddie pool of computer go with my
> 1-ply
> > bots,
> > > > but I think I may have found a useful enhancement to monte carlo.
> > > >
> > > > HouseBot supports three 1-ply search modes:
> > > >   1plyMC - Uniform sampling
> > > >   1plyShuffle - Uniform sampling with monte carlo transposition
> reuse
> > > >   1plyUCT - Non-uniform sampling based on the UCT algorithm (AKA
> UCB)
> > > >
> > > > Obviously, 1plyMC is far inferior to 1plyUCT as everyone probably
> > expects.
> > > > What may surprise many is that 1

Re: [computer-go] monte carlo transposition reuse (MCTR)

2007-09-14 Thread Brian Slesinsky
Can you explain a bit more about how 1plyShuffle works?

On 9/14/07, Jason House <[EMAIL PROTECTED]> wrote:
> Time management is identical.  Based on quick profiling, the MCTR version
> does about 1/6 of the simulations.  Actually, since the MCTR does extra
> tracking of info, it can reuse some old simulations, so it may be more like
> 1/4 or 1/5 of the simulations.  It's just using the results more
> efficiently.
>
>
> On 9/14/07, Chris Fant <[EMAIL PROTECTED]> wrote:
> > Does it defeat it based on number of samples taken or time allotted per
> turn?
> >
> > On 9/14/07, Jason House <[EMAIL PROTECTED]> wrote:
> > > I know I'm only wading in the kiddie pool of computer go with my 1-ply
> bots,
> > > but I think I may have found a useful enhancement to monte carlo.
> > >
> > > HouseBot supports three 1-ply search modes:
> > >   1plyMC - Uniform sampling
> > >   1plyShuffle - Uniform sampling with monte carlo transposition reuse
> > >   1plyUCT - Non-uniform sampling based on the UCT algorithm (AKA UCB)
> > >
> > > Obviously, 1plyMC is far inferior to 1plyUCT as everyone probably
> expects.
> > > What may surprise many is that 1plyShuffle defeats 1plyUCT nearly every
> > > time.  I'm basic this on self-play data from CGOS.  Currently,
> > >
> http://cgos.boardspace.net/9x9/cross/housebot-617-UCB.html
> > > shows 10 matches between housebot-617-UCB has played housebot-618-shuff.
> > > housebot-617-UCB (1plyUCT) lost every time.
> > >
> > > While tricky, it should be possible to combine UCT and MCTR for an even
> > > stronger bot.  MCTR can be thought of as a low bias alternative to the
> AMAF
> > > heuristic.  Rather than using all moves, MCTR takes only the top N
> moves,
> > > where N is computed based on which moves were played in the random game.
> > > From an open board position MCTR uses about 1/3 of the moves that AMAF
> > > would.  Computation of the resulting winning percentage must also be
> > > weighted based on the probabilities of duplicating results (roughly
> > > speaking, it's 1/N).
> > >
> > > As a result of using MCTR, winning rates are no longer integers as one
> would
> > > expect.  Here's the estimated winning rates for all three algorithms
> when
> > > asked for a white response to black G3:
> > >
> > > 1plyMC:   781 / 1272
> > > 1plyShuffle:  140.15 /  231.75
> > > 1plyUCT: 936 / 1515
> > >
> > > 1plyShuffle is slower because of the extra work information tracking,
> but
> > > the variance in estimates should be far lower than the numbers would
> > > indicate.  I have yet to do the computations, but a sample size of
> 231.75
> > > has an estimation error of around 6000 normal MC runs for that position.
> > > That is why my implementation of MCTR is defeating my (1ply)
> implementation
> > > of UCT.
> >
>
>
> ___
> computer-go mailing list
> computer-go@computer-go.org
> http://www.computer-go.org/mailman/listinfo/computer-go/
>
___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/


Re: [computer-go] monte carlo transposition reuse (MCTR)

2007-09-14 Thread Jason House
Time management is identical.  Based on quick profiling, the MCTR version
does about 1/6 of the simulations.  Actually, since the MCTR does extra
tracking of info, it can reuse some old simulations, so it may be more like
1/4 or 1/5 of the simulations.  It's just using the results more
efficiently.

On 9/14/07, Chris Fant <[EMAIL PROTECTED]> wrote:
>
> Does it defeat it based on number of samples taken or time allotted per
> turn?
>
> On 9/14/07, Jason House <[EMAIL PROTECTED]> wrote:
> > I know I'm only wading in the kiddie pool of computer go with my 1-ply
> bots,
> > but I think I may have found a useful enhancement to monte carlo.
> >
> > HouseBot supports three 1-ply search modes:
> >   1plyMC - Uniform sampling
> >   1plyShuffle - Uniform sampling with monte carlo transposition reuse
> >   1plyUCT - Non-uniform sampling based on the UCT algorithm (AKA UCB)
> >
> > Obviously, 1plyMC is far inferior to 1plyUCT as everyone probably
> expects.
> > What may surprise many is that 1plyShuffle defeats 1plyUCT nearly every
> > time.  I'm basic this on self-play data from CGOS.  Currently,
> > http://cgos.boardspace.net/9x9/cross/housebot-617-UCB.html
> > shows 10 matches between housebot-617-UCB has played housebot-618-shuff.
> > housebot-617-UCB (1plyUCT) lost every time.
> >
> > While tricky, it should be possible to combine UCT and MCTR for an even
> > stronger bot.  MCTR can be thought of as a low bias alternative to the
> AMAF
> > heuristic.  Rather than using all moves, MCTR takes only the top N
> moves,
> > where N is computed based on which moves were played in the random game.
> > From an open board position MCTR uses about 1/3 of the moves that AMAF
> > would.  Computation of the resulting winning percentage must also be
> > weighted based on the probabilities of duplicating results (roughly
> > speaking, it's 1/N).
> >
> > As a result of using MCTR, winning rates are no longer integers as one
> would
> > expect.  Here's the estimated winning rates for all three algorithms
> when
> > asked for a white response to black G3:
> >
> > 1plyMC:   781 / 1272
> > 1plyShuffle:  140.15 /  231.75
> > 1plyUCT: 936 / 1515
> >
> > 1plyShuffle is slower because of the extra work information tracking,
> but
> > the variance in estimates should be far lower than the numbers would
> > indicate.  I have yet to do the computations, but a sample size of
> 231.75
> > has an estimation error of around 6000 normal MC runs for that position.
> > That is why my implementation of MCTR is defeating my (1ply)
> implementation
> > of UCT.
>
___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

Re: [computer-go] monte carlo transposition reuse (MCTR)

2007-09-14 Thread Chris Fant
Does it defeat it based on number of samples taken or time allotted per turn?

On 9/14/07, Jason House <[EMAIL PROTECTED]> wrote:
> I know I'm only wading in the kiddie pool of computer go with my 1-ply bots,
> but I think I may have found a useful enhancement to monte carlo.
>
> HouseBot supports three 1-ply search modes:
>   1plyMC - Uniform sampling
>   1plyShuffle - Uniform sampling with monte carlo transposition reuse
>   1plyUCT - Non-uniform sampling based on the UCT algorithm (AKA UCB)
>
> Obviously, 1plyMC is far inferior to 1plyUCT as everyone probably expects.
> What may surprise many is that 1plyShuffle defeats 1plyUCT nearly every
> time.  I'm basic this on self-play data from CGOS.  Currently,
> http://cgos.boardspace.net/9x9/cross/housebot-617-UCB.html
> shows 10 matches between housebot-617-UCB has played housebot-618-shuff.
> housebot-617-UCB (1plyUCT) lost every time.
>
> While tricky, it should be possible to combine UCT and MCTR for an even
> stronger bot.  MCTR can be thought of as a low bias alternative to the AMAF
> heuristic.  Rather than using all moves, MCTR takes only the top N moves,
> where N is computed based on which moves were played in the random game.
> From an open board position MCTR uses about 1/3 of the moves that AMAF
> would.  Computation of the resulting winning percentage must also be
> weighted based on the probabilities of duplicating results (roughly
> speaking, it's 1/N).
>
> As a result of using MCTR, winning rates are no longer integers as one would
> expect.  Here's the estimated winning rates for all three algorithms when
> asked for a white response to black G3:
>
> 1plyMC:   781 / 1272
> 1plyShuffle:  140.15 /  231.75
> 1plyUCT: 936 / 1515
>
> 1plyShuffle is slower because of the extra work information tracking, but
> the variance in estimates should be far lower than the numbers would
> indicate.  I have yet to do the computations, but a sample size of 231.75
> has an estimation error of around 6000 normal MC runs for that position.
> That is why my implementation of MCTR is defeating my (1ply) implementation
> of UCT.
>
> ___
> computer-go mailing list
> computer-go@computer-go.org
> http://www.computer-go.org/mailman/listinfo/computer-go/
>
___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/


[computer-go] monte carlo transposition reuse (MCTR)

2007-09-14 Thread Jason House
I know I'm only wading in the kiddie pool of computer go with my 1-ply bots,
but I think I may have found a useful enhancement to monte carlo.

HouseBot supports three 1-ply search modes:
  1plyMC - Uniform sampling
  1plyShuffle - Uniform sampling with monte carlo transposition reuse
  1plyUCT - Non-uniform sampling based on the UCT algorithm (AKA UCB)

Obviously, 1plyMC is far inferior to 1plyUCT as everyone probably expects.
What may surprise many is that 1plyShuffle defeats 1plyUCT nearly every
time.  I'm basic this on self-play data from CGOS.  Currently,
http://cgos.boardspace.net/9x9/cross/housebot-617-UCB.html shows 10 matches
between housebot-617-UCB has played housebot-618-shuff.  housebot-617-UCB
(1plyUCT) lost every time.

While tricky, it should be possible to combine UCT and MCTR for an even
stronger bot.  MCTR can be thought of as a low bias alternative to the AMAF
heuristic.  Rather than using all moves, MCTR takes only the top N moves,
where N is computed based on which moves were played in the random game.
>From an open board position MCTR uses about 1/3 of the moves that AMAF
would.  Computation of the resulting winning percentage must also be
weighted based on the probabilities of duplicating results (roughly
speaking, it's 1/N).

As a result of using MCTR, winning rates are no longer integers as one would
expect.  Here's the estimated winning rates for all three algorithms when
asked for a white response to black G3:

1plyMC:   781 / 1272
1plyShuffle:  140.15 /  231.75
1plyUCT: 936 / 1515

1plyShuffle is slower because of the extra work information tracking, but
the variance in estimates should be far lower than the numbers would
indicate.  I have yet to do the computations, but a sample size of
231.75has an estimation error of around 6000 normal MC runs for that
position.
That is why my implementation of MCTR is defeating my (1ply) implementation
of UCT.
___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

Re: [computer-go] 2007 Cotsen Open wants your program to enter

2007-09-14 Thread Jason House
I am more than willing to provide a binary version of my program if someone
else provides the operator and (refundable) entrance fee.  While I doubt my
weak program would get best of the year, if it did win, I'd be fine with
giving the extra $$ to whoever paid the entrance fee.

On 9/13/07, Ray Tayek <[EMAIL PROTECTED]> wrote:
>
> At 12:49 PM 9/13/2007, you wrote:
> ...
> >I would also like to list the results from past years, as far as
> >computer participation goes.  But I can't find any Cotsen results
> >listed on the web.  Do you happen to know what program won in the
> >last two years?
>
> we've only done this twice. slug beat smart go the first time and
> slugo was the sole program entered last year.
>
> i know the time is short, but i am hoping that we have more programs
> in the mix this year.
>
> thanks
>
> ---
> vice-chair http://ocjug.org/
>
>
> ___
> computer-go mailing list
> computer-go@computer-go.org
> http://www.computer-go.org/mailman/listinfo/computer-go/
>
___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/