On Sun, 2007-04-08 at 18:11 +0200, Edward de Grijs wrote:
> Hello Don,
> 
> >A few weeks ago I announced that I was doing a long term
> >scalability study with computer go on 9x9 boards.
> >I have constructed a graph of the results so far:
> 
> Your work and insight keeps on amazing me.
> 
> If I understand is correctly the playouts are made all from the
> root position?
> I am very interested in the results if the larger amount of
> playouts are not made from the root position, but from the
> moment the UCT part is over, and the playouts begin,
> especially with larger amount of playouts.
> I remember some statement of you that it made no
> significant difference if the playouts are multiplied from
> that position, instead of the root position, even with
> hundreds of playouts...
> Do those statements still hold?
> 
> Edward de Grijs.

Hi Edward,

Thank you for the kind remarks.

I count the total play-outs - version 0 does 1024 play-outs
period and then the search is stopped cold and a more returned.

It makes the program weaker if you just do N play-outs from the
leaf position of UCT.   But it does not seem to hurt the program
at all to not expand the existing CHILDREN of a node, until the
parent has been visited, say 100 times.   I have not tried anything
larger than 100, but if it's weaker with 100, I haven't been able
to measure it.    This really is huge benefit in memory usage,
so I like 100.   Of course this means some children will get 2 or 
more visits before getting expanded.    There is no need for 
this if you have memory to burn.  

I'm not sure what experiment you are proposing, but at some
point it might be interesting to see how other values work.
If you have a computer with very little memory to spare, you
could probably use much larger numbers although at some point
you would surely experience a noticable weakening.

If you are proposing doing 2, 3, 4 or more play-outs at the
point where you normally do one,  I think it strengthens the
program - but not enough to justfy the extra work.   In other
words doing 2 play-outs doubles the amount of time spent on
a move and it does not increase the strength enough to 
justify this.


- Don






> 
> 
> 
> >From: Don Dailey <[EMAIL PROTECTED]>
> >Reply-To: [EMAIL PROTECTED], computer-go <computer-go@computer-go.org>
> >To: computer-go <computer-go@computer-go.org>
> >Subject: [computer-go] The physics of Go playing strength.
> >Date: Sat, 07 Apr 2007 21:05:19 -0400
> >
> >A few weeks ago I announced that I was doing a long term
> >scalability study with computer go on 9x9 boards.
> >
> >I have constructed a graph of the results so far:
> >
> >   http://greencheeks.homelinux.org:8015/~drd/public/study.jpg
> >
> >Although I am still collecting data, I feel that I have
> >enough samples to report some results - although I will
> >continue to collect samples for a while.
> >
> >This study is designed to measure the improvement in
> >strength that can be expected with each doubling of computer
> >resources.
> >
> >I'm actually testing 2 programs - both of them UCT style go
> >programs, but one of those programs does uniformly random
> >play-outs and the other much stronger one is similar to
> >Mogo, as documented in one of their papers.
> >
> >Dave Hillis coined the terminolgoy I will be using, light
> >play-outs vs heavy play-outs.
> >
> >For the study I'm using 12 versions of each program.  The
> >weakest version starts with 1024 play-outs in order to
> >produce a move.  The next version doubles this to 2048
> >play-outs, and so on until the 12th version which does 2
> >million (2,097,152) playouts.  This is a substantial study
> >which has taken weeks so far to get to this point.
> >
> >Many of the faster programs have played close to 250 games,
> >but the highest levels have only played about 80 games so
> >far.
> >
> >The scheduling algorithm is very similar to the one used by
> >CGOS.  An attempt is made not to waste a lot of time playing
> >seriously mis-matched opponents.
> >
> >The games were rated and the results graphed.  You can see
> >the result of the graph here (which I also included near the
> >top of this message):
> >
> >   http://greencheeks.homelinux.org:8015/~drd/public/study.jpg
> >
> >The x-axis is the number of doublings starting with 1024
> >play-outs and the y-axis is the ELO rating.
> >
> >The public domain program GnuGo version 3.7.9 was assigned
> >the rating 2000 as a reference point.  On CGOS, this program
> >has acheived 1801, so in CGOS terms all the ratings are
> >about 200 points optimistic.
> >
> >Feel free to interpret the data any way you please, but here
> >are my own observations:
> >
> >   1.  Scalability is almost linear with each doubling.
> >
> >   2.  But there appears to be a very gradual fall-off with
> >       time - which is what one would expect (ELO
> >       improvements cannot be infinite so they must be
> >       approaching some limit.)
> >
> >   3.  The heavy-playout version scales at least as well,
> >       if not better, than the light play-out version.
> >
> >       (You can see the rating gap between them gradually
> >       increase with the number of play-outs.)
> >
> >   4.  The curve is still steep at 2 million play-outs, this
> >       is convincing empirical evidence that there are a few
> >       hundred ELO points worth of improvement possible
> >       beyond this.
> >
> >   5.  GnuGo 3.7.9 is not competive with the higher levels of
> >       Lazarus.  However, what the study doesn't show is that
> >       Lazarus needs 2X more thinking time to play equal to
> >       GnuGo 3.7.9.
> >
> >
> >This graph explains why I feel that absolute playing
> >strength is a poor conceptual model of how humans or
> >computers play go.  If Lazarus was running on the old Z-80
> >processors of a few decades ago, it would be veiewed as an
> >incredibly weak program, but running on a supercomputer it's
> >a very strong program.  But in either case it's the SAME
> >program.  The difference is NOT the amount of work each
> >system is capable of, it's just that one takes longer to
> >accomplish a given amount of work.  It's much like the
> >relationships between power, work, force, time etc.  in
> >physics.
> >
> >Based on this type of analysis and the physics analogy,
> >GnuGo 3.7.9 is a stronger program than Lazarus (even at 9x9
> >go).  Lazarus requires about 2X more time to equalize.  So
> >Lazarus plays with less "force" (if you use the physics
> >analogy) and needs more TIME to get the same amount of work
> >done.
> >
> >ELO is treated numerically as if it were "work" in physics
> >because when it's measured by playing games, both players
> >get the same amount of time.  The time factor cancels out
> >but it cause us to ignore that it's part of the equation.
> >
> >On CGOS, Lazarus and FatMan are the same program, but one
> >does much more work and they have ELO ratings that differ by
> >almost 300 ELO points.   Even though they are the same
> >program you will look on CGOS and believe Lazarus is much
> >stronger because you have not considered the physics of Go
> >playing strength.
> >
> >- Don
> >
> >
> >_______________________________________________
> >computer-go mailing list
> >computer-go@computer-go.org
> >http://www.computer-go.org/mailman/listinfo/computer-go/
> 
> _________________________________________________________________
> Wil jij MSN reporter zijn? Deel jouw nieuws en verhalen hier! 
> http://reporter.msn.nl/
> 
> _______________________________________________
> computer-go mailing list
> computer-go@computer-go.org
> http://www.computer-go.org/mailman/listinfo/computer-go/

_______________________________________________
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

Reply via email to