PROTECTED]
To: computer-go computer-go@computer-go.org
Sent: Wed, 9 Jan 2008 3:46 pm
Subject: Re: [computer-go] How to design the stronger playout policy?
Mark Boon wrote:
On 8-jan-08, at 17:04, Don Dailey wrote:
And yes, it slows down the play-outs. Still, the play-outs seem to
require
Mark Boon wrote:
On 8-jan-08, at 17:04, Don Dailey wrote:
And yes, it slows down the play-outs. Still, the play-outs seem to
require a good bit of randomness - certainly they cannot be
deterministic and it seems difficult to find the general principles that
are important to the
On 5-jan-08, at 11:48, Gian-Carlo Pascutto wrote:
Would you explain the details of the playout policy?
(1) Captures of groups that could not save themselves last move.
(2) Save groups in atari due to last move by capturing or extending.
(3) Patterns next to last move.
(4) Global moves.
I think Dave Hillis coined this term heavy playouts.
In the first programs the play-outs were uniformly random. Any move
would get played with equal likelihood with the exception of eye-filling
moves which don't get played at all of course.
But it was found that the program improves if
That would be exciting seeing your team get involved with this Monte
Carlo stuff, especially since you have some previous experience with
this.
- Don
David Doshay wrote:
I have been interested in monte-carlo approaches to Go since running
my first MC simulations in magnetic phase
Quoting Don Dailey [EMAIL PROTECTED]:
And yes, it slows down the play-outs. Still, the play-outs seem to
require a good bit of randomness - certainly they cannot be
deterministic and it seems difficult to find the general principles that
are important to the play-out policy.
Not all changes
I have been interested in monte-carlo approaches to Go since running
my first MC simulations in magnetic phase transitions when I was in
graduate school in the 1980's. What held me back, even when the latest
crop of MC programs started winning against older stronger programs
and my program
Yamato wrote:
I guess the current top programs have much better playout policy than
the classical MoGo-style one.
The original policy of MoGo was,
(1) If the last move is an Atari, plays one saving move randomly.
(2) If there are interesting moves in the 8 positions around the
last move,
Gian-Carlo Pascutto wrote:
What improvements did you try? The obvious one I know are prioritizing
saving and capturing moves by the size of the string.
Zen appears quite strong on CGOS. Leela using the above system was
certainly weaker.
I use the static ladder search in playouts. For example,
Lazarus uses a system very simlar to the original MoGo policy as
documented in the paper. However I did find one significant
improvement.I used Rémi's ELO system to rate patterns and I simply
throw out moves which match the weakest patterns in the play-outs.In
the tree, I also throw out
Yamato wrote:
I finally improved my playouts by using Remi's ELO system to learn a set
of interesting patterns, and just randomly fiddling with the
probabilities (compressing/expanding) until something improved my
program in self-play with about +25%. Not a very satisfying method or an
Don Dailey wrote:
Lazarus uses a system very simlar to the original MoGo policy as
documented in the paper. However I did find one significant
improvement.I used Rémi's ELO system to rate patterns and I simply
throw out moves which match the weakest patterns in the play-outs.In
the tree,
I guess the current top programs have much better playout policy than
the classical MoGo-style one.
The original policy of MoGo was,
(1) If the last move is an Atari, plays one saving move randomly.
(2) If there are interesting moves in the 8 positions around the
last move, plays one
13 matches
Mail list logo