Some months ago I did several experiments with using tactics and
patterns in playouts. Generally I found a big boost in strength using
tactics. I also found a boost in strength using patterns but with a
severe diminishing return after a certain number and even becoming
detrimental when using large number of patterns (1,000s to 10,000s).
Since I was using a generalized pattern-matcher, using it slowed
things down considerably. Although it played a lot better with the
same number of playouts, if I compared MC playouts with patterns to a
MC playout without patterns using the same amount of CPU time the
gain was not so obvious. Since most of the gain in strength was
gained by just a few patterns I concluded just as David that it was
probably better to just use a handful of hard-coded patterns during
playouts.
I only recently started to do real experiments with hard-coded
patterns and so far my results are rather inconclusive. I found when
mixing different things it's not always clear what contributes to any
increased strength observed. So I'm still in the process of trying to
dissect what is actually contributing where. I found for example that
a lot of the increased level of play using patterns does not come
from using them during playouts but comes from the effect they have
on move-exploration. I don't know if this is due to my particular way
of implementing MC playouts in combination with UCT search, but moves
matching a pattern (usually) automatically make it first in the tree-
expansion as well and generally I can say so far I'm observing that
most of the increase in level comes from the selection during
exploration and only in small part from the selection during simulation.
For example, in one particular experiment using just 5 patterns I saw
a win-rate of 65% against the same program not using patterns (with
the same number of playouts). But when not using the patterns during
exploration saw the win-rate drop to just 55%.
I still have a lot of testing to do and it's too early to draw any
hard conclusions. But I think it's worthwhile trying to distinguish
where the strength is actually gained. Better yet, finding out
exactly 'why' it gained strength, because with MC playouts I often
find results during testing highly counter-intuitive, occasionally to
the point of being (seemingly) nonsensical.
I also think what Don was proposing with his reference-bot could be
interesting. Trying to make it play around ELO 1700 on CGOS just
using 5,000 (light) playouts. I don't know if it's possible, but I
think it's a fruitful exercise. At a time where most people are
looking at using more and more hardware to increase playing-strength,
knowing what plays best at the other end of the spectrum is valuable
as well. With that I mean, finding what plays best using severely
constrained resources.
Mark
_______________________________________________
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/