Some months ago I did several experiments with using tactics and patterns in playouts. Generally I found a big boost in strength using tactics. I also found a boost in strength using patterns but with a severe diminishing return after a certain number and even becoming detrimental when using large number of patterns (1,000s to 10,000s). Since I was using a generalized pattern-matcher, using it slowed things down considerably. Although it played a lot better with the same number of playouts, if I compared MC playouts with patterns to a MC playout without patterns using the same amount of CPU time the gain was not so obvious. Since most of the gain in strength was gained by just a few patterns I concluded just as David that it was probably better to just use a handful of hard-coded patterns during playouts.

I only recently started to do real experiments with hard-coded patterns and so far my results are rather inconclusive. I found when mixing different things it's not always clear what contributes to any increased strength observed. So I'm still in the process of trying to dissect what is actually contributing where. I found for example that a lot of the increased level of play using patterns does not come from using them during playouts but comes from the effect they have on move-exploration. I don't know if this is due to my particular way of implementing MC playouts in combination with UCT search, but moves matching a pattern (usually) automatically make it first in the tree- expansion as well and generally I can say so far I'm observing that most of the increase in level comes from the selection during exploration and only in small part from the selection during simulation.

For example, in one particular experiment using just 5 patterns I saw a win-rate of 65% against the same program not using patterns (with the same number of playouts). But when not using the patterns during exploration saw the win-rate drop to just 55%.

I still have a lot of testing to do and it's too early to draw any hard conclusions. But I think it's worthwhile trying to distinguish where the strength is actually gained. Better yet, finding out exactly 'why' it gained strength, because with MC playouts I often find results during testing highly counter-intuitive, occasionally to the point of being (seemingly) nonsensical.

I also think what Don was proposing with his reference-bot could be interesting. Trying to make it play around ELO 1700 on CGOS just using 5,000 (light) playouts. I don't know if it's possible, but I think it's a fruitful exercise. At a time where most people are looking at using more and more hardware to increase playing-strength, knowing what plays best at the other end of the spectrum is valuable as well. With that I mean, finding what plays best using severely constrained resources.

Mark

_______________________________________________
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

Reply via email to