Re: [computer-go] MC approach

Matt Gokey Wed, 07 Feb 2007 20:15:51 -0800

Weston Markham wrote:

But of course, it's not the size of the win that counts, it is rather
the confidence that it really is a win.

Yes, and my reasoning was that a larger average win implied a higherconfidence since there is more room for error. That intuition may nothold though.

> In random playouts that

continue from a position from a close game, the ones that result in a
large victory are generally only ones where the opponent made a severe
blunder.  (Put another way, the score of the game is affected more by
how bad the bad moves are, rather than how good the good ones are, or
even how good most of the moves are.  Others have commented on this
effect in this list, in other contexts.)  Since you can't count on
that happening in the real game, these simulations have a lower value
in the context of ensuring a win.

That is the first reasonable argument I've heard that makes some senseas to why this effect may be true. The opposite of course may be trueas well and close games may really not be close due to the same blundereffect. Perhaps it is just another symptom of the fact that mostplayouts are nonsense games.

> (snip)


Given that people have reported such a strong effect, I am actually
wondering if these simulations (those that result in a large score
difference) should be _penalized_, for not being properly
representative of the likely outcome of the game.  In other words:

value = 1000 * win - score

Instead of penalizing these simulations, what about keeping frequenciesof the simulation scores and throwing out lower and upper extreme datapoints, then using the remaining average? By throwing out extremes, itmight be safer to use the scoring information in the evaluation. Thisfrequency distribution concept might also be used to find a type ofquiescence or trust over a set of simulations - a single cluster ofsimulations returning tightly grouped scores from a position.

The real problem is with the poor simulations. Is there a way to measurethe quality of a simulation somehow? If this were feasible, having thescores and confidence factors for each simulation would be prettypowerful for UCT evaluation, wouldn't it? Could the number of capturedstones during a random simulation be an indicator? Are there other(cheap) heuristics that could be used to recognize nonsensical patternsduring the playout?

A while back I suggested using a "stratified sampling" method wheremultiple different types of simulation distributions might be used andcombined to combat the ineffectiveness of any single simulation method.Does anyone have any thoughts about this?


-Matt


_______________________________________________
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

Re: [computer-go] MC approach

Reply via email to