The second explanation was no clearer to me.  I'll try to criticize in
more detail:

1.  Uniform playouts, as used in practice, are not really uniform over
all legal go moves.  Generally, pass moves are excluded until
necessary, and moves that fill "eyelike" points are excluded.  So, I
assume that when you use the word "legal", you mean admissible within
this sort of playout.

2.  "That variance depends on the length of the playout."  It is
difficult for me to make sense of this statement, simply because not
all playouts from a given position have the same length.  My best
guess is that you are claiming that the longer a playout is, the more
likely it is that its result differs from the result under correct
play.  However, I strongly doubt that this is true for all starting
positions.  (Imagine that the first player needs to prevent the second
player from forming two eyes in a large group.  After doing this, that
group will eventually be captured, allowing playouts to continue
longer by filling the intersections that it once occupied.  Failing to
kill this group may allow the playouts to complete much more quickly,
but gives inaccurate results.)

2.5.  "The variance of the stochastic process is not to be mixed up
with the distribution of
the error of a repeated Bernoulli experiment!"  Perhaps I have mixed
them up.  Can you explain more clearly or precisely what "the variance
of the stochastic process" is?  Do you perhaps mean some measurement
of variation across different starting points, rather than across
different Bernoulli trials from the same starting position?  Or, do
you mean to distinguish the probability that a playout's outcome
differs from the outcome under correct play, from the probability that
a playout results in a win?  (Although those are just two different
Bernoulli experiments, right?)  Or is there some subtlety that I have
missed?

3.  'p is a biased towards 1/2 "estimator" of W'.  Consider the game:

    o
  / \
 o   o
/ \  |
1   0 0

(1 is a win by the first player, and 0 is a loss.)  There is a move
that could allow the first player to win, if the second player does
not respond to it correctly.  This sounds like a realistic scenario
for go.

W = 1/3
p = 1/4

p is further from 1/2 than W.  Does this game violate the condition
that "the number of legal moves for each side is balanced"?  (It is
still not clear to me what this condition is that you are attempting
to impose.)  Or, was I supposed to calculate a statistic across
multiple game trees where W=1/3, in order to interpret p as an
"estimator" of W?

4.  Even if we can compute W exactly, do we have any reason to think
that its value is a good estimate of the minimax value of the game?
Is it even a better estimate than p, which we can already estimate
accurately?  (Note that in the game tree above, it is not.)  My
offhand guess is that it would not be as good.

Weston
_______________________________________________
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

Reply via email to