On Tue, Oct 5, 2010 at 18:40, <valky...@phmp.se> wrote:
No information is thrown away with maximizing win rate.
Let's take a VERY idealised example.
Play your playouts until 50 moves before the end. The last fifty moves
being not perfect can be seen as adding noise to the score. Say a
binomial with mean 0, width 10 on n=200.
Imagine that all the noises added there are independent.
Make the huge (and wrong) hypothesis that all scores are equally likely
a priori.
Then if you end with a win of 1, there was in fact a chance of almost
one half that the real result should be a loss; it would be better to
add +1/2 + epsilon (can be computed) to the winrate. When you get a win
of more than 10 points, you are almost sure you were winning already 50 moves
before.
Essentially as soon as the win is big, it counts for one. When it is
smaller it counts for less.
Here we had a very simplified example. There are many reasons why we
cannot use that directly.
That is not true. :)
If you look for robustness median and quantile statistics are a good choice.
Or trimmed mean for keeping efficiency. But it seems obvious that
average score would awfully unstable.
But that is not necessary because playout results almost follwow
Bernoulli distribution.
Just look at the histogram.
That's really interesting. Do you have one to show us?
I say almost because it is not a sum of iid variables. Result is a
sum of random components of not equall sizes.
Yes, I intended to modelise like this.
Jonas
_______________________________________________
Computer-go mailing list
Computer-go@dvandva.org
http://dvandva.org/cgi-bin/mailman/listinfo/computer-go