> The statement "will never give a strong computer go program."  is rather
> devoid of meaning.  You either should define "strong" ...

OK, I'll add something. By strong I mean dan level.

> I definitely agree that once you've played a few thousand uniformly
> random games, there is little to be gained by doing a few thousand more.
> And as an evaluation function this is a relatively weak one - although
> surprisingly good in some ways it has definite limitations.    AnchorMan
> hits the wall at about 5,000 simulations and it is uniformly random with
> no other search involved.   It would not be much stronger even with
> infinite number of simulations.  

5000 is a fascinating number. You cannot be talking about UCT playouts,
as I know you know strength always increases with more playouts. But, if
you are talking about playouts as an evaluation function, in my
experiments there was practically no gain in accuracy beyond 60
playouts, and even 30 was enough to get a good approximation.

I guess our results are so different as I concentrated on the end game?

> The way to think about a play-out policy is to ask, "how good would it
> be given an infinite number of simulations?"   The answer for uniform
> random is, "not very."   

I did not mention it in the article, as it wasn't related to my main
point, but when I've been testing playout algorithms I've been measuring
the result as 5 sets of 20 playouts, then remembering the worst score of
the 5 sets. The difference in accuracy between worst set of 20 and all
100 playouts I've been calling the stability: a small difference is a
stable algorithm, and is highly desirable as then I know I can get a
reliable estimate with fewer playouts.

Darren
_______________________________________________
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

Reply via email to