Re: [computer-go] How to design the stronger playout policy?

Gian-Carlo Pascutto Sat, 05 Jan 2008 05:48:48 -0800

Yamato wrote:

I finally improved my playouts by using Remi's ELO system to learn a setof "interesting" patterns, and just randomly fiddling with theprobabilities (compressing/expanding) until something improved myprogram in self-play with about +25%. Not a very satisfying method or anexceptional result. There could be some other magic combination that iseven better, or maybe not.
I also have implemented Remi's Minorization-Maximization algorithm.
But I could not find how to use the result of it to improve the strength.

Would you explain the details of the playout policy?


(1) Captures of groups that could not save themselves last move.
(2) Save groups in atari due to last move by capturing or extending.
(3) Patterns next to last move.
(4) Global moves.

I quantize the MM pattern scores to 0..255 by multipying them with alarge constant and clipping. This causes the "very good" patterns tohave close scores. I then use a threshold so I do not play the very badpatterns at all. The remaining moves are played with the probabilitiesindicated by the quantized values.

I also throw away very bad moves in phase (4) unless there are noalternatives. This gives a small but measurable improvement.

But now I believe all the above is actually flawed. With this system Iwill play bad saving moves even if there are great pattern moves. Itmight be that your ladder detection avoids these problems somewhat.

Considering the probabilities of all moves as Crazy Stone does avoidsthis problem.

I am now trying to get a similar effect without incrementally updatingall urgencies.

Do you use only 3x3 patterns?


Yes.

I have not tried bigger ones. For size = 4 the tables would become 2 x16M. Might be worth a try.


--
GCP
_______________________________________________
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

Re: [computer-go] How to design the stronger playout policy?

Reply via email to