Hi, I also have some Zero stuff brewing since almost two days. Although
I depend on heavy playout MC-evaluation for self play.
I am using my Odin MC-engine as a base as it is. It can use a small AG
style policy network running on CPU implemented with Eigen (C++). It
does not have any value network which I guess will prevent super strong
play. My first goal is at least to be able to reach the same standard
for komi 5.5 on 9x9 as I did with the help of CGOS games for komi 7.0.
The 2 first generations are special. Every generation generate 50000
games (I might need a compromise here but since I am training the
network on CPU there is no hurry. For 9x9 I expect a generation to
complete in about 10 days.
Generations:
1. This is ODIN playing with a prior 90% for a random but using the
default pruning of ODIN, so it is not pure random.
2. Here ODIN plays itself using MCTS only with 5000 simulations,
selecting moves proportional to the visits at root, and not the most
visited. Also 15% is added to the prior of one random move. Roughly the
same as AGZ at root (tree search is ODIN specific inspired by AG)
3+. The latest saved neural network is loaded for self play otherwise
the same parameters as for 2.
The neural network training is running in parallel and simply make a
list of all sgf files generated so far and create a list of batches
picked at random. When the list is throught it repeats it. The network
is training on a single best move. Planned is to train on the visit
distribution for each position. Which has the advantage that more is
learned from each position and that random bad moves in the sgf is not
trained.
I am running 4 experiments in parallel 19x19, 13x13, 9x9 7.0 and 5.5.
Interesting questions: is it really necessary to start with random
games? My hypothesis is that it is a good way to prevent over fitting.
But maybe it can be skipped.
Also maybe I could start with my best networks and see if this training
procedure can improve them.
Best
Magnus Persson
_______________________________________________
Computer-go mailing list
Computer-go@computer-go.org
http://computer-go.org/mailman/listinfo/computer-go