Hi, I also have some Zero stuff brewing since almost two days. Although I depend on heavy playout MC-evaluation for self play.

I am using my Odin MC-engine as a base as it is. It can use a small AG style policy network running on CPU implemented with Eigen (C++). It does not have any value network which I guess will prevent super strong play. My first goal is at least to be able to reach the same standard for komi 5.5 on 9x9 as I did with the help of CGOS games for komi 7.0.

The 2 first generations are special. Every generation generate 50000 games (I might need a compromise here but since I am training the network on CPU there is no hurry. For 9x9 I expect a generation to complete in about 10 days.

Generations:
1. This is ODIN playing with a prior 90% for a random but using the default pruning of ODIN, so it is not pure random. 2. Here ODIN plays itself using MCTS only with 5000 simulations, selecting moves proportional to the visits at root, and not the most visited. Also 15% is added to the prior of one random move. Roughly the same as AGZ at root (tree search is ODIN specific inspired by AG) 3+. The latest saved neural network is loaded for self play otherwise the same parameters as for 2.

The neural network training is running in parallel and simply make a list of all sgf files generated so far and create a list of batches picked at random. When the list is throught it repeats it. The network is training on a single best move. Planned is to train on the visit distribution for each position. Which has the advantage that more is learned from each position and that random bad moves in the sgf is not trained.

I am running 4 experiments in parallel 19x19, 13x13, 9x9 7.0 and 5.5.

Interesting questions: is it really necessary to start with random games? My hypothesis is that it is a good way to prevent over fitting. But maybe it can be skipped.

Also maybe I could start with my best networks and see if this training procedure can improve them.

Best
Magnus Persson



_______________________________________________
Computer-go mailing list
Computer-go@computer-go.org
http://computer-go.org/mailman/listinfo/computer-go

Reply via email to