On 06-12-17 21:19, Petr Baudis wrote: > Yes, that also struck me. I think it's good news for the community > to see it reported that this works, as it makes the training process > much more straightforward. They also use just 800 simulations, > another good news. (Both were one of the first tradeoffs I made in > Nochi.)
The 800 simulations are a setting that works over all 3 games. It's not necessarily as good for 19x19 Go (more legal moves than the other games, so less deep trees). As for both the lack of testing and this parameter, someone has remarked on github that the DeepMind hardware is fixed, so this also represents a tuning between the speed of the learning machine and the speed of the self-play machines. In my experience, just continuing to train the network further (when no new data is batched in) often regresses the performance by 200 or more Elo. So it's not clear this step is *entirely* ignorable unless you have already tuned the speed of the other two aspects. > Another interesting tidbit: they use the TPUs to also generate the > selfplay games. I think this was already known. -- GCP _______________________________________________ Computer-go mailing list Computer-go@computer-go.org http://computer-go.org/mailman/listinfo/computer-go