Hi,
I found AlphaZero paper Table S3 is different 2017 and 2018.
2017 2018
Mini-batches 700k 700k
Training Time 34h 13d
Training Games 21 million 140 million
Thinking Time 800 sims, 200ms 800 sims, 200ms
Training Time is 34h -> 13d 9.2 times
Training Games is 21 million -> 140 million 6.6 times
Chess and Shogi is same.
And Figure 1 is also a bit different in Shogi and Go. Chess looks same.
Why these numbers are so different? Is it typo?
AlphaZero(2017/12/05)
Mastering Chess and Shogi by Self-Play with a General Reinforcement Learning
Algorithm
https://arxiv.org/abs/1712.01815
AlphaZero(2018/12/07)
A general reinforcement learning algorithm that masters chess, shogi, and Go
through self-play
https://deepmind.com/documents/260/alphazero_preprint.pdf
Thanks,
Hiroshi Yamashita
_______________________________________________
Computer-go mailing list
[email protected]
http://computer-go.org/mailman/listinfo/computer-go