Nice job! And the graph makes it super clear how the edge effects work. s.
On Sat, May 9, 2020, 2:19 PM Rémi Coulom <remi.cou...@gmail.com> wrote: > Hi, > > I am probably not the only one who made this mistake: it is usually very > bad to use a power of 2 for the batch size! > > Relevant documentation by NVIDIA: > > https://docs.nvidia.com/deeplearning/performance/dl-performance-convolutional/index.html#quant-effects > > The documentation is not extremely clear, so I figured out the formula: > N=int((n*(1<<14)*SM)/(H*W*C)) > > SM is the number of multiprocessors (80 for V100 or Titan V, 68 for RTX > 2080 Ti). > n is an integer (usually n=1 is slightly worse than n>1). > > So the efficient batch size is 63 for 9x9 Go on a V100 with 256-channel > layers. 53 on the RTX 2080 Ti. > > There is my tweet with an empirical plot: > https://twitter.com/Remi_Coulom/status/1259188988646129665 > > I created a new CGOS account to play with this improvement. Probably not a > huge different in strength, but it is good to get such an improvement so > easily. > > Rémi > _______________________________________________ > Computer-go mailing list > Computer-go@computer-go.org > http://computer-go.org/mailman/listinfo/computer-go >
_______________________________________________ Computer-go mailing list Computer-go@computer-go.org http://computer-go.org/mailman/listinfo/computer-go