On 11-11-17 00:58, Petr Baudis wrote: >>> * The neural network is updated after _every_ game, _twice_, on _all_ >>> positions plus 64 randomly sampled positions from the entire history, >>> this all done four times - on original position and the three >>> symmetry flips (but I was too lazy to implement 90\deg rotation). >> >> The reasoning being to give a stronger and faster reinforcement with the >> latest data? > > Yes.
One thing I wonder about, given the huge size of the network and the strong reinforcement, don't you get total overfitting? I guess the next few games will quickly "point out" the overfit, but I still wonder whether keeping the overfit under control wouldn't be better rather than the see-sawing this would seem to cause. -- GCP _______________________________________________ Computer-go mailing list Computer-go@computer-go.org http://computer-go.org/mailman/listinfo/computer-go