On 7/11/2017 19:08, Petr Baudis wrote: > Hi! > > Does anyone knows why the AlphaGo team uses MSE on [-1,1] as the > value output loss rather than binary crossentropy on [0,1]? I'd say > the latter is way more usual when training networks as typically > binary crossentropy yields better result, so that's what I'm using > in https://github.com/pasky/michi/tree/nnet for the time being, but > maybe I'm missing some good reason to use MSE instead?
Not that I know of. You can certainly get some networks to converge better by using cross-entropy over MSE. Maybe it's related to the nature of the errors? More avoidance of the output being entirely wrong? Or habit? MSE is generally preferred for regression-like problems, but you can argue whether a go position is being regressed to some winrate%, or to win/loss... -- GCP _______________________________________________ Computer-go mailing list Computer-go@computer-go.org http://computer-go.org/mailman/listinfo/computer-go