I can only speculate, but I see two advantages to using MSE:
* MSE accomodates games that have more than just win/loss. One of
AlphaGo Zero's goals (I'm extrapolating from the paper) was to develop
a system that was easy to apply to domains other than go.
* It can be used with TD-lambda-lik
On 7/11/2017 19:08, Petr Baudis wrote:
> Hi!
>
> Does anyone knows why the AlphaGo team uses MSE on [-1,1] as the
> value output loss rather than binary crossentropy on [0,1]? I'd say
> the latter is way more usual when training networks as typically
> binary crossentropy yields better result, so
Hi!
Does anyone knows why the AlphaGo team uses MSE on [-1,1] as the value
output loss rather than binary crossentropy on [0,1]? I'd say the
latter is way more usual when training networks as typically binary
crossentropy yields better result, so that's what I'm using in
https://github.com/pa