On 6/12/2017 18:57, Darren Cook wrote:
>> Mastering Chess and Shogi by Self-Play with a General Reinforcement
>> Learning Algorithm
>> https://arxiv.org/pdf/1712.01815.pdf
> 
> One of the changes they made (bottom of p.3) was to continuously update
> the neural net, rather than require a new network to beat it 55% of the
> time to be used. (That struck me as strange at the time, when reading
> the AlphaGoZero paper - why not just >50%?)

I read that as a simple way of establishing confidence that the result
was statistically significant > 0. (+35 Elo over 400 games - I don't
know by hearth how large the typical error margin of 400 games is, but I
think it won't be far off!)

-- 
GCP
_______________________________________________
Computer-go mailing list
Computer-go@computer-go.org
http://computer-go.org/mailman/listinfo/computer-go

Reply via email to