Re: [Computer-go] Nochi: Slightly successful AlphaGo Zero replication

Gian-Carlo Pascutto Wed, 15 Nov 2017 02:10:36 -0800

On 11-11-17 00:58, Petr Baudis wrote:
>>>   * The neural network is updated after _every_ game, _twice_, on _all_
>>>     positions plus 64 randomly sampled positions from the entire history,
>>>     this all done four times - on original position and the three
>>>     symmetry flips (but I was too lazy to implement 90\deg rotation).
>>
>> The reasoning being to give a stronger and faster reinforcement with the
>> latest data?
> 
> Yes.


One thing I wonder about, given the huge size of the network and the
strong reinforcement, don't you get total overfitting?

I guess the next few games will quickly "point out" the overfit, but I
still wonder whether keeping the overfit under control wouldn't be
better rather than the see-sawing this would seem to cause.

-- 
GCP
_______________________________________________
Computer-go mailing list
Computer-go@computer-go.org
http://computer-go.org/mailman/listinfo/computer-go

Re: [Computer-go] Nochi: Slightly successful AlphaGo Zero replication

Reply via email to