[Computer-go] Some experiences with CNN trained on moves by the winning player

Detlef Schmicker Sun, 11 Dec 2016 02:38:28 -0800

I want to share some experience training my policy cnn:

As I wondered, why reinforcement learning was so helpful. I trained
from the Godod database with only using the moves by the winner of
each game.


Interestingly the prediction rate of this moves was slightly higher
(without training, just taking the previously trained network) than
taking into account the moves by both players (53% against 52%)

Training on winning player moves did not help a lot, I got a
statistical significant improvement of about 20-30ELO.

So I still don't understand, why reinforcement should do around
100-200ELO :)

Detlef
_______________________________________________
Computer-go mailing list
Computer-go@computer-go.org
http://computer-go.org/mailman/listinfo/computer-go

[Computer-go] Some experiences with CNN trained on moves by the winning player

Reply via email to