Hi Detlef,

Interestingly the prediction rate of this moves was slightly higher
(without training, just taking the previously trained network) than

I also have started training with only using the moves by the winner
of each game, from GoGoD with 8 symmetries.
I also got accuracy up from 49.1% to 50.1.
(without training, Iteration 0, finetuning from big learning rate) My learning is not finished, so I can't say how strong it is though.

This result is interesting.
Loser's move is diffcult for DCNN? Or DCNN tends to learn only good moves?
I have thought if DCNN learn KGS 6k moves, it can reproduce 6k moves.
But this is not correct?

Thanks,
Hiroshi Yamashita

----- Original Message ----- From: "Detlef Schmicker" <d...@physik.de>
To: <computer-go@computer-go.org>
Sent: Sunday, December 11, 2016 7:38 PM
Subject: [Computer-go] Some experiences with CNN trained on moves by thewinning 
player


I want to share some experience training my policy cnn:

As I wondered, why reinforcement learning was so helpful. I trained
from the Godod database with only using the moves by the winner of
each game.

Interestingly the prediction rate of this moves was slightly higher
(without training, just taking the previously trained network) than
taking into account the moves by both players (53% against 52%)

Training on winning player moves did not help a lot, I got a
statistical significant improvement of about 20-30ELO.

So I still don't understand, why reinforcement should do around
100-200ELO :)

Detlef

_______________________________________________
Computer-go mailing list
Computer-go@computer-go.org
http://computer-go.org/mailman/listinfo/computer-go

Reply via email to