On 12/02/2017 5:44, Álvaro Begué wrote:
> I thought about this for about an hour this morning, and this is what I
> came up with. You could make a database of positions with a label
> indicating the result (perhaps from real games, perhaps similarly to how
> AlphaGo trained their value network). L
Sent: Saturday, February 11, 2017 11:44 PM
To: computer-go
Subject: [Computer-go] Playout policy optimization
Hi,
I remember an old paper by Rémi Coulom ("Computing Elo Ratings of Move Patterns
in the Game of Go") where he computed "gammas" (exponentials of scores that
Hi,
I remember an old paper by Rémi Coulom ("Computing Elo Ratings of Move
Patterns in the Game of Go") where he computed "gammas" (exponentials of
scores that you could feed to a softmax) for different move features, which
he fit to best explain the move probabilities from real games.
Similarly,