subject:"\[Computer\-go\] Playout policy optimization"

Re: [Computer-go] Playout policy optimization

2017-02-13 Thread Gian-Carlo Pascutto

On 12/02/2017 5:44, Álvaro Begué wrote: > I thought about this for about an hour this morning, and this is what I > came up with. You could make a database of positions with a label > indicating the result (perhaps from real games, perhaps similarly to how > AlphaGo trained their value network). L

Re: [Computer-go] Playout policy optimization

2017-02-12 Thread Brian Sheppard via Computer-go

Sent: Saturday, February 11, 2017 11:44 PM To: computer-go Subject: [Computer-go] Playout policy optimization Hi, I remember an old paper by Rémi Coulom ("Computing Elo Ratings of Move Patterns in the Game of Go") where he computed "gammas" (exponentials of scores that

[Computer-go] Playout policy optimization

2017-02-11 Thread Álvaro Begué

Hi, I remember an old paper by Rémi Coulom ("Computing Elo Ratings of Move Patterns in the Game of Go") where he computed "gammas" (exponentials of scores that you could feed to a softmax) for different move features, which he fit to best explain the move probabilities from real games. Similarly,

Re: [Computer-go] Playout policy optimization

Re: [Computer-go] Playout policy optimization

[Computer-go] Playout policy optimization

3 matches

Site Navigation

Mail list logo

Footer information