Re: [Computer-go] difficult things for alphazero

2017-12-07 Thread Dave Dyer
Without reference to your specific ideas for games that might be difficult to solve, I wonder where these games fit on the human playability scale. The things we find acceptable as games are in a pretty small domain, which lies between the things that are trivial and the things that are too hard

Re: [Computer-go] Mastering Chess and Shogi by Self-Play with a General Reinforcement Learning Algorithm

2017-12-07 Thread Brian Sheppard via Computer-go
AZ scalability looks good in that diagram, and it is certainly a good start, but it only goes out through 10 sec/move. Also, if the hardware is 7x better for AZ than SF, then should we elongate the curve for AZ by 7x? Or compress the curve for SF by 7x? Or some combination? Or take the data at f

Re: [Computer-go] Mastering Chess and Shogi by Self-Play with a General Reinforcement Learning Algorithm

2017-12-07 Thread Ingo Althöfer
Hi Jim, > In 2002/Nov, I created a Go adaptation, Abchij, which > I think might not be easily conquered by these > algorithms. It's funny, I did so in anticipation of > thwarting any sort of brute force algorithms that might > emerge to "solve" Go as I hated how those were the > solution to

Re: [Computer-go] Mastering Chess and Shogi by Self-Play with a General Reinforcement Learning Algorithm

2017-12-07 Thread Jim O'Flaherty
In 2002/Nov, I created a Go adaptation, Abchij, which I think might not be easily conquered by these algorithms. It's funny, I did so in anticipation of thwarting any sort of brute force algorithms that might emerge to "solve" Go as I hated how those were the solution to Chess. If you are intereste

Re: [Computer-go] Mastering Chess and Shogi by Self-Play with a General Reinforcement Learning Algorithm

2017-12-07 Thread Rémi Coulom
>My concern about many of these points of comparison is that they presume how >AZ scales. In the absence of data, I would guess that AZ gains much less from >hardware than SF. I am basing this guess on >two known facts. First is that AZ >did not lose a game, so the upper bound on its strength is

Re: [Computer-go] Mastering Chess and Shogi by Self-Play with a General Reinforcement Learning Algorithm

2017-12-07 Thread Brian Sheppard via Computer-go
> Which is IMHO missing the point a bit ;-) I saw it the same way, while conceding that the facts are accurate. It makes sense for SF to internalize the details before making decisions. At some point there will be a realization that AZ is a fundamental change. >What about the data point that A

Re: [Computer-go] Mastering Chess and Shogi by Self-Play with a General Reinforcement Learning Algorithm

2017-12-07 Thread Gian-Carlo Pascutto
On 7/12/2017 13:20, Brian Sheppard via Computer-go wrote: > The conversation on Stockfish's mailing list focused on how the > match was imbalanced. Which is IMHO missing the point a bit ;-) > My concern about many of these points of comparison is that they > presume how AZ scales. In the absence

Re: [Computer-go] Mastering Chess and Shogi by Self-Play with a General Reinforcement Learning Algorithm

2017-12-07 Thread Brian Sheppard via Computer-go
The conversation on Stockfish's mailing list focused on how the match was imbalanced. - AZ's TPU hardware was estimated at several times (7 times?) the computational power of Stockfish's. - Stockfish's transposition table size (1 GB) was considered much too small for a 64 core machine. - Stockf

Re: [Computer-go] action-value Q for unexpanded nodes

2017-12-07 Thread Gian-Carlo Pascutto
On 03-12-17 21:39, Brian Lee wrote: > It should default to the Q of the parent node. Otherwise, let's say that > the root node is a losing position. Upon choosing a followup move, the Q > will be updated to a very negative value, and that node won't get > explored again - at least until all 362 top

Re: [Computer-go] Mastering Chess and Shogi by Self-Play with a General Reinforcement Learning Algorithm

2017-12-07 Thread Gian-Carlo Pascutto
On 06-12-17 22:29, Brian Sheppard via Computer-go wrote: > The chess result is 64-36: a 100 rating point edge! I think the > Stockfish open source project improved Stockfish by ~20 rating points in > the last year. It's about 40-45 Elo FWIW. > AZ would dominate the current TCEC. I don't think y

Re: [Computer-go] Mastering Chess and Shogi by Self-Play with a General Reinforcement Learning Algorithm

2017-12-07 Thread Gian-Carlo Pascutto
On 06-12-17 21:19, Petr Baudis wrote: > Yes, that also struck me. I think it's good news for the community > to see it reported that this works, as it makes the training process > much more straightforward. They also use just 800 simulations, > another good news. (Both were one of the first trad