Summarizing the objections to my (non-evidence-based, but hand-wavy observationally-based) assertion that 9x9 is going down anytime someone really wants it to go down, I get the following:
* value networks can't hack it (okay, maybe? does this make it less likely? -- we shouldn't expect to cut-and-paste.) * double ko is some kind of super special problem on 9x9 * the margins are way narrower the second issue (ko in general, but multi-ko) is exactly why go is pspace-complete (I made a rough argument for the fact that under slight relaxation it isn't, but didn't flesh it out or publish it). I am not feeling strong arguments about the overall picture (i.e. it's super much harder than 19x19 to beat humans at) other than that the margins are narrower. Does anyone else have a different synopsis? Thanks, Steve On Tue, Mar 6, 2018 at 12:17 PM, Brian Sheppard via Computer-go < computer-go@computer-go.org> wrote: > Well, AlphaZero did fine at chess tactics, and the papers are clear on the > details. There must be an error in your deductions somewhere. > > > > *From:* Computer-go [mailto:computer-go-boun...@computer-go.org] *On > Behalf Of *Dan > *Sent:* Tuesday, March 6, 2018 1:46 PM > > *To:* computer-go@computer-go.org > *Subject:* Re: [Computer-go] 9x9 is last frontier? > > > > I am pretty sure it is an MCTS problem and I suspect not something that > could be easily solved with a policy network (could be wrong hree). My > opinon is that DCNN is not > > a miracle worker (as somebody already mentioned here) and it is going to > fail resolving tactics. I would be more than happy with it if it has same > power as a qsearch to be honest. > > > > Search traps are the major problem with games like Chess, and what makes > transitioning the success of DCNN from Go to Chess non trivial. > > The following paper discusses shallow traps that are prevalent in chess. ( > https://www.aaai.org/ocs/index.php/ICAPS/ICAPS10/paper/download/1458/1571 > ) > > They mention traps make MCTS very inefficient. Even if the MCTS is given > 50x more time is needed by an exhaustive minimax tree, it could fail to > find a level-5 or level-7 trap. > > It will spend, f.i, 95% of its time searching an asymetric tree of depth > > 7 when a shallow trap of depth-7 exists, thus, missing to find the level-7 > trap. > > This is very hard to solve even if you have unlimited power. > > > > The plain MCTS as used by AlphaZero is the most ill-suited MCTS version in > my opinion and i have hard a hard time seeing how it can be competitive > with Stockfish tactically. > > > > My MCTS chess engine with AlphaZero like MCTS was averaging was missing a > lot of tactics. I don't use policy or eval networks but qsearch() for eval, > and the policy is basically > > choosing which ever moves leads to a higher eval. > > > > a) My first improvement to the MCTS is to use minimax backups instead of > averaging. This was an improvmenet but not something that would solve the > traps > > > > b) My second improvment is to use alphabeta rollouts. This is a rollouts > version that can do nullmove and LMR etc... This is a huge improvment and > none of the MCTS > > versons can match it. More on alpha-beta rollouts here ( > https://www.microsoft.com/en-us/research/wp-content/ > uploads/2014/11/huang_rollout.pdf ) > > > > So AlphaZero used none of the above improvements and yet it seems to be > tactically strong. Leela-Zero suffered from tactical falls left and right > too as I expected. > > > > So the only explanation left is the policy network able to avoid traps > which I find hard to believe it can identify more than a qsearch level > tactics. > > > > All I am saying is that my experience (as well as many others) with MCTS > for tactical dominated games is bad, and there must be some breakthrough in > that regard in AlphaZero > > for it to be able to compete with Stockfish on a tactical level. > > > > I am curious how Remi's attempt at Shogi using AlphaZero's method will > turnout. > > > > regards, > > Daniel > > > > > > > > > > > > > > > > > > On Tue, Mar 6, 2018 at 9:41 AM, Brian Sheppard via Computer-go < > computer-go@computer-go.org> wrote: > > Training on Stockfish games is guaranteed to produce a blunder-fest, > because there are no blunders in the training set and therefore the policy > network never learns how to refute blunders. > > > > This is not a flaw in MCTS, but rather in the policy network. MCTS will > eventually search every move infinitely often, producing asymptotically > optimal play. But if the policy network does not provide the guidance > necessary to rapidly refute the blunders that occur in the search, then > convergence of MCTS to optimal play will be very slow. > > > > It is necessary for the network to train on self-play games using MCTS. > For instance, the AGZ approach samples next states during training games by > sampling from the distribution of visits in the search. Specifically: not > by choosing the most-visited play! > > > > You see how this policy trains both search and evaluation to be internally > consistent? The policy head is trained to refute the bad moves that will > come up in search, and the value head is trained to the value observed by > the full tree. > > > > *From:* Computer-go [mailto:computer-go-boun...@computer-go.org] *On > Behalf Of *Dan > *Sent:* Monday, March 5, 2018 4:55 AM > *To:* computer-go@computer-go.org > *Subject:* Re: [Computer-go] 9x9 is last frontier? > > > > Actually prior to this it was trained with hundreds of thousands of > stockfish games and didn’t do well on tactics (the games were actually a > blunder fest). I believe this is a problem of the MCTS used and not due to > for lack of training. > > > > Go is a strategic game so that is different from chess that is full of > traps. > > I m not surprised Lela zero did well in go. > > > > On Mon, Mar 5, 2018 at 2:16 AM Gian-Carlo Pascutto <g...@sjeng.org> wrote: > > On 02-03-18 17:07, Dan wrote: > > Leela-chess is not performing well enough > > I don't understand how one can say that given that they started with the > random network last week only and a few clients. Of course it's bad! > That doesn't say anything about the approach. > > Leela Zero has gotten strong but it has been learning for *months* with > ~400 people. It also took a while to get to 30 kyu. > > -- > GCP > _______________________________________________ > Computer-go mailing list > Computer-go@computer-go.org > http://computer-go.org/mailman/listinfo/computer-go > > > _______________________________________________ > Computer-go mailing list > Computer-go@computer-go.org > http://computer-go.org/mailman/listinfo/computer-go > > > > _______________________________________________ > Computer-go mailing list > Computer-go@computer-go.org > http://computer-go.org/mailman/listinfo/computer-go >
_______________________________________________ Computer-go mailing list Computer-go@computer-go.org http://computer-go.org/mailman/listinfo/computer-go