In particular, they had no way to train a value net, so it was back to AlphaGo v1 style of training just a policy net and reusing it as the rollout policy.
On Fri, Apr 6, 2018 at 6:31 AM Fidel Santiago <pperez...@gmail.com> wrote: > Hello, > > Apparently the lessons of Alphago (and many others) are being applied to > other fields: > > https://www.nature.com/articles/d41586-018-03774-5 > > "The authors devised a computational process that starts by automatically > extracting chemical transformations from a large commercial database, being > careful to include only reactions that have been reported several times. > Their system accepts these well-precedented reactions as ‘allowed moves’ in > organic synthesis. When the system is asked to devise a synthetic route to > a target molecule, it works backwards from the target as would a human, > picking out the most promising precursor molecules according to the design > rules that it has learnt, and then seeing how feasible it is to synthesize > those. The authors combined three artificial neural networks with a random > Monte Carlo tree search — a type of search algorithm used by computers in > certain decision-making processes — to narrow down the most promising > synthetic routes, without getting stuck too quickly on a particular path." > > Ciao! > > Fidel Santiago. > _______________________________________________ > Computer-go mailing list > Computer-go@computer-go.org > http://computer-go.org/mailman/listinfo/computer-go
_______________________________________________ Computer-go mailing list Computer-go@computer-go.org http://computer-go.org/mailman/listinfo/computer-go