In particular, they had no way to train a value net, so it was back to
AlphaGo v1 style of training just a policy net and reusing it as the
rollout policy.



On Fri, Apr 6, 2018 at 6:31 AM Fidel Santiago <pperez...@gmail.com> wrote:

> Hello,
>
> Apparently the lessons of Alphago (and many others) are being applied to
> other fields:
>
> https://www.nature.com/articles/d41586-018-03774-5
>
> "The authors devised a computational process that starts by automatically
> extracting chemical transformations from a large commercial database, being
> careful to include only reactions that have been reported several times.
> Their system accepts these well-precedented reactions as ‘allowed moves’ in
> organic synthesis. When the system is asked to devise a synthetic route to
> a target molecule, it works backwards from the target as would a human,
> picking out the most promising precursor molecules according to the design
> rules that it has learnt, and then seeing how feasible it is to synthesize
> those. The authors combined three artificial neural networks with a random
> Monte Carlo tree search — a type of search algorithm used by computers in
> certain decision-making processes — to narrow down the most promising
> synthetic routes, without getting stuck too quickly on a particular path."
>
> Ciao!
>
> Fidel Santiago.
> _______________________________________________
> Computer-go mailing list
> Computer-go@computer-go.org
> http://computer-go.org/mailman/listinfo/computer-go
_______________________________________________
Computer-go mailing list
Computer-go@computer-go.org
http://computer-go.org/mailman/listinfo/computer-go

Reply via email to