Hi,
Thank you for the info and an interesting idea. I wonder,
however, if DCNN can replace handcraft rollouts...
In the case of Zen, there are so many lines for special
cases such as snapbacks, approach moves, and nakades to
play "correct" reply moves at 100% probability. ReLU is not
so
Stephan K:
Hi Hideki,
I think they could have used a rollout policy network (RPN), as described in
"Convolutional Monte Carlo Rollouts in Go" :https://arxiv.org/abs/1512.03375
and have it trained based on the MCTS outcome, at the same time and in the same
way as the policy head is trained. This RPN would
Hideki,
This is a very nice observation.
s.
On Nov 16, 2017 12:37 PM, "Hideki Kato" wrote:
Hi,
I strongly believe adding rollout makes Zero stronger.
They removed rollout just to say "no human knowledge".
#Though the number of past moves (16) has been tuned by
human
While MCTS works "better" in games with a forward direction, it eventually
converges to the same answer as alpha-beta in any game. The general
architecture is to set a maximum depth, and use a suitable evaluator
at the leaf nodes.
I haven't done detailed studies, but there is definitely a
While MCTS works "better" in games with a forward direction, it eventually
converges to the same answer as alpha-beta in any game. The general
architecture is to set a maximum depth, and use a suitable evaluator
at the leaf nodes.
I haven't done detailed studies, but there is definitely a
On 16-11-17 18:15, "Ingo Althöfer" wrote:
> Something like MCTS would not work in chess, because in
> contrast to Go (and Hex and Amazons and ...) Chess is
> not a "game with forward direction".
Ingo, I think the reason Petr brought the whole thing up is that AlphaGo
Zero uses "MCTS" but it does
State of the art in computer chess is alpha-beta search, but note that the
search is very selective because of "late move reductions."
A late move reduction is to reduce depth for moves after the first move
generated in a node. For example, a simple implementation would be "search the
first
Hi,
I strongly believe adding rollout makes Zero stronger.
They removed rollout just to say "no human knowledge".
#Though the number of past moves (16) has been tuned by
human :).
Hideki
Petr Baudis: <20171116154309.tfq5ix2hzwzci...@machine.or.cz>:
> Hi,
>
> when explaining AlphaGo Zero to
2017-11-16 17:37 UTC+01:00, Gian-Carlo Pascutto :
> Third, evaluating with a different rotation effectively forms an
> ensemble that improves the estimate.
Could you expand on that? I understand rotating the board has an
impact for a neural network, but how does that change
Hi Petr,
> What would you say is the current state-of-art game tree search for
> chess? That's a very unfamiliar world for me, to be honest all I really
> know is MCTS...
Stockfish is one of the top-three chess programs, and
it is open source. It is mainly iterative deepening
alpha-beta, but
On 16/11/2017 16:43, Petr Baudis wrote:
> But now, we expand the nodes literally all the time, breaking the
> stationarity possibly in drastic ways. There are no reevaluations
> that would improve your estimate.
First of all, you don't expect the network evaluations to drastically
vary between
As far as I know, the state of art in chess is some flavor of alphabeta
(as long as I read stockfish source correctly),
so basically they prove their current esimation is the best one up to a
certain depth.
MCTS has the benefit to enable various depth search depending on how
good the evaluation
Hi,
when explaining AlphaGo Zero to a machine learning audience yesterday
(https://docs.google.com/presentation/d/1VIueYgFciGr9pxiGmoQyUQ088Ca4ouvEFDPoWpRO4oQ/view)
it occurred to me that using MCTS in this setup is actually such
a kludge!
Originally, we used MCTS because with
14 matches
Mail list logo