My reaction was "well, if you are using alpha-beta, then at least use LMR 
rather than hard pruning." Your reaction is "don't use alpha-beta", and you 
would know better than anyone!

Yes, LMR in Go has is a big difference compared to LMR in chess: Go tactics 
take many moves to play out, whereas chess tactics are often pretty immediate. 
So LMR could hurt Go tactics much more than it hurts chess tactics. Compare the 
benefit of forcing the playout to the end of the game.

Best,
Brian

-----Original Message-----
From: Computer-go [mailto:computer-go-boun...@computer-go.org] On Behalf Of 
Gian-Carlo Pascutto
Sent: Monday, May 22, 2017 4:08 AM
To: computer-go@computer-go.org
Subject: Re: [Computer-go] mini-max with Policy and Value network

On 20/05/2017 22:26, Brian Sheppard via Computer-go wrote:
> Could use late-move reductions to eliminate the hard pruning. Given 
> the accuracy rate of the policy network, I would guess that even move
> 2 should be reduced.
> 

The question I always ask is: what's the real difference between MCTS with a 
small UCT constant and an alpha-beta search with heavy Late Move Reductions? 
Are the explored trees really so different?

In any case, in my experiments Monte Carlo still gives a strong benefit, even 
with a not so strong Monte Carlo part. IIRC it was the case for AlphaGo too, 
and they used more training data for the value network than is publicly 
available, and Zen reported the same: Monte Carlo is important.

The main problem is the "only top x moves part". Late Move Reductions are very 
nice because there is never a full pruning. This heavy pruning by the policy 
network OTOH seems to be an issue for me. My program has big tactical holes.

--
GCP
_______________________________________________
Computer-go mailing list
Computer-go@computer-go.org
http://computer-go.org/mailman/listinfo/computer-go

_______________________________________________
Computer-go mailing list
Computer-go@computer-go.org
http://computer-go.org/mailman/listinfo/computer-go

Reply via email to