Could use late-move reductions to eliminate the hard pruning. Given the 
accuracy rate of the policy network, I would guess that even move 2 should be 
reduced.

-----Original Message-----
From: Computer-go [mailto:computer-go-boun...@computer-go.org] On Behalf Of 
Hiroshi Yamashita
Sent: Saturday, May 20, 2017 3:42 PM
To: computer-go@computer-go.org
Subject: [Computer-go] mini-max with Policy and Value network

Hi,

HiraBot author reported mini-max search with Policy and Value network. 
It does not use monte-carlo.
Only top 8 moves from Policy is searched in root node. In other depth,  top 4 
moves is searched.

Game result against Policy network best move (without search)

             Win Loss winrate 
MaxDepth=1, (558-442) 0.558   +40 Elo
MaxDepth=2, (351-150) 0.701  +148 Elo
MaxDepth=3, (406-116) 0.778  +218 Elo
MaxDepth=4, (670- 78) 0.896  +374 Elo
MaxDepth=5, (490- 57) 0.896  +374 Elo
MaxDepth=6, (520- 20) 0.963  +556 Elo

Search is simple alpha-beta.
There is a modification Policy network high probability moves tend to be 
selected.
MaxDepth=6 takes one second/move on i7-4790k + GTX1060.

His nega-max code
http://kiyoshifk.dip.jp/kiyoshifk/apk/negamax.zip
CGOS result, MaxDepth=6
http://www.yss-aya.com/cgos/19x19/cross/minimax-depth6.html
His Policy network(without search) is maybe 
http://www.yss-aya.com/cgos/19x19/cross/DCNN-No336-tygem.html
His Policy and Value network(MCTS) is maybe 
http://www.yss-aya.com/cgos/19x19/cross/Hiratuka10_38B100.html

Thanks,
Hiroshi Yamashita

_______________________________________________
Computer-go mailing list
Computer-go@computer-go.org
http://computer-go.org/mailman/listinfo/computer-go

_______________________________________________
Computer-go mailing list
Computer-go@computer-go.org
http://computer-go.org/mailman/listinfo/computer-go

Reply via email to