Re: [computer-go] Monte-Carlo Tree Search reference bot

dhillismail Tue, 02 Dec 2008 21:10:47 -0800

There is another speedup trick that might interest you. IIRC Lukasz Lew came up 
with it, but I forget what he called it. After calculating the selected move 
for an internal node (going through UCT and RAVE and whatnot) store that move. 
Then, for the next N visits to that node (where N is 5 or 10 or so), just 
repeat that move without having to calculate what the move would be.

- Dave Hillis

-----Original Message-----
From: Mark Boon <[EMAIL PROTECTED]>
To: computer-go <computer-go@computer-go.org>
Sent: Tue, 2 Dec 2008 11:17 pm
Subject: Re: [computer-go] Monte-Carlo Tree Search reference bot

I have made some minor performance improvements and this is as far as I intend 
to take this particular project. I might make some small changes if necessary, 
but most likely I'll leave this largely unchanged from now.?
?
I had set myself as an arbitrary goal that it should do at least 20K playouts. 
But with real liberties, AMAF and a RAVE formula I got stuck in the 16K-17K 
range. According to my profiler that is mostly due to the expensive formula 
used to compare nodes, where it says it spends 25% of total time. The formula 
I'm using is:?
?
? beta * (virtual-win-ratio + RAVE) + (1-beta) * (win-ratio + UCT)?
?
? beta = 1 - log(parent-visits) / 20?
? UCT = exploration-factor *sqrt( log(parent-visits) / (nr-visits+1) )?
? RAVE = exploration-factor *sqrt( log(parent-visits) / (nr-virtual-visits+1) )?
?
There are probably quite a few possibilities still to tune this program with 
regards to playing strength and performance. But I felt it doesn't help to 
obscure the implementation by too many details.?
?
The implementation of the search algorithm was entirely game-independent, until 
I introduced AMAF. I didn't see how to fix that, as there's no way getting 
around that it's based on the fact that a move is represented by a single 
coordinate, which is mapped to an array to store the statistical values. But 
strip the AMAF part, which is a block of 30 odd lines of code, and I think it 
can be used for other games basically as-is. I did this not because I ever see 
myself program another game, but because in my experience in doing so I get a 
cleaner separation between modules.?
?
At 2,000 playouts, it's still quite a bit weaker than the plain MC-AMAF refbot. 
It wins only about 33%. But that's probably because the 1,000-2,000 playouts 
range is the sweet-spot for that particular type of playing engine. It doesn't 
scale from there, whereas the MCTS ref-bot only just gets warmed up with 2,000 
playouts.?
?
This leads me to a question. I suppose it might be of some interest to put this 
bot up on CGOS. But what parameters to use? The main one being the number of 
playouts, naturally.?
?
Mark?
_______________________________________________?
computer-go mailing list?
[EMAIL PROTECTED]
http://www.computer-go.org/mailman/listinfo/computer-go/?

_______________________________________________
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

Re: [computer-go] Monte-Carlo Tree Search reference bot

Reply via email to