Re: [Computer-go] UCT parameters and application to other games

Rémi Coulom Sat, 26 Mar 2011 07:37:04 -0700

I'd recommend more than 110. Maybe 200 is better. In Crazy Stone, I use no 
limit, and test for superko.


Rémi

On 26 mars 2011, at 15:32, Daniel Shawul wrote:

> Sorry 81 moves was a bad estimate by me. I am actually using 96 moves. I will 
> change that to 110 or above 
> moves and see what effect it has. Also I would take Remi's suggestion i.e to 
> bias the move selection process.
> For the alpha-beta program , I have a decent move ordering algorithm and 
> qsearch. I guess can borrow some from that.
> 
> In the meantime, I found a paper using UCT for chinese checkers and other 
> games 
> http://www.google.com/url?sa=t&source=web&cd=7&ved=0CD4QFjAG&url=http%3A%2F%2Fweb.cs.du.edu%2F~sturtevant%2Fpapers%2Fmpuct_icga.pdf&rct=j&q=UCT%20for%20checkers&ei=mPiNTcqdBsSO0QG20-nWCw&usg=AFQjCNGFgMMMG8xMtawvx-3rQtwXPhfWxQ&cad=rja,
>  and also
> some fun java programs using UCT for checkers. It seems UCT is indeed 
> competitive in checkers.
> I must say I didn't expect that at all. I think the forced nature of captures 
> helps to improve tactical awareness of the MC simulations. 
> Is that so ?
> 
> 
> On Sat, Mar 26, 2011 at 8:52 AM, Erik van der Werf <erikvanderw...@gmail.com> 
> wrote:
> Ah ok, I misunderstood.
> 
> Still something seems to be wrong. On the empty 9x9 board I think most
> programs with random/light playouts play in the order of 110 moves.
> ~81 moves seems quite low; in my experience you can only get such low
> numbers to work well if you have a lot of knowledge in your playouts.
> Did you check the quality of the evaluations/playouts?
> 
> If you want UCT to search deeper you need good priors and perhaps
> something like rave/amaf.
> 
> Best,
> Erik
> 
> 
> On Sat, Mar 26, 2011 at 1:13 PM, Daniel Shawul <dsha...@gmail.com> wrote:
> > Hello,
> > I am using monte carlo playouts for the UCT method. It can do about 10k/sec.
> > The UCT tree is expanded to a depth of  d = 3 in a 5 sec search, from then
> > onwards a random playout (with no bias)
> > is carried out.  Actually it is a 'patial playout' which doesn't go to the
> > end of the game, rather upto a depth of MAX_PLY=96.
> >  If the game has ended earlier, then a win/draw/loss is returned. Otherwise
> > I  forcefully end the game by using a determinstic eval
> > and assign a WDL. For 9x9 go actually most of random playouts end before
> > move 81.
> > For the alpha-beta searcher , I do classical evaluation. With heavy use of
> > reductions
> > I can get a depth of 14 half plies , which seems to give it quite an edge
> > against the UCT version.
> > Is the depth of expansion for the UCT tree too low ? (d = 3 in a 5 sec
> > search). Should I lower the UCTK parameter
> > to 0.1 or so which seems to give me a depth = 7 at the start positon of a
> > 9x9 go. I am confident my implementation is
> > correct because it is working quite well in my checkers program despite my
> > expectation.
> > thanks
> > Daniel
> >
> > On Sat, Mar 26, 2011 at 7:54 AM, Erik van der Werf
> > <erikvanderw...@gmail.com> wrote:
> >>
> >> It sounds like you're using a classical (deterministic) evaluation
> >> function.
> >> Try combining UCT with Monte Carlo evaluation.
> >>
> >> Erik
> >>
> >>
> >> On Sat, Mar 26, 2011 at 12:43 PM, Daniel Shawul <dsha...@gmail.com> wrote:
> >> > Hello,
> >> > I am very new to UCT,  just implemented basic UCT for go yesterday.
> >> > But with no success so far for GO,I think  mostly because it searches
> >> > not
> >> > very deep (depth = 3 on a 5 sec search with those values).
> >> > I am using the following values as UCT parameters
> >> > UCTK = sqrt(1/5) = 0.44     UCTN = 10 (visits afte which best move is
> >> > expanded)
> >> > Even if I lower UCTK down to 7 I get a maximum depth of d=7 at the start
> >> > position for a 5 sec search.
> >> > For how deep a search should I tune these parameter for ?
> >> > Before UCT,  I have an alpha-beta searcher which sometimes plays on
> >> > CGOS.
> >> > It reached a level of ~1500, and this engine seems to be too strong for
> >> > the
> >> > UCT version.
> >> >  It just gets outsearched in some tactical positions and also in
> >> > evaluation
> >> > I think.
> >> > For example, I have an evaluation term which gives big bonuses for
> >> > connected
> >> > strings which seems
> >> > to give an edge in a lot of games.. How do you introduce such eval terms
> >> > in
> >> > UCT ?
> >> > But for my checkers program , to my big surprise , UCT made a
> >> > significant
> >> > impact. The regular
> >> > alpha-beta searcher averages a depth=25 but the UCT version I think is
> >> > equally strong from the games
> >> > I saw. That was a kind of surprise for me because I thought UCT would
> >> > work
> >> > better for bushy trees and
> >> > when the eval has a lot of strategy. It also reached good depths
> >> > averaging
> >> > 16 plies .
> >> > My checkers eval had only material in it, so I don't know if UCT
> >> > is bringing
> >> > strategy (distant information) to the game
> >> > which the other one don't have.The games are not really played out to
> >> > the
> >> > end rather to a MAX_PLY = 96
> >> > afte which the material is counted and a WDL score is assigned (I call
> >> > it
> >> > partial playout).
> >> > Also the fact that captures are forced seem to help a lot because it
> >> > doesn't
> >> > make too many mistakes.
> >> > I also found out some positions where it encounters similar problems as
> >> > ladders in go. But in the checkers case,
> >> > this problems are still solved correctly. Only problem is that it
> >> > doesn't
> >> > report correct looking winning rates.
> >> > For example, in a position with two kings where one of the kings is
> >> > chasing
> >> > the other to the sides to mate it, but
> >> > the loosing king can draw by making a serious of correct moves to get
> >> > itself
> >> > to one of the safe corners; The program
> >> > displays winning rates of 0.01 (when it should have been more like 0.5)
> >> > but
> >> > it still manages the draw !
> >> > thanks and apologies for the verbose email
> >> > Daniel
> >> > _______________________________________________
> >> > Computer-go mailing list
> >> > Computer-go@dvandva.org
> >> > http://dvandva.org/cgi-bin/mailman/listinfo/computer-go
> >> >
> >> _______________________________________________
> >> Computer-go mailing list
> >> Computer-go@dvandva.org
> >> http://dvandva.org/cgi-bin/mailman/listinfo/computer-go
> >
> >
> > _______________________________________________
> > Computer-go mailing list
> > Computer-go@dvandva.org
> > http://dvandva.org/cgi-bin/mailman/listinfo/computer-go
> >
> _______________________________________________
> Computer-go mailing list
> Computer-go@dvandva.org
> http://dvandva.org/cgi-bin/mailman/listinfo/computer-go
> 
> _______________________________________________
> Computer-go mailing list
> Computer-go@dvandva.org
> http://dvandva.org/cgi-bin/mailman/listinfo/computer-go

_______________________________________________
Computer-go mailing list
Computer-go@dvandva.org
http://dvandva.org/cgi-bin/mailman/listinfo/computer-go

Re: [Computer-go] UCT parameters and application to other games

Reply via email to