Re: [computer-go] Rapid action value estimation
I store it in the normal uct tree, so that each node has variables raveVisits and raveWins besides uctVisits and uctWins. So a node in the UCT-DAG can either represent a position or a move. On 11/2/07, Christoph Birk <[EMAIL PROTECTED]> wrote: > > On Fri, 2 Nov 2007, Benjamin Teuber wrote: > > I don't think there's something different at different depths in the > tree.. > > To update RAVE after a simulation, for each child of a node you visited > > during that simulation, you update if the move leading to the child was > > played later (until the end of the playout). > > Then, always when you calculate the UCT value, you combine that with the > > RAVE value with that weighted average formula to give the final score. > > Of course, you need to be careful with signs :-) > > That means you have one global 'RAVE' table? > Or one at each node in the UCT tree? > > Christoph > ___ > computer-go mailing list > computer-go@computer-go.org > http://www.computer-go.org/mailman/listinfo/computer-go/ > ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] Rapid action value estimation
On Fri, 2 Nov 2007, Benjamin Teuber wrote: I don't think there's something different at different depths in the tree.. To update RAVE after a simulation, for each child of a node you visited during that simulation, you update if the move leading to the child was played later (until the end of the playout). Then, always when you calculate the UCT value, you combine that with the RAVE value with that weighted average formula to give the final score. Of course, you need to be careful with signs :-) That means you have one global 'RAVE' table? Or one at each node in the UCT tree? Christoph ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] Rapid action value estimation
I don't think there's something different at different depths in the tree.. To update RAVE after a simulation, for each child of a node you visited during that simulation, you update if the move leading to the child was played later (until the end of the playout). Then, always when you calculate the UCT value, you combine that with the RAVE value with that weighted average formula to give the final score. Of course, you need to be careful with signs :-) Btw, I don't really see a point in calculating and adding the confidence bound for RAVE as well, as all moves will have been played almost equally often - thus I dropped the term.. Maybe Sylvain or someone else can comment on this.. Another thing - I didn't believe that you need to do RAVE seperately for both colors (i.e. you should only consider later moves on the point by the same color), as e.g. Peter Drake mentioned in a paper of his. But after some experiments I changed my mind and think he is right =) Cheers, Benjamin On 11/2/07, Jason House <[EMAIL PROTECTED]> wrote: > > I'd like to implement RAVE as described in [1]. I believe I have a very > clear understanding of how to do this at the leaves of the UCT search tree. > What I'm not sure about is how to apply RAVE results higher in the UCT > tree. Does anyone have any experience with this that they're willing to > share? > > > [1] http://www.machinelearning.org/proceedings/icml2007/papers/387.pdf > > ___ > computer-go mailing list > computer-go@computer-go.org > http://www.computer-go.org/mailman/listinfo/computer-go/ > ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
[computer-go] icml2007: Learning to solve game trees
Hi, Interesting paper, in case you did not notice: http://www.machinelearning.org/proceedings/icml2007/papers/394.pdf Title: Learning to solve game trees Authors: David Stern, Ralf Herbrich, Thore Graepel Abstract: We apply probability theory to the task of proving whether a goal can be achieved by a player in an adversarial game. Such problems are solved by searching the game tree. We view this tree as a graphical model which yields a distribution over the (Boolean) outcome of the search before it terminates. Experiments show that a best-first search algorithm guided by this distribution explores a similar number of nodes as Proof-Number Search to solve Go problems. Knowledge is incorporated into search by using domain-specific models to provide prior distributions over the values of leaf nodes of the game tree. These are surrogate for the unexplored parts of the tree. The parameters of these models can be learned from previous search trees. Experiments on Go show that the speed of problem solving can be increased by orders of magnitude by this technique but care must be taken to avoid over-fitting. Rémi ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
[computer-go] Connecting a gtp engine to CGOS 19
I am trying to make a description of all steps required. But it still doesn't work. So please, say what am I doing wrong. 1. Get TCL. There are many "flavors" with GUI, debugger etc. The simplest, when you just need to run a Tk application is something like: http://www.equi4.com/pub/tk/tclkit-win32.upx.exe This .exe is all you need, about 1M and without any annoyances, installation, registry etc. For the following, I rename it as tcl. 2. Get the .tcl client from CGOS. http://cgos.boardspace.net/public/cgos3.zip 3. Modify the .tcl script to use the 19x19 server # ... set server cgos.boardspace.net set server cgos.lri.fr # ... set port 6867 set port 6919 Use # to change the original lines into remarks and add new lines. AFAIK no other modification is necessary. Please, confirm. 4. Create an account in CGOS. I remember having read that when you use one for the first time, any name and password are valid and then, you have to use the same password to continue using it. Now I can't find where I read that. Please, confirm. So, for testing purposes, I use the account: name: testingTCL pass: password 5. Using gnugo for the sake of simplicity (just for the test), gnugo37 --mode gtp --chinese-rules --capture-all-dead should be a valid setting to play on CGOS. 6. I create a .bat file with: tcl cgos3.tcl testingTCL password E:\\GO\\PROGRAMS\\GnuGo\\gnugo37 --mode gtp --chinese-rules --capture-all-dead 7. I run the .bat file and that launches both tcl.exe and gnugo37.exe I don't get any info from neither. I can't see my program when I use: E:\TEMP\cgosview.exe cgos.lri.fr 6919 to see what is happening on the server. I have waited for many rounds to finish, but my program never plays if it is connected at all, something I don't know. 8. Shouldn't there be log files somewhere? Where? I don't mean .log files with gtp info. I can do that myself. I mean log files giving some info about the communication and the state of the server. Jacques. ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
[computer-go] 9x9 CGOS
It appears that CGOS (9x9) is down. Christoph ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] CGOS on sourceforge
On Thu, Nov 01, 2007 at 08:57:38PM -0400, Joshua Shriver wrote: > In that case I stand happily corrected. I once was going to release > and one of the stipulations what that it had to be reassigned to the > FSF. Couldn't remember if it was sourceforge, gnu, or what... GNU Go has this requirement. -H -- Heikki Levanto "In Murphy We Turst" heikki (at) lsd (dot) dk ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
[computer-go] Rapid action value estimation
I'd like to implement RAVE as described in [1]. I believe I have a very clear understanding of how to do this at the leaves of the UCT search tree. What I'm not sure about is how to apply RAVE results higher in the UCT tree. Does anyone have any experience with this that they're willing to share? [1] http://www.machinelearning.org/proceedings/icml2007/papers/387.pdf ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] Standard references on CGOS
On 10/29/07, Christoph Birk <[EMAIL PROTECTED]> wrote: > > On Oct 29, 2007, at 8:39 AM, Jason House wrote: > > For all of us in the bot-making kiddie pool, it's exceptionally > > helpful to have reference implementations of basic algorithms > > running on the server. When playing with AMAF, I found the > > reference AMAF bots very helpful. Now that I'm playing with UCT, > > references for UCT would be helpful. > > 'myCtest-V-0003' is running 50k UCT. Pure random playouts guided > by a UCT search with theses parameters: > # playouts before expanding = 50 > node-score = win_ratio + 0.5 * sqrt(log(N)/n); > > I will start it under the nam 'myCtest-50k-UCT' later today running > 24/7. I think I've gotten my big UCT bugs worked out. Thanks a lot for the reference. For any who are interested, hb-672-UCT has the following configuration: # playouts per move = variable (should be in the ballpark of 10k) # playouts before expanding = 10 node-score = win_ratio + tuned_standard_deviation * sqrt(0.8*log(N)/n); tuned_standard_deviation = sqrt(min(0.25 ,win_ratio*(1-win_ratio)+sqrt(2*ln(N)/n))) The 0.8 factor is carry over from initially following http://senseis.xmp.net/?UCT The tuning was based on http://hal.inria.fr/inria-00117266 and is supposedly superior to a flat 0.5 I'll likely try variants to better match Ctest: * No 0.8 factor * 50 playouts before expansion * No tuning * True 10k simulations per move ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] cgos viewer feature request
Hi Don, Too much apologize is a (well known :-) Japanese culture or custom. Please never mind. I agree that the remaining time is more useful than the elapsed time of a move and, also, some kind of explanation is much helpful. Anyway, thank you for such a nice improvement. It's very useful for me now. Hideki Don Dailey: <[EMAIL PROTECTED]>: >Nothing to apologize over! > >I probably should have put some kind of column headings or something. > >I thought it more useful to be a count-down timer - it's much easier to >calculate how much time you took by subtraction with the previous entry >than it is to see how much time is left by adding up every entry. > >- Don > > >Hideki Kato wrote: >> Hi Don, >> >> Now I understand the time is the time left! So your code is not >> wrong (_ _). >> # A Japanese facemark that means I'm sorry. Don't rotate your head. >> >> Hideki >> >> Hideki Kato: <[EMAIL PROTECTED]>: >> >>> Hi Don, >>> >>> Don Dailey: <[EMAIL PROTECTED]>: >>> Hideki Kato wrote: > Hi Don, > > Thank you for the additional feature. > > It seems, however, strange. Does 04:19 mean 41.9 seconds? > > Hideki > > Hi Hideki, 04:19 means 4 minutes and 19 seconds.I don't understand how it could mean 41.9 seconds. Is this an international thing?Is there a better way that's more understood than what I am doing? >>> If so, all clients on 9x9 should lose by time. But now >>> I'm sure that the time is completely wrong. Please check your code. >>> >>> Hideki >>> >>> - Don > Don Dailey: <[EMAIL PROTECTED]>: > > >> I just updated the current viewer to version 0.35. >> >> I added the remaining time display. >> >> The default server is the 19x19 server, so if you use it for 9x9 you must >> specify the server and port. You must use the options like this: >> >>cgosview -server server_name -port portnum -games 1,2,3,4,5 >> >> Games is optional, but it will pop up all the specified games. >> >> - Don >> >> >> >> >> Jason House wrote: >> >> >>> On Thu, 2007-11-01 at 17:05 -0400, Don Dailey wrote: >>> >>> >>> The source code is included - even though you probably don't realize it.There is a utility that will unpack the kit and reveal the source code. Then you can fix it, pack it back up and run it.Google for sdx.kit and tclkit and equi4 and you will find the details. It's a tcl/tk program. >>> For those considering doing this, here's some help... (I hope, I'm just >>> figuring this out now) >>> >>> You'll need to download tclkit from [1] and sdx from [2]. sdx assumes >>> it can find "tclkit", so after unpacking, you must rename it. >>> Similarly, sdx will download as sdx.kit and the instructions I found say >>> to rename it to sdx. >>> >>> The sdx unwrap (and wrap) should then be all that's needed to access the >>> source. >>> >>> [1] http://www.equi4.com/pub/tk/8.4.16/ >>> [2] http://www.equi4.com/wikis/equi4/206 >>> >>> >>> >>> I think this would be relatively easy to do and a good feature. While you are at it, add the feature to display the time used for each move :-) - Don Chris Fant wrote: > It would be nice to be able to automatically follow all the games for > a certain bot. > ___ > computer-go mailing list > computer-go@computer-go.org > http://www.computer-go.org/mailman/listinfo/computer-go/ > > > > > ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/ >>> ___ >>> computer-go mailing list >>> computer-go@computer-go.org >>> http://www.computer-go.org/mailman/listinfo/computer-go/ >>> >>> >>> >>> >> ___ >> computer-go mailing list >> computer-go@computer-go.org >> http://www.computer-go.org/mailman/listinfo/computer-go/ >> >> > -- > [EMAIL PROTECTED] (Kato) > ___ > computer-go mailing list > computer-go@computer-g
Re: [computer-go] cgos viewer feature request
Nothing to apologize over! I probably should have put some kind of column headings or something. I thought it more useful to be a count-down timer - it's much easier to calculate how much time you took by subtraction with the previous entry than it is to see how much time is left by adding up every entry. - Don Hideki Kato wrote: > Hi Don, > > Now I understand the time is the time left! So your code is not > wrong (_ _). > # A Japanese facemark that means I'm sorry. Don't rotate your head. > > Hideki > > Hideki Kato: <[EMAIL PROTECTED]>: > >> Hi Don, >> >> Don Dailey: <[EMAIL PROTECTED]>: >> >>> Hideki Kato wrote: >>> Hi Don, Thank you for the additional feature. It seems, however, strange. Does 04:19 mean 41.9 seconds? Hideki >>> Hi Hideki, >>> >>> 04:19 means 4 minutes and 19 seconds.I don't understand how it could >>> mean 41.9 seconds. >>> >>> Is this an international thing?Is there a better way that's more >>> understood than what I am doing? >>> >> If so, all clients on 9x9 should lose by time. But now >> I'm sure that the time is completely wrong. Please check your code. >> >> Hideki >> >> >>> - Don >>> >>> >>> >>> >>> >>> >>> Don Dailey: <[EMAIL PROTECTED]>: > I just updated the current viewer to version 0.35. > > I added the remaining time display. > > The default server is the 19x19 server, so if you use it for 9x9 you must > specify the server and port. You must use the options like this: > >cgosview -server server_name -port portnum -games 1,2,3,4,5 > > Games is optional, but it will pop up all the specified games. > > - Don > > > > > Jason House wrote: > > >> On Thu, 2007-11-01 at 17:05 -0400, Don Dailey wrote: >> >> >> >>> The source code is included - even though you probably don't realize >>> it.There is a utility that will unpack the kit and reveal the source >>> code. Then you can fix it, pack it back up and run it.Google for >>> sdx.kit and tclkit and equi4 and you will find the details. It's a >>> tcl/tk program. >>> >>> >>> >> For those considering doing this, here's some help... (I hope, I'm just >> figuring this out now) >> >> You'll need to download tclkit from [1] and sdx from [2]. sdx assumes >> it can find "tclkit", so after unpacking, you must rename it. >> Similarly, sdx will download as sdx.kit and the instructions I found say >> to rename it to sdx. >> >> The sdx unwrap (and wrap) should then be all that's needed to access the >> source. >> >> [1] http://www.equi4.com/pub/tk/8.4.16/ >> [2] http://www.equi4.com/wikis/equi4/206 >> >> >> >> >>> I think this would be relatively easy to do and a good feature.While >>> you are at it, add the feature to display the time used for each move >>> :-) >>> >>> - Don >>> >>> >>> Chris Fant wrote: >>> >>> >>> It would be nice to be able to automatically follow all the games for a certain bot. ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/ >>> ___ >>> computer-go mailing list >>> computer-go@computer-go.org >>> http://www.computer-go.org/mailman/listinfo/computer-go/ >>> >>> >>> >> ___ >> computer-go mailing list >> computer-go@computer-go.org >> http://www.computer-go.org/mailman/listinfo/computer-go/ >> >> >> >> > ___ > computer-go mailing list > computer-go@computer-go.org > http://www.computer-go.org/mailman/listinfo/computer-go/ > > -- [EMAIL PROTECTED] (Kato) ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/ >>> ___ >>> computer-go mailing list >>> computer-go@computer-go.org >>> http://www.computer-go.org/mailman/listinfo/computer-go/ >>> >> -- >> [EMAIL PROTECTED] (Kato) >> ___ >> computer-go mailing list >> computer-go@computer-go.org >> http://www.computer-go.org/mailman/listinfo/computer-go/ >> > -- > [EMAIL PROTECTED] (Kato) > _