Re: [Computer-go] Machine for Deep Neural Net training
Looks like you're making good progress. Apart from the time gained training, you'll probably get a similar speed up when using the DNN during play? I'm curious when you'll see improvement in play outweigh the extra computational cost. Mark > On Apr 26, 2016, at 9:55 PM, David Fotlandwrote: > > I have my deep neural net training setup working, and it's working so well I > want to share. I already had Caffe running on my desktop machine (4 core > i7) without a GPU, with inputs similar to AlphaGo generated by Many Faces > into an LMDB database. I trained a few small nets for a day each to get > some feel for it. > > I bought an Alienware Area 51 from Dell, with two GTX 980 TI GPUs, 16 GB of > memory, and 2 TB of disk. I set it up to dual boot Ubuntu 14.04, which made > it trivial to get the latest caffe up and running with CUDNN. 2 TB of disk > is not enough. I'll have to add another drive. > > I expected something like 20x speedup on training, but I was shocked by what > I actually got. > > On my desktop, the Caffe MNIST sample took 27 minutes to complete. On the > new machine it was 22 seconds. 73x faster. > > My simple network has 42 input planes, and 4 layers of 48 filters each. > Training runs about 100x faster on the Alienware. Training 100k Caffe > iterations (batches) of 50 positions takes 13 minutes, rather than almost a > full day on my desktop. > > David > > ___ > Computer-go mailing list > Computer-go@computer-go.org > http://computer-go.org/mailman/listinfo/computer-go ___ Computer-go mailing list Computer-go@computer-go.org http://computer-go.org/mailman/listinfo/computer-go
Re: [Computer-go] Numenta
Somehow reading this list has fallen off my radar... I looked at the Numenta stuff some years ago. Jeff Hawkins' book On Intelligence is an inspiring read and has some really good insights. So when he started Numenta I had considerable expectations. However, what I thought were the most interesting insights, mostly related to feedback loops, were completely eliminated from the actual implementation. That combined with the fact that the pattern recognizer only works when a pattern is shown in every possible location and orientation to be universally recognized is a big drawback. I don't believe that's how the brain does it at all. Initially I thought it was just because they were just starting. But a couple of years later it was still the same. In the years since some advances may have been made by Numenta of course, but none that caught my attention. From the book I got some ideas on how to use massive feedback loops to do natural language understanding, but unfortunately I haven't had time to explore. It may be just a red herring as well. Mark On Feb 18, 2015, at 2:34 AM, Petr Baudis pa...@ucw.cz wrote: Hi! On Tue, Feb 17, 2015 at 12:26:50PM -0800, Michael Alford wrote: I thought some of you might be interested in this: http://numenta.com/#hero Did you mean to link to just the home page? Here's a short teaser :) https://www.youtube.com/watch?v=RojcnwnEzSQ I guess it's not strictly on-topic, but I think I should add that Numenta is actually a bit controversial in machine learning circles, at least it was when I last checked it out, since they have a lot of theory and marketing but not so many results to motivate it... See e.g. http://www.reddit.com/r/MachineLearning/comments/25lnbt/ama_yann_lecun/chisjsc http://www.reddit.com/r/MachineLearning/comments/25lnbt/ama_yann_lecun/chj9pwm -- Petr Baudis If you do not work on an important problem, it's unlikely you'll do important work. -- R. Hamming http://www.cs.virginia.edu/~robins/YouAndYourResearch.html ___ Computer-go mailing list Computer-go@computer-go.org http://computer-go.org/mailman/listinfo/computer-go ___ Computer-go mailing list Computer-go@computer-go.org http://computer-go.org/mailman/listinfo/computer-go
Re: [computer-go] Re: Open source real time Go server
2010/1/19 terry mcintyre terrymcint...@yahoo.com: ( I recall a pro making such an observation; I was willing to accept his expertise on the matter. ) Any pro making such a comment at move 10 is just grand-standing. I have experienced pros making such comments too. You can let such a remark pass out of politeness of course, but accepting it because of his presumed expertise is extremely naive IMO. I would even be suspicious of pros making such comments at move 150. Mark ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] Reference Montecarlo TreeDecision Bot.
I took AMAF as the process to consider all the moves regardless when they were played in the sequence (although a slight discount for later in the sequence seems to help a little) whereas RAVE is using an undefined method to favour some nodes over others prior to expanding them. The reason (as far as I understood so far) they get confused is because a popular method to use in RAVE is in fact using AMAF values. Mark On Tue, Dec 15, 2009 at 2:12 AM, Magnus Persson magnus.pers...@phmp.se wrote: Quoting Petr Baudis pa...@ucw.cz: On Mon, Dec 14, 2009 at 10:37:24PM -0800, Peter Drake wrote: It's easy to get confused -- different researchers use the terms slightly differently. They both gather data on moves other than a move made from the current board configuration. I would say that AMAF stores statistics on every move played from any position, while RAVE only stores info on moves played from descendants of the current position. Consequently, AMAF uses a global table, whereas RAVE data must be stored at every node. I guess that is a good definition; I assume this difference to arise from the fact whether you use tree or flat MC, so for me, AMAF in tree always means from descendants of the current position. Instead, to me AMAF is the data collected, while RAVE is the way to apply the data in the node urgency computation (which furthermore splits to what I call for myself Sylvain Gelly's RAVE vs David Silver's RAVE, of course...). This also how I have interpreting AMAF and RAVE after being confused initially thinking it was just two names for one thing. I think it's because I haven't seen this approach evolve and I'm not too familiar with the pre-RAVE AMAF, so perhaps I underestimate how revolutionary the descendants only idea was. AMAF was first used with programs that did not build a tree. Perhaps this is why Peter Drake makes this interpretation. When I implemented AMAF in Valkyria it was always self evident that descendants only is only the only good way of making use of it, since search was so deep that the positions cannot be compared. Best Magnus ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/ ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] Gongo: Go in Go
The relative values look about right. But I remember getting much higher numbers. Did you run the Java versions with or without the -server parameter? Mark On Mon, Dec 14, 2009 at 11:00 PM, Brian Slesinsky br...@slesinsky.org wrote: Okay, I added a few more timings (playouts / second, very rough): Plug-and-Go refbot: 14700 CRef bot (-O3) 12500 Gongo 1 Java bot: 6500 CRef bot (no optimization) 5882 Note that Gongo and Plug-and-Go are using different board data structures than the others. ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/ ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] Reference Montecarlo TreeDecision Bot.
I've found that AMAF gives very little boost with light playouts, you really need something fairly meaningful already to get any kind of real boost. To have at least 10% chance of beating GNUGo with reasonable time per game (to be able to play-test your bot), I think you can't avoid doing a lot more than plain UCT + AMAF + light playouts. I suppose it depends on what exactly you try to boost. As others already stated doing just light playouts, no tree-search, AMAF gives a huge boost. With light playouts and UCT tree-search, AMAF still gives a considerable improvement. Maybe you can try to search the archives, I think I posted about it. My MCTS ref-bot has a parameter called 'useAMAF'. You can experiment with witching it on and off with different numbers of playouts if you want to. If I remember well, the difference was very obvious, even with high number of playouts. Also the latter statement I find an exaggeration. Plain UCT + AMAF + semi-light playouts beats GNUGo without too much work. All I needed was including some simple tactics and a few tricks to prune the tree early in the game (i.e. not play on the 1st and 2nd line). Mark ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] Reference Montecarlo TreeDecision Bot.
On Dec 13, 2009, at 8:08 AM, Denis fidaali wrote: So it would certainly be usefull, if people could agree on a reference monte carlo tree bot (and provide some reference implementations in popular langages). It would obviously be based on the reference light-bot. This is what I attempted in the 'plug-and-go' project. Apart from making it easy to get started, I included a Java implementation of the AMAF refbot and of what could be used as a refbot for a MCTS refbot with light playouts. But I don't think it got much attention and now I don't have time for it. But it's there for anyone to take a look at. At best I can spend a few hours here and there if people want it. Mark ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] Kinds of Zobrist hashes
On Tue, Dec 8, 2009 at 8:37 PM, David Fotland fotl...@smart-games.com wrote: I use two values. I never even occurred to me to use three. I use one, it never occurred to me to use two :) At the time I'd never heard of Zobrist, but I used to use a single value for each point to look up Joseki. By using white-value = -black-value I could recognise the same joseki with colors inverted simply by using -value. Mark ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] KCC won the 3rd UEC Cup
Well, I suppose this is a lesson that every computer-Go programmer learns one day or another: always have a way to accept any move, no matter whether your bot thinks it's legal or not. I don't see how this is an indictment, the rules are what they are. For every player. It's not as if this was a little-known issue with Japanese rules. Mark On Sun, Nov 29, 2009 at 10:57 PM, Stefan Kaitschick stefan.kaitsch...@hamburg.de wrote: Crazy Stone (CS) lost the first game due to a wrong ko setting. The opponent of CS played a superko violation which was legal under Japanese rules, and CS lost the game by a faul. The most devastating indictment against japanese rules I've seen so far. Stefan ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/ ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] Rated bots on KGS
On Nov 19, 2009, at 1:24 AM, Nick Wedd wrote: So a bot which appears on KGS, gets rated as 25k, plays hundreds of games, and then improves to 15k, is going to do a lot of damage to the rating system. What happens when all those beginners that start at 25k improve, many of them well beyond 15k? I often wonder if rating is actually an exact science ;-) Mark ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [SPAM] Re: [computer-go] First ever win of a computer against a pro 9P as black (game of Go, 9x9).
Roger Penrose thinks the human brain can do things a Turing machine cannot. (Note: I don't say 'computer'.) He claims it's due to some quantum-physical effects used by the brain. I doubt his ideas are correct, but he did have a few interesting chess-positions to support his theory. Typically, they would contain a completely locked position, say a V-shaped pawn position and bishops on the wrong color to pass the pawn-ranks. These types of positions are very easily analyzed by even mediocre players, yet a computer never gets the right answer. Basically what it shows is that the human brain is able to conceptualize certain things that enable it to reason about situations that cannot be calculated by brute force. I don't claim that a Turing machine cannot do such things as well if programmed well, but it's very easy to see that there could be barriers to computers, no matter how much computing power you give them, if they solely rely on a simple method with brute force. Mark ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] First ever win of a computer against a pro 9P as black (game of Go, 9x9).
On Oct 27, 2009, at 7:41 PM, Hideki Kato wrote: IMHO, Jeff's idea is still very interesting while the implementation by the staff in Numenta have been going to not right direction. That was also my opinion. What I thought was strange is that Numenta's implementation doesn't have feed-back connections, which is a corner- stone of the ideas in the book. Those playouts are done in Cerebellum using some associative memory, I beleive. Then the mechanism, how to communicate with Cerebral, is a mistery, assuming some kind of tree search is done in Cerebral. It's not so sure to me there's a clear boundary between the activity of the two. It seems the tree search is done in the Cerebral cortex. But that may simply be because we're conscious of it. It's unclear what exactly happens during the unconscious processes. It mays also be a form of tree search that blends in with the conscious process. Knowledge about how the brain works is growing, but I believe it's mostly still a mystery. The way it's being observed currently is mostly like trying to figure out a computer-program by observing a piece of computer-memory on the screen. You see bits flashing on and off but you have to guess what instructed it to do so. The games in last Meijin-sen in Japan, Iyama vs Cho, may support your thought. I'm rather out of touch with what happens in tournaments. I've never heard of Iyama and even Cho could be a different one than I know. What happened in that match that is relevant to this discussion? Mark ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] First ever win of a computer against a pro 9P as black (game of Go, 9x9).
On Oct 27, 2009, at 3:39 AM, Hideki Kato wrote: I strongly believe that such patterns must not be only spatial (static) but also temporal, ie, dynamic or sequence of pattens which allow the player quickly remember the results of local fights or LD. I think that's exactly right. At least for humans. Maybe for computers there's another way. After reading On Intelligence (anyone follow my advice and read it?) I got to thinking the human brain possibly does a lot of little playouts in parallel. Not random, whole-board playouts from beginning to end, but short, local playouts, following strong patterns at each choice. Each time the result is fed back into the first layer so that the result of this playout gets used to guide the next playout. And the variance of the outcome of each of these playouts gets fed into the next layer to recognise higher-level concepts. Maybe for a few levels until it reaches a conscious level. The reason why thinking longer only helps marginally is that these small playouts follow a limited set of patterns. It takes time and practice to add these patterns, you can't easily consciously add a pattern in there during the game. So 'thinking' is restricted to a higher level, trying to think steps ahead in the game. Obviously this helps a lot for strength too, and pros are very good at that too. But with each stone you read ahead it becomes harder for your brain to do the pattern-matching because it doesn't have the complete (visual) input. So humans tend to think ahead in rather fixed sequences along the lines of play in the patterns that are followed sub-consciously. So when Sakata claimed he can read ahead 30 moves in a blink, he doesn't do a search of lots of possibilities. Instead, his brain is able to do these little playouts a lot deeper than mere mortals can. Most likely the main candidates all come up in the first (split) second. The rest of the time he spends verifying their results. This is all rather speculative of course. Christian Nentwich is right that it would be nice if we could measure this somehow. That's going to be difficult. But it shows a bit why I have the opinion that I have. Mark ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] OT: AI article I found interesting
2009/10/24 Dave Dyer dd...@real-me.net: At 10:12 AM 10/24/2009, Joshua Shriver wrote: Came across this today, and since this is also an AI oriented list thought some of you might enjoy it too. http://www.techradar.com/news/world-of-tech/future-tech/the-past-present-and-future-of-ai-643838 I won't believe it even if I see it. Google Mechanical Turk I heard about this a few months ago. My first thought was similar, seeing is believing. From what I read about it it's a system only built to play Jeopardy and specifically tailored to produce answers in this fashion. So that narrows the scope quite a bit from actual natural language understanding. And this article's author either didn't understand it or is being disingenuous. Because it was built to play Jeopardy from the start, it's not at all that its language understanding is so good they decided to let it play the game to see how well it would do. This kind of twisting of the truth just raises more doubts with me. Having said all that, if the program can do even remotely what they claim it can do it would already be a big advancement. It just so happens I'm allergic to hype. The article also mentions Jabberwacky. For my project I have looked at quite a few chat-bots, including Jabberwacky. I didn't feel it stood out from all the other main ones. And all have a very artificial feel a few sentences into a conversation. We have a custom-made chat-bot that was made by Bruce Wilcox (yes, the world is small) and I think it's on par with the most famous ones. It suffers from the same shortcomings in that it doesn't really understand what it's talking about. Humans can be fooled by them for a little bit, but it soon becomes very artificial because of a lack of understanding and lack of the most basic logic. Progress is being made. But very, very slowly. Where it says The Watson project isn't a million miles from the fictional HAL project I'm afraid that is not a million miles indeed but a billion miles. Mark ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] First ever win of a computer against a pro 9P as black (game of Go, 9x9).
2009/10/26 Don Dailey dailey@gmail.com: 2009/10/26 Richard J. Lorentz lore...@csun.edu Yes, this group does not have a consensus at all on this. On the one hand we hear that MCTS has reached a dead end and there is no benefit from extra CPU power, and on the other hand we have these developers hustling around for the biggest machines they can muster in order to play matches with humans! And Olivier claims that computers benefit more from additional thinking time than humans! Well, we had this discussion a while back on this list. I (and some others) argued that humans play fast extremely well and that more time provides a rapidly decreasing benefit. If I remember well it was you who was arguing this not being the case and that pros benefit greatly with more time. So it seems we're starting to see some support for the argument that at least in Go professional players don't benefit as much from more time than computers do at the moment. Mark ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] First ever win of a computer against a pro 9P as black (game of Go, 9x9).
2009/10/26 Don Dailey dailey@gmail.com: Yes, you understood me right. I disagree with Olivier on this one. To me it is self-evident that humans are more scalable than computers because we have better heuristics. When that is not true it is usually because the task is trivial, not because it is hard. Personally I rather think that what makes a human good at certain tasks is not necessarily a conscious effort, and that doesn't have to be a trivial task. So then actively thinking longer doesn't help as much because you lack the control over the thought-process. I believe very much that Go falls in that category, where Chess does not. Mark ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] Great Wall Opening by Bruce Wilcox
On Oct 19, 2009, at 7:04 PM, Petri Pitkanen wrote: Not really a compuetr Go issue, but I do not think that great wall is superior even when completed. It is not too bad but it needs a definite strategy from wall owner. I.e building side moyos using wall as a roof and hoping that the other guy gets nervous and jumps in. So by being patient is pretty good defence against it. Even when completed I think it's inferior. But that doesn't mean you can take it lightly. There's a psychological component to it that makes it easy for the opponent to make a mistake. Also, White may have some advantage by having more experience with the strategy. But I agree with the pro that if you disrupt it by preventing completion with the last move it really turns disadvantageous for Black. Mark ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] Neural networks - numenta
Also not Remi, but... Numenta is a startup funded by Palm founder Jeff Hawkins. He started it following up on his book 'On Intelligence', which I think is a very interesting read. I'd suggest it as a reading to anyone considering applying some form of Neural simulation to Go or any other problem. At some point I did try to figure out what Numenta was trying to do. To some extent it sounds interesting but with limited use. IMO it strayed quite a bit from the original ideas from Hawkins' book. At least that is my impression. And what I don't like is that the solution doesn't translate. So for a pattern-matcher for Go for example, it has to learn the patterns at all possible board-locations in all orientations. The abstraction part where you see a pattern on one part of the board and then being able to recognise it on another part or in another rotation of the board is missing. But that doesn't necessarily mean there aren't any interesting ideas in there. I just think it's not nearly half there to be generally useful. Mark On Wed, Oct 14, 2009 at 6:03 AM, David Fotland fotl...@smart-games.com wrote: Remi, what do you think of Numenta http://www.numenta.com/, a startup that is using feedforward/feedback networks to model learning and pattern recognition in the neocortex. Does this approach make sense or is it just startup hype? http://www.numenta.com/for-developers/education/biological-background-htm.ph p David -Original Message- From: computer-go-boun...@computer-go.org [mailto:computer-go- boun...@computer-go.org] On Behalf Of Rémi Coulom Sent: Wednesday, October 14, 2009 6:07 AM To: computer-go Subject: Re: [computer-go] Neural networks Petr Baudis wrote: Hi! Is there some high-level reason hypothesised about why there are no successful programs using neural networks in Go? I'd also like to ask if someone has a research tip for some interesting Go sub-problem that could make for a nice beginner neural networks project. Thanks, At the time when it was fashionable, I would have sold my pattern-Elo stuff as a neural network, because, in neural-network jargon, it is in fact a one-layer network with a softmax output. Since the development of support-vector machines, neural networks have been considered completely obsolete in the machine-learning community. From a marketing point of view, it is not a good idea to do research on neural networks nowadays. You must give your system another name. Rémi ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/ ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/ ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] Pattern matching
On Sat, Oct 10, 2009 at 5:32 PM, Álvaro Begué alvaro.be...@gmail.com wrote: Are you not going to tell us what this new job is about? I almost forgot to answer this, I had no intention to sound mysterious. My job is to make autonomous avatars (also called NPCs or 'bots') for a new MMO platform called Blue Mars. The immediate goal is to make them more interesting, intelligent and interactive than the zombie bots currently populating such worlds. That is a pretty easy target. The ultimate goal is to make them actually seem intelligent and indistinguishable from real players or even replace a real player when he/she is not online. I've been given a practically free hand in how I think this is accomplished best. I've only just started laying the groundwork, but already there are some results that people are excited about. So far so good. Mark ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] Pattern matching
Someone commented on it, so it must have come across OK. Is it possible that your e-mail provider or e-mail reader decided to not include the attachment? At least that's what 'skipped content' seems to indicate. If I need to repost or publish it in another way I'd be happy to do so. Mark On Tue, Oct 13, 2009 at 10:54 AM, Brian Sheppard sheppar...@aol.com wrote: The following article (attached) For me, I see no attachment. Only: -- next part -- Skipped content of type multipart/mixed Does anyone know what do I have to do to see the attachment? ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/ ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] string criticality?
Maybe from a different angle, but maybe you remember me writing about 'stone-age'. Basically what it did was assuming strings created during the playout are less critical than existing strings. I used this to limit my tactical search by a great deal by not doing any search on 'new' strings. This didn't affect strength in any way I could measure, but it reduced the overhead of tactics ( I don't remember exactly, but I think it cut in half). I think the single most useful information about strings or groups would be whether it has two eyes. Unfortunately I haven't been able to think of a cost-effective way to compute that yet :) Mark ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] Stone-Age
On Mon, Sep 14, 2009 at 10:55 AM, Stefan Kaitschick stefan.kaitsch...@hamburg.de wrote: Stone-Age - spooky concept :-) I suppose it has some relationship to generally lighter playouts deeper in the tree. Have you experimented some more with this? No, I didn't have time to explore this further. Perhaps the cutoff point should be somewhere in the future though, moving towards the present as the game progesses. Otherwise you completely disable patterns for the first move, which doesn't seem right. I'm not sure. I used as cut-off the start of the playout, which being partly down the tree by a few moves (at least) seems to work fine. My main 'requirement' was that it didn't lose strength with equal number of playouts. Since that requirement was met I didn't look any further. I'd be surprised if pushing it further out into the future would actually *gain* strength, whereas it would certainly lose speed. Stefan I feel ungrateful for saying this, but a search in the archive would be great. You mean you don't keep your mail? My post about this was on Feb 2nd. GMail is your friend ;-) Mark ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] any mac programmers out there?
Just being curious I decided to give it a swing to see if Fuego would compile on a Mac. The configure scripts stops saying 'boost' is not installed. So I downloaded the latest version of that (it's huge!) and set a BOOST_ROOT environment variable, but it still says it can't find it. Anyone know what's the matter there? I must admit I didn't spend a whole lot of time trying to figure it out. Mark ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] any mac programmers out there?
On Sep 6, 2009, at 4:20 AM, Don Dailey wrote: I tried both llvm-gcc and CLANG. I did not have any trouble getting them to work for my 64 bit chess program. I didn't try too hard, but neither is producing executables as fast as gcc. llvm-gcc is the slowest about 20% slower than gcc and clang is only a little slower than gcc. Since I developed with gcc it is very likely that the program and the way I write code is tuned to work well with gcc. Perhaps I will try this with the GO program, which is not heavily optimized. I grabbed and compiled the latest llvm and clang - so I cannot be accused of using outdated versions. And I didn't use the debug versions either. But I will keep my eye on llvm and clang. From what I've seen, LLVM should be comparable to gcc or faster. Of course whenever anyone publishes this kind of comparison you have to wonder how biased they are. And supposedly compile-times are several times faster than gcc, which doesn't matter for the final product of course but is nice during development. Maybe it would be interesting to compile a ref-bot on the Mac and see how it compares. Would Fuego compile on a Mac with XCode? That might provide even more a real-world comparison. From what I've read so far it sounds like Objective C 2.0 offers many of the things I like about Java. And then it offers a few niceties Java doesn't offer (yet). It also claims a seamless connection to C- code. Java and C# can call into C-code, but doing it right is so much work you'd think twice before doing it unless you have a substantial library that stands on its own. If it's really seamless there's little that stops you from sticking in a few small routines in plain C that are optimized to the bone. Mark ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] any mac programmers out there?
On Sep 5, 2009, at 4:41 AM, terry mcintyre wrote: Found an interesting article on Snow Leopard at Ars Technica ... 20- some pages. http://arstechnica.com/apple/reviews/2009/08/mac-os-x-10-6.ars Of interest to Computer Go programmers: the addition of blocks to C, which allow closures and other fun stuff, much like Lisp. LLVM, which allows JIT compilation to multiple architectures, including GPUs; Grand Central Dispatch, which provides very light-weight concurrency; and CLANG, a new compiler which is said to be quite an improvement over GCC. Open CL, which leverages LLVM to program GPUs. Seems interesting indeed. Does anyone know how Objective-C 2.0 compares in speed to C? I like the promise of abstracting the CPU to the point where you can execute either on the CPU or the GPU, depending on which is available and which is suitable. I also like the blocks, it seems a little more elegant and more flexible than anonymous functions in Java. Combined with light-weight concurrency makes for an interesting combination. Mark ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] MoGo policy: capture stones anywhere?
2009/9/1 Peter Drake dr...@lclark.edu: On Sep 1, 2009, at 8:11 AM, David Fotland wrote: I don’t think any of the strong programs use pseudoliberties. Interesting! Can those involved with other strong programs verify this? My board code is descended from my Java re-implementation of libEGO. I tried writing one using real liberties earlier, and it was considerably slower in random playouts. I started out by looking at Orego's code when I first tried MC ;-). Since then I found that even with very light playouts, pseudo-liberties is only marginally (like a few percent) faster than keeping actual liberty-counts. This performance hit is easily recovered by having the real number of liberties at all times for other parts. Just the coding is a bit more work to make it efficient. But you can check the plug-and-go project for a Java implementation. Mark ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] Mercy rule position
On Tue, Aug 18, 2009 at 10:24 AM, Brian Sheppardsheppar...@aol.com wrote: What do you do in your program? Not using the mercy-rule. I believe you can gain 10%-20% performance on 9x9 using a mercy-rule. But in its most simple form I don't see how it can be used reliably. I don't know if the gain in performance offsets the small number of problems you'll be getting, but if you add in the amount of time you'll be chasing red herrings like these I don't think it's worth it. I have thought of keeping track of the number of stones that are adjacent to two 'eyes' and abort if the number, including the eyes, exceeds half the board. But it's not really the same thing. And it was never high enough on my to-do list to make it into an implementation. Mark ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] representing liberties
On Aug 15, 2009, at 6:24 AM, Heikki Levanto wrote: You can also use board-sized bitmaps. Merging is a trivial OR operation. I've seen bit-maps mentioned many times, but is there any evidence it's faster than a 'traditional' implementation? Mark ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] representing liberties
On Aug 15, 2009, at 8:52 AM, w...@swcp.com w...@swcp.com wrote: You will just have to jump in and read some code or write your own to fully understand. I recommend reading the gnugo source, which is pretty darn good. But that's exactly the kind of work you'd want to avoid if there's no reasonable grounds for believing it will gain any performance. Especially if it will clearly be very memory-inefficient. Mark ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] Dynamic komi at high handicaps
I started to write something on this subject a while ago but it got caught up in other things I had to do. When humans play a (high) handicap game, they don't estimate a high winning percentage for the weaker player. They'll consider it to be more or less 50-50. So to adjust the komi at the beginning of the game such that the winning percentage becomes 50% seems a very reasonable idea to me. This is what humans do too, they'll assume the stronger player will be able to catch up a certain number of points to overcome the handicap. What seems difficult to me however is to devise a reasonable way to decrease this komi as the game progresses. In an actual game the stronger player catches up in leaps and bounds, not smoothly. In MC things are not always intuitive though. Mark ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] Dynamic komi at high handicaps
2009/8/12 Don Dailey dailey@gmail.com: I disagree about this being what humans do. They do not set a fake komi and then try to win only by that much. I didn't say that humans do that. I said they consider their chance 50-50. For an MC program to consider its chances to be 50-50 you'd have to up the komi. There's a difference. I think their model is somewhat incremental, trying to win a bit at a time but I'm quite convinced that they won't just let the opponent consolidate like MCTS does. With dynamic komi the program will STILL just try to consolidate and not care about what his opponent does. But strong players will know that letting your opponent consolidate is not going to work. So they will keep things complicated and challenge their weaker opponents everywhere that is important. It's difficult to make hard claims about this. I don't agree at all that the stronger player constantly needs to keep things complicated. Personally I tend to play solidly when giving a handicap. Because most damage is self-inflicted. You can either make a guess what the weaker player doesn't know, or you can give him the initiative and he'll show you. I prefer the latter approach. When done properly, I don't see how an MCTS program would consolidate all the time. Doing so would keep the position stable while the komi declines. As soon as he gets behind the komi degradation curve play will automatically get more dynamic in an attempt to catch up. The problem is: we're speculating. The proof is in the pudding. Mark ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] Dynamic komi at high handicaps
2009/8/12 Don Dailey dailey@gmail.com: If the program makes decisions about the best way to win N points, there is no guarantee that this is ALSO the best way to win N+1 points. Although this is obviously true, that doesn't automatically mean it's not the best approach. Because there's a hidden assumption in there. And that is it's not the best way to win by N+1, given proper play by the opponent thereafter. If not perfect, then at least as strong as the stronger player. Whatever your strategy, even when you catch up a lot there's no guarantee the opponent will keep making mistakes enough for you to win. Human players generally do keep track whether they seem to be catching up 'enough' and will take more risk when progress is not in line with the progress of the game. I don't think anyone is trying to argue that adjusting komi is the perfect answer. But what apparently is observed (I never tried myself) is that currently MCTS does poorly in handicap games. So the question is whether adjusting the handicap would improve performance. The positions seem to be entrenched. But I have yet to see conclusive evidence or persuasive arguments one way or the other. Maybe I should ask first, for clarity sake, is MCTS performance in handicap games currently a problem? Mark ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] memory management for search trees(basic question)
There are many ways to skin a cat. I allocate them dynamically, but recycle the nodes no longer needed for performance. And I use 'aspect programming' that optionally (after a recompile) checks for dangling pointers. Mark On Jul 14, 2009, at 5:06 AM, Carter Cheng wrote: This is something I have been curious about since I am somewhat new to writing code in languages which require explicit memory management (as opposed to have some sort of garbage collector do it for you). The question is how do most programs manage memory w.r.t. the search nodes? Is the memory preallocated in advance or is there some strategy to grow the space required as the nodes accumulate during a search? Thanks in advance, Carter. ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/ ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] Basic question concerning the edges of the board
On Mon, Jul 13, 2009 at 7:54 AM, David Doshayddos...@mac.com wrote: My personal opinion is that way too much effort is put into optimizations that used to be very important when memory was small, but now is nice but not really needed. My bias is that efficiency is a good thing as long as it does not get in the way of easily understandable code, particularly for a new engine. I am not debating (and do not want to start a flame war on the subject) that good data structures lead to good programs, but I think that trying to wring every last bit out of the stored data is silly when machines these days have up to 8 GB of RAM. I'm not sure I was the first to come up with the 20*21+1 idea. I suppose it's possible. But the reason for me was actually more a practical one than an optimization, even in the early days. When viewing the values in a debugger in multiples of 19 or 21 is mentally a lot more work than when it's multiples of 20. For example 356 makes x=16 y=17 which is much easier (for math-challenged people like me) to see than when the coordinate is represented by 339 or 373 because I only learned the multiplication tables until 10 in elementary school. Mark ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] Another odd seki
I have thought about this kind of thing a bit, but it's hard to formulate a general solution. What happens is when you prohibit certain 'bad' moves is that you slant the result in favour of the side with more 'bad' moves. This gets to be a problem in situations where there are few moves to play. Either globally or locally. Mark On Jul 11, 2009, at 12:09 AM, Magnus Persson wrote: Valkyria correctly behaves as if this is seki. It actually does not now it is seki explicitely, but it prunes all moves that would break the seki. The principle to do this as simple as possible is thus not actually to identify seki in general but to find simple rules that prunes bad moves. This means one has to have many rules, but each rule becomes simple. Pruning J3 for X is a very neat example which is easy to do with the rich boardrepresentation of Valkyria. First the program at som point identifies J3 as a false eye. Some false eyes must be played and some most not be played. In this case XJ2 and XH3 both have two liberties, then V. looks at the corners of J3 and finds OH2 which also has two liberties. Now the liberties of XJ2 and XH3 that are not the false eye are matched to the liberties of OH2. Also these liberties are at the moment suicidal for both players. This is enough to safely conclude that filling in the false eye cannot be a good move becuase it just connect two liberties that cannot be played anyway. But! There are always exceptions. In this case one exceptions is when the false eye is on the 1-1 point and only two stones are connect and that the resulting shape is a precursor to bent four in the corner. In that case filling in the false eye is essential. I cannot exclude that are no more exceptions to this rule, but so far it works nicely. Oh, by the way I find on the sight of it hareder to prune O J1. I know Valkyria prunes this but I do not remember exactly how it is done, but I guess the code that handle sacrfices has to check for false eyes becoming real eyes because of the sacrfice or something like that. -Magnus Quoting Brian Sheppard sheppar...@aol.com: There is a seki in the lower left of the position below: 1 2 3 4 5 6 7 8 9 A - O X X - X - - - B X O X X X X X - X C - O O X X - X X - D O O O O X X X X X E X X O - O X - X O F - X X O O O X X O G O X X X O X X O - H O O X X O X O - O J - X - X O O O O - It is obvious that X cannot play F1. O cannot play F1 because that would sacrifice a straight four. Pebbles has followed Magnus's advice and added a rule that prevents the two-for-one trade that would occur when X plays J1, so that is also not a problem. And for O, playing J1 is ruled out because X has another eye on J3. What isn't obvious is that X cannot play J3. If X plays J3 then O follows with J1 atari, and X loses because of the nakade shape of O's stones. The bottom line is that Pebbles rates this position as hugely favorable for O, because X stumbles into J3 in the playouts. How does your program handle this situation? ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/ -- Magnus Persson Berlin, Germany ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/ ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] 11'th Annual Malibu Go Party
On Fri, Jul 3, 2009 at 7:28 PM, Ray Tayekrta...@ca.rr.com wrote: please join us for an afternoon of surf, sand, and go. saturday, august 22'nd 2009, from 11 a.m. to 6 p.m or so, at 26918 malibu cove colony drive Sounds good. If the earth wasn't curved maybe we could wave at each other :) ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] Complicated seki with Ko
You always need to be extremely careful about these kind of heuristics. Especially in MC programs they can be detrimental very easily. But I believe you can come up with a reasonable rule to prevent some self-atari's. I have one which is something along the following lines: - puts itself into atari - has four occupied neighbours, one of which is its own color - does not put the opponent in atari Probably better rules can be invented. I would be surprised if any such rule is always 100% safe. But these at least cover your case. Another rule I just thought of that may work very well is not put yourself into atari with more than one stone if you can play at the other liberty without putting yourself into atari. Mark On Wed, Jul 1, 2009 at 7:06 AM, Christian Nentwichchrist...@modeltwozero.com wrote: |- - - - - - - |. * * o . . . |* . * o * . * |o . o o * . . |o * o . * . . |o o o * . . . |. * * * . . . |. . . . . . . |. * . . . . . Black to play and kill :) Christian On 01/07/2009 17:41, Magnus Persson wrote: In this case one needs to check that after the two stones are captured the capturing single stone can be recaptured bringing us back to where we started. If it is a big eye there is no recapture. -Magnus Quoting Brian Sheppard sheppar...@aol.com: For black I think I prune this kind of two stone suicide always no matter what the situation is (exception is ko). These prunings are probably wrong in some extremely rare cases. How can you tell the difference between this kind of two-stone self-atari, and a self-atari of two stones within an opponent's big eye, which could be necessary for lifedeath? ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/ ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/ ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] Negative result on using MC as a predictor
I've also tried a variety of ways to use point-ownership in combination with RAVE. By no means was it an exhaustive study, but I failed to find an intuitive way to improve play this way. I didn't try enough to be able to come to hard conclusions, but at the very least it didn't turn out to be obvious. Mark On Fri, Jun 5, 2009 at 3:39 AM, Brian Sheppard sheppar...@aol.com wrote: In a paper published a while ago, Remi Coulom showed that 64 MC trials (i.e., just random, no tree) was a useful predictor of move quality. In particular, Remi counted how often each point ended up in possession of the side to move. He then measured the probability of being the best move as a function of the frequency of possession. Remi found that if the possession frequency was around 1/3 then the move was most likely to be best, with decreasing probabilities elsewhere. I have been trying to extract more information from each trial, since it seems to me that we are discarding useful information when we use only the result of a trial. So I tried to implement Remi's idea in a UCT program. This is very different from Remi's situation, in which the MC trials are done before the predictor is used in a tree search. Here, we will have a tree search going on concurrently with collecting data about point ownership. My implementation used the first N trials of each UCT node to collect point ownership information. After the first M trials, it would use that information to bias the RAVE statistics. That is, in the selectBest routine I had an expression like this: for all moves { // Get the observed RAVE values: nRAVE = RAVETrials[move]; wRAVE = RAVEWins[move]; // Dynamically adjust according to point ownership: if (trialCount M) { ; // Do nothing. } else if (Ownership[move] 0.125) { nRAVE += ownershipTrialsParams[0]; wRAVE += ownershipWinsParams[0]; } else if (Ownership[move] 0.250) { nRAVE += ownershipTrialsParams[1]; wRAVE += ownershipWinsParams[1]; } else if (Ownership[move] 0.375) { nRAVE += ownershipTrialsParams[2]; wRAVE += ownershipWinsParams[2]; } else if (Ownership[move] 0.500) { nRAVE += ownershipTrialsParams[3]; wRAVE += ownershipWinsParams[3]; } else if (Ownership[move] 0.625) { nRAVE += ownershipTrialsParams[4]; wRAVE += ownershipWinsParams[4]; } else if (Ownership[move] 0.750) { nRAVE += ownershipTrialsParams[5]; wRAVE += ownershipWinsParams[5]; } else if (Ownership[move] 0.875) { nRAVE += ownershipTrialsParams[6]; wRAVE += ownershipWinsParams[6]; } else { nRAVE += ownershipTrialsParams[7]; wRAVE += ownershipWinsParams[7]; } // Now use nRAVE and wRAVE to order the moves for expansion } The bottom line is that the result was negative. In the test period, Pebbles won 69% (724 out of 1039) of CGOS games when not using this feature and less than 59% when using this feature. I tried a few parameter settings. Far from exhaustive, but mostly in line with Remi's paper. The best parameter settings showed 59% (110 out of 184, which is 2.4 standard deviations lower). But maybe you can learn from my mistakes and figure out how to make it work. I have no idea why this implementation doesn't work. Maybe RAVE does a good job already of determining where to play, so ownership information is redundant. Maybe different parameter settings would work. Maybe just overhead (but I doubt that; the overhead wouldn't account for such a significant drop). Anyway, if you try something like this, please let me know how it works out. Or if you have other ideas about how to extract more information from trials. Best, Brian ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/ ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] Go + code + environment
On Wed, May 27, 2009 at 7:33 PM, David Fotland fotl...@smart-games.com wrote: GPL is not infectious through looking at source code, but I didn't want any appearance of wrongdoing. And I was put off a little by Stallman's rhetoric. David I have mostly stayed away from GPL projects for the same reasons. Instead I preferred discussing things on the list here, occasionally asking how others do things instead of looking at source-code that is under license. Looking is not infectious. But taking code and re-write it is, even if there's little to no resemblance with the original. It's a very slippery slope what is the difference between the two and very hard to prove. I don't know Stallman myself, but I have heard from several people who had beef with him. It's not something I'd want to get into. There are probably good cases where GPL is appropriate, but most of the time it has always seemed a bit childish to me. Mark ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] Re: Implications of a CPU vs Memory trend on MCTS
I'm a bit late reading this thread and the discussion seems to have veered from the original topic a bit. As to the CPU vs. memory discussion, it's not so clear-cut to me that CPU speeds are improving faster than memory. Twenty years ago you had CPUs in the 4-8Mhz range and around 1Mb of memory. Today both are about 1000 times higher. CPU speeds are not necessarily only represented by hertz of course, but both CPU and memory seemed to have progressed with roughly the same speed. The thing is, they don't progress evenly. So maybe at the moment CPUs are going a bit faster than memory. But this could be temporary, not necessarily a sustainable trend. Also, CPU speeds of a single processor has stalled for a few years now. Using multiple CPUs or cores you get easy doubles by going to two and four. But it also gets more and more expensive. And doubling the CPUs doesn't double the power really. A bit over ten years ago we made a tsume-go program using PN search. There we also had problems keeping the tree in memory, so we designed a tree-structure that would automatically swap part of the tree out to disk. But after that there was a period that this code seemed to have become superfluous, as memory suddenly became abundant. If we now have a memory shortage with respect to CPU power it's quite possible this is something cyclical. Mark ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] Value of capture (atari?) heuristic.
On Sun, Apr 26, 2009 at 11:54 PM, Łukasz Lew lukasz@gmail.com wrote: Hi, Have any one of you tried uct + capture heuristic? Is it stronger? How much? From memory I'd say it was winning about 80% against plain UCT. But only capture if it can escape (which means it can make three liberties.) Mark ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] Value of capture (atari?) heuristic.
I ran some tests but I don't have access to the specifics right now (my computer is still in the box from moving, only got internet at home yesterday). Simply always capture didn't gain any strength. Capturing with low probability (like 5%-10%) depending on the number of stones seemed to gain a little bit, but it was never enough for me to be sure I wasn't looking at noise. Capturing / defending a neighbour of the last move gained a little bit I believe. But nothing came close to using the ladder-heuristic, which is defend a stone in atari if it can escape. Capturing the last move gains little for me, as it is covered in the first case. (I.e. the ladder heuristic will escape a stone in atari by capturing the last move.) David Fotland said he has a low probability on capture, but I don't think he ever gave specific numbers. Mark 2009/4/27 Łukasz Lew lukasz@gmail.com: 2009/4/28 Mark Boon tesujisoftw...@gmail.com: On Sun, Apr 26, 2009 at 11:54 PM, Łukasz Lew lukasz@gmail.com wrote: Hi, Have any one of you tried uct + capture heuristic? Is it stronger? How much? From memory I'd say it was winning about 80% against plain UCT. But only capture if it can escape (which means it can make three liberties.) Thanks. Any additional rules in your experiment? For instance no capturing stones that can't escape by this definition. Do you know what happens if there's no such rule (always capture) ? Lukasz Mark ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/ ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/ ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] Pseudo liberties: Detect 2 unique liberties?
On Mon, Apr 6, 2009 at 10:54 AM, Isaac Deutsch i...@gmx.ch wrote: This leads us to the question if groups in average have =10 or 10 liberties... :) Without actually having done any tests or measurements, I'd guess it's much less than 10. More like 4. Mark ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: RE: [computer-go] Pseudo liberties: Detect 2 unique liberties?
On Thu, Apr 2, 2009 at 5:14 AM, w...@swcp.com wrote: Isaac, I implemented about 6 way to track liberties, a couple years ago, and measured the running performance, and by far the best is also the simplest: keep an array of liberties for each chain, and do a simple linear search. This may seem slow, but it has a couple real advantages. First, it works with the cache to maximize locality. Second it is very simple. This is what I do too. I never bothered with bit-arrays but possibly that's simply because I fail to see how you can merge two chains and count liberties in O(1). You can find my implementation in the reference-bots I made. The file that counts liberties (and does a bit of other house-keeping) is here: https://plug-and-go.dev.java.net/source/browse/plug-and-go/TesujiRefBot/source/tesuji/games/go/reference/monte_carlo/MCLibertyAdministration.java?rev=267view=markup Mark ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] Pseudo liberties: Detect 2 unique liberties?
I can confirm, with a bit of optimization, counting real liberties is only marginally slower than counting pseudo-liberties. So there's really no benefit that I can see from using pseudo liberties. Mark On Wed, Apr 1, 2009 at 8:49 AM, Álvaro Begué alvaro.be...@gmail.com wrote: When John Tromp and I were thinking about these things in 2007 we decided to switch to counting real liberties instead of pseudo-liberties. Someone (Rémi?) told us that in the end the performance difference wasn't very large, and we verified this. Álvaro. On Wed, Apr 1, 2009 at 2:08 PM, Isaac Deutsch i...@gmx.ch wrote: Hi I'm currently using the method described here to detect if a group is in atari (1 real liberty): http://computer-go.org/pipermail/computer-go/2007-November/012350.html Thus I store the number of pseudo libs, the sum and the sum of squares for each group. Now for heavy playouts, it would be useful if I could somehow modify this so I can easily detect if a group can be put into atari (meaning it has exactly 2 real liberties). My intuition tells me it should be possible by also storing the sum of positions^3. However, I can't quite wrap my head around how to do the check. Has anyone looked into this before, and found an answer? I like this approach because it's so easy and fast. Regards, Isaac -- Neu: GMX FreeDSL Komplettanschluss mit DSL 6.000 Flatrate + Telefonanschluss für nur 17,95 Euro/mtl.!* http://dsl.gmx.de/?ac=OM.AD.PD003K11308T4569a ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/ ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/ ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] MC and Japanese rules
2009/3/23 Yamato yamato...@yahoo.co.jp: Sorry for responding to the old topic. Mark Boon wrote: Other than that, I'd take a different approach: - play out as usual. Instead of counting stones + eyes on the board, you count eyes + prisoners + nr-opponent's passes during playout. - don't count passes outside of playout. I think this avoids having to take a security margin or require passing as soon as the opponent does (although in practice that may happen almost all the time). The seki-matter is the same. I think this scheme works for the playout itself, but I have a problem with UCT tree. Look at the attached file. This position is win for black in Japanese rules, but the only correct move is pass. If black plays anywhere other than pass, he loses. This time white's correct move is pass, otherwise he loses. Such a condition breaks winning rate values in the tree. How should I handle this? It seems pretty straightforward to me. The UCT exploration needs to choose pass for Black and he'll win. You say white's correct move is pass, otherwise he loses. That's not entirely correct. White loses no matter what he does, whether it's a pass or anything else. Mark ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] Re: static evaluators for tree search
On Tue, Feb 17, 2009 at 8:35 PM, George Dahl george.d...@gmail.com wrote: Really? You think that doing 20-50 uniform random playouts and estimating the win probability, when used as a leaf node evaluator in tree search, will outperform anything else that uses same amount of time? You'll probably find a variety of opinions on that. I think you can make a static evaluator that will give you a better estimate of the win probability estimate than 50 uniform random playouts. But... (there are a few big ones) implementing uniform random playout is a lot less work. On top of that, with some prudent exploration, you rarely spend 50 playouts all in one place. This is definitely something powerful in MCTS programs, that they can reject unpromising lines of play at relatively little cost. Mark ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] Presentation of my personnal project : evolution of an artificial go player through random mutation and natural selection
Just curious, did you ever read 'On Intelligence' by Jeff Hawkins? After reading that I got rather sold on the idea that if you're ever going to attempt making a program with neural nets that behaves intelligently then it needs to have a lot of feed-back links. Not just the standard feed-forward type of networks. Some other good ideas in that book too IMO. Mark On Feb 13, 2009, at 5:42 PM, Ernest Galbrun wrote: Hello, I would like to share my project with you : I have developped a program trying to mimic evolution through the competition of artificial go players. The players are made of totally mutable artificial neural networks, and the compete against each other in a never ending tournament, randomly mutating and reproducing when they are successful. I have also implemented a way to share innovation among every program. I am currently looking for additional volunteer (we are 4 at the moment) to try this out. If you are interested, pleas feel free to answer here, or directly email me. I have just created a blog whose purpose will be to explain how my program work and to tell how it is going. (As for now, it has been running consistently for about a month, the players are still rather passive, trying to play patterns assuring them the greatest territory possible.) Here is the url of my blog : http://goia-hephaestos.blogspot.com/ Ernest Galbrun ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/ ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] GPGPU
I don't know if that's what you're already looking at, but recently Apple announced their new version of OS X called 'Snow Leopard' which supposedly focuses mostly on improvements in the use of multiple processing. And that includes the GPU. The module that binds it all together is called 'Grand Central'. I don't know much of the details, just picked up some buzz-words from articles like these: http://gizmodo.com/5017615/giz-explains-mac-os-106-snow-leopard-parallel-processing-and-gpu-computing If you google around a bit you may easily find more information about it. Mark ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] How to properly implement RAVE?
I had a little spare time to look at it. It seems indeed I forgot to update the GoEngineGTP.jar file last time I made some changes. This was easy to fix even from here and I think it should work now. Just as a note, if you want to change the number of playouts to 50K, you need to change the minimumNrNodes from 4,000 to 50,000 in the ReferenceMonteCarloTreeSearch definition in the TesujiRefBot.xml. But you may also want to increase the 'nrSimulationsBeforeExpansion' value to something higher like 8 or even 32. It depends on how much memory you have available. You may want to set it to the same value you use for your own program to get a good comparison anyway. Use the MCTSRefBot to play, which means you'll be passing the following command to two-gtp: java -xmx512M -jar GoEngineGTP.jar MCTSRefBot Adjust the -xmx???M to whatever you think is suitable on your computer. I think if you set nrSimulationsBeforeExpansion=8 then 512M should be enough but you may have to experiment a little to find the optimal settings. Mark On Feb 8, 2009, at 7:06 AM, Isaac Deutsch wrote: No hurry, I will be away for a week next week. :-) Isaac, I haven't looked at this stuff for a while. I'm not at home so I can't look at it now. From the error I understand that tesuji/games/general/ MoveIterator is missing. It is there in the Subversion source-tree however. So I don't know what could be wrong. Maybe it's somehow missing in the GoEngineGTP.jar If you can wait until Wednesday I'll fix it for you then. Mark -- Pt! Schon vom neuen GMX MultiMessenger gehört? Der kann`s mit allen: http://www.gmx.net/de/go/multimessenger01 ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/ ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] How to properly implement RAVE?
On Sun, Feb 8, 2009 at 3:02 PM, Isaac Deutsch i...@gmx.ch wrote: Hi, Can you explain what minimumNrNodes and nrSimulations do? In my program I play 50k games regardless of the number of nodes, so I would like to adjust this accordingly. minimumNrNodes is the number of games played out. Originally I used to always create a new node when a playout happened. Maybe this should be renamed to minimumNrPlayouts. nrSimulationsBeforeExpansion is the minimum number of visits that have to be made to a node before the tree gets expanded any deeper. As an example, when the search begins, the root-node is expanded with all possible legal moves. But those children nodes are not expanded themselves until a certain number of simulations (playouts) have been done starting from the root-node. Until that time the children nodes are only used to store the AMAF values. I think this may be slightly different from how most people do it, but I figured it would be a waste to throw away the AMAF values for the first 'n' simulations in a node. Otherwise, it works now. :) Good. Mark ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] How to properly implement RAVE?
Isaac, I haven't looked at this stuff for a while. I'm not at home so I can't look at it now. From the error I understand that tesuji/games/general/MoveIterator is missing. It is there in the Subversion source-tree however. So I don't know what could be wrong. Maybe it's somehow missing in the GoEngineGTP.jar If you can wait until Wednesday I'll fix it for you then. Mark On Sat, Feb 7, 2009 at 4:29 PM, Isaac Deutsch i...@gmx.ch wrote: I put all files in a folder like so: http://img21.imageshack.us/img21/5272/bild4ya1.png But I get various errors when I try to run, for example, GoEngineGTP.jar with java -jar ..., also I'm not able to select the RefBot as player. I'm not sure what the others do, but one seems to connect to the 19x19 CGOS server. I don't have any experience with java. Any ideas? Isaac The beginning of the error: org.springframework.beans.factory.BeanCreationException: Error creating bean with name 'TesujiRefBot' defined in file [/Users/ibd/Desktop/computer go/Rango/CGOS/TesujiRefBot.xml]: Cannot resolve reference to bean 'referenceLibertyAdministration' while setting bean property 'monteCarloAdministration'; nested exception is org.springframework.beans.factory.BeanCreationException: Error creating bean with name 'referenceLibertyAdministration' defined in file [/Users/ibd/Desktop/computer go/Rango/CGOS/TesujiRefBot.xml]: Instantiation of bean failed; nested exception is java.lang.NoClassDefFoundError: tesuji/games/general/MoveIterator Original-Nachricht Datum: Sat, 7 Feb 2009 09:30:53 -0200 Von: Mark Boon tesujisoftw...@gmail.com An: computer-go computer-go@computer-go.org Betreff: Re: [computer-go] How to properly implement RAVE? You can get everything here: http://plug-and-go.dev.java.net The MCTS program description is under 'Derived Projects'. You don't really need the source-code. You can just get the 'binaries scripts' and then copy the files 'TesujiRefBot.jar' and 'TesujiRefBot.xml' to the directory where you put the binaries and everything works automagically. It's all plug and go. You may want to change the number of nodes to read in the XML file to 50K (it's called minimumNrNodes). Of course if you prefer to get the source you are free to do so. I do remember making a significant fix a little while ago that I don't think I submitted yet. But I won't be able to commit that for a few more days as I'm not at home. But if you implemented RAVE correctly your bot should do at least as well as this one. Hopefully it's any help. Mark On Sat, Feb 7, 2009 at 7:18 AM, Isaac Deutsch i...@gmx.ch wrote: I'm currently tied up but you can get my MCTS implementation, which includes RAVE, and set it up to play 50K playouts. It's only a matter of setting the right number in the configuration file. You can also use it to play through two-gtp, that way you can test an awful lot faster. Mark Hi Mark, This would be awesome. Can you send the source code to this eMail address? Thanks, ibd -- Pt! Schon vom neuen GMX MultiMessenger gehört? Der kann`s mit allen: http://www.gmx.net/de/go/multimessenger01 ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/ ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/ -- Pt! Schon vom neuen GMX MultiMessenger gehört? Der kann`s mit allen: http://www.gmx.net/de/go/multimessenger01 ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/ ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] How to properly implement RAVE?
I'm currently tied up but you can get my MCTS implementation, which includes RAVE, and set it up to play 50K playouts. It's only a matter of setting the right number in the configuration file. You can also use it to play through two-gtp, that way you can test an awful lot faster. Mark On Fri, Feb 6, 2009 at 3:55 PM, Isaac Deutsch i...@gmx.ch wrote: The rating of the bot still seems to be drifting upwards, but I think I can conclude my UCT implementation is OK afterall. Many thanks to the bots provided. Does someone have a bot that does 50k light playouts + RAVE? I would be most grateful if you could put them online for a few days of testing. :-) By the way, I've seen 2 games when checking my bot's status where one of the myCtest bots lost because of an illegal ko move. Maybe there's a bug in handling superko? Regards, Isaac -- Jetzt 1 Monat kostenlos! GMX FreeDSL - Telefonanschluss + DSL für nur 17,95 Euro/mtl.!* http://dsl.gmx.de/?ac=OM.AD.PD003K11308T4569a ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/ ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] MC and Japanese rules
I think as long as you don't count passes during exploration (or game- play) but only count passes during playout as points for the opponent, I don't see why you would need any adjustment. As to unsettled groups, that's what the second phase is for. Playout acts as the second phase in this case. Mark On Feb 4, 2009, at 3:59 PM, Rémi Coulom wrote: David Fotland wrote: I only pass in the playouts when the game is over. There is a possible one point adjustment depending on who passes first. So I can't see how you can avoid taking a one-point security margin with respect to komi. Who passes first in the playout is meaningless. A clever Japanese player would pass earlier. Rémi ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/ ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] time measurement
On Feb 3, 2009, at 2:34 PM, Heikki Levanto wrote: All in all, I think this is a messy and unreliable solution to a problem I have not seen happening. For what it is worth I vote against client-side time controls. Maybe you haven't seen it. That doesn't mean it doesn't exist. I've lost games on KGS because it took too long for my opponent's move to arrive. Made me lose the game before even given a split-second to respond. Client-side time-control would have prevented that. If the problem exists for human play it exists for computer play. I always thought that security-certificates, signed applications and public-key encryption were well equipped to tackle a problem like this. But I'm certainly no security expert. Mark ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
[computer-go] MC and Japanese rules
I think I've seen people post about playing with Japanese rules in relation to MC programs. Correct me if I'm wrong, but I think I saw people do some adjustment in that case. Does that mean they actually use Chinese scoring internally? Mark ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
[computer-go] stone-age and patterns
I haven't gotten very far yet in incorporating many of the suggestions published on this mailing-list into the MCTS ref-bot. As such I feel I still have a lot of catching up to do when it comes to MC programs, mostly due to lack of time. But I had an idea I wanted to share as I haven't seen anything like it described here. It comes forth from an observation I had when looking at playouts and what effects some of the patterns had on it. So far it's my opinion that guiding playouts is mostly useful in order to maintain certain features of the original position and prevent the random walk from stumbling into an unreasonable result. As an example I'm going to use the simple case of a stone in atari that cannot escape. When random play tries an escaping move, I make the program automatically play the capturing move to maintain the status of the stone(s) (now more than one) in atari. When implementing something like that in the playouts however, more often than not this 'pattern' arises not based on the original position but purely from the random play. I figured it doesn't help the program at all trying to maintain the captured status of a stone or stones that weren't even on the board at the start of the playout. So I tried a simple experiment: whenever a single stone is placed on the board I record the time (move-number really) it was created in an array I call stoneAge. When more stones are connected to the original they get the same age. When two chains merge I pick an arbitrary age of the two (I could have picked the smallest number, but it doesn't really matter). So for each chain of stones the array marks the earliest time of creation. Next, when a playout starts, I mark the starting time in a variable I call 'playoutStart' and there's a simple function: boolean isPrehistoric(int chain) { return stoneAge[chain]=playoutStart; } During playout, I only apply the tactical reader to chains for which the isPrehistoric() function returns true. Tests show that using this method doesn't affect the strength of the program at all. But the amount of time spent in the tactical reader is cut in less than half. I'm suspecting the same holds true to a large degree for other patterns, but I haven't had the time yet to test that. Other cases may not provide as much gain because they are cheaper to compute. But I think in general it's better to let the random play run its course as much as possible and restrict moves guided by patterns as much as possible to situations relevant to the original position. The stone- age information is very cheap to maintain so it's hardly a burden to use. Hope this helps anyone, especially those with slow tactical readers :) If anyone manages to use this successfully in other situations than tactical readers I'd be interested to hear it, as so far it's only a hunch that this has wider applicability than just tactics. I was going to wait until posting this until I had time to try it out for myself but lately I didn't have the time. Mark ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] How to properly implement RAVE?
On Feb 2, 2009, at 9:57 AM, Isaac Deutsch wrote: They are not pattern based playouts, but as I said uniformly random. I reckon the effect of RAVE is less with these? My MCTS implementation sees a gain of 200-400 ELO from RAVE using uniformly random moves. 400 gain (90% win-rate) for 2K playouts, becoming slowly less for larger numbers of playouts. Mark ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] Rules for remote play at the Computer Olympiad
On Feb 1, 2009, at 11:29 AM, Erik van der Werf wrote: Something else for the discussion. I would like to have a rule about mandatory displaying the thinking process of the program so that both operators have an idea of what is happening. Especially for remote play I think this is needed because now it is just too trivial to cheat. Do you want this just for 'remote' programs, or any program? What if the 'thinking process' is nothing intelligible for anyone else? Do we want to restrict programs made according to certain specifications which include that the thinking process is understandable? I don't know what the situation currently is in computer-Go, but I don't think the stakes are high enough to go over the trouble of cheating through a remote program (it's quite a lot of work). I have been accused of cheating once, but it was a rare thing to happen. I think either you allow remote programs and trust them, or you don't allow them at all. Anywhere in the middle will only cause more trouble. Mark ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] How to properly implement RAVE?
On Jan 21, 2009, at 10:23 AM, Magnus Persson wrote: Quoting Thomas Lavergne thomas.laver...@reveurs.org: - the best play is a good only if played immediatly and very bad if played later in the game : - the first playout for this play resulted in a lost. score and RAVE score will be very low and this play will never be considered again until a very long time. You raise an interesting concern. The simple solution to your question is to add an exploration term using UCT for example. Then it becomes an empirical question what parameter for exploration gives the strongest play. My experience is that the best parameter is so small it can be set to zero. Well, empirically, when I set the exploration component to zero it starts to play a lot worse. Like I wrote: the winning percentage drops to 24% vs. the same program with the exploration component, which is a huge difference. So if you have a different experience, you must have something else that overcomes this hurdle that's not part of a simple MCTS-RAVE implementation. I'd be very interested to learn what that is. Sylvain didn't take the bait ;-) Mark ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] How to properly implement RAVE?
On Jan 21, 2009, at 11:53 AM, Olivier Teytaud wrote: Here, we have a non-zero initialization of the number of wins, of the numbere of simulations, of the number of Rave-wins, of the number of Rave-losses. We have then a 0 constant for exploration, but also an exploratory term which is very different, and for which I am not the main author - therefore I let the main author give an explanation if he wants to :-) I point out that even before this exploratory term, the best UCB- like exploration-constant was 0 - as soon as the initializations of numbers of wins, of losses, of Rave-wins, of Rave-losses are heuristic values. I'd like to make sure I understand what you mean exactly. You use some heuristics to intialize all the moves (or maybe some of the moves) with a certain win-loss and rave-win-loss ratios? To a certain extent I suppose these could come from the reading of the previous move? I think I slowly start to make sense of things... Mark ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] How to properly implement RAVE?
On Jan 18, 2009, at 4:11 PM, David Fotland wrote: I think it is too expensive to read ladders during playouts. I remember that you have faster ladders search code so it might not cost you as much. My playout code has no ability to undo a move or do any kind of lookahead. Yes, my ladder code is fast. But it has been public for many years, I had figured others would have caught up on that. My playout code also has no ability to do undo. If I remember well, adding a few simple tactical searches during playouts made the performance go from 20Kps to 15Kps. But it moved to a winning-rate of about 90%. Reading tactics has to be really slow for that not to be worth it. I did find a nice heuristic to cut in less than half the amount of tactical reading necessary without any noticeable loss in playing strength. I'll write it down one day, I think it may have applicability beyond tactical reading and be a general concept to be used during playouts. Mark ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] How to properly implement RAVE?
On Jan 18, 2009, at 5:38 PM, Magnus Persson wrote: In Valkyria I solved this by playing out the ladder on the playout board, and store all changes on a stack. When the ladder undos moves it just pop changes from the stack. In this way I can also use the rich board representation of Valkyria to setup the ladder fast. The drawback is that the code is ugly and slightly buggy when stones are sacrificed during the search. A todo thing is to write a more correct ladder reader that still uses the idea of sharing the array with board with the playout, so that no copying has to be done. My ladder code re-uses the board array of the playout module. But copying this array before reading a ladder doesn't slow things down all that much. As has been discussed on this list elsewhere, copying an array is fast nowadays. Re-using the array is actually a way to make sure your ladder-reading is without bugs when doing undo. If something goes wrong you notice immediately... Maybe what David alluded to is that the playout has no undo so he can't use it to play and undo moves during ladder search. In my case the ladder reader only needs an array with the position. For the rest it's completely self-contained. That adds maybe a little bit when setting up the ladder reading, but it's still fast. Mark -Magnus Quoting David Fotland fotl...@smart-games.com: I think it is too expensive to read ladders during playouts. I remember that you have faster ladders search code so it might not cost you as much. My playout code has no ability to undo a move or do any kind of lookahead. David -- Magnus Persson Berlin, Germany ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/ ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] How to properly implement RAVE?
Thanks for taking the time Sylvain. I took a quick look to see how your pseudo-code compared to the reference implementation I made. There are a few small differences, and I think the place(s) where I deviated is exactly what confused Daniel Waeber. The first difference is that when I check whether a point has been played, I always start from the position of the root-node, whereas you start from the position of the node where the moves_played_in_tree is played. The second difference is I also don't count a move if the opposite color played on that point first. The result is I only need to compute the alreadyPlayed map once (in my code these are called weightMap and colorMap, I need two maps because I use decreasing weights) instead of computing it at each step back up in the tree. The only time these 'deviations' make a difference is in case of ko and ishi-no-shita. When I have a little spare time I'll check to see if this actually makes a difference in playing strength. If there's any noticeable difference I'll try to modify the code in my reference implementation to reflect the 'correct' method. Mark On Jan 17, 2009, at 11:48 AM, Sylvain Gelly wrote: Hi, Sorry for the slow reply. After those discussions, I figured out that pseudo code was the fastest clear and not ambiguous way to describe how to precisely implement the algorithm. I needed to find some time to write it. Note that I did not write only the backup phase because to clearly describe the backup you have to see the playouts as well. I also describe the choice of the best move. Note also the fact that the backup from the moves in the tree and from the moves out of the tree is different. That is the somehow more tricky part to check that a move has not been already played (that is different for each node in the tree and we obviously don't want to keep a vector of already played moves for each node). Please forgive the mistakes that are in that code, of course I did not make any test, and it has been quite a long time I am not in the computer go trip anymore :). Please correct any mistake, I hope it makes things clearer. Best, Sylvain class AmafBoard { AmafBoard() { offset = 0 fill(alread_played, 0) } Clear() { offset++; } Play(move) { already_played[move] = offset } AlreadyPlayed(move) { return already_played[move] == offset } vector already_played int offset } ChooseMove(node, board) { bias = 0.015 // I put a random number here, to be tuned b = bias * bias / 0.25 best_value = -1 best_move = PASSMOVE for (move in board.allmoves) { c = node.child(move).counts w = node.child(move).wins rc = node.rave_counts[move] rw = node.rave_wins[move] coefficient = 1 - rc * (rc + c + rc * c * b) value = w / c * coef + rw / rc * (1 - coef) // please here take care of the c==0 and rc == 0 cases if (value best_value) { best_value = value best_move = move } } return best_move } PlayoutInTree(board, moves_played_in_tree, nodes_seen_in_tree) { node = tree.root while (node) { move = ChooseMove(node, board) moves_played_in_tree.append(move) nodes_seen_in_tree.append(node) board.PlayMove(move) node = node.child(move) } } PlayoutOutTree(board, AmafBoard played, moves) { int pass = 0 while (pass 2) { move = board.ChooseMoveAndPlay() if (move == PASSMOVE) { pass ++ continue } else { pass = 0 } if (!played.AlreadyPlayed(move)) { if (!board.last_move_was_black()) { move = -move } moves.append(move) } } return board.black_wins() } BackupNode(node, index, black_wins, moves_played_in_tree, moves_played_out_tree, already_played) { already_played.Clear() win = node.is_black_to_play ? black_wins : !black_wins // normal backup node.wins += win node.counts ++ // rave backup for (j = index; j moves_played_in_tree.size(); j += 2) { move = moves_played_in_tree[j] if (already_played.AlreadyPlayed(move)) continue already_played.Play(move) node.rave_wins[move] += win node.rave_counts[move] ++ } for (j = 0; j moves_played_out_tree.size(); ++j) { move = moves_played_out_tree[j] if (!node.is_black_to_play) move = -move // If it is either not the right color or the intersection is already played we ignore that move for that node if (move 0 || already_played.AlreadyPlayed(move)) continue already_played.Play(move) node.rave_wins[move] += win node.rave_counts[move] ++ } } Backup(black_wins, nodes_seen_in_tree, moves_played_in_tree, moves_played_out_tree, already_played) { for (i = 0; i nodes_seen_in_tree.size(); ++i) { node = nodes_seen_in_tree[i] BackupNode(node, i, black_wins, moves_played_in_tree, moves_played_out_tree, already_played) } } OneIteration(board_position, AmafBoard already_played) { board = board_position.Copy() already_played.Clear() vector moves_played_in_tree vector nodes_seen_in_tree PlayoutInTree(board,
Re: [computer-go] How to properly implement RAVE?
On Jan 17, 2009, at 5:41 PM, Sylvain Gelly wrote: For the first difference you mention, as far as I remember it makes a small but significant difference and is one of the main reason I was talking about tricky details. OK, I ran a test and after 1,000 games with 2K semi-light playouts I get a winning percentage of 50.6% for your methods vs. mine. Of course it's possible I made some mistake, but my first impression is it makes no difference which way you do this particular detail. Your ChooseNode is also quite different from mine, mostly because I also still have a UCT component in there. I'll give your method a go one day, just to see if it changes anything. I've come to understand what you mean by tricky details, sometimes I see a big difference in playing strength that I find hard to explain given the change(s) I made. Conversely I've been in quite a few cases where I thought something would make a difference, only to find out it all didn't matter one bit. It's also possible that some deficiencies that would be apparent in one implementation, get compensated for in another. Some examples: David Fotland wrote he does light playouts with just a few patterns but no tactics. I find that using a moderate amount of tactics actually is the biggest contributor to playing strength (save one or more stones if can't be caught in ladder). However, augmenting patterns with tactical information I found doesn't help at all, even when disregarding the performance cost. Maybe David uses some patterns to compensate for part of the tactics and relies on the faster playouts to compensate for poorer playouts. I'm guessing here, but otherwise I can't imagine why he would forego what otherwise seems to be a big gain in strength. I also tried to use ownership maps to modify the RAVE value. Remi Coulom wrote in a paper he used ownership information of up to 63 playouts. When I tried something similar it always makes play weaker. Maybe I should use it in a different way, but I haven't stumbled on the solution yet. When I think of it, AMAF information is already something very similar to ownership information. So maybe combining the two doesn't make much sense. Lastly, in an earlier UCT bot that I made I gained a lot by initially reducing the number of moves and slowly expanding it. After using AMAF it turns out this method hardly gains anything at all anymore. So the devil is not only in the details, it's also in the combination of the details. Mark ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] How to properly implement RAVE?
On Jan 15, 2009, at 10:47 AM, Daniel Waeber wrote: Thanks you. I think that I understand it now :) On 23:21 Wed 14 Jan , Mark Boon wrote: You have to understand that the 'start' variable really starts at the root from the position for which we do the search. So all the moves 'below' the playoutNode are also taken into account. The check in the earlier part if (_colorMap[moveXY]==0) makes sure there's no move between the playoutNode and the n.move as you call it. Ah, yes, I did not get the meaning of start right. But still I think you have to incrementally add values to the maps as you go down the tree. But it does that. This happens when playoutNode is set to its parent and the AMAF values are added again (now for the other color) until the root is reached. I think there's a problem with caputres/ko-fights inside the tree. All nodes after the capture should get the amaf color/value for the stones put there after the capture and not the value of the stones that were put there and captured. Most likely I missed a little piece of code again, but hey, perhaps its a real bug. I did stop to think about ko and 'ishi no shita' (something like 'playing under the stones') a little bit but couldn't come to an immediate conclusion what would be the best thing to do. So I decided to leave it as it is. You didn't miss any little piece of code there. Maybe there's room for improvement in case of ko, but my guess is the difference will be minimal at best. If you find it does make a big difference everyone here will be delighted to hear it. Given how it's performing I doubt there are major problems with my AMAF-RAVE implementation. Mark ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] How to properly implement RAVE?
On Jan 15, 2009, at 12:33 PM, Daniel Waeber wrote: yes, but the weight/color maps stay the same for all updated nodes. I think the first playoutNode (the one most deep inside the tree) only should get amaf values for the random playout, the next one form random playout + from the first playoutNode ... and the root node amaf values form all the nodes. OK, I think I see now what you're trying to say. This is something I did think about. I hope my memory serves me well enough to say why I didn't do it that way. - What you propose adds complexity and possibly computation (if it means recalculating or adjusting the weight map). - I don't think it makes all that much difference. The reason I don't think it makes much difference is that adding AMAF values for the moves above (closer to the root) the playoutNode is that most likely those points are now occupied. Since the AMAF values are used to compute which empty points are the best next candidate, the AMAF values at occupied points are immaterial. They are not even added. So it only makes a difference in cases where played stones get captured and those points then are occupied again. This brings us back to the issue discussed earlier about ko and ishi-no-shita. Mark P.S. what do I need to open that file? Is it a SVN patch? ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] How to properly implement RAVE?
On Jan 14, 2009, at 9:36 AM, Daniel Waeber wrote: I have a question about the the part were the stats are updated. (l.15-25). having an array of amaf-values in every node seems very memory intensive and I don't get how you would access these values. You are right, it is memory intensive. I believe this is one of the reasons that most implementations wait a certain number of playouts before creating the next level of nodes. Accessing the AMAF values depends on your implementation. The following is a code-snippet from my MCTS reference implementation that updates the AMAF values in the tree: if (_useAMAF) { IntStack playoutMoves = _searchAdministration.getMoveStack(); byte color = _monteCarloAdministration.getColorToMove(); int start = _monteCarloAdministration.getMoveStack().getSize(); int end = playoutMoves.getSize(); double weight = 1.0; double weightDelta = 1.0 / (end - start + 1); // Michael Williams' idea to use decreasing weights GoArray.clear(_weightMap); GoArray.clear(_colorMap); for (int i=start; iend; i++) { int moveXY = playoutMoves.get(i); if (_colorMap[moveXY]==0) { _colorMap[moveXY] = color; _weightMap[moveXY] = weight; } weight -= weightDelta; color = opposite(color); } while (playoutNode!=null) { color = opposite(playoutNode.getContent().getMove().getColor()); boolean playerWins = (blackWins color==BLACK) || (!blackWins color==WHITE); double score = playerWins ? MonteCarloTreeSearchResult.MAX_SCORE : MonteCarloTreeSearchResult.MIN_SCORE; for (int i=0; iplayoutNode.getChildCount(); i++) { TreeNodeMonteCarloTreeSearchResultMoveType nextNode = playoutNode.getChildAt(i); MonteCarloTreeSearchResultMoveType result = nextNode.getContent(); GoMove move = (GoMove) result.getMove(); int xy = move.getXY(); if (_colorMap[xy]==color) result.increaseVirtualPlayouts(_weightMap[xy]*score,_weightMap[xy]); } playoutNode = playoutNode.getParent(); } } playoutNode is the move-node from which the playout was done. The amaf values are stored in its children by the increaseVirtualPlayout() method. Note that it goes up the tree by assigning the parent to playoutNode until it gets to the root. For more context it would be better to lookup the whole source at http://plug-and-go.dev.java.net If you think some more comments in the code could clarify things better I'm open to suggestions. Good luck. Mark ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] Re: GCP on ICGA Events 2009 in Pamplona
It's difficult to get hard data about this. Go is only the most popular game in Korea. In other countries like Japan and China it comes second by far to a local chess variation. Possibly Chess is more ingrained in Western culture than Go is in Asia, I don't know really. But Chess has the population-numbers of West vs. East against it. If there are more chess-players than Go- players in the world then it won't be by much. But the Go market is probably a lot bigger. Look only at the money in professional Go tournaments. It's probably an order of magnitude more than the money in professional Chess. But I must admit this is just a guess of mine. Mark On Jan 12, 2009, at 9:22 AM, steve uurtamo wrote: i think you might be estimating this incorrectly. s. On Sat, Jan 10, 2009 at 9:00 AM, Gian-Carlo Pascutto g...@sjeng.org wrote: Ingo Althöfer wrote: What prevents you from freezing in your chess activities for the next few months and hobbying full (free) time on computer go. The amount of chess players compared to the amount of go players. -- GCP ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/ ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/ ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] Re: GCP on ICGA Events 2009 in Pamplona
On Jan 14, 2009, at 12:43 PM, Thomas Lavergne wrote: Couting xiangqi and shogi players as chess players is a bit unfair... Sorry if I caused confusion, I didn't mean to count those as Chess- players. I just stated that to show that despite large population- numbers in say China, most of those people play Xiang-Qi rather than Wei-Qi. This in contrast to a large country like Russia where I believe Chess is by far the most popular. In Holland however, Chess comes only at third place (or maybe even lower) after Bridge and Draughts. Mark ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
[computer-go] CGOS ELO questions
I'm not so knowledgeable about the ELO system and had a few questions about how it's used by the CGOS server. Firstly, on the CGOS server page there's an explanation of how it works with a table of expected winning percentages vs. difference in ELO ratings: http://cgos.boardspace.net/ (towards the bottom) This table is rather different from another table I found on the GoBase page about the ELO rating system: http://gobase.org/studying/articles/elo/ Is there a reason why they are so different? Secondly, I have let my MCTS reference implementation (with a few modifications) play on CGOS with different numbers of playouts. What I'm noticing is that initially the rating changes rapidly, slowing down over time. This is explained on the CGOS page that it uses a K- factor that slowly goes down to 3.0 over 200-300 games. What I tend to see however is that the rating after some 200 games is very much determined by how well it did early on. After these 200-300 games I still see a drift towards its actual rating (I assume) for a very long time afterwards. Since one of the purposes of CGOS is for a large part so that people can verify the strength of their program against others, it's desirable to converge on the actual rating as fast as possible. What I'm seeing suggest that maybe the K-factor reaches 3.0 too fast. Is that a reasonable conclusion? I'd be interested to hear opinions about that. Mark ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] CGOS ELO questions
On Jan 14, 2009, at 2:15 PM, Rémi Coulom wrote: Mark Boon wrote: I'm not so knowledgeable about the ELO system and had a few questions about how it's used by the CGOS server. Firstly, on the CGOS server page there's an explanation of how it works with a table of expected winning percentages vs. difference in ELO ratings: http://cgos.boardspace.net/ (towards the bottom) This table is rather different from another table I found on the GoBase page about the ELO rating system: http://gobase.org/studying/articles/elo/ Is there a reason why they are so different? CGOS probably uses the usual logistic formula: http://en.wikipedia.org/wiki/Elo_rating_system#Mathematical_details Arpad Elo, in his book, used the cumulative distribution of a Gaussian instead. It has a similar shape, but it is different. FIDE numbers are based on that older Gaussian formula. Thanks for that link Remi. That same page states however that the FIDE also switched to the method of using the logistic distribution. But it doesn't say when that happened. Mark ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] CGOS ELO questions
On Jan 14, 2009, at 3:40 PM, Don Dailey wrote: However, the best thing to do is to ignore that page and go the Bayes Rated link which is updated every day. This is the total performance rating over all time of any player on CGOS. Everything is rated together, even if you have only played 1 or 2 games but I do not display players with less than 20 games. I see, I wasn't aware of that page. Doesn't it make sense then to occasionally calibrate the ratings of the active programs with their rating on this list? There seems to be quite a difference between the rating of many programs on that list compared to what the CGOS server shows while playing. A difference of 100 points or more often. I'm particularly surprised by the high rating of the MCTS-2K.v814 program after 349 games. Somehow an ELO of 1753 seems way too high for a program with just 2,000 semi-light playouts, compared to a rating of 1340 of the same program with 2,000 light playouts. Of course I introduced a few improvements, but my own tests didn't indicate such a big difference. Conversely, my tests show a 90% win-rate by the 32K version against the 2K version. But the 32K version is only rated 2014 after some 600 games. According to your table I'd expect a 400 point difference between the two. Is it possible that after 350 games a program's 'bayeslo' rating is off by 100-150 ELO points? Isn't that extremely unlikely, bordering on the impossible? Mark ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] Re: Hardware limits
On Jan 14, 2009, at 8:39 PM, David Doshay wrote: Saving energy is a fine thing. Lets leave that to various hardware engineers in the semiconductor industry. Or, if you think this is such a grand idea then you should offer up the prize money and then we can all see who comes to compete for it. Actually, this is one of IBM's sales-pitches. That their big computers use less energy per transaction than most other setups, whether they're super-computers or clusters. Maybe we can ask them to sponsor an event :-). I think we need to encourage a wide variety of approaches to make progress. Severe restrictions on the hardware to be used doesn't fit in that. But MCTS programs are known to scale well. So it's also not desirable to have a single-CPU computer compete with a 3,000 CPU cluster and call it even. So personally I'm still of the opinion that you can roughly divide competitions between 'all you can carry' and 'as large and powerful as you can arrange'. That will suffice for now. I'm sure computer-Go has quite a bit to go before it's in a similar situation as computer-chess. Mark ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] How to properly implement RAVE?
On Jan 14, 2009, at 10:55 PM, Daniel Waeber wrote: Accessing the AMAF values depends on your implementation. The following is a code-snippet from my MCTS reference implementation that updates the AMAF values in the tree: if (_useAMAF) { IntStack playoutMoves = _searchAdministration.getMoveStack(); byte color = _monteCarloAdministration.getColorToMove(); int start = _monteCarloAdministration.getMoveStack().getSize(); int end = playoutMoves.getSize(); double weight = 1.0; double weightDelta = 1.0 / (end - start + 1); // Michael Williams' idea to use decreasing weights GoArray.clear(_weightMap); GoArray.clear(_colorMap); for (int i=start; iend; i++) { int moveXY = playoutMoves.get(i); if (_colorMap[moveXY]==0) { _colorMap[moveXY] = color; _weightMap[moveXY] = weight; } weight -= weightDelta; color = opposite(color); } until here it's clear to me. OK, I hope so. Until here it's pretty much the same as in the original AMAF ref-bot without tree-search as defined by Don. while (playoutNode!=null) { color = opposite(playoutNode.getContent().getMove().getColor()); boolean playerWins = (blackWins color==BLACK) || (!blackWins color==WHITE); double score = playerWins ? MonteCarloTreeSearchResult.MAX_SCORE : MonteCarloTreeSearchResult.MIN_SCORE; for (int i=0; iplayoutNode.getChildCount(); i++) { TreeNodeMonteCarloTreeSearchResultMoveType nextNode = playoutNode.getChildAt(i); MonteCarloTreeSearchResultMoveType result = nextNode.getContent(); GoMove move = (GoMove) result.getMove(); int xy = move.getXY(); if (_colorMap[xy]==color) result.increaseVirtualPlayouts(_weightMap[xy]*score,_weightMap[xy]); if i understand this code correctly, it updates the amaf vaules of all direct children of the playoutNode according to the weight/color maps. Yes. And that update is done for all nodes on the selected path. Yes, until the root, which is the starting position from which the search is performed. } playoutNode = playoutNode.getParent(); First of all, I miss an weight/colorMap update for xy here. Souldn't the move of the current playoutNode be considered as an amaf move for all the nodes below this one? This is in fact done in this code, see further remarks below. } } But, most of all, I miss that the code only updates the amaf values of all direct children, and not of all nodes n below the playoutNode, where there is no play at n.move on the path between n and the playoutNode. Finding all these nodes n would be a costy thing to do, but wouldn't that be the right thing to do? Implementing a realistic subset of RAVE is another story, but first of all I want to understand the pure concept of RAVE. You have to understand that the 'start' variable really starts at the root from the position for which we do the search. So all the moves 'below' the playoutNode are also taken into account. The check in the earlier part if (_colorMap[moveXY]==0) makes sure there's no move between the playoutNode and the n.move as you call it. That is, if I understand you correctly. Because I'm not quite sure what you mean by 'finding all these nodes n'. This is not a sub-set of RAVE as I understand it. What you see in the code is the accumulation of the AMAF values after each playout. The RAVE value is calculated using these values when comparing move-candidates, which is in an altogether different place. (The MonteCarloTreeSearchResult class in my project). I'm afraid I haven't made things much clearer for you. Mark ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] Re: Hardware limits
On Jan 10, 2009, at 8:16 AM, Gian-Carlo Pascutto wrote: Dave Dyer wrote: I think general hardware limits are good, because they will permit more teams to be competitive without altering the nature of the competition. So in effect, it's an admission that the strength of some teams should be crippled in a completely arbitrary way, because they are to good for the others. It's nice that someone admits this in writing. Please, don't sneer. Different people have different ideas about things. If you don't agree with them, try to make them see your point of view by way of arguments. Don't get personal, it won't help you in any way. Rather the contrary, people will become less inclined to listen to you. We are trying to make computers play Go as well as possible. That inevitably has both a hardware and a software side to it. So it seems arbitrary to put limitations on the hardware. However, if two programs are essentially the same, but one side manages to bring a more powerful computer than the other, is it fair to award one program a prize and not the other? This is not an easy matter. Taking an extreme standpoint one way or the other is going to be difficult to maintain. For now I tend to be of the opinion that in competitions, one should be able to bring your own hardware or run on standard hardware provided by organizers. The restriction that the hardware be physically present allows for enough flexibility that people or teams can try different set-ups (like a row of PS3s) while avoiding having people with access to a big cluster compete with people who only have access to a PC. But similarly to the competition of building the most powerful computer in the world, I can see room for a competition between big clusters that play Go as well. One doesn't have to be to the exclusion of the other. Think of car-racing. You have drag-racing where they use rockets to cross half a mile as fast as possible and you have F1- racing where the 'hardware' is constrained within certain limits. Mark ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
[computer-go] Strange Ko
The attached game played on CGOS was awarded a win for white due to an illegal move. Black tried to play J3, which according to the game- record is a ko. Nothing could be further from the truth... Rather shocking really. What happened here? Mark 684276.sgf Description: Binary data ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] RefBot (thought-) experiments
On Tue, Dec 16, 2008 at 12:20 PM, Jason House jason.james.ho...@gmail.com wrote: When thinking about the apparent strength loss, I came up with a potential theory: consistency. With more simulations, noise has less of an impact. I'm going to guess that the known bias of AMAF leads to blunder that is played more consistently. Bots with fewer simulations would make the blunder too, but also pick sub-optimal moves due to evaluation noise. This is something I noticed while watching a few games on CGOS. The higher the number of playouts, the more often it plays the first moves exactly the same. That may lead to skewed results to an individual opponent. For example, if it always plays the same losing sequence, the loss ratio against that opponent becomes larger than normal. This gets averaged out with a large number of opponents, but CGOS has just a few participants. Mark ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] RefBot (thought-) experiments
By the way, what does scratch100k.sh look like? ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] RefBot (thought-) experiments
- Show quoted text - On Tue, Dec 16, 2008 at 11:35 PM, Weston Markham weston.mark...@gmail.com wrote: On Wed, Dec 17, 2008 at 1:32 AM, Mark Boon tesujisoftw...@gmail.com wrote: By the way, what does scratch100k.sh look like? ../gogui-1.1.3/bin/gogui-twogtp -auto -black java -jar jrefgo.jar 10 -game s 1 -komi 0.5 -maxmoves 10 -referee gnugo --mode gtp --score aftermath --ch inese-rules --positional-superko -sgffile games/jr100k-v-mogo-10m -size 9 -whit e `cygpath -w /home/Experience/projects/go/MoGo_release3/mogo` Thanks. I just realized that you set the komi to 0.5. That doesn't sound like a good idea. I wanted to make sure you had the same for the 100k version. Were your earlier experiments also with 0.5 komi? MC programs are highly sensitive to komi, so I'd use something more reasonable, like 6.5 or 7.5. Mark ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] UEC cup
On Tue, Dec 16, 2008 at 8:52 PM, Michael Goetze mgoe...@mgoetze.net wrote: I wish people would stop spreading such incorrect information. The correlation between professional ranks and playing strength is quite bad, and EGF 7dans are not, generally speaking, professional strength. I'm not claiming to be an authority on the matter, but I beg to differ. Name me an EGF 7-dan that's not professional level. And then explain how come they are listed among players that are anywhere from 1p to 5p in different Asian countries. I used to be an EGF 6-dan and have beaten top 9p players with 3 stones on occasion. For a while I had a Japanese 2p teacher but stopped taking lessons when I started to beat him on black pretty consistently. That was when I was still 5-dan. So I don't think it's so far off to say 7-dan amateur is pro level. The main problem is that ranks of different countries differ considerably, even for professionals. I also think as an amateur my chances would have been much lower had there been anything at stake for the pro. But it's little use to quible about it. If CrazyStone is 1-dan or more, it will become clear sooner or later. It's just a matter of time and enough games. Mark ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] UEC cup
My understanding of the PlayStation is that it's a Cell architecture, with one main CPU and six auxilary processing units with limited capability. Of course you don't need much for something to do MC playouts, so it seems a very suitable architecture. So 8 PS3s gives a total of 56 CPU's. Plus the four of the desktop that would make 60. I have mixed feelings about this piling up of hardware. On the one hand it's exciting. Complex parallel processing to improve the level of play is very interesting. On the other hand, I hope attention doesn't only go towards putting more computing power together. Mark On 15-dec-08, at 08:23, Darren Cook wrote: Advertisement: Fudo Go used a desktop pc (Intel Q9550) and _eight_ Playstation 3 consoles on a private Gigabit Ethernet LAN. Hello Kato-sensei, Are you able to use all 8 cores of the playstation? So, with the 4 of the Q9550, 68 cores altogether? Do you, or your students, have any papers on the hardware challenges/solutions? Darren -- Darren Cook, Software Researcher/Developer http://dcook.org/mlsn/ (English-Japanese-German-Chinese-Arabic open source dictionary/semantic network) http://dcook.org/work/ (About me and my work) http://dcook.org/blogs.html (My blogs and articles) ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/ ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] UEC cup
On Mon, Dec 15, 2008 at 12:37 PM, Don Dailey drdai...@cox.net wrote: So I think we have to embrace the fact that hardware is a part of these kinds of advancements. In fact I have always believe this anyway, the whole idea behind computing is to perform simple and stupid operations very very quickly.It's easy to forget that everything about computing and what is possible is tied to the power of the hardware. There is another school of thought that I somewhat subscribe to and I think you are alluding to, that we have been spoiled by the power and do not look for the most efficient way to do things. That is another matter, one that I agree with. But that was not what I was alluding to. If anything, the fact that MC programs are highly scalable puts much more focus on efficient algorithms than has been the case in prior years. As with any IT problem, Computer-Go is both about hardware and software. I have no problem with that, it's the nature of computers and software. What I was alluding to was that I hope the software doesn't take a back-seat to the increase of hardware. I don't think it's nearly as interesting if it becomes a competition of who can bring the biggest piece of iron or who can arrange the biggest sponsor to pay for hardware (which is a bit what happened with Deep Blue). In the meantime, I think the advances that MC programs have brought are great. And so is the attention the matches get when playing against a pro with a super-computer. Mark ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] RefBot (thought-) experiments
Weston, Although those result sound intriguing, it also looks like a convoluted experiment. I wouldn't call gnu-go an expert judge, although it is an impartial one. The fact that it says that the 5K ref-bot is ahead after 10 moves 46% of the time alone makes it suspect in my eyes. But it is curious it consistently shows a much lower percentage for the bot with more playouts. It would have been much more persuasive if you had simply run a 5K playout bot against a 100K bot and see which wins more. It shouldn't take much more than a day to gather a significant number of games. twogtp is perfect for this. Or connect both to CGOS and see which ends up with a higher rating. But in that case it will take a week or more before you get conclusive data. Unless the difference is really clear. I did in fact put up a 100K+ ref-bot on CGOS for a little while, and it ended up with a rating slightly (possibly insignificantly) higher than the 2K ref-bot. Maybe I didn't put it there long enough, certainly not for thousands of games. But it didn't look anywhere near to supporting your findings. I say 100K+ because I didn't set it to a specific number, just run as many as it could within time allowed. Generally it would reach well over 100K per move, probably more like 250K-500K. That should only make things worse according to your hypothesis. So although I think the result of your experiment is very curious, I think it might be a bit hasty draw your conclusion. Mark On Mon, Dec 15, 2008 at 8:30 PM, Weston Markham weston.mark...@gmail.com wrote: Hi. This is a continuation of a month-old conversation about the possibility that the quality of AMAF Monte Carlo can degrade, as the number of simulations increases: Me: running 10k playouts can be significantly worse than running 5k playouts. On Tue, Nov 18, 2008 at 2:27 PM, Don Dailey drdai...@cox.net wrote: On Tue, 2008-11-18 at 14:17 -0500, Weston Markham wrote: On Tue, Nov 18, 2008 at 12:02 PM, Michael Williams michaelwilliam...@gmail.com wrote: It doesn't make any sense to me from a theoretical perspective. Do you have empirical evidence? I used to have data on this, from a program that I think was very nearly identical to Don's reference spec. When I get a chance, I'll try to reproduce it. Unless the difference is large, you will have to run thousands of games to back this up. - Don I am comparing the behavior of the AMAF reference bot with 5000 playouts against the behavior with 10 playouts, and I am only considering the first ten moves (five from each player) of the (9x9) games. I downloaded a copy of Don's reference bot, as well as a copy of Mogo, which is used as an opponent for each of the two settings. gnugo version 3.7.11 is also used, in order to judge which side won (jrefgo or mogo) after each individual match. gnugo was used because it is simple to set it up for this sort of thing via command-line options, and it seems plausible that it should give a somewhat realistic assessment of the situation. jrefgo always plays black, and Mogo plays white. Komi is set to 0.5, so that jrefgo has a reasonable number of winning lines available to it, although the general superiority of Mogo means that egregiously bad individual moves will be punished. In the games played, Mogo would occasionally crash. (This was run under Windows Vista; perhaps there is some incompatibility of the binary I downloaded) I have discarded these games (about 1 out of 50, I think) from the statistics gathered. As far as I know, there would be no reason to think that this would skew the comparison between 5k playouts and 100k playouts. Other than occasional crashes, the behavior of Mogo seemed reasonable in other games that I observed. I have no reason to think that it was not playing at a relatively high level in the retained results. Out of 3637 matches using 5k playouts, jrefgo won (i.e., was ahead after 10 moves, as estimated by gnugo) 1688 of them. (46.4%) Out of 2949 matches using 100k playouts, jrefgo won 785. (26.6%) It appears clear to me that increasing the number of playouts from 5k to 100k certainly degrades the performance of jrefgo. Below, I am including the commands that I used to run the tests and tally the results. Weston $ cat scratch5k.sh ../gogui-1.1.3/bin/gogui-twogtp -auto -black \C:Program FilesJavaj dk1.6.0_06binjava.exe\ -jar jrefgo.jar 5000 -games 1 -komi 0.5 -ma xmoves 10 -referee gnugo --mode gtp --score aftermath --chinese-rules --positio nal-superko -sgffile games/jr5k-v-mogo -size 9 -white C:cygwinhomeE xperienceprojectsgoMoGo_release3mogo.exe $ grep B+ games/jr5k-v-mogo.dat | grep -v unexp | wc -l 1688 $ grep W+ games/jr5k-v-mogo.dat | grep -v unexp | wc -l 1949 $ grep B+ games/jr100k-v-mogo.dat | grep -v unexp | wc -l 785 $ grep W+ games/jr100k-v-mogo.dat | grep -v unexp | wc -l 2164
Re: [computer-go] MC Opening stage
People sometimes play all kinds of silly things. It's not necessary to anticipate it all, just make sure you can keep playing reasonable moves when the other side plays strangely. There's little problem with continuing to play in the corners and sides when your opponent decides to do something else. Mark On Wed, Dec 10, 2008 at 7:02 PM, Heikki Levanto [EMAIL PROTECTED] wrote: On Wed, Dec 10, 2008 at 09:57:18PM +0100, [EMAIL PROTECTED] wrote: And 6-7 every now and then (humans imitating MC bots?). Well, didn't Bruce Wilcox recommend an opening that built a line across the board, starting at 4-10 (middle of the side). Was supposed to be effective against people who didn't expect it, and not too bad even against those who knew it... Not something I would dare to play in a serious tournament, but then again, I don't play serious tournaments these days. -H -- Heikki Levanto In Murphy We Turst heikki (at) lsd (dot) dk ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/ ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] latest and greatest ref bots?
Well, I don't know about 'latest' or 'greatest', but I've posted a Java implementation of Don's reference definition a while ago. And my recent effort to come to a MCTS reference implementation is there as well. http://plug-and-go.dev.java.net I don't think the site has seen much traffic yet. Mark On 8-dec-08, at 19:41, terry mcintyre wrote: What's the status of the greatest-and-latest reference bots? Are the sources available anywhere? Terry McIntyre [EMAIL PROTECTED] -- Libertarians Do It With Consent! ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/ ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] Monte-Carlo Tree Search reference bot
On 3-dec-08, at 06:09, Sylvain Gelly wrote: What I did (was a long time ago, I don't know if it is still used in Mogo), is to compute the m best moves every so often and most of the time just do the max over those m moves. m was on the order of 5, and every so often was an increasing function like sqrt or log (I don't remember). That speeds up quite a lot (when the max becomes the bottleneck), and if you tune the parameters well you almost don't loose any strength at a fixed number of playouts. I thought about that, but I was afraid the code would become too obscure. After all, this is supposed to be a reference implementation. But maybe I should actually give it a try to see what it would look like. I also played with precomputing a table of the log() function but found little speedup. Instead I decided to (p)recompute the log(nr- parent-visits) only once in 8 times. According to my profiler that reduced the log() from using 8% to 1% of computation time. Another thing I did was replace log(nr-virtual-parent-visits) in the RAVE calculation by the same log(nr-parent-visits) used in the UCT calculation, cutting both the number of times I need to call log in half and reducing the amount of storage in the node a little. After these modifications I didn't notice any degradation in play. In terms of memory use, I did a few obvious things to keep storage of the nodes to a minimum. I had seen people write about memory usage of the tree, but never understood the concerns. But that was because I always expanded the tree one node at a time. Of course, if you create all the children in one go like this reference implementation does, it's a little bit a different story. But still, I have to push my computer hard to run out of space before I run out of time so I didn't see much need to reduce memory too aggressively. Using just a moderately small number of playouts before expansion is already enough to never have memory problems. But if memory is a concern to anyone I see two obvious ways to make make significant reductions. One is to flatten the data-structure, reducing the object overhead Java imposes. That could save up to a third. The other is to use smaller placeholders for nodes that have been created, but not yet visited. Once a node gets visited it can be replaced by a full-blown node, but until then all you need is the move-coordinate and the AMAF statistics. That should save up to 75% or so. Mark ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] Monte-Carlo Tree Search reference bot
On 3-dec-08, at 10:31, Heikki Levanto wrote: Having said that, I can't help responding to one detail: I had seen people write about memory usage of the tree, but never understood the concerns. One thing to remember is that more memory use means more cache misses, and more access to the main memory. On modern computers, those can cost as much as executing a thousand instructions! So memory optimizing can often pay off, also with respect to time! Of course, Java does a lot of memory management behind the scenes, so optimizing such can be harder... But when writing in C or C++, it does make sense. I've seen this stated often. But color me skeptical. If what you say is true, then the MC-AMAF bot, which hardly uses any memory at all and which fits in the secondary cache entirely, code and data combined, should show an enormous speed-up in terms of number of playouts per second. Instead it does only about twice the number of playouts per second as the tree-search version, which can be almost if not entirely explained by the extra work that needs to be done to traverse and maintain the tree. If this is because I use Java, then Don's concise C implementation of the MC-AMAF bot should be a lot faster than my bloated Java version. Which, ahem, is not the case at all. I think a well-crafted C program can be up to twice as fast as a similar Java program. But I doubt that has much to do with cache misses. I think in computer-Go, trying to control cache misses is futile no matter the language. Maybe you can do it for a relatively trivial, well-understood and stable sub-part. But then you risk losing the gain again as soon as it gets incorporated into a larger whole. So your efforts go to waste. So this is in the show me category for me. You can use C or C++ if you like, but show me a tree-search bot that speeds up the number of playouts significantly by reducing the node-size. Maybe it's possible, but I don't believe it until I see it. This could be different for Chess programs, at least I've seen it claimed often. But I don't envy them, it must be excruciatingly frustrating to deal with. Mark ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] Mogo MCTS is not UCT ? (GENERALIZED_AMAF)
At first blush, it sounds like a reasonable idea. However, as always with these things, the proof of the cake is in the eating. So you'd have to try it out against a ref-bot to see if it improves things. You also need to keep an eye on time used, as it sounds a lot more CPU intensive than plain AMAF. As to the quote below: what is meant with the 'very beginning'? If that's the very beginning of the game, then I suppose an opening book can be used for the initial statistics. If it means, at the beginning of the search process of a move, then I suppose you can start by using the data generated by the previous search. Most likely it's very rare your opponent plays a move that has not been investigated by your previous search at all. Another thing you can do is use the opponent's time to do a little preparation. I haven't spent time looking at 'pondering' yet, but if it were me I'd start by building a two-ply tree of AMAF values. Say you expand the tree after N playouts. You do N simulations at the current level. Then you choose the best node and do N simulations there. Then at the second-best node, etc. After N*m (m=number of legal moves) simulations, you have the initial AMAF data for every possible move your opponent can play. Even with N fairly large you're likely to be able to finish this process before your opponents plays his move. Of course the time saved is very little, as will always be the case with pondering. The 'weakness' as seen in the article is a very small one. Mark On 2-dec-08, at 06:48, Denis fidaali wrote: In the mentioned article it is said that : A weakness remains, due to the initial time: at the very beginning, neither AMAF values nor standard statistics are available I have developed a model that aim at healing that part. It gives virtual_simulations out of arbitrary ancestor at a rate of 1/2 for each node you go through. I called that process GENERALIZED_AMAF. I'm not familiar with mathematical terms, but i think the idea is dead simple : AMAF goes as this : Evaluate the expected results KNOWING that move A has been played. GENERALIZED_AMAF : Evaluate the expected results KNOWING that move A (AND) move B (AND) move C has been played. -- You can easily build for example an AMAF_map, which would get both the moves that have been marked as played by black on one part, and the moves that has been played by white in the other part. That is associated with the result (black win/lose) Given a node, classical AMAF would then try to check out every of those maps, for each simulation starting from that node where the child you are trying to score has been played. Generalized AMAF can build statistics out simulation from arbitrary ancestors from the considered node. You would get (1/2)^n simulations from the ancestor as GENERALIZED_AMAF_simulations (n the number of choice made from the ancestor to the node you assess). Suppose that you have 1000 simulations for a root node R. Then amaf gives you about 500 simulations for every child let's call them Cn those child, where n is the id of the child. Cn also represent the move made from R to reach that child. Then for each child of Cn, you can get 250 generalized_amaf simulations. you would consider from the set of simulations from the root, only those where Cn has been played, and then aggregate the results in the AMAF fashion for each son of Cn. My prototype was scoring 30% win against Gnu-go lvl 0, without any tuning using plain standard_light_simulations. It would use the generalized-amaf as a way to get RAVE values. Then it would guarantees that the most promising nodes is explored exponentially more. (it used a raw non deterministic algorithm) I did not however tried to compare that win ratio, with the win ratio i would have got out of standard Amaf. Best regard, Denis FIDAALI. Date: Mon, 1 Dec 2008 21:55:03 +0100 From: [EMAIL PROTECTED] To: computer-go@computer-go.org Subject: Re: [computer-go] Mogo MCTS is not UCT ? I think it's now well known that Mogo doesn't use UCT. I realize that i have no idea at all what Mogo do use for it's MCTS. A complicated formula mixing (i) patterns (ii) rules (iii) rave values (iv) online statistics Also we have a little learning (i.e. late parts of simulations are evolved based on online statistics and not only the early parts). I really wonder if there was an article describing the new MCTS of mogo somewhere that i missed. How is it better than UCT ? http://www.lri.fr/~teytaud/eg.pdf contains most of the information (many other things have been tried and kept as they provided small improvements, but essentially the ideas are in this version) Best regards, Olivier ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/ Votre correspondant a choisi Hotmail et
Re: [computer-go] regression testing
Yes, I asked that question. So GnuGo already uses a comment-node to store the information in the SGF tree. But twogtp uses information from the .tst file. So why the difference? On 2-dec-08, at 01:18, terry mcintyre wrote: Someone recently asked about regression testing, and methods of expressing the expected results. Unfortunately, I can't find the post in question. The GnuGo regression suite appears to encode expected results in the .sgf file; here is an example: $ cat Goemate990902-1.sgf (;N[Goemate990902-1.gob]ID[Goemate990902-1.gob]BS[0]WS[0]GM[1]FF[3] SZ[19] AB[eb][gb][lb][qb][cc][dc][hc][ic][kc][mc][qc][ad][dd][nd][od][pd] [rd][be][de][k e][af][cf][ef][ff][gf][hf][mf][bg][cg][gg][mg][hh][ih][lh][nh][gi] [hi][ji][ki][m i][qi][hj][ij][kj][mj][qj][bk][dk][ek][ik][jk][nk][qk][fl][gl][hl] [nl][ql][cm][p m][kn][mn][nn][pn][sn][bo][jo][lo][po][dp][fp][jp][mp][pp][qp][rp] [cq][eq][er][h r][ir][jr][gs] AW[mb][nb][rb][ec][oc][pc][rc][bd][cd][ed][gd][hd][id][jd][qd][ce] [ee][pe][qe][r e][df][if][jf][dg][hg][ig][lg][og][qg][bh][ch][dh][fh][gh][jh][oh] [ei][ni][oi][e j][fj][nj][pj][gk][hk][kk][lk][mk][pk][il][jl][ll][pl][rl][hm][km] [lm][qm][rm][d n][gn][in][qn][eo][fo][qo][ep][hp][lp][np][op][fq][gq][lq][mq][pq] [qq][rq][fr][k r][fs][js] ;C[move(rk,black):best ]) Terry McIntyre [EMAIL PROTECTED] -- Libertarians Do It With Consent! ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/ ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] Monte-Carlo Tree Search reference bot
I have made some minor performance improvements and this is as far as I intend to take this particular project. I might make some small changes if necessary, but most likely I'll leave this largely unchanged from now. I had set myself as an arbitrary goal that it should do at least 20K playouts. But with real liberties, AMAF and a RAVE formula I got stuck in the 16K-17K range. According to my profiler that is mostly due to the expensive formula used to compare nodes, where it says it spends 25% of total time. The formula I'm using is: beta * (virtual-win-ratio + RAVE) + (1-beta) * (win-ratio + UCT) beta = 1 - log(parent-visits) / 20 UCT = exploration-factor *sqrt( log(parent-visits) / (nr-visits+1) ) RAVE = exploration-factor *sqrt( log(parent-visits) / (nr-virtual- visits+1) ) There are probably quite a few possibilities still to tune this program with regards to playing strength and performance. But I felt it doesn't help to obscure the implementation by too many details. The implementation of the search algorithm was entirely game- independent, until I introduced AMAF. I didn't see how to fix that, as there's no way getting around that it's based on the fact that a move is represented by a single coordinate, which is mapped to an array to store the statistical values. But strip the AMAF part, which is a block of 30 odd lines of code, and I think it can be used for other games basically as-is. I did this not because I ever see myself program another game, but because in my experience in doing so I get a cleaner separation between modules. At 2,000 playouts, it's still quite a bit weaker than the plain MC- AMAF refbot. It wins only about 33%. But that's probably because the 1,000-2,000 playouts range is the sweet-spot for that particular type of playing engine. It doesn't scale from there, whereas the MCTS ref- bot only just gets warmed up with 2,000 playouts. This leads me to a question. I suppose it might be of some interest to put this bot up on CGOS. But what parameters to use? The main one being the number of playouts, naturally. Mark ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
[computer-go] log base e vs. log base 10
Just now I realized that I'm using the standard Java Math.log() function in places where it computes the log(visits). In Java, this log() function is actually the logarithm of base e, which I suppose is normally actually written as ln(). When I read articles about UCT and it says log(), does that actually mean log base e, or log base 10? I figured it probably won't make an awful lot of difference. But there should be some difference. Just to make sure I replaced Math.log () by Math.log10(). Now I'm seeing a slight degradation of play, so I suppose that should answer the question. That doesn't surprise my an awful lot, somehow intuitively it seems to make more sense to use log base e. But maybe adjusting the exploration-factor a little would bring them closer still. I just wanted to make sure... Another thing I tried was replacing log(virtual-parent-visits) by log (parent-visits) in the RAVE calculation. I see no effect on the level of play, so apparently it's a wash. But using the latter saves a little memory and / or (depending on your implementation) a little performance since the log() function is expensive. Mark ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] Mogo MCTS is not UCT ?
On 1-dec-08, at 18:55, Olivier Teytaud wrote: I think it's now well known that Mogo doesn't use UCT. I realize that i have no idea at all what Mogo do use for it's MCTS. A complicated formula mixing (i) patterns (ii) rules (iii) rave values (iv) online statistics Isn't that technically still UCT? I mean, you use different input and probably a different formula, but most likely what you do is still establish an upper bound to which extent you trust the win-ratio (and possibly other data) to determine which node to extend next. When that upper-bound is passed you decide to extend a less promising node to make sure you don't overlook an unlikely but possibly very good candidate. It's just that people here have come to associate UCT with a particular formula, but that formula is not the only way you can establish an upper confidence bound. Mark ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] RAVE formula of David Silver (reposted)
Indeed, the scaling question is very important. Even though I think I have AMAF/ RAVE working now, it's still not so clear-cut what it's worth. With just 2,000 playouts I'm seeing a 88% win-rate against plain old UCT tree-search without RAVE. At 10,000 playouts this win- rate drops to 75%. With 50,000 to 69%. All of these results have a margin of error of a few points, but the trend is obvious. UCT plays weaker than UCT+RAVE but it scales a little better. This doesn't necessarily mean they converge. From the few data-points that I have it looks like UCT+RAVE might converge to a winning rate of 66% against plain UCT search with playouts in the hundred-thousands or millions. Is that about 100 ELO points? That in itself would be justification enough to keep it. But there's a computation-cost as well. Plus, as soon as you start to introduce other move-selection procedures it may eat into the gain RAVE provides even further. Anyhow, the way I have it set now I can easily switch between using AMAF information to compute RAVE or not. There are also still some parameters to tune. So this is not the last word on it by far. It's more like a good starting point. Also, even if it's not something to use in a final playing-engine, it's good to have a 'base-line' that provides the best possible time/strength combination to run quick tests against. Is there actually a well-understood basis for the diminishing return of UCT+RAVE vs. UCT? I have given it a little thought, but it's not entirely obvious to me why UCT+RAVE wouldn't scale better than what I'm seeing. I've also run into a few 'fluke' results. Winning streaks of a dozen games in a row (or more) happen between equally strong programs. So to be reasonably sure I'd like to get about 1,000 games. If you want to make sure two implementations are equivalent, like in case of the ref-bots, I'd recommend 10,000 games. If all I want to know is whether something is an improvement or not, then I usually settle for fewer games. If after a (few) hundred games I see a win-rate of 50% or less I decide it's not an improvement (not one worth anything anyway), if I see a winning-rate of around 60% or more I keep it. Anything in between I might decide to let it run a bit more. The improvements that I keep I run with longer thinking times overnight to see if they scale. After all, the only real test worth anything is under realistic playing circumstances. Mark On 29-nov-08, at 11:32, Don Dailey wrote: On Sat, 2008-11-29 at 11:58 +0100, Denis fidaali wrote: From my own experience, an important testing case whenever trying to get AMAF to work, is the scaling study. No truer words ever spoken. This is one of the secrets to strong programs, if they scale, they are probably soundly designed. I do that with Chess. I find that some program changes scale up, particular sound algorithms that reduce the branching factor. I have to run tests pretty fast in order to get results I can interpret, but I also plot the result visually with gnuplot. As many here will recall, my own Fatman study vs Leela showed that Leela scaled better with increasing depth than Fatman.Nothing like a graph to reveal this very clearly, although you can also look at the numbers if you are careful. It's important to point out that you will be completely mislead if you don't get enough samples. It's very rare that 100 or 200 games are enough to draw any conclusions (unless it's really lopsided.) I remember once thinking I had found a clear scalable improvement but decided that it must run longer - but I was hopeful. When the improvement started to decline, I discovered that I had by accident been running the same exact program against itself. The point is that it is not uncommon to get really lucky, and have an equal program look substantially superior - for a while. - Don My prototype was quite strong considering that it used only 1000 light playout (and score 25-30 % win against gnugo lvl 0), but it seemed to not get much over that as the number of playout grew ... (also there had a serious exponential complexity problem, which i never get into the trouble of investigating :) ) I know that Zoe was about 2000 elo with i think 50k simulations, and ... never got any real better as the number of simulations increased. Both prototype were toying with AMAF, so i really think you need a bit of scalability study whenever trying to asses an engine employing it. (although it could very well be that the scalability trouble came out of some nasty bugs. Both aforementioned prototype where quite messy ...) From: [EMAIL PROTECTED] Subject: Re: [computer-go] RAVE formula of David Silver (reposted) Date: Sat, 29 Nov 2008 03:39:58 -0200 To: computer-go@computer-go.org CC: On 28-nov-08, at 17:28, [EMAIL PROTECTED] wrote: I would be very interested to see the RAVE code from Valkyria. I'm sure others would
Re: [computer-go] RAVE formula of David Silver (reposted)
On 30-nov-08, at 16:51, Jason House wrote: You've claimed to be non-statistical, so I'm hoping the following is useful... You can compute the likelihood that you made an improvement as: erf(# of standard deviations) Where # of standard deviations = (win rate - 0.5)/sqrt(#games) Erf is ill-defined, and in practice, people use lookup tables to translate between standard deviations and confidence levels. In practice, people set a goal confidence and directly translate it to a number of standard deviations (3.0 for 99.85%). This situation requires the one-tailed p test. After about 20 or 30 games, this approximation is accurate and can be used for early termination of your test. Lately I use twogtp for my test runs. It computes the winning percentage and puts a ± value after it in parenthesis. Is that the value of one standard deviation? (I had always assumed so.) Even after a 1,000 games it stays in the 1.5% neighbourhood. Maybe 20-30 games is usually an accurate approximation. But if you perform tests often, you'll occasionally bump into that unlikely event where what you thought was a big improvement turned out to be no improvement at all. Or the other way around. Only when I see 20+ games with a zero winning percentage do I stop it, assuming I made a mistake. Mark ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
[computer-go] Test set for reference bots
I think I've seen some discussion a while ago about a set of tests used to test playing engines. I suppose it would make sense to compile such a set for the reference bots. Matching a win-rate and game-length after 1,000,000 playouts is a quick first milestone for verification that can be done quickly. A winning rate of something very close to 50% after thousands of games takes hours if not a whole day. A set of situations with their expected result can provide something in between. Maybe the original set posted by Yamato can be used as a starting point. But I'd also include some very simple situations that check whether the engines properly implement things like ko, super-ko, seki, end of game etc... So far I've been using unit-tests for things like this. But that's not very easy to transfer to other engine implementations. I haven't used the GoGui regression-test application yet but I'm planning to look into it. The one thing I did notice is that the problem diagram and the problem's solution are separated. Wouldn't it have been easier to include the latter into the SGF? A simple format like #c? [result-set] where c is the color to play, similar as the format already used. Put that as a comment in the first or last SGF node should be easy enough to parse. Or you could even define a special SGF property for this information. Maybe there's a well thought out reason why it is the way it is? Mark ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] Monte-Carlo Tree Search reference bot
Denis, Thanks for the feed-back. What you describe is pretty much what I do, with a small difference that from A I don't assign any real scoring for the first move unless A has had the minimum number of playouts before it expands further. So assume that that limit is 10 playouts. For the first 10 playouts all the virtual-scores are distributed among A's children. Then it chooses the child with the best AMAF score (let's call that B), creates its children nodes, and does a playout. Then the virtual score is distributed over the child-nodes of B. And over the child nodes of A. I've never seen mentioned distributing the AMAF values up the tree, so maybe it's not necessary. I didn't see much difference whether I did or didn't. The idea is not to waste any AMAF information gathered. So it acts exactly like Don's ref-bot if it never expands. Also, if it expands after each playout it naturally always awards the real score to the first move, making this approach the same to what you described again. It's quite possible the devil is in the details. At one time I ran a test where I did see the expected gain of AMAF. And then I didn't manage to reproduce it. So I don't know if I accidentally ran the wrong version or what. Usually when I run into an anomaly that I can't reproduce I rely on TimeMachine to go back to what my computer was like when I started the test . But incidentally, just when running that test I had unplugged the disk to take elsewhere. Murphy's law I guess. I did expect to end up with two different versions. But I already have a different ref-bot implementation and I have a different tree- search implementation. Having those two anchors I can freely experiment with the current one. If that leads me to split up versions further I'll do that, but I haven't found reason to do so yet. I'll also try weighting the real simulations more like you did. I just didn't have the time yet. Hopefully I can do a lot more experimenting next week. Mark On 28-nov-08, at 06:32, Denis fidaali wrote: My own experiments showed a very big increase in strength especially with about 1000 light-playouts. I had many features and parameters witch may make a difference. But my main hypothesis was that it shouldn't ... I think that i weighted real simulations (aka first move) 10 times more than virtual one (aka AMAF moves). I assume that by expansions you mean allowing data for every child of a node. So let's assume that your root node is called A, you would then allocate for every of it's child. Then when make up a simulation running from A, you would first apply the UCT formula for determining wich first move you should make ... then you would both get back the AMAF score for every moves that has been played during the simulation for virtual scoring AND you would use the true first move for Real scoring (normal UCT part) Of course, this shouldn't behave exactly like AMAF, because for one thing you can get a better judgment for the first move because of UCT, and then you also make the first move weight more. (although you probably technically should also do so with AMAF despite that the first move is still 100% random. Which is part of what the weighted AMAF does. But first move weight would be arbitrary thus defying the purpose of a standard way) Last, you could try to downgrade the weight of the virtual score part as the number of simulation grows. like with a (if nbsim 1000) winRatio=[realRatio *(1000+nbsim)]/3000+[virtualRatio*(2000-(1000 +nbsim))]/3000 I'm not sure about the formula :) but you get the idea of taking more and more into account the realRatio. I think that what i did was that i didn't take the realRatio into accout at all at first for estimating the winRatio, and then increased it linearly to the point where i wouldn't take into accout the virtualRatio. The drawback with all that, is that you will certainly have to make several versions, although they would share a lot of code. I don't think that trying to get One version behave as several is a good idea. You probably should keep the code safe, once you have made a lot of testing of it. For reference purpose. Then you'll be able to exactly know the strength of a particular piece of software, even in two years from now. I din't do that. I tried to make it evolve. And with poor svn versionning capacity, and poor management of the measures/software link : i'm now to a point where all i can do is make random guesses, although i made millions of experiments in the last past years :) I just can't link those experiments back to a fixed piece of software. Then, if by any means, you think a code is bug free. Which is hard to bet on, you really should put that somewhere, with all the proof (and experiments) you can associate with it. (Even on a web page if you ask me , or a thesis) To me that is the most usefull part of the standard ref bot :
Re: [computer-go] RAVE formula of David Silver (reposted)
Thanks for posting that Remi. I do remember seeing that before but somehow I didn't notice it when looking for RAVE-related stuff recently. Mathematics is not my strong point, so I have a hard time making sense of those formula's. I do get the gist that it uses a UCT value and a RAVE value in a similar fashion, one based on actual playouts and the other based on virtual playouts (based on AMAF). The balance in which the two values influences node-selection is calculated by beta, which favours UCT for frequently visited nodes and RAVE for unfrequently visited notes. But I'm not toally clear on what b_r and q_ur actually are in formula (11). (I don't know how to denote subscription symbols in mail.) At first glance this seems to be a bit more sophisticated version of what Denis was trying to explain. What is also not clear to me from the article is how this UCT_RAVE value is used after it's calculated. In plain UCT search you select the node with the highest win/loss+UCT value. How does the virtual win/loss ratio get used in combination with the UCT-RAVE value resulting from formula (14)? Is this explained in the original by Gelly and Silver? Mark On 28-nov-08, at 07:38, Rémi Coulom wrote: Hi Mark, Maybe you missed the nice RAVE formula that David Silver posted in that message: http://computer-go.org/pipermail/computer-go/2008-February/014095.html Unfortunately, the list archive does not keep attachments. I attached another copy to this message. I am not sure it is better than your formula, but I thought it would be good to repost it, since it seems that it is not available online anywhere. Rémirave.pdf___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/ ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] RAVE formula of David Silver (reposted)
On 28-nov-08, at 17:28, [EMAIL PROTECTED] wrote: I would be very interested to see the RAVE code from Valkyria. I'm sure others would be too. I'm much more interested in a general concise description. If such a description cannot be given easily, then I think there's little point including it in the definition of a MCTS reference engine. I found a serious flaw in my code collecting the AMAF scores, which explains why I wasn't seeing any gains so far with AMAF turned on. Now over the first 100+ games UCT+RAVE scores 90% over plain UCT. I'm going to run a test overnight, but so far it looks good. It should have collected a few thousand samples by tomorrow. Hopefully next week I can go back to testing my list of playout improvements, which is why I started making the MCTS reference implementation in the first place. This RAVE stuff caused a bit of a distraction, but it's nice to have if it works. Mark ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] Re: hex robot
Denis, A CGOS equivalent for Hex would probably be good to have. But since the CGOS server software is open-source you can easily adapt it. GTP you can simply use as-is, I don't see why that wouldn't work. GoGui is also open-source and can possibly also be easily adapted to Hex as well. But to be honest, I don't really need a Gui that much. But twogtp is really useful. A KGS equivalent will only come into existence if there are actually enough people playing the game. You say you're going to try to make a prototype first and then when it shows promise, move to a more flexible language like Java. What language are you intending to use? It seems the wrong way around to me. Develop the prototype in a flexible language and when it seems to work, move it to a specific langauge that can leverage some specific CPU features. Seems to make much more sense to me. If you want to save even more time, you can consider using my CGOS client and plugin framework. Most of it can probably be used as is. Good luck. Mark On 27-nov-08, at 07:52, Denis fidaali wrote: hi. I have put a lot of though in that Hex bot stuff. I realize now how eager i am to try my ideas with an Hex bot. It's been a long time since i first realize how much more elegant it would be to try those ideas for the Hex game anyway. Your site seems a great source of interest for that game. When i tried to find out a community for Hex-computer, i stumbled upon it. But what really lacks (or i wasn't able to find anyway) is a strong community like there is for go. A CGOS equivalent. A GTP equivalent. A Gogui equivalent. A Kgs equivalent. That made it hard for me to settle for putting on the hex bot a lot of work. But as i said, it seems so much more reasonable to me, to first try my ideas with Hex, rather than with go. It seems that Montecarlo is as sound for Hex as it is for go. Given that some of the good-playing programs are montecarlo ones; Although it seems the strongest one is not. So what i will do, is to make up a prototype, that suit my goals : assembly 64 bits montecarlo prototype, with all the tricks i want to try in it. I then will provide a Java gui, for Playing against One instance of the bot when there is an Hardware online willing to support it. I will use a protocol that seems promising, or make up a new one from scratch, if nothing hits me as being standard. It would be logical to use the same protocol that the Computer Games Olympiad uses. But i wasn't able to figure out where to find it. So if all goes well, i should have a prototype available for trying punctually through a java Applet by the end of January 2009. If this prototype shows promises, then i might try to port it to a more flexible langage (like Java). The name of the prototype will be Stohex. Date: Wed, 26 Nov 2008 13:56:19 -0800 To: computer-go@computer-go.org From: [EMAIL PROTECTED] Subject: [computer-go] Re: hex robot At 01:31 PM 11/26/2008, Denis fidaali wrote: Speaking of hex ... I really think it would be a nice intermediary game before tackling the complexity of go. Do you know of any good community (and protocol equivalent to GTP) where i could start to look for submitting a bot ? There are a couple of pretty decent programs and some nice underlying theory. Also a perfect play bot for a small board. I would start at the Hex wiki page to research them. A credible Hex bot is on my wish list for Boardspace. The one I wrote is laughably weak, but it will be a significant effort to write a better one. If you're willing to work in Java and within the constraints of a realtime web site, lets talk. ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/ Qui vous permet d'enregistrer la TV sur votre PC et lire vos emails sur votre mobile ? la réponse en vidéo la réponse en vidéo ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/ ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/