Re: [SPAM] Re: [SPAM] Re: [computer-go] First ever win of a computer against a pro 9P as black (game of Go, 9x9).
> > > >If there are people interested in a ph.D. or a post-doc around Monte-Carlo > >Tree Search, candidates are welcome (Monte-Carlo Tree Search, and not > >necessarily / not only computer-go). > > Excuse me, but what press conference and where to ask? > People interested in a ph.D. or a post doc can contact me. >This was during a press conference at Taipei around a French-Taiwanese > grant > >for joint research. > but I can find no links even with Google. > I'll ask to the taiwanese people if there is something on the web about the press conference. I was only there through a video. I don't know if there is something on the web. This is essentially for the launching of a France/Taiwan collaboration around Monte-Carlo Tree Search, I guess there are not thousards of journalists from tenths of countries interested in it :-) Best regards, Olivier ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [SPAM] Re: [SPAM] Re: [SPAM] Re: [computer-go] First ever win of a computer against a pro 9P as black (game of Go, 9x9).
> > > But is it shown that "the score is well done" for these properties to > hold in case of RAVE-guided exploration? Since it massively perpetuates > any kind of MC bias... > This only matters for the fact that we don't visit all the tree. For the consistency (the fact that asymptotically we will find the best possible decision), there's no problem. If "score ~ success rate" for n--> infinity (which holds for most usual rules, including rave rules) we also have that, for binary games, we have some good properties on the part of the tree which is visited. Please not that I do not claim that major improvements are possible in computer-go thanks to this. We only observed a very small improvement on success rates, and a good behavior on the situation which appeared during the game against Fan Hui. It might be interesting to know, for people who have similar problems in their bot (a situation in which, even with huge computation time, the good estimate is not found), they solve it with this. > > > We use a stupid method, i.e. the success rate. The pattenrs are bigger > than > > 3x3, with jokers in them. Bandits (Bernstein races, slightly modified) > are > > used > > for distributing the computational effort among the tested patterns. > > Thank you for pointing me to more study material. :-) > > The following paper is great for Bernstein races: http://icml2008.cs.helsinki.fi/papers/523.pdf Please note, however, that we had only very small improvements with races. Maybe our code has had too many tuning steps in the past for being strongly improved by random generation of patterns and bernstein races for validating them. Best regards, Olivier ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [SPAM] Re: [SPAM] Re: [computer-go] First ever win of a computer against a pro 9P as black (game of Go, 9x9).
On Tue, Oct 27, 2009 at 06:32:44PM +0200, Olivier Teytaud wrote: > > > > > > AIUI, once upon N simulations in a node you take let's say the node with > > the lowest value, pick one son of it at random within the tree and start > > a simulation? > > > > I'll try to write it clearly (for binary deterministic games, extensions can > be shown but they are too long and out of topic in this mailing list I guess > :-) ): > > If (average value of father < threshold ) > then > randomly pick up one son >else >pick up the son with maximum score > end > Aha, thanks for clearing that up. > If the score is asymptotically equivalent to the success rate, and if the > threshold is >0 and < 1, then this ensures > consistency (convergence to optimal move). If the score is well done, then > this is consistent without visiting all the tree. But is it shown that "the score is well done" for these properties to hold in case of RAVE-guided exploration? Since it massively perpetuates any kind of MC bias... > > Wow - one of my planned little projects was genetic development of the > > 3x3 patterns... To evaluate patterns, do you use tournaments or some > > smarter method? I feared one generation would take awfully long... > > > > We use a stupid method, i.e. the success rate. The pattenrs are bigger than > 3x3, with jokers in them. Bandits (Bernstein races, slightly modified) are > used > for distributing the computational effort among the tested patterns. Thank you for pointing me to more study material. :-) -- Petr "Pasky" Baudis A lot of people have my books on their bookshelves. That's the problem, they need to read them. -- Don Knuth ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [SPAM] Re: [SPAM] Re: [computer-go] First ever win of a computer against a pro 9P as black (game of Go, 9x9).
> > > AIUI, once upon N simulations in a node you take let's say the node with > the lowest value, pick one son of it at random within the tree and start > a simulation? > I'll try to write it clearly (for binary deterministic games, extensions can be shown but they are too long and out of topic in this mailing list I guess :-) ): If (average value of father < threshold ) then randomly pick up one son else pick up the son with maximum score end If the score is asymptotically equivalent to the success rate, and if the threshold is >0 and < 1, then this ensures consistency (convergence to optimal move). If the score is well done, then this is consistent without visiting all the tree. UCT (with non-zero constant) visits all the tree, and does so infinitely often. UCT (with zero constant) does not visit all the tree, but it is not necessarily consistent. Wow - one of my planned little projects was genetic development of the > 3x3 patterns... To evaluate patterns, do you use tournaments or some > smarter method? I feared one generation would take awfully long... > We use a stupid method, i.e. the success rate. The pattenrs are bigger than 3x3, with jokers in them. Bandits (Bernstein races, slightly modified) are used for distributing the computational effort among the tested patterns. Best regards, Olivier ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/