Re: [SPAM] Re: [SPAM] Re: [computer-go] First ever win of a computer against a pro 9P as black (game of Go, 9x9).

2009-10-28 Thread Olivier Teytaud
>
>
> >If there are people interested in a ph.D. or a post-doc around Monte-Carlo
> >Tree Search, candidates are welcome (Monte-Carlo Tree Search, and not
> >necessarily / not only computer-go).
>
> Excuse me, but what press conference and where to ask?
>

People interested in a ph.D. or a post doc can contact me.

>This was during a press conference at Taipei around a French-Taiwanese
> grant
> >for joint research.
> but I can find no links even with Google.
>

I'll ask to the taiwanese people if there is something on the web about the
press conference. I was only there through
a video. I don't know if there is something on the web. This is
essentially for the launching of a France/Taiwan collaboration around
Monte-Carlo Tree Search, I guess there are not thousards of journalists from
tenths of countries interested in it :-)

Best regards,
Olivier
___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

Re: [SPAM] Re: [SPAM] Re: [SPAM] Re: [computer-go] First ever win of a computer against a pro 9P as black (game of Go, 9x9).

2009-10-28 Thread Olivier Teytaud
>
>
> But is it shown that "the score is well done" for these properties to
> hold in case of RAVE-guided exploration? Since it massively perpetuates
> any kind of MC bias...
>

This only matters for the fact that we don't visit all the tree. For the
consistency (the fact that
asymptotically we will find the best possible decision), there's no problem.
If "score ~ success rate" for n--> infinity (which holds for most usual
rules, including rave rules) we also
have that, for binary games, we have some good properties on the part of the
tree which is visited.

Please not that I do not claim that major improvements are possible in
computer-go thanks to this.
We only observed a very small improvement on success rates, and a good
behavior on the situation
which appeared during the game against Fan Hui. It might be interesting to
know, for people who have
similar problems in their bot (a situation in which, even with huge
computation time, the good estimate is
not found), they solve it with this.

>
> > We use a stupid method, i.e. the success rate. The pattenrs are bigger
> than
> > 3x3, with jokers in them. Bandits (Bernstein races, slightly modified)
> are
> > used
> > for distributing the computational effort among the tested patterns.
>
> Thank you for pointing me to more study material. :-)
>
>
The following paper is great for Bernstein races:

http://icml2008.cs.helsinki.fi/papers/523.pdf

Please note, however, that we had only very small improvements with races.
Maybe our code has had too many tuning steps in the past for being strongly
improved
by random generation of patterns and bernstein races for validating them.

Best regards,
Olivier
___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

Re: [SPAM] Re: [SPAM] Re: [computer-go] First ever win of a computer against a pro 9P as black (game of Go, 9x9).

2009-10-27 Thread Petr Baudis
On Tue, Oct 27, 2009 at 06:32:44PM +0200, Olivier Teytaud wrote:
> >
> >
> > AIUI, once upon N simulations in a node you take let's say the node with
> > the lowest value, pick one son of it at random within the tree and start
> > a simulation?
> >
> 
> I'll try to write it clearly (for binary deterministic games, extensions can
> be shown but they are too long and out of topic in this mailing list I guess
> :-) ):
> 
> If (average value of father < threshold )
> then
> randomly pick up one son
>else
>pick up the son with maximum score
> end
> 

Aha, thanks for clearing that up.

> If the score is asymptotically equivalent to the success rate, and if the
> threshold is >0 and < 1, then this ensures
> consistency (convergence to optimal move). If the score is well done, then
> this is consistent without visiting all the tree.

But is it shown that "the score is well done" for these properties to
hold in case of RAVE-guided exploration? Since it massively perpetuates
any kind of MC bias...

> > Wow - one of my planned little projects was genetic development of the
> > 3x3 patterns... To evaluate patterns, do you use tournaments or some
> > smarter method? I feared one generation would take awfully long...
> >
> 
> We use a stupid method, i.e. the success rate. The pattenrs are bigger than
> 3x3, with jokers in them. Bandits (Bernstein races, slightly modified) are
> used
> for distributing the computational effort among the tested patterns.

Thank you for pointing me to more study material. :-)

-- 
Petr "Pasky" Baudis
A lot of people have my books on their bookshelves.
That's the problem, they need to read them. -- Don Knuth
___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/


Re: [SPAM] Re: [SPAM] Re: [computer-go] First ever win of a computer against a pro 9P as black (game of Go, 9x9).

2009-10-27 Thread Olivier Teytaud
>
>
> AIUI, once upon N simulations in a node you take let's say the node with
> the lowest value, pick one son of it at random within the tree and start
> a simulation?
>

I'll try to write it clearly (for binary deterministic games, extensions can
be shown but they are too long and out of topic in this mailing list I guess
:-) ):

If (average value of father < threshold )
then
randomly pick up one son
   else
   pick up the son with maximum score
end

If the score is asymptotically equivalent to the success rate, and if the
threshold is >0 and < 1, then this ensures
consistency (convergence to optimal move). If the score is well done, then
this is consistent without visiting all the tree.

UCT (with non-zero constant) visits all the tree, and does so infinitely
often.
UCT (with zero constant) does not visit all the tree, but it is not
necessarily consistent.

Wow - one of my planned little projects was genetic development of the
> 3x3 patterns... To evaluate patterns, do you use tournaments or some
> smarter method? I feared one generation would take awfully long...
>

We use a stupid method, i.e. the success rate. The pattenrs are bigger than
3x3, with jokers in them. Bandits (Bernstein races, slightly modified) are
used
for distributing the computational effort among the tested patterns.

Best regards,
Olivier
___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/