Re: [computer-go] November KGS bot tournament: large boards, slow

2008-11-15 Thread Nick Wedd

Reminder - it's tomorrow

In message [EMAIL PROTECTED], Nick Wedd 
[EMAIL PROTECTED] writes
Registration is now open for the next bot tournament on KGS, which will 
be on Sunday November 16th.  Each division will be a 5-round Swiss, 
19x19 boards, 43 minutes each plus very fast Canadian overtime of 25 
moves in 20 seconds.  They will start at 16:00 UTC (=GMT) and 16:05 
respectively, and end seven hours thirty minutes later.


Note that the event is not this weekend, but next.  I will be away on 
holiday this weekend, and will not receive emails until I return on the 
morning of Wednesday 12th.


Registration is as described at
http://www.weddslist.com/kgs/how/index.html

When you register you should tell me the processor power (number of 
processors, processor speed, and any other significant details) of the 
platform that it will be running on.  This information will appear in 
my report of the event, making comparisons between programs more 
meaningful.  I have limited understanding of processor power, and am 
likely to publish it as given, see for example the final section of

http://www.weddslist.com/kgs/past/43/index.html

The tournaments are on the KGS site at
   http://www.gokgs.com/tournInfo.jsp?id=429 and
   http://www.gokgs.com/tournInfo.jsp?id=430
These pages may give the times of the rounds in your local timezone,
depending on your browser and its settings.

Nick


--
Nick Wedd[EMAIL PROTECTED]
___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/


RE: [computer-go] Monte carlo play

2008-11-15 Thread dave.devos
Hello Tony,
 
I'm just speaking for myself here, but I suspect you may not get the answers 
you are looking for, because you're not asking the right questions.
 
I get the impression that the situation is similar to this scenario:
 

A young guy with a surf board and a hawai shirt turns up at 
basecamp on mount Everest and says:
Hey dudes, what's up!
This morning I had this wonderful idea: I wanna climb the mount 
Everest!
So here I am, and I even brought my hiking shoes!
But can anyone give me directions to the nearest McDonalds, cuz 
I'm starving.
 
Everybody in the tent stares at him speechlessy.

I may be wrong, but that's my impression.
I'm not even sure whether if I myself should be here at basecamp, even though I 
am a 4d in Go and even though I have 25 years of programming experience in a 
couple of different languages (including the last 8 years as a fulltime 
professional developer.)
It's only logical that ever since I've learned Go 20 years ago I've had a 
special interest in computer-go and I spent some time over the years attempting 
to build a strong go playing program myself (although not nearly as much time 
as some others here).
I'm a mountaineer, but climbing the mount Everest is something else. I feel a 
bit like a rooky between all the seasoned mountaineers hanging around here at 
the Everest basecamp. Most of the time I keep my mouth shut when one of them is 
talking, because I'm a bit apprehensive that I might say something that is 
utterly obvious to the others or even outright stupid.
 
And then this guy turns up in his hawai shirt.
 
I hope you don't feel offended. Indeed you took up a wonderful endeavour, but I 
sense that you're not quite ready to go for the summit today.
 
Best Regards,
Dave



Van: [EMAIL PROTECTED] namens tony tang
Verzonden: do 13-11-2008 22:03
Aan: go mailing list
Onderwerp: RE: [computer-go] Monte carlo play


Thanks Darren,
 
After reading a couple of interesting papers on this field I am applying 
pattern matching to limit the number of possibilities of the monte carlo tree 
(on 19x19 it appears to take a day to play).
I was wondering if you could comment on weather my following concept is right 
on the evalution part.
 
The possible patterns for the current situation are passed to a class where 
self play begins. Each different pattern runs on a instance, and at the end the 
results are written into a file. The game tree will be constructed using the 
stats from the file. 
 
Thanks again
 
Tony
 
 


 Date: Wed, 12 Nov 2008 10:34:21 +0900
 From: [EMAIL PROTECTED]
 To: computer-go@computer-go.org
 Subject: Re: [computer-go] Monte carlo play
 
  I am new to programming go, could some one explain to me how a monte
  carlo based evalution manages to play random games by itself? ie:
  who/what is the oppoent which supplies the opposing moves which
  allows another move to be randomly played after making the initial
 
 It is self-play, so both players are random.
 
 I'm not aware of any programs that use different playout strategies for
 black/white, though it has been briefly mentioned before on this list,
 and I personally think it might be interesting as a way of discovering
 and more accurately evaluating the imbalances present in most go
 positions (i.e. usually one side has more strength and one side has more
 territory, and so to correctly understand the position one side has to
 play aggressive moves and the other has to play defensive moves).
 
  move at the root. I am implementing in java, is there a
  package/framework which allows me to train my software.
 
 This page, recently started by Eric Marchand, might be a good starting
 place:
 http://ricoh51.free.fr/go/engineeng.htm
 
 Darren
 
 
 -- 
 Darren Cook, Software Researcher/Developer
 http://dcook.org/mlsn/ (English-Japanese-German-Chinese-Arabic
 open source dictionary/semantic network)
 http://dcook.org/work/ (About me and my work)
 http://dcook.org/blogs.html (My blogs and articles)
 ___
 computer-go mailing list
 computer-go@computer-go.org
 http://www.computer-go.org/mailman/listinfo/computer-go/





Get the best wallpapers on the Web - FREE. Click here! 
http://wallpapers.msn.com/?ocid=[B001MSN42A0716B]  
___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

Re: [computer-go] Monte carlo play

2008-11-15 Thread Nick Wedd
In message
[EMAIL PROTECTED],
[EMAIL PROTECTED] writes

 Hawaiian shirt analogy snipped 

I hope you don't feel offended. Indeed you took up a wonderful
endeavour, but I sense that you're not quite ready to go for the summit
today.

But Tony never expressed any interest in going for the summit.  Maybe
he just wants to write a simple MC implementation.  Here is what he
wrote:

I am new to programming go, could some one explain to me how a monte
carlo based evalution manages to play random games by itself?
ie: who/what is the oppoent which supplies the opposing moves which
allows another move to be randomly played after making the initial move
at the root.

I am implementing in java, is there a package/framework which allows me
to train my software.

I will try to answer these questions myself.

 who/what is the oppoent which supplies the opposing moves

It is the same piece of your program that supplies your own moves.  For
each random game or rollout, your program places both black and
white stones, alternately, at random.

is there a package/framework which allows me
to train my software.

Nothing trains your software.  This isn't a neural net.  The way it
works is:
  you write a program
  A: it plays, badly
 you modify it so it plays differently, and with luck, better
 repeat from A

Nick
-- 
Nick Wedd[EMAIL PROTECTED]
___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/


RE: [computer-go] Monte carlo play

2008-11-15 Thread dave.devos
You are absolutely right. Tony did not imply he wanted to go for the summit 
today. Also, Tony's qualifications are surely a lot better than a hawaiian 
shirt and a surfboard. 
 
My analogy was an exaggeration with the intention of explaining to Tony that he 
has more catching up to do than he might have expected. 
By no means was my intention to put Tony down. I hope Tony understands this and 
I sincerely hope that Tony won't give up on joining us in this wonderful 
endeavour, because we need all the help we can get.
 
Best Regards,
Dave



Van: [EMAIL PROTECTED] namens Nick Wedd
Verzonden: za 15-11-2008 15:48
Aan: computer-go
Onderwerp: Re: [computer-go] Monte carlo play



In message
[EMAIL PROTECTED],
[EMAIL PROTECTED] writes

 Hawaiian shirt analogy snipped 

I hope you don't feel offended. Indeed you took up a wonderful
endeavour, but I sense that you're not quite ready to go for the summit
today.

But Tony never expressed any interest in going for the summit.  Maybe
he just wants to write a simple MC implementation.  Here is what he
wrote:

I am new to programming go, could some one explain to me how a monte
carlo based evalution manages to play random games by itself?
ie: who/what is the oppoent which supplies the opposing moves which
allows another move to be randomly played after making the initial move
at the root.

I am implementing in java, is there a package/framework which allows me
to train my software.

I will try to answer these questions myself.

 who/what is the oppoent which supplies the opposing moves

It is the same piece of your program that supplies your own moves.  For
each random game or rollout, your program places both black and
white stones, alternately, at random.

is there a package/framework which allows me
to train my software.

Nothing trains your software.  This isn't a neural net.  The way it
works is:
  you write a program
  A: it plays, badly
 you modify it so it plays differently, and with luck, better
 repeat from A

Nick
--
Nick Wedd[EMAIL PROTECTED]
___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/


___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

RE: [computer-go] Monte carlo play

2008-11-15 Thread dave.devos
Just a quickstart here, and I'm not one of the real experts, so it may contain 
inaccuracies. Perhaps you already know all this, perhaps not.
 
The basis of many game-playing programs (like chess programs) is the minimax 
(or negamax) algorithm.
It is a simple algorithm by itself and you can find a lot of background and 
examples for chess or other games if you google it.
Basically, minimax explores all possible variations stemming from the current 
position and then selects the best move.
 
This algorithm works in principle, but there is a problem with it:
For interesting games like chess and go, it will usually take the program 
billions of years to find the best move, given a game position.
 
In the past decades several adaptations of the minimax algorithm were invented 
to enable the algorithm to find a reasonably good move in a timespan that 
humans are prepared to wait for (like evaluation functions, alfabeta pruning 
and transposition tables). The program keeps investigating possibilities for 
some time and then the move seeming best so far is played. The longer the 
program is allowed to think (and the faster the computer), the better the move 
it will come up with.
 
These adaptations work by reducing the number of possibilities the program 
explores, without weakening the program too much. The program must ignore many 
possibilities to save time (this is called pruning), but the more possibilities 
it ignores, the higher the risk that it accidentally ignores the best move. 
Also, there is allways the danger of the horizon effect: The program 
misinterprets the situation because it simply doesn't get enough time to fully 
investigate every possibility. 
 
The challenge for the programmer is to strike an optimal balance between time 
consumption and losing by accidentally ignoring or misinterpreting the best 
move. Some game programs are even able to finetune themselves to optimize this 
balance over time (learning).
 
A combination of these adaptations have proven to be well enough to enable 
chess programs on current hardware to beat any human with reasonable thinking 
time.
But computer-go is a harder problem than computer-chess. The adaptations that 
are succesful in computer-chess are just not enough for computer-go, mainly 
because there ar far more possibilities (and I mean FAR more) to explore in go. 
 
But now there is a recent adaptation that has proven to be fairly succesful in 
computer-go: MonteCarlo playouts.
The program reduces the work to be done by gambling a number of times and 
measuring the average payoff. If the average payoff is good, the move is 
assumed to be good for the time being. Also, it doesn't automatically ignore 
moves with a low payoff (remember the risk of ignoring the best move). The 
program just postpones further investigation of seemingly bad moves until the 
moves seeming good initially give more and more dissapointing payoffs after 
further investigation. 
Although this invention has had only moderate success in computer-chess, it has 
proven to be fairly succesful in computer-go.
 
But at the heart of a MonteCarlo go program, you can still recognize the 
minimax algorithm.
 
Best Regards,
Dave
 
 


Van: [EMAIL PROTECTED] namens tony tang
Verzonden: wo 12-11-2008 2:23
Aan: computer-go@computer-go.org
Onderwerp: [computer-go] Monte carlo play


Greetings all go gurus,
 
I am new to programming go, could some one explain to me how a monte carlo 
based evalution manages to play random games by itself?
ie: who/what is the oppoent which supplies the opposing moves which allows 
another move to be randomly played after making the initial move at the root.
I am implementing in java, is there a package/framework which allows me to 
train my software.
 
Thank you
 
Kind regards
 
Tony




Read amazing stories to your kids on Messenger Try it Now! 
http://clk.atdmt.com/UKM/go/117588488/direct/01/  
___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

[computer-go] FW: computer-go] Monte carlo play?

2008-11-15 Thread dave.devos
Van: tony tang [mailto:[EMAIL PROTECTED]
Verzonden: za 15-11-2008 21:22
Aan: [EMAIL PROTECTED]
Onderwerp: [RE:computer-go] Monte carlo play?


Hello Dave,

Thank you for a thorough introduction of the theory, and i sincerely hope I am 
not wasting your time with amateur questions.
Because it was quite late when I sent the question I think I didnt word it 
correctly and mis-communicated 
my question. 

Being a computer scientist but new to go, i can grasp some of the theory. The 
question I was trying to get across 
was:

In a game of self play, if both parties are employing only monte carlo, surely 
its not a good conceptual representation
of a human, and if the reinforcement learning is based on random simulations 
wouldnt it be very weak when playing a real human?
According to nick wedd and some playing around I am clearly mistaken because 
its works quite well.
Is it because they are training(not in the neural network sense) their monte 
carlo based engine against another engine with prior knowledge or something
of that nature?

I have seen papers implementing patterns and prior knowledge with monte carlo 
(dated 2006) has that become a standard
now, and when people refer to monte carlo they dont mean absolutly random? 

Thank you

Kind regards

Tony




Read amazing stories to your kids on Messenger Try it Now! 
http://clk.atdmt.com/UKM/go/117588488/direct/01/  
___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

[computer-go] FW: computer-go] Monte carlo play?

2008-11-15 Thread dave.devos
Van: [EMAIL PROTECTED]
Verzonden: za 15-11-2008 22:44
Aan: tony tang
Onderwerp: RE: computer-go] Monte carlo play?


Hello Tony,
 
Ok, so you know about the minimax algorithm and the like. My impression was 
wrong and I'm very sorry for my analogy.
 
I'm no expert on montecarlo, so I can only say what I know. But most of the 
real experts have been rather quiet during the last week, so perhaps I can act 
as a stand-in.
 
The basics is indeed absolutely random play (alternating turns and removing 
captured stones as in normal play) until the game finishes. The only exceptions 
to random plays are occupied intersections, simple ko recaptures, suicide moves 
(not sure if all programs prune suicide if the rules allow it) and moves that 
fill one-point eyes of the color who is to play (there may be slight 
differences in the exact definition of a one point eye).
 
Without the one eye pruning, random games would never finish, because both 
random players would keep on killing their own stones and refilling the empty 
space after the opponent has captured. 
Even with eye detection and simple ko detection, the game could end up in a 
cycle occasionally because of triple ko or more pathological situations, so one 
should probably also limit the number of moves and discard the result if that 
limit is exceeded. This is probably a much cheaper solution to the cycle 
problem than a more sophisticated detection). 
 
A purely random game is played out until both players are forced to pass. This 
is a called a light playout. Claus Reinke recently dubbed the term random 
fill-in for this process 
(http://www.mail-archive.com/computer-go@computer-go.org/msg09749.html). I like 
this term better than light playout, but it doesn't seem to catch on here.
 
The board is then counted to determine the winner and this binary result is 
added to a statistic for the move that is currently being evaluated. The 
accumulated statistic is effectively the evaluation value for the position. It 
is then backtracked to the game tree root in a minimax way.
 
In go it is hard to quickly evaluate a position statically (or even slowly for 
that matter). So it is nice to have a dynamic evaluation that becomes more 
accurate as you spend more cpu cyles on it. You can see that it's important 
that random fill-ins are fast, but it's still a rather large investment to do a 
lot of playouts, when you compare this with a static evaluation function. So it 
is even more important to avoid wasting effort in evaluating positions by 
random fill-ins. The trick is to use crude statistics (few fill-ins) to decide 
which statistic to refine by investing more fill-ins. This is where the UCT 
value comes in (http://senseis.xmp.net/?UCT). It is derived from the statistic 
and it decides which position to investigate further, preferring good 
statistics over bad ones, but also preferring crude statistics over accurate 
ones.
 
There is no training involved and there is no a priori knowledge involved here 
except for the rules and eye detection. The program has absolutely no idea 
about go concepts.
 
Would you call this process learning?
 
Programs using this algorithm are called pure MC programs and although they 
get better with more CPU cycles, they aren't very strong with normal time 
limits on current hardware (or hardware in the near future). You are right, by 
itself pure MC is just not effective enough.
 
So programmers started to build in more Go knowledge to improve on this. This 
makes the fill-in process slower, so they are called heavy playouts. And this 
is also where my knowledge ends. I may guess that instead of selecting moves in 
a uniform random way, moves are weighted by a priori Go knowledge built in by 
the programmer, but I know no details and I don't know if there have been 
publications on this. Perhaps there aren't any, because it is also a 
competitive field and commercial interests might be involved too.
 
Other contributors can probably tell you more about this.
 
Best Regards,
Dave 
 



Van: tony tang [mailto:[EMAIL PROTECTED]
Verzonden: za 15-11-2008 21:22
Aan: [EMAIL PROTECTED]
Onderwerp: [RE:computer-go] Monte carlo play?


Hello Dave,

Thank you for a thorough introduction of the theory, and i sincerely hope I am 
not wasting your time with amateur questions.
Because it was quite late when I sent the question I think I didnt word it 
correctly and mis-communicated 
my question. 

Being a computer scientist but new to go, i can grasp some of the theory. The 
question I was trying to get across 
was:

In a game of self play, if both parties are employing only monte carlo, surely 
its not a good conceptual representation
of a human, and if the reinforcement learning is based on random simulations 
wouldnt it be very weak when playing a real human?
According to nick wedd and some playing around I am clearly mistaken because 
its works quite well.
Is it because they are training(not in the neural 

[computer-go] FW: computer-go] Monte carlo play?

2008-11-15 Thread dave.devos
Van: tony tang [mailto:[EMAIL PROTECTED]
Verzonden: za 15-11-2008 23:08
Aan: [EMAIL PROTECTED]
Onderwerp: RE: computer-go] Monte carlo play?


Thanks Dave, that was incredible helpful, hopefully this new hobby of mine will 
materialise into a decent project.
 
All the best
 
Tony
 




Subject: RE: computer-go] Monte carlo play?
Date: Sat, 15 Nov 2008 22:44:51 +0100
From: [EMAIL PROTECTED]
To: [EMAIL PROTECTED]



Hello Tony,
 
Ok, so you know about the minimax algorithm and the like. My impression was 
wrong and I'm very sorry for my analogy.
 
I'm no expert on montecarlo, so I can only say what I know. But most of the 
real experts have been rather quiet during the last week, so perhaps I can act 
as a stand-in.
 
The basics is indeed absolutely random play (alternating turns and removing 
captured stones as in normal play) until the game finishes. The only exceptions 
to random plays are occupied intersections, simple ko recaptures, suicide moves 
(not sure if all programs prune suicide if the rules allow it) and moves that 
fill one-point eyes of the color who is to play (there may be slight 
differences in the exact definition of a one point eye).
 
Without the one eye pruning, random games would never finish, because both 
random players would keep on killing their own stones and refilling the empty 
space after the opponent has captured. 
Even with eye detection and simple ko detection, the game could end up in a 
cycle occasionally because of triple ko or more pathological situations, so one 
should probably also limit the number of moves and discard the result if that 
limit is exceeded. This is probably a much cheaper solution to the cycle 
problem than a more sophisticated detection). 
 
A purely random game is played out until both players are forced to pass. This 
is a called a light playout. Claus Reinke recently dubbed the term random 
fill-in for this process 
(http://www.mail-archive.com/computer-go@computer-go.org/msg09749.html). I like 
this term better than light playout, but it doesn't seem to catch on here.
 
The board is then counted to determine the winner and this binary result is 
added to a statistic for the move that is currently being evaluated. The 
accumulated statistic is effectively the evaluation value for the position. It 
is then backtracked to the game tree root in a minimax way.
 
In go it is hard to quickly evaluate a position statically (or even slowly for 
that matter). So it is nice to have a dynamic evaluation that becomes more 
accurate as you spend more cpu cyles on it. You can see that it's important 
that random fill-ins are fast, but it's still a rather large investment to do a 
lot of playouts, when you compare this with a static evaluation function. So it 
is even more important to avoid wasting effort in evaluating positions by 
random fill-ins. The trick is to use crude statistics (few fill-ins) to decide 
which statistic to refine by investing more fill-ins. This is where the UCT 
value comes in (http://senseis.xmp.net/?UCT). It is derived from the statistic 
and it decides which position to investigate further, preferring good 
statistics over bad ones, but also preferring crude statistics over accurate 
ones.
 
There is no training involved and there is no a priori knowledge involved here 
except for the rules and eye detection. The program has absolutely no idea 
about go concepts.
 
Would you call this process learning?
 
Programs using this algorithm are called pure MC programs and although they 
get better with more CPU cycles, they aren't very strong with normal time 
limits on current hardware (or hardware in the near future). You are right, by 
itself pure MC is just not effective enough.
 
So programmers started to build in more Go knowledge to improve on this. This 
makes the fill-in process slower, so they are called heavy playouts. And this 
is also where my knowledge ends. I may guess that instead of selecting moves in 
a uniform random way, moves are weighted by a priori Go knowledge built in by 
the programmer, but I know no details and I don't know if there have been 
publications on this. Perhaps there aren't any, because it is also a 
competitive field and commercial interests might be involved too.
 
Other contributors can probably tell you more about this.
 
Best Regards,
Dave 
 



Van: tony tang [mailto:[EMAIL PROTECTED]
Verzonden: za 15-11-2008 21:22
Aan: [EMAIL PROTECTED]
Onderwerp: [RE:computer-go] Monte carlo play?


Hello Dave,

Thank you for a thorough introduction of the theory, and i sincerely hope I am 
not wasting your time with amateur questions.
Because it was quite late when I sent the question I think I didnt word it 
correctly and mis-communicated 
my question. 

Being a computer scientist but new to go, i can grasp some of the theory. The 
question I was trying to get across 
was:

In a game of self play, if both parties are employing only monte carlo, 

Re: [computer-go] FW: computer-go] Monte carlo play?

2008-11-15 Thread Darren Cook
 In a game of self play, if both parties are employing only monte
 carlo, ... random simulations... wouldnt it be very weak...
 ... and some playing around I am clearly mistaken because its works
 quite well.

Yes, it doesn't make sense but it does indeed seem to work :-).

 I have seen papers implementing patterns and prior knowledge with
 monte carlo (dated 2006) has that become a standard now, and when
 people refer to monte carlo they dont mean absolutly random?

Just using monte carlo playouts is still weak.
Monte carlo + UCT is strong. (UCT is a specific algorithm, so people are
now referring to the set of similar algorithms as MCTS for Monte Carlo
Tree Search.)

No one is using purely random playouts in any top program, but David
Fotland described Many Faces' playouts as fairly light. I think Crazy
Stone (as described in detail in one of RĂ©mi Coulom's papers) or
Valkryia perhaps have the heaviest playouts, and Mogo is somewhere between?

Too much knowledge in the monte carlo playouts and you use too many CPU
cycles and cannot do enough of them. (And the real situation is even
more complex than that, as sometimes adding knowledge that seems really
fundamental for human players does not seem to help, even before
allowing for the extra CPU cycles used.)

Darren


-- 
Darren Cook, Software Researcher/Developer
http://dcook.org/mlsn/ (English-Japanese-German-Chinese-Arabic
open source dictionary/semantic network)
http://dcook.org/work/ (About me and my work)
http://dcook.org/blogs.html (My blogs and articles)
___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/