Re: [computer-go] State of the art of pattern matching

2008-04-02 Thread Moi de Quoi
On Tue, 2008-04-01 at 15:18 -0400, Joshua Shriver wrote:
 Do you have a link to those papers?

There is still one listed on the computer go bibliograpy:
http://www.cs.ualberta.ca/~games/go/compgo_biblio/

The links don't seem to work, so I set up a copy here:

http://nngs.ziar.net/cgml/gotxt/kojima.ps

HTH,
AvK

___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/


Re: [computer-go] State of the art of pattern matching

2008-04-02 Thread Jacques Basaldúa

Jonas Kahn wrote:

 I guess you have checked that with your rules for getting probability
 distributions out of gammas, the mean of the probability of your move 1
 was that that you observed (about 40 %) ?

If I understand your post, there may be a misunderstanding by my fault.
Here gamma is not a gamma function nor a gamma distribution but a constant.
It is just the same Greek letter. I don't remember if it was in 
Crazystone's

description or in some other paper I read to understand what Bradley Terry
models are, I just got used to that notation.

The best thing of BT models (for me) is the extreme simplicity at runtime
(calibration may have more black magic) so:

  prob of move i = gamma[i] / SUMj gamma[j]

where gamma[·] is a constant each pattern has. Setting those constants is
the learning process.

The 40% is obtained between move 20 and 119 of over 55K games. That is more
than 5 M competitions. The patterns are learned for other move numbers as
well. It considers the urgency modified by the first time the pattern 
appears.

Also ko, but ko is learned to (hopefully) find ko threats, its impact on
guessing is less important. It counts the number of times the first move
was the right move, then the first + second, etc.

The reason why two different ways of measuring the same:

a. Probability expected for each move.
b. Number of guesses of the best move, 2nd best, 3rd best, etc.

don't match is mainly academic, because form a practical point of view, the
important is to have a good move generator and (even more important) to
understand its limitations. It cannot be considered as the only source of
moves, but the first moves in terms of urgency are moves that should be
paid attention.

I admit, that what I call a probability distribution over the legal moves,
is not really balanced I don't understand why, but, nevertheless, in terms
of urgency, better moves get higher urgency. This is all I really need.
Of course, I would welcome an explanation on why the 2 things don't match
or if someone else can verify if his patterns give correct values.


Jacques.
___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/


Re: [computer-go] State of the art of pattern matching

2008-04-02 Thread Jonas Kahn
On Wed, Apr 02, 2008 at 02:13:45PM +0100, Jacques Basaldúa wrote:
 Jonas Kahn wrote:

  I guess you have checked that with your rules for getting probability
  distributions out of gammas, the mean of the probability of your move 1
  was that that you observed (about 40 %) ?

 If I understand your post, there may be a misunderstanding by my fault.
 Here gamma is not a gamma function nor a gamma distribution but a constant.
 It is just the same Greek letter. I don't remember if it was in 
 Crazystone's
 description or in some other paper I read to understand what Bradley Terry
 models are, I just got used to that notation.

Sorry, you were clear, it's only me who did not remember BT correctly. I
thought of a transform P = Lambda(gamma2 - gamma1) in the end, like for
Elo estimation.

The heavy tails could still very well come from inadequacy of the
formula P(i) = gamma(i) / sum_j gamma(j) for the smaller probabilities.

The most natural explanation for your underestimating the proability
that the first move is the right one would come from correlations
between patterns.

A short toy model where only the first move can have one or two
patterns, with same individual value, but with varying combined value,
suggests that correlations between patterns should be positive to
explain your result.

I am a bit surprised, since I would have thought that correlations
between patterns would rather be negative...

Maybe you can test that in the following way: if you have say 500
patterns that have a much greater value than others, you could add to
your learning an entry 'there is both pattern 1 and pattern 2' for all
pairs of patterns among those 500, and see if you get better predictions
in the end. 
Of course, if too many patterns are really relevant, this is hard to do.  

Anyhow, as you said, what's important is having good suggestions for
urgent moves, not this kind of discrepancies.

Jonas
___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/


Re: [computer-go] State of the art of pattern matching

2008-04-01 Thread Mark Boon


On 31-mrt-08, at 20:28, Don Dailey wrote:


You could be
blind-siding the program.


I think this is the crux of the matter. Not just in MC but in Go  
programming in general. If you add 'strong' knowledge you can create  
blind-spots. For example, I guess a ko rarely gets played during  
playouts because strong priority is given to connecting the stone in  
atari. It's only during the actual search that ko is discovered. But  
if you have a case where it's almost always out of the scope of the  
search-tree to find the blind-spot of a piece of knowledge then it  
can lead to very poor play.


Mark


___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

Re: [computer-go] State of the art of pattern matching

2008-04-01 Thread Jacques Basaldúa

Hi Lukasz

In Rémi's paper about Bradly Terry models he found a way to give a
comparable gamma score to things that were different, for instance:
capturing a stone v.s. a given 3x3 pattern.
His model is much more general, but has less patterns (not at all
around 200K patterns of my system). Additionally, my mask pattern
only system has a reasonable a priori value:

times_played / (times_played + times_postponed).

But there is an additional issue handled by Rémi's system that is
ignored by the naif system: The fact that some competitions are
much harder than others. If the sum of concurrent scores to the
average competition is, say 100, the value of winning a competition
of sum 40 is obviously much less than winning one of sum 300. That
is not included in the simple formula. Therefore, a simple intuitive
idea if adding more for a the hard competition 300/100 and less for
the easy one 40/100. Because there is a feedback as when you change
the scores you are also changing the sums of concurrent scores, there
are so many variables to adjust ~200K and the a priori values are
already reasonable, I prefer to adjust doing small steps since
this is an offline process and speed is irrelevant.

The hits of each move don't change that much.

At the beginning of the adjustment:

Hits of move 1 = 0.382405 acc = 0.382405
Hits of move 2 = 0.113530 acc = 0.495935
Hits of move 3 = 0.067430 acc = 0.563365
Hits of move 4 = 0.048055 acc = 0.611420
Hits of move 5 = 0.036304 acc = 0.647725
Hits of move 6 = 0.028978 acc = 0.676703
Hits of move 7 = 0.023769 acc = 0.700472
Hits of move 8 = 0.019793 acc = 0.720264
Hits of move 9 = 0.016693 acc = 0.736957
Hits of move 10 = 0.014164 acc = 0.751121


At the end of the adjustment: (Additional steps don't
improve further, some value may even decrease.)

Hits of move 1 = 0.409278 acc = 0.409278
Hits of move 2 = 0.115598 acc = 0.524877
Hits of move 3 = 0.062659 acc = 0.587536
Hits of move 4 = 0.044014 acc = 0.631550
Hits of move 5 = 0.033681 acc = 0.665230
Hits of move 6 = 0.027696 acc = 0.692926
Hits of move 7 = 0.022885 acc = 0.715811
Hits of move 8 = 0.019029 acc = 0.734840
Hits of move 9 = 0.015758 acc = 0.750598
Hits of move 10 = 0.012886 acc = 0.763484

What I call the distribution is this: The gamma value of each legal
move defines a probability distribution function over the legal moves.
If we had no information (uniform random distribution) we would expect
that with 20% of the legal moves we hint 20% of the times, etc. So
for 0.1, 0.2 .. 0.9 of the legal moves we hit around 0.1, 0.2 .. 0.9.
Now, what happens when we use our pdf over the legal moves? We can
get 10%, 20% 30%, etc with very few move candidates. But, if the
first move has say 0.17 and the 1+2 = 0.21 and I want to get 0.2
If the first is a match, count 1, if the second is a match count
3/4 = (0.2 - 0.17)/(0.21 - 0.17) if any other move is a match count 0.
Do I really get 0.1, 0.2, .. 0.9, of course, with much less moves ?

No. for a reason I don't understand, I get something like:

Distribution fit expected 0.1 found 0.153164
Distribution fit expected 0.2 found 0.298602
Distribution fit expected 0.3 found 0.433074
Distribution fit expected 0.4 found 0.551575
Distribution fit expected 0.5 found 0.651135
Distribution fit expected 0.6 found 0.727419
Distribution fit expected 0.7 found 0.776876
Distribution fit expected 0.8 found 0.804008
Distribution fit expected 0.9 found 0.823292

So my distribution is distorted, when I try to get 30% of
the guessing chance I get 43.3%, but when I try to get
90% I get 82.3%. I don't know why.


Jacques.

Łukasz Lew wrote:


On Sat, Mar 29, 2008 at 3:47 PM, Jacques Basaldúa [EMAIL PROTECTED] wrote:
 


Mark Boon wrote:

 I suppose there's no reason why  it has to be assembler,
 you could just as well generate C-code.

I don't think so. But I don't write that much asm as it may appear.
I see what the compiler writes when I debug and write critical
parts only. And automatically generated code.


 I don't see yet how you go from there to finding patterns.

Depending on the mode, it computes either the hash or the mask.
That mask has to be translated into urgency by a table search.

(Btw. urgency is not just:

  times_played / (times_played + times_postponed)

but that value is used as an a priori value for Bradley Terry
models as Rémi introduced. My adjustment is not as advanced
as what Rémi describes. I just give a weight to a competition
based on the sum of the competing gamma values, and that gives
me another point:

SUM weights_of_played / SUM weights_of_played + weights_of_postponed
   



Can you tell how you came with such an equation?
Does it improves much?

Thanks
Lukasz

 


And I advance a step (like 1/4 of the distance) in that direction
and use the new point as the current value. Iterating this way
improves the wins of the best move but distorts the distribution
(I extract lots of statistics at each step) so I stop rather early.)

The macro for searching in an ordered list 

Re: [computer-go] State of the art of pattern matching

2008-04-01 Thread Petr Baudis
On Mon, Mar 31, 2008 at 03:12:39PM -0700, Christoph Birk wrote:

 On Mar 31, 2008, at 1:05 PM, Don Dailey wrote:


 Christoph Birk wrote:

 On Mar 31, 2008, at 10:48 AM, Mark Boon wrote:
 I don't know about this. I'm pretty sure MoGo checks if the stone can
 make at least two liberties (ladder problem) in which case it can
 still be horrible but very seldomly worse than random.

 I would expect playing a not-working ladder to be worse than random
 most
 of the time.
 Of course this is true, but presumably a move that answers a direct
 atari threat would classify as being better than random.

 Not if it's a working ladder.

I think this is not obvious, since there is still about 50% chance on
who gets the second move there. The original MoGo pattern describes only
capture-possibility check, not atari-possibility check, so even if you
play out the ladder, the opponent will most likely tenuki. So I think
not playing out working ladders as white is actually better because it
saves _black_ from getting a bad result!

I implemented a naive ladder check (that does not catch a lot of cases,
especially when you have to decide whether to slide the stones in
ladder along border or some of your friendly stones) and tested its
efficiency in UCT-driven playouts with only a 50% probability
capture-possibility check heuristic.

Against GNUGo Level8, I get 37% +-5.7% wins with ladder check, 28%
+-4.6% without ladder check. Unfortunately, the numbers are not very
precise and the difference is probably much smaller in reality - 37%
seems overinflated given later measurements; I will redo the test with
newer codebase and more games.

(I plan to post detailed measurements of effects of various heuristics
and parameters anyway; so far it seems my code is still too buggy
though: AMAF, prior knowledge and FPU all make my program weaker :-(.)

-- 
Petr Pasky Baudis
Whatever you can do, or dream you can, begin it.
Boldness has genius, power, and magic in it.-- J. W. von Goethe
___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/


Re: [computer-go] State of the art of pattern matching

2008-04-01 Thread Jonas Kahn
Hi Jacques


 No. for a reason I don't understand, I get something like:

 Distribution fit expected 0.1 found 0.153164
 Distribution fit expected 0.2 found 0.298602
 Distribution fit expected 0.3 found 0.433074
 Distribution fit expected 0.4 found 0.551575
 Distribution fit expected 0.5 found 0.651135
 Distribution fit expected 0.6 found 0.727419
 Distribution fit expected 0.7 found 0.776876
 Distribution fit expected 0.8 found 0.804008
 Distribution fit expected 0.9 found 0.823292

 So my distribution is distorted, when I try to get 30% of
 the guessing chance I get 43.3%, but when I try to get
 90% I get 82.3%. I don't know why.

I guess you have checked that with your rules for getting probability
distributions out of gammas, the mean of the probability of your move 1
was that that you observed (about 40 %) ? And the same for the following
ones ?
For the tails (many moves), I guess that a proportion of available moves
is a better reference.

In fact, doing that for the following moves too would be a way to
calibrate your function (gamma_1, gamma_2) - Prob(gamma_1);
All the logits and so on are somewhat arbitrary and can be wrong
especially in the tails.

For a distribution fit of .1, I guess you merely always keep the first
one (except very special case). Then, if you had a homogeneous 40%
probability that first choiceis the right one, and you decide it's half
the time 30%,  half the time 50%, you see that you get (.1/.3 + .1/.5) /
2  .1/.4.
Hence if your fluctuations in the estimation of the first choice are not
adapte, you shall get bigger fit.

On the other hand, when you are not in the first moves, shape is not a
factor for really deciding the move.
Suppose that for 50% of moves, shape counts and the right move is one of
the first 5 candidates, and that for 50% it does not count, and the good
move is chosen uniformly at random with respect to your patterns
criterion;
Then achieving .9 fit merely asks for always taking 80% of the possible
moves (including of course the 'good pattern' ones).
If you have wrong estimation of your tail probabilities, you can easily
select much less moves.

Jonas
___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/


Re: [computer-go] State of the art of pattern matching

2008-04-01 Thread Joshua Shriver
Do you have a link to those papers?
-Josh



 My go-programming efforts are very much concentrated on patterns.
 (maybe I have been influenced by the Kojima-papers)
___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

Re: [computer-go] State of the art of pattern matching

2008-03-31 Thread Don Dailey


Mark Boon wrote:
 I did an experiment that looks rather similar. I generated patterns
 and only kept the ones that had a minimum amount of 'urgency' and a
 minimum number of occurrences. But I noticed two things when using
 these patterns in a MC playout:

 1) There are many important moves missing. Apparently they were not
 picked up from the game-database even though the number of games is in
 the 10-thousands.


Are these fixed patterns or wildcard patterns?  I'm interested in
wildcard patterns too and how to automatically generate them.

A wildcard pattern is exactly the same as a decision tree (it can be
represented perfectly by a decision tree.)There are several methods
in AI that have similar function and performance to decision trees - and
that's why these methods are interesting to me.   They are different
ways of looking at wildcard pattern generation.  


 2) When I used these patterns during simulation the playouts suddenly
 look surprisingly like normal Go compared to random playouts. However,
 the program started to play worse. Much worse. Even when I let it
 compute as many playouts as a program without patterns.
This of course is now a well known phenomenon.


 The first observation made me wary to rely on harvested patterns. I
 think it shows at least it needs to be used in combination with
 hand-crafted patterns. Also it means that the fact you harvest a large
 number of patterns isn't necessarily meaningful.
It isn't  necessarily the quality of the moves that is important it
seems.All playouts, including uniformly random playouts, set up
certain type of bias.   Uniformly random playouts have the
characteristic that stupid errors are compensated by stupid errors by
the opponent and so tend to weakly balance out and return a reasonable
evaluation. 

Heavier playouts have been shown to be far superior,  but just placing
stronger moves in the playouts does not seem to be the right
formula. My guess is that if you place arbitrary knowledge into the
playouts, you create terrible imbalances.   Perhaps the secret (I'm
entering uncharted territory here)  is that it's ok to create imbalance
if it's the type the tree is easily able to correct.This is my
working theory,  let's call it maximum information gain.  The
early Mogo paper showed that it's good to look at defending opponents
atari.Defending an atari tends to be either one of the better or one
of the worst moves on the board.It's a horrible move if it cannot be
defended.HOWEVER,  what you can say is that Mogo's playout strategy
pushes the search right away in the direction of  finding out for
sure   so this appears to cause the playouts to create a hypothesis
that is either quickly refuted or quickly confirmed.I really think
you want to push towards positions that work out what is really going
on, even if you have to insert bias into the the playouts to get this
(which you probably do.)


 The second observation I think may have been caused by not enough
 randomness. But that means I first have to find an answer to how much
 randomness I need to put into the patterns. I'm first looking at this
 question with some hand-crafted patterns to get a better handle on
 this issue.
Let us know.   The whole issue is pretty interesting.I think
randomness is required only to counteract systematic biases because
obviously if you playouts played perfectly,  there would be no need for
randonmess.   And yet better than random playouts can also lead to worse
play in general.  


- Don




 Mark


 On 30-mrt-08, at 09:14, Jacques Basaldúa wrote:

 Mark Boon wrote:


 There are 16 4-distance points, so if you spill ino that by one
  point you get 315 or a little over 14 million patterns. Multiplied
  by 3 for every don't-care point within less than 4 distance. Ouch.

 True, but the number of patterns is learned automatically. When I
 first learn the 55K+ games, there are so many patterns that I can
 only create a pattern file of less than 2000 games. I create 32
 such files an call importance the number of files in which each
 pattern is found. (a value from 1 to 32). The number of patterns
 are:

 # of imp = 32   97132  (97132)
 # of imp = 31   26493  (123625)
 # of imp = 30   21460  (145085)
 # of imp = 29   19335  (164420)
 # of imp = 28   18415  (182835)
 # of imp = 27   18703  (201538)
 # of imp = 26   18619  (220157)
 # of imp = 25   19345  (239502)
 # of imp = 24   20390  (259892)
 # of imp = 23   21611  (281503)
 # of imp = 22   22959  (304462)
 # of imp = 21   24675  (329137)
 # of imp = 20   26808  (355945)
 # of imp = 19   29081  (385026)
 # of imp = 18   31938  (416964)
 # of imp = 17   35319  (452283)
 # of imp = 16   39188  (491471)
 # of imp = 15   43899  (535370)
 # of imp = 14   50391  (585761)
 # of imp = 13   57259  (643020)
 # of imp 

Re: [computer-go] State of the art of pattern matching

2008-03-31 Thread terry mcintyre
I think Don Dailey makes a good point about creating a
line of search which can quickly prove or refute a
line of play.

I've been using life-and-death problems to improve my
own level of play, and sometimes vital move patterns
quickly lead to proper solutions - but sometimes they
fail, and one must look for unlikely patterns which
seldom or never show up in game records.

When I play Mogo and other MC programs, I find it
highly profitable to set up conditions for a life and
death problem which the programs can't resolve - yet!
I'm sure future generations will be more clever.

13x13 and 19x19 boards give a lot more scope for such
shenanigans; on a 9x9 board, it's harder to isolate
groups, and fewer playouts will pretty much exhaust
the important possibilities.




 

Terry McIntyre lt;[EMAIL PROTECTED]gt;

“Wherever is found what is called a paternal government, there is found state 
education. It has been discovered that the best way to insure implicit 
obedience is to commence tyranny in the nursery.”

Benjamin Disraeli, Speech in the House of Commons [June 15, 1874]


  

No Cost - Get a month of Blockbuster Total Access now. Sweet deal for Yahoo! 
users and friends. 
http://tc.deals.yahoo.com/tc/blockbuster/text1.com
___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/


Re: [computer-go] State of the art of pattern matching

2008-03-31 Thread Mark Boon


On 31-mrt-08, at 12:36, Don Dailey wrote:


Are these fixed patterns or wildcard patterns?  I'm interested in
wildcard patterns too and how to automatically generate them.

A wildcard pattern is exactly the same as a decision tree (it can be
represented perfectly by a decision tree.)There are several  
methods
in AI that have similar function and performance to decision trees  
- and

that's why these methods are interesting to me.   They are different
ways of looking at wildcard pattern generation.



These are wild-card patterns. And I do indeed represent them as a  
decision tree.




2) When I used these patterns during simulation the playouts suddenly
look surprisingly like normal Go compared to random playouts.  
However,

the program started to play worse. Much worse. Even when I let it
compute as many playouts as a program without patterns.

This of course is now a well known phenomenon.


Aha, well sometimes reinventing a few wheels is inevitable :-) It did  
surprise me.






The first observation made me wary to rely on harvested patterns. I
think it shows at least it needs to be used in combination with
hand-crafted patterns. Also it means that the fact you harvest a  
large

number of patterns isn't necessarily meaningful.

It isn't  necessarily the quality of the moves that is important it
seems.All playouts, including uniformly random playouts, set up
certain type of bias.   Uniformly random playouts have the
characteristic that stupid errors are compensated by stupid errors by
the opponent and so tend to weakly balance out and return a  
reasonable

evaluation.


They balance, but tend to have a large deviation from the norm.  
'Better' play during playouts should, in theory, reduce the size of  
the deviation.




Heavier playouts have been shown to be far superior,  but just placing
stronger moves in the playouts does not seem to be the right
formula. My guess is that if you place arbitrary knowledge into  
the

playouts, you create terrible imbalances.   Perhaps the secret (I'm
entering uncharted territory here)  is that it's ok to create  
imbalance

if it's the type the tree is easily able to correct.This is my
working theory,  let's call it maximum information gain.  The
early Mogo paper showed that it's good to look at defending opponents
atari.Defending an atari tends to be either one of the better  
or one
of the worst moves on the board.It's a horrible move if it  
cannot be
defended.HOWEVER,  what you can say is that Mogo's playout  
strategy

pushes the search right away in the direction of  finding out for
sure   so this appears to cause the playouts to create a hypothesis
that is either quickly refuted or quickly confirmed.I really think
you want to push towards positions that work out what is really  
going

on, even if you have to insert bias into the the playouts to get this
(which you probably do.)



I don't know about this. I'm pretty sure MoGo checks if the stone can  
make at least two liberties (ladder problem) in which case it can  
still be horrible but very seldomly worse than random. It seems to me  
that in principle whenever you have a 'pattern' that has a higher  
success-rate than a random move over a large number of games, then it  
should always be preferred over the random choice. I think it's the  
lack of randomness that's the problem. The large number of playouts  
tend to converge to a mistake if it's deterministic. It doesn't even  
have to be totally deterministic I think.




The second observation I think may have been caused by not enough
randomness. But that means I first have to find an answer to how much
randomness I need to put into the patterns. I'm first looking at this
question with some hand-crafted patterns to get a better handle on
this issue.

Let us know.   The whole issue is pretty interesting.I think
randomness is required only to counteract systematic biases because
obviously if you playouts played perfectly,  there would be no need  
for
randonmess.   And yet better than random playouts can also lead to  
worse

play in general.


- Don




___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/


Re: [computer-go] State of the art of pattern matching

2008-03-31 Thread Don Dailey




 Heavier playouts have been shown to be far superior,  but just placing
 stronger moves in the playouts does not seem to be the right
 formula. My guess is that if you place arbitrary knowledge into the
 playouts, you create terrible imbalances.   Perhaps the secret (I'm
 entering uncharted territory here)  is that it's ok to create imbalance
 if it's the type the tree is easily able to correct.This is my
 working theory,  let's call it maximum information gain.  The
 early Mogo paper showed that it's good to look at defending opponents
 atari.Defending an atari tends to be either one of the better or one
 of the worst moves on the board.It's a horrible move if it cannot be
 defended.HOWEVER,  what you can say is that Mogo's playout strategy
 pushes the search right away in the direction of  finding out for
 sure   so this appears to cause the playouts to create a hypothesis
 that is either quickly refuted or quickly confirmed.I really think
 you want to push towards positions that work out what is really going
 on, even if you have to insert bias into the the playouts to get this
 (which you probably do.)


 I don't know about this. 
I don't know about it either - I'm still trying to figure it out.  
Let's say this is just a theory of mine for now.
 I'm pretty sure MoGo checks if the stone can make at least two
 liberties (ladder problem) in which case it can still be horrible but
 very seldomly worse than random. 
I don't know what Mogo does today,  but a successful early version of
Mogo checks for a random move that defends.Defends in the temporary
sense that it either creates 2 liberties directly as you say or captures
one of the surrounding groups.

But it doesn't defend absolutely of course.   This means you might
continue to defend a group that is certain to die eventually via a
ladder.  The game is virtually over if you continue to defend a long
ladder and fail. 

About whether it's worse than random or not: I don't think that's
a good litmus test.You are sure to have terrible playouts if one
side plays better than random and the other side doesn't. As long
as one side, or one type of move is emphasized without a corresponding
balance,   the playouts are going to be heavily biased and consistently
give the wrong answer. 

Just for illustration purposes,  pretend that we find some really
wonderful heuristic that can find the very best move but for some odd
reason it only works for WHITE.We could have a playout that plays
half of the moves perfectly,  far better than random. The obvious
problem with such an algorithm is that it would almost always show white
winning the game!It would be virtually worthless in the playouts!

Evidently,   the way we add knowledge to playouts can produce a similar
type of effect,  putting some kind of bias into the playouts that makes
it return win/loss information that is not as useful as even random
playouts in some cases.  I can only guess that any knowledge that
makes the playouts play much better than random, and yet makes the total
program play worse, must have some kind of systematic bias that gives
some types of positions much better chances than they deserve.  


 It seems to me that in principle whenever you have a 'pattern' that
 has a higher success-rate than a random move over a large number of
 games, then it should always be preferred over the random choice. 
Intuition fails us in this case I think. Your goal is not to find
the best move,   it is to fairly decide who has the better
chances. It's probably no good if your pattern set just happens to
favor one side or the other in a specific position.  I'm not sure
why YOUR patterns are failing, because I would assume using an automated
procedure like you are doing would be fair and give overall even
coverage compared to using careful hand crafted patterns (which would
surely be biased.)  

I think Mogo used 8 or 12 patterns  that were carefully chosen to cover
really broad principles and not highly specific cases. 

 I think it's the lack of randomness that's the problem. The large
 number of playouts tend to converge to a mistake if it's
 deterministic. It doesn't even have to be totally deterministic I think.
I think you are very likely correct.The more patterns you have, the
more deterministic your program is.Your playouts should  not always
win (for one side) in a fairly even positions due to small changes in
the position.   Perhaps this is a general principle that could be tested
somehow?




 The second observation I think may have been caused by not enough
 randomness. But that means I first have to find an answer to how much
 randomness I need to put into the patterns. I'm first looking at this
 question with some hand-crafted patterns to get a better handle on
 this issue.
There is at least one program I have heard of that tries to play moves
directly proportionate to their value somehow. In 

Re: [computer-go] State of the art of pattern matching

2008-03-31 Thread Christoph Birk


On Mar 31, 2008, at 10:48 AM, Mark Boon wrote:
I don't know about this. I'm pretty sure MoGo checks if the stone  
can make at least two liberties (ladder problem) in which case it  
can still be horrible but very seldomly worse than random.


I would expect playing a not-working ladder to be worse than random  
most

of the time.

Christoph

___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/


Re: [computer-go] State of the art of pattern matching

2008-03-31 Thread Christoph Birk


On Mar 31, 2008, at 1:05 PM, Don Dailey wrote:



Christoph Birk wrote:


On Mar 31, 2008, at 10:48 AM, Mark Boon wrote:
I don't know about this. I'm pretty sure MoGo checks if the stone  
can

make at least two liberties (ladder problem) in which case it can
still be horrible but very seldomly worse than random.


I would expect playing a not-working ladder to be worse than random
most
of the time.

Of course this is true, but presumably a move that answers a direct
atari threat would classify as being better than random.


Not if it's a working ladder.

Christoph

___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/


Re: [computer-go] State of the art of pattern matching

2008-03-31 Thread Don Dailey


Christoph Birk wrote:

 On Mar 31, 2008, at 1:05 PM, Don Dailey wrote:


 Christoph Birk wrote:

 On Mar 31, 2008, at 10:48 AM, Mark Boon wrote:
 I don't know about this. I'm pretty sure MoGo checks if the stone can
 make at least two liberties (ladder problem) in which case it can
 still be horrible but very seldomly worse than random.

 I would expect playing a not-working ladder to be worse than random
 most
 of the time.
 Of course this is true, but presumably a move that answers a direct
 atari threat would classify as being better than random.

 Not if it's a working ladder.
That's what I'm saying Christoph.   When it's a ladder and the defender
cannot succeed then I responded Of course this is true.

In all other cases,  the odds are very strong that the move is much
better than random.

Something else not caught well in patterns and clever logic for playouts
is that winning probability gets confused with good local moves and so
too much of a good thing may be counter-productive.You could be
blind-siding the program. This could be more of an issue than we
think.  

- Don



 Christoph

 ___
 computer-go mailing list
 computer-go@computer-go.org
 http://www.computer-go.org/mailman/listinfo/computer-go/

___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/


Re: [computer-go] State of the art of pattern matching

2008-03-30 Thread Jacques Basaldúa

Mark Boon wrote:


There are 16 4-distance points, so if you spill ino that by one
 point you get 315 or a little over 14 million patterns. Multiplied
 by 3 for every don't-care point within less than 4 distance. Ouch.

True, but the number of patterns is learned automatically. When I
first learn the 55K+ games, there are so many patterns that I can
only create a pattern file of less than 2000 games. I create 32
such files an call importance the number of files in which each
pattern is found. (a value from 1 to 32). The number of patterns
are:

# of imp = 32   97132  (97132)
# of imp = 31   26493  (123625)
# of imp = 30   21460  (145085)
# of imp = 29   19335  (164420)
# of imp = 28   18415  (182835)
# of imp = 27   18703  (201538)
# of imp = 26   18619  (220157)
# of imp = 25   19345  (239502)
# of imp = 24   20390  (259892)
# of imp = 23   21611  (281503)
# of imp = 22   22959  (304462)
# of imp = 21   24675  (329137)
# of imp = 20   26808  (355945)
# of imp = 19   29081  (385026)
# of imp = 18   31938  (416964)
# of imp = 17   35319  (452283)
# of imp = 16   39188  (491471)
# of imp = 15   43899  (535370)
# of imp = 14   50391  (585761)
# of imp = 13   57259  (643020)
# of imp = 12   67062  (710082)
# of imp = 11   79013  (789095)
# of imp = 10   95292  (884387)
# of imp =  9  117109  (1001496)
# of imp =  8  147810  (1149306)

Depending on the threshold value used (and also the number
of times the pattern is seen) I can create databases from about
100K patterns to 1M patterns, more than that means including
patterns that are too seldom, their urgency information won't be
very accurate either due to the small sample size.

Jacques.

___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/


Re: [computer-go] State of the art of pattern matching

2008-03-29 Thread Jacques Basaldúa

Mark Boon wrote:

 I suppose there's no reason why  it has to be assembler,
 you could just as well generate C-code.

I don't think so. But I don't write that much asm as it may appear.
I see what the compiler writes when I debug and write critical
parts only. And automatically generated code.

 I don't see yet how you go from there to finding patterns.

Depending on the mode, it computes either the hash or the mask.
That mask has to be translated into urgency by a table search.

(Btw. urgency is not just:

   times_played / (times_played + times_postponed)

but that value is used as an a priori value for Bradley Terry
models as Rémi introduced. My adjustment is not as advanced
as what Rémi describes. I just give a weight to a competition
based on the sum of the competing gamma values, and that gives
me another point:

SUM weights_of_played / SUM weights_of_played + weights_of_postponed

And I advance a step (like 1/4 of the distance) in that direction
and use the new point as the current value. Iterating this way
improves the wins of the best move but distorts the distribution
(I extract lots of statistics at each step) so I stop rather early.)

The macro for searching in an ordered list is an efficient
implementation of the canonical method.

Copy/paste from Numerical Recipes in C ISBN 0-521-43 108-5
(c) Cambridge University Press 1988-1992

void locate(float xx[], unsigned long n, float x, unsigned long *j)
{
   unsigned long ju,jm,jl;
   int ascnd;
   jl=0; //Initialize lower
   ju=n+1; //and upper limits.
   ascnd=(xx[n] = xx[1]);
   while (ju-jl  1) { // If we are not yet done,
   jm=(ju+jl)  1; //compute a midpoint,
   if (x = xx[jm] == ascnd)
   jl=jm; // and replace either the lower limit
   else
   ju=jm; // or the upper limit, as appropriate.
   } // Repeat until the test condition is satisfied.
   if (x == xx[1]) *j=1; // Then set the output
   else if(x == xx[n]) *j=n-1;
   else *j=jl;
} // and return.

/Copy/paste

Improvement 1: Instead of if/else, use conditional moves.

Improvement 2. Since when the value is not found, the while
loop has a fixed number of iterations for a fixed table size,
unroll the loop as a sequence of iterations. As mentioned,
an iteration has only 8 instructions.

Improvement 3: When the value is found, break.
(That wouldn't come for free in C it would
require a additional if (x = xx[jm] ) )

   moveax, ebx
   addeax, ecx
   shreax, 1
   andal,  0fch   ;; Align to a valid pointer
   cmpedx, [eax]
   jznameout
   cmovc   ecx, eax;; This is true if [eax]  value
   cmovnc  ebx, eax;; This is true if the previous is false

Here ecx is a pointer to xx[jl], ebx is a pointer to xx[ju],
edx is x and eax is a pointer to xx[jm].

 I assume you allow some of the points to be undefined
 (also called 'don't care points') but I don't see how. And
 if you allow undefined points, why would you need masks
 of smaller manhatten distances?

I don't use don't care. I mask code in 2 bits: void, own,
opponent, out_of_board. This produces bigger database size,
because the only way to introduce don't care is to repeat the
pattern. Since my patterns are learned automatically and database
size is not a problem at all, I didn't see the need of don't care.

The bigger patterns have the disadvantage that around move 150
2/3 of the legal moves have unknown patterns. The smaller distances
improve this. I don't know yet if that is required, because patterns are
mainly intended to break the ice, when complex  shapes emerge, the
moves suggested by the search may be much better candidates.

Jacques.
___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/


Re: [computer-go] State of the art of pattern matching

2008-03-29 Thread Mark Boon


On 29-mrt-08, at 10:47, Jacques Basaldúa wrote:


I don't use don't care. I mask code in 2 bits: void, own,
opponent, out_of_board. This produces bigger database size,
because the only way to introduce don't care is to repeat the
pattern.


OK, so we were comparing apples and oranges. I know that using a hash- 
code with fixed-size patterns is a whole lot faster. You don't need  
to revert to assembler to make that really fast. If you see my  
original message you'll see that I mentioned that. I was specifically  
asking about a pattern-matcher that allows for don't-care points  
because I think I need that.


It's possible that speed considerations will make it necessary to use  
this kind of fixed-size patterns. I'm not yet convinced of it, but I  
don't rule out the possibility either. But before turning to the dark  
side I'd prefer to first explore how far (and fast) I can get with a  
'traditional' pattern-matcher that allows for don't care points.


Mark



___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

Re: [computer-go] State of the art of pattern matching

2008-03-29 Thread terry mcintyre

--- Mark Boon [EMAIL PROTECTED] wrote:

 
 On 29-mrt-08, at 10:47, Jacques Basaldúa wrote:
 
  I don't use don't care. I mask code in 2 bits:
 void, own,
  opponent, out_of_board. This produces bigger
 database size,
  because the only way to introduce don't care is
 to repeat the
  pattern.
 
 OK, so we were comparing apples and oranges. I know
 that using a hash- 
 code with fixed-size patterns is a whole lot faster.
 You don't need  
 to revert to assembler to make that really fast. If
 you see my  
 original message you'll see that I mentioned that. I
 was specifically  
 asking about a pattern-matcher that allows for
 don't-care points  
 because I think I need that.
 
 It's possible that speed considerations will make it
 necessary to use  
 this kind of fixed-size patterns. I'm not yet
 convinced of it, but I  
 don't rule out the possibility either. But before
 turning to the dark  
 side I'd prefer to first explore how far (and fast)
 I can get with a  
 'traditional' pattern-matcher that allows for don't
 care points.

Considering how inexpensive memory is, and how
branches cause processor pipelines to stall, it seems
to make sense to convert don't care patterns into
however many fixed patterns would be equivalent. If
there are three don't care elements, which could be
instantiated three ways, then 3^3 patterns would
replace one, for example. If these were converted into
optimized assembler, per Jacques' suggestion, the
pattern matcher would be extremely fast.



Terry McIntyre lt;[EMAIL PROTECTED]gt;

“Wherever is found what is called a paternal government, there is found state 
education. It has been discovered that the best way to insure implicit 
obedience is to commence tyranny in the nursery.”

Benjamin Disraeli, Speech in the House of Commons [June 15, 1874]


  

Like movies? Here's a limited-time offer: Blockbuster Total Access for one 
month at no cost. 
http://tc.deals.yahoo.com/tc/blockbuster/text4.com
___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/


Re: [computer-go] State of the art of pattern matching

2008-03-29 Thread Mark Boon


On 29-mrt-08, at 14:13, terry mcintyre wrote:


Considering how inexpensive memory is, and how
branches cause processor pipelines to stall, it seems
to make sense to convert don't care patterns into
however many fixed patterns would be equivalent. If
there are three don't care elements, which could be
instantiated three ways, then 3^3 patterns would
replace one, for example. If these were converted into
optimized assembler, per Jacques' suggestion, the
pattern matcher would be extremely fast.


Yes, of course.

Maybe I'd need to do an experiment at some point and compute from the  
pattern-set that I have and see how many fixed patterns that results  
into. I'm a bit afraid though of the worst case scenario where you  
have a pattern that doesn't quite fit in the 3-manhatten-distance  
mask. There are 16 4-distance points, so if you spill ino that by one  
point you get 3^15 or a little over 14 million patterns. Multiplied  
by 3 for every don't-care point within less than 4 distance. Ouch.


Mark

___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

Re: [computer-go] State of the art of pattern matching

2008-03-28 Thread Jacques Basaldúa

Mark Boon wrote:

Sorry, without a bit more explanation, the assembler code is very  
hard to understand. What exactly does it do?


The first source code was just an example to see what kind of code
is generated. The second is useful, if you understand asm you should 
understand it. The board has 4 modes, as far as patterns are concerned.
So the following applies for each mode and for each possible mapping 
scheme. When 40 neighbors are used, (i.e. Manhattan distance 4) there 
are 81 possible situations (all the cells of a 9x9 board are different 
as far as board limits are concerned). On bigger boards, each cell maps 
one of the 81 possible 9x9 cells. The board system cannot play on less 
than 9x9. And for each of these 4x(maximum)81 (some board modes have 
smaller distance) the generator writes 2 functions: one for creating 
the mask/hash from scratch and an other to update all masks/hashes 
in the neighborhood of a new stone.



Does it relate to a pattern? 


Yes the complete pattern is:

   +---+
   | 4 |
   +---+---+---+
   | 4 | 3 | 4 |
   +---+---+---+---+---+
   | 4 | 3 | 2 | 3 | 4 |
   +---+---+---+---+---+---+---+
   | 4 | 3 | 2 | 1 | 2 | 3 | 4 |
+---+---+---+---+---+---+---+---+---+
| 4 | 3 | 2 | 1 | · | 1 | 2 | 3 | 4 |
+---+---+---+---+---+---+---+---+---+
   | 4 | 3 | 2 | 1 | 2 | 3 | 4 |
   +---+---+---+---+---+---+---+
   | 4 | 3 | 2 | 3 | 4 |
   +---+---+---+---+---+
   | 4 | 3 | 4 |
   +---+---+---+
   | 4 |
   +---+

Justification of this shape:

1. It includes all standard nobi, iken tobi, niken tobi, kosumi, 
keima  ogeima relations (+ double iken tobi + double kosumi)
2. It detects if a pattern is at the 4-th line or the 4x4 corner 
(and below obviously). Less than that is too small.
The 4-th line is too different from the center. 
3. It is a nested structure, {dist12, dist12 + dist3} are both usable.

4. It has reasonable size for the information it contains.
5. The bit arrangement is optimized for 90 deg rotation.

Additionally, I keep urgency information for: normal, the pattern
shows up for the first time and urgency in a ko.



Did you write a paper or post explaining this in more detail?


Not yet.



Do I understand correctly that you generate this code automatically?


Yes. It would be a nightmare to write about 70K lines by hand and 
debugging would be hard as well. Although what the functions do is

simple enough to create a test that verifies 100% of the cases.


You are talking about updating 40 hashes. Does it mean your patterns  
have fixed size? 


Yes. 3 sizes: Manhattan distance 2, 3 and 4


In the 500 nanoseconds, how many patterns do you compare? 


That was updating (max) 40 hashes in the neighborhood of a newly 
placed stone. The precise number of hashes depends on the board
coordinate and the surrounding stones although that is achieved 
without a single conditional jump. (It is a very conservative 
estimation for approx 140 instructions in a jump free sequence. 
The typical case is probably more like half of that.) But, as 
mentioned, it does not include neither the board logic nor the

search that translates hash - urgency.


How about rotations and color inversions? 


In the slowest mode the pattern is rotated using macros like this 
one (that rotates a 40 neighbor pattern)


   asm
   mov edx, eax// @jm
   mov eax, [edx + 8]  // jm.mask4
   rol eax, 8
   mov [edx + 8], eax  // jm.mask4

   mov eax, [edx + 4]  // jm.mask3
   shl eax, 6
   mov ecx, eax
   and eax, 0ffh
   rol ecx, 8
   or  al, cl
   mov [edx + 4], eax  // jm.mask3

   mov eax, [edx]  // jm.msk12
   mov ecx, eax
   shl eax, 4
   rol cl, 2
   mov al, cl
   mov ecx, eax
   shr ecx, 16
   and eax, 0ffF0FFh
   and ecx, F00h
   or  eax, ecx
   mov [edx], eax  // jm.msk12
   end

and converted to canonical. Color inversion is automatic
because the pattern is own/opponent instead of black/white.

In the fastest mode, the hash table has x8 size and includes
the hashes of the rotated patterns.


Jacques.



___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/


Re: [computer-go] State of the art of pattern matching

2008-03-28 Thread Jacques Basaldúa

terry mcintyre wrote:


It is possible to get some remarkably high correlation
between the moves played by pros and a predictor - yet
still not have a good program. Why?



We have a random variable, the place at which a player
plays, and some variables that we can compute. The
distribution of the random variable, depends on this 
variables. Our variables are shape urgency and life.


Of course, I agree that it would be great to include
life in our statistical model. It would be a much better 
model. The problem is we can compute shape in few clockcycles

but we cannot compute life/death. (The revised Benson methods
are of no use in real games because they detect life/death 
when it way too late.)


All the pattern logic is not intended to determine what
move has to be played, but only to select candidates for 
later methods. Also, it is not about what a player did

once, but what is statistically sound in over 10 M positions.
We ignored life/death when learning and we ignore it
when we just suggest the best moves in terms of shape.
That is fair, but, of course, the interaction between 
shape an life would change things for better.


UCT methods are (among other reasons) strong in 9x9 
because a random move (not in the first 2 lines in the 
beginning) is a reasonable move, and then the stochastic

parts of the search finds good candidates (RAVE and similar).

But in real go, there are 361 legal moves for move 1!

And CrazyStone (as far as I know still the strongest 19x19 
program) is still 2 kyu. That is the state of computer go 
in 2008, not the remark made by a pro about how hard it was

to beat mogo by 9 stones.

Most people in this list seem to be against human learned 
shape, but the truth is that we don't know if it is useful
or not. That depends on the price we pay for urgency. In 
terms of RAM it is a couple of tenths Mb, that's nothing 
today, it was a lot years ago. Some people assume that you 
can predict what airplane society would be without building 
an airplane. Just move people in a bus, the airplane is the 
same only faster. That's not how things work. Unfortunately, 
you have to build an airplane to see what it can or cannot do.

Some call it overengineering, but if it is not fast, then
I agree it will be useless almost for sure.


Jacques.

___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/


Re: [computer-go] State of the art of pattern matching

2008-03-28 Thread Mark Boon


On 28-mrt-08, at 09:43, Jacques Basaldúa wrote:


The first source code was just an example to see what kind of code
is generated. The second is useful, if you understand asm you  
should understand it.


Well, the only serious assembler I ever wrote was on a 6502 :-) And  
that was a very long time ago, so I'm sorry to say the asm is a bit  
too hard for me to see what it does. I suppose there's no reason why  
it has to be assembler, you could just as well generate C-code. Maybe  
C-compilers aren't good enough still compared to hand-crafted code in  
cases like these?


So although I start to understand some things, there's still a lot  
unclear to me. I think I see how you generate functions to  
efficiently update the hash-codes but I don't see yet how you go from  
there to finding patterns. I assume you allow some of the points to  
be undefined (also called 'don't care points') but I don't see how.  
And if you allow undefined points, why would you need masks of  
smaller manhatten distances?


Mark

___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

Re: [computer-go] State of the art of pattern matching

2008-03-27 Thread Santiago Basaldúa

Mark Boon wrote:

What I have now takes 10-15 microseconds on a 2Ghz Intel computer to  
find all the patterns on a board (on average for 9x9, for 19x19 it's  
more like 15-20 usec) 



From your difference between 9x9 and 19x19 I assume that you are updating
the patterns of the cells after a stone was placed, (else the difference 
should be 361/81 times) not a computation from scratch. I make this sure 
so that we compare apples to apples. As far as the patterns computing is

concerned, (i.e. excluding the board part verifying captures, etc.) for
a pattern of 40 neighbors I get times easily below 500 nanoseconds 
(even on an old P4 3.4 GHz) for updating 40 hashes. I explained that about
June last year, when I wrote it. Since it passed my tests (June 07) I 
have never changed a character in the code that writes the 67090 asm
lines. It is just a black box, that works 100%. Each board cell has an 
updating function that knows where the board limits are, the resulting

code (for hash 32 bit mode) is something like an endless:

  lea ebx, [edx + ecx*2 - SizeOf_bccCell]
  mov eax, 0967f6762h
  bt  dword ptr [ebx], bidx_StoneBit
  cmovc   eax, esi
  xor [ebx + 4], eax
  lea ebx, [edx + ecx*2]
  mov eax, 0d83b6518h
  bt  dword ptr [ebx], bidx_StoneBit
  cmovc   eax, esi
  xor [ebx + 4], eax
  lea ebx, [edx + ecx*2 + SizeOf_bccCell]
  mov eax, 0121001f7h
  bt  dword ptr [ebx], bidx_StoneBit
  cmovc   eax, esi
  xor [ebx + 4], eax

The longest of these functions has 206 lines, the typical about 120. There
is not a single conditional jump, no vars in RAM, etc.



It's written in Java so making it in C would possibly give a two-fold
speedup.


I hate the Java war but, when I said that it is taking a 10 year handicap, 
at least for patterns that is true with no doubt. My code running on 1998 
hardware (it is compatible) outperforms Java code for updating patterns.



Somehow smaller sets don't go much faster, but larger sets do slow down, 
every ten-fold increase in number of patterns seems to double the time.


What I wrote above updates the hashes (or the masks, because the board 
has many modes) but does not search the hash to get urgency. For searching

in a sorted list, I use the second best language after automated asm ..

.. manually written asm, of course ;-)

;; One iteration:
;; --
CompStepMACRO   name

mov eax, ebx
add eax, ecx
shr eax, 1
and al,  0fch   ;; Align to a valid pointer
cmp edx, [eax]
jz  nameout
cmovc   ecx, eax;; This is true if [eax]  value
cmovnc  ebx, eax;; This is true if the previous is false

ENDM

Also (almost) without conditional jumps. The only conditional jump is selected
once when the hash is found and exits. 10 steps of that can search in a 1024
long list, 20 steps in 1024^2. A doubling in table size is just adding one step,
(8 instructions, say 64 clock cycles).

I hope this helps.

Jacques.

___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/


Re: [computer-go] State of the art of pattern matching (Oops)

2008-03-27 Thread Jacques Basaldúa

Santiago wrote:

... Oops, wrong name ...

(Santiago is my official name, because I was born in a dictatorship that 
did not allow foreign names. After that, I was too lazy to change it.

Jacques, like my French grandfather, is my real name.)


Jacques.


___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/


Re: [computer-go] State of the art of pattern matching

2008-03-27 Thread Don Dailey


Mark Boon wrote:
 Thanks for the pointer Don, that might be worth looking into at some
 point.

 When you say 'amazing accuracy despite this speed and simplicity' I
 still have to ask: how fast?. I think 10usec is actually pretty
 fast. But when talking about using it during MC playouts for example,
 it's still rather slow. This is due to the pile-up of numbers. A
 (relatively) big area to look for, and a large number of occasions to
 do the lookup.

 The patterns don't necessarily need to be hand-crafted. But throwing a
 million generated patterns at it also has its price in terms of
 performance.
Pattern lookup has nothing to do with the number of patterns it was
trained on of course.   Think of this as a kind of neural net.The
data produced is compact, and  is a function of the number of
attributes,  and the number of classes (this is a classifier) and not
the amount of patterns used for training.   

Therefore 10 million training patterns would take the same storage as 1
million or even just one.I generate 16 versions of each pattern for
each possible color and orientation and the training time, I'm
guessing,  is dominated by the time it takes to read the training file
(a text file) into memory, parse the fields, and rotate the patterns.   
Of course I don't really care about the training time itself since this
is a one time operation but it's fast nevertheless,  not like training a
neural net or running a genetic algorithm.   All it is, is parsing data
and collecting it into a data structure to be able to calculate
probabilities from.  

If you are just interested in patterns generated the old fashion way,
hand crafted without learning or generalization (other than wild
cards),   then this has nothing to offer for you.The techniques for
doing this have been around for decades and you are an expert in them
and there is nothing new in this. David Fotland, and many of the top
GO people have very fast pattern matchers that work well and are
super-optimized to do what they do very quickly.   

I definitely agree with you, when it comes to pure state of the art
pattern matching (the old fashioned way) that variable pattern shapes
with wild-card matching is very important.   This is all fine for a
programmer who is also an expert GO player,   but for someone like
myself there is no other choice (other than taking a few years to become
a GO expert)  but to explore this in a machine learning context.  

Naive Bayes is called naive for a reason.   It makes assumptions about
the independence of observations that are not true,  however it has been
shown both empirically and formally that it can very often perform
surprisingly well despite these limitations even in domains where
attributes are highly correlated.In GO,  patterns attributes of
course are highly correlated probably to an absurd degree,  but there
are many simple enhancements that correct the shortcomings while
maintaining most of the simplicity.  

But it is a whole different thing from what you are doing. 

- Don



 Mark


 On 26-mrt-08, at 16:17, Don Dailey wrote:

 Mark,

 I am doing some experimentation with something similar to patterns,  but
 based  on Naive Bayes classifiers.   The idea is to use  Naive Bayes
 classifiers to generalize patterns. The classifiers would still be
 focused on some constrained area of the board, such as the 5x5 matrix or
 something,  but they would be extremely compact compared to patterns and
 very good at generalizing.  I'm convinced they would have to be enhanced
 with additional attributes concerning the points being considered,  but
 one of their strengths is that you can pile on huge numbers of
 attributes without killing the speed or memory consumption very
 significantly.

 Recently there has been a huge surge of interest in Naive Bayes
 Classifiers due to their simplicity and speed, and yet amazing accuracy
 despite this speed and simplicity.Nothing has been found that
 improves all that much on Naive Bayes for many applications,  and some
 simple improvements to Naive Bayes has put it in the same league as
 other far more complex and computationally expensive methods such as
 neural networks and decision trees.

 I have no idea whether I'm barking up the wrong tree - but I think it's
 definitely worth taking a look at. I don't think GO programs can be
 improved much more by simply adding more hand crafted patterns - we need
 to find ways to generalize knowledge in powerful ways.

 Naive Bayes is trained by example and it's trivial, basically a single
 pass statistics gathering.   So you must basically show it a gazillion
 sample patterns with known classifications.You could build these
 from games of strong players for instance.

 - Don




 Mark Boon wrote:
 Lately I have been putting some effort into pattern-matching. Although
 I have made progress, the result was not as good as what I had hoped
 for by about an order of magnitude. This makes me wonder what is
 

Re: [computer-go] State of the art of pattern matching

2008-03-27 Thread Mark Boon

Jacques,

Yes, I do an incremental update so the board-size should be almost  
irrelevant. I'm not sure why I see any difference between 9x9 and  
19x19 but it may have to do with the fact that the edge cuts a lot of  
pattern seach short.



On 27-mrt-08, at 10:13, Santiago Basaldúa wrote:


Mark Boon wrote:

What I have now takes 10-15 microseconds on a 2Ghz Intel computer  
to  find all the patterns on a board (on average for 9x9, for  
19x19 it's  more like 15-20 usec)


From your difference between 9x9 and 19x19 I assume that you are  
updating
the patterns of the cells after a stone was placed, (else the  
difference should be 361/81 times) not a computation from scratch.  
I make this sure so that we compare apples to apples. As far as the  
patterns computing is
concerned, (i.e. excluding the board part verifying captures, etc.)  
for
a pattern of 40 neighbors I get times easily below 500 nanoseconds  
(even on an old P4 3.4 GHz) for updating 40 hashes. I explained  
that about
June last year, when I wrote it. Since it passed my tests (June 07)  
I have never changed a character in the code that writes the 67090 asm
lines. It is just a black box, that works 100%. Each board cell has  
an updating function that knows where the board limits are, the  
resulting

code (for hash 32 bit mode) is something like an endless:

  lea ebx, [edx + ecx*2 - SizeOf_bccCell]
  mov eax, 0967f6762h
  bt  dword ptr [ebx], bidx_StoneBit
  cmovc   eax, esi
  xor [ebx + 4], eax
  lea ebx, [edx + ecx*2]
  mov eax, 0d83b6518h
  bt  dword ptr [ebx], bidx_StoneBit
  cmovc   eax, esi
  xor [ebx + 4], eax
  lea ebx, [edx + ecx*2 + SizeOf_bccCell]
  mov eax, 0121001f7h
  bt  dword ptr [ebx], bidx_StoneBit
  cmovc   eax, esi
  xor [ebx + 4], eax



Sorry, without a bit more explanation, the assembler code is very  
hard to understand. What exactly does it do? Does it relate to a  
pattern? Did you write a paper or post explaining this in more detail?


Do I understand correctly that you generate this code automatically?  
I believe David Fotland does something similiarly. I have thought  
about that but I wasn't sure if that was really much faster.


You are talking about updating 40 hashes. Does it mean your patterns  
have fixed size? In the 500 nanoseconds, how many patterns do you  
compare? One? A thousand? A million? How about rotations and color  
inversions? To make it really an apples to apples comparison I'd like  
to make sure we're talking about the same thing.


Mark

___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/


Re: [computer-go] State of the art of pattern matching

2008-03-27 Thread terry mcintyre
I've been thinking a bit about the collection of
patterns from games, whether of professionals or of
programs.

It is possible to get some remarkably high correlation
between the moves played by pros and a predictor - yet
still not have a good program. Why?

One possible answer is that many moves are considered
but never played; this information is not captured by
looking at game records alone. Ordinarily, both
players analyze parts of the game - life-and-death
situations, for example - and know exactly what
outcome to expect. For instance, the L group is dead
- therefore, one would not create such a provably dead
shape.

The game record will show the results of decisions
made by pros, but not the process of rejecting bad
shapes.

A tree search based only on game records is unlikely
to  have enough information to weed out situations
which are almost right - just a little bit dead.

Suppose a group can be defended - four liberties in a
row, for example. If the opponent plays inside those
four liberties, you play to divide the area into two
eyes - unless the situation is such that the group has
a second eye elsewhere. Game records won't show such
frivolous plays, but it is essential to know how to
respond to programs which do make such plays. 

It might be worthwhile for tree search to include
patterns which have been generated by life-and-death
solvers, determining the status of groups using moves
which seldom appear in game records, but which are
essential to gather tactical information about the
status of groups, used to make top-level strategic
decisions. 

To summarize, the tree search needs to know about
patterns which are unlikely to ever be expressed in
the game record itself. 



Terry McIntyre lt;[EMAIL PROTECTED]gt;

“Wherever is found what is called a paternal government, there is found state 
education. It has been discovered that the best way to insure implicit 
obedience is to commence tyranny in the nursery.”

Benjamin Disraeli, Speech in the House of Commons [June 15, 1874]


  

Be a better friend, newshound, and 
know-it-all with Yahoo! Mobile.  Try it now.  
http://mobile.yahoo.com/;_ylt=Ahu06i62sR8HDtDypao8Wcj9tAcJ
___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/


Re: [computer-go] State of the art of pattern matching

2008-03-27 Thread Petr Baudis
On Thu, Mar 27, 2008 at 12:14:06PM -0700, terry mcintyre wrote:
 Suppose a group can be defended - four liberties in a
 row, for example. If the opponent plays inside those
 four liberties, you play to divide the area into two
 eyes - unless the situation is such that the group has
 a second eye elsewhere. Game records won't show such
 frivolous plays, but it is essential to know how to
 respond to programs which do make such plays. 

Such plays that almost work are often played as ko threats. I'm not
sure if you get enough samples of these patterns though; and the
opponent might tenuki from the play often to connect the ko, I'm not
sure if that helps you either.

I think this is one reason why people seem to often include some sample
of lower-level games during pattern learning.

-- 
Petr Pasky Baudis
Whatever you can do, or dream you can, begin it.
Boldness has genius, power, and magic in it.-- J. W. von Goethe
___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/


Re: [computer-go] State of the art of pattern matching

2008-03-27 Thread Don Dailey


terry mcintyre wrote:
 I've been thinking a bit about the collection of
 patterns from games, whether of professionals or of
 programs.

 It is possible to get some remarkably high correlation
 between the moves played by pros and a predictor - yet
 still not have a good program. Why?
   

In my opinion it's because playing go is more about the quality of your weakest 
moves, it's not the average quality of the moves.   You pretty much have to 
play every move at a given level to actually be playing at
that level.   Of course that's a simplification but gets the point across.  
Some moves can be found by relatively weak players and the difference between 
good and great players is in a small percentage of the moves.


 One possible answer is that many moves are considered
 but never played; this information is not captured by
 looking at game records alone. Ordinarily, both
 players analyze parts of the game - life-and-death
 situations, for example - and know exactly what
 outcome to expect. For instance, the L group is dead
 - therefore, one would not create such a provably dead
 shape.

 The game record will show the results of decisions
 made by pros, but not the process of rejecting bad
 shapes.

 A tree search based only on game records is unlikely
 to  have enough information to weed out situations
 which are almost right - just a little bit dead.

 Suppose a group can be defended - four liberties in a
 row, for example. If the opponent plays inside those
 four liberties, you play to divide the area into two
 eyes - unless the situation is such that the group has
 a second eye elsewhere. Game records won't show such
 frivolous plays, but it is essential to know how to
 respond to programs which do make such plays. 

 It might be worthwhile for tree search to include
 patterns which have been generated by life-and-death
 solvers, determining the status of groups using moves
 which seldom appear in game records, but which are
 essential to gather tactical information about the
 status of groups, used to make top-level strategic
 decisions. 

 To summarize, the tree search needs to know about
 patterns which are unlikely to ever be expressed in
 the game record itself. 

   
I remember we did talk about this is a group.   We came to the
conclusion that relevant patterns and positions rarely or never occur on
the actual playing board,  but are just understood.In a trivial way
I discovered this when I was looking for king and pawn endings in a huge
grandmaster database.   I found very few.   And yet these are pretty
much fundamental to understanding the game of chess and even influence
the openings you play and the goals of the game.It turns out that
Grandmasters usually know the outcome of any king and pawn ending that
is threatened over the board,  and so they basically work around them. 
And yet they are of fundamental importance  in the outcome.  

In some of my pattern learning experiments,  I discovered that only a
very small subset of possible patterns occur on the real board,  and yet
for a game tree searcher it would be pretty important to understand
those patterns that are constantly avoided in order to understand why
they are being avoided.

That's why I believe that patterns culled from games are pretty much
useless.That probably extends to most learning based on observing
games.  

Nevertheless,  I got a pretty good improvement using patterns that
Lazarus never plays as a form of selectivity in Lazarus.So that is
evidence that refutes my idea to a certain extent.   I figured if
Lazarus would never (or almost never given many opportunities) play to
create a certain pattern, it was reasonable evidence that it is not a
good move.


- Don




 Terry McIntyre lt;[EMAIL PROTECTED]gt;

 “Wherever is found what is called a paternal government, there is found state 
 education. It has been discovered that the best way to insure implicit 
 obedience is to commence tyranny in the nursery.”

 Benjamin Disraeli, Speech in the House of Commons [June 15, 1874]


   
 
 Be a better friend, newshound, and 
 know-it-all with Yahoo! Mobile.  Try it now.  
 http://mobile.yahoo.com/;_ylt=Ahu06i62sR8HDtDypao8Wcj9tAcJ
 ___
 computer-go mailing list
 computer-go@computer-go.org
 http://www.computer-go.org/mailman/listinfo/computer-go/

   
___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/


Re: [computer-go] State of the art of pattern matching

2008-03-27 Thread Don Dailey
Actually, a better way to put this:

King and Pawn positions are generally understand so well by Grandmasters
that they know what the outcome of the game will be once it appears on
the board. 

Therefore, at least 1 player is actively trying to avoid this ending,
because the game is essentially over once it appears! And yet you
can never be a good chess player without an intimate understanding of
this ending which rarely occurs!  

So how could we expect a machine learning algorithm to learn  this 
ending by showing it hundreds of thousands of example games?

- Don





 It turns out that
 Grandmasters usually know the outcome of any king and pawn ending that
 is threatened over the board,  and so they basically work around them. 
 And yet they are of fundamental importance  in the outcome.  

 In some of my pattern learning experiments,  I discovered that only a
 very small subset of possible patterns occur on the real board,  and yet
 for a game tree searcher it would be pretty important to understand
 those patterns that are constantly avoided in order to understand why
 they are being avoided.

 That's why I believe that patterns culled from games are pretty much
 useless.That probably extends to most learning based on observing
 games.  

 Nevertheless,  I got a pretty good improvement using patterns that
 Lazarus never plays as a form of selectivity in Lazarus.So that is
 evidence that refutes my idea to a certain extent.   I figured if
 Lazarus would never (or almost never given many opportunities) play to
 create a certain pattern, it was reasonable evidence that it is not a
 good move.


 - Don



   
 Terry McIntyre lt;[EMAIL PROTECTED]gt;

 “Wherever is found what is called a paternal government, there is found 
 state education. It has been discovered that the best way to insure implicit 
 obedience is to commence tyranny in the nursery.”

 Benjamin Disraeli, Speech in the House of Commons [June 15, 1874]


   
 
 Be a better friend, newshound, and 
 know-it-all with Yahoo! Mobile.  Try it now.  
 http://mobile.yahoo.com/;_ylt=Ahu06i62sR8HDtDypao8Wcj9tAcJ
 ___
 computer-go mailing list
 computer-go@computer-go.org
 http://www.computer-go.org/mailman/listinfo/computer-go/

   
 
 ___
 computer-go mailing list
 computer-go@computer-go.org
 http://www.computer-go.org/mailman/listinfo/computer-go/

   
___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/


Re: [computer-go] State of the art of pattern matching

2008-03-27 Thread terry mcintyre

--- Don Dailey [EMAIL PROTECTED] wrote:

 terry mcintyre wrote:
  I've been thinking a bit about the collection of
  patterns from games, whether of professionals or
 of
  programs.
 
  It is possible to get some remarkably high
 correlation
  between the moves played by pros and a predictor -
 yet
  still not have a good program. Why?

 
 In my opinion it's because playing go is more about
 the quality of your weakest moves, it's not the
 average quality of the moves.   You pretty much have
 to play every move at a given level to actually be
 playing at
 that level.   Of course that's a simplification but
 gets the point across.  Some moves can be found by
 relatively weak players and the difference between
 good and great players is in a small percentage of
 the moves.

Very true! I have the very good fortune to play
against a pro from time to time. ( He gives me 9
stones and I try to minimize my losses after that
point. ) Many of my moves look very like those of a a
pro player, but I'm still somewhere around 5 kyu on
KGS. It's one thing to play the first three moves of
the Chinese Opening; it's another to correctly handle
the umpty-seven variations which follow - especially
variants which would never appear in pro games.

I've watched a pro patiently wait for one mistake,
after which the game unravels. It's like breaking the
weakest link of a chain. We lower-level players
usually have many such weaknesses, not obvious when we
play against people or programs at a similar level.



Terry McIntyre lt;[EMAIL PROTECTED]gt;

“Wherever is found what is called a paternal government, there is found state 
education. It has been discovered that the best way to insure implicit 
obedience is to commence tyranny in the nursery.”

Benjamin Disraeli, Speech in the House of Commons [June 15, 1874]


  

Never miss a thing.  Make Yahoo your home page. 
http://www.yahoo.com/r/hs
___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/


Re: [computer-go] State of the art of pattern matching

2008-03-27 Thread Stuart A. Yeates
On undefined, Don Dailey [EMAIL PROTECTED] wrote:

  In some of my pattern learning experiments,  I discovered that only a
  very small subset of possible patterns occur on the real board,  and yet
  for a game tree searcher it would be pretty important to understand
  those patterns that are constantly avoided in order to understand why
  they are being avoided.

  That's why I believe that patterns culled from games are pretty much
  useless.That probably extends to most learning based on observing
  games.

I agree wholeheartedly with your observation, but not with your conclusion.

It is undoubtedly true that dan level players foresee, understand and
avoid many patterns, but it is also true that many players develop
those abilities largely through playing many games of go, as well as
studying books and problems.

Given that there are collections of tens of thousands of games played
by kyu level players as they improve to dan level; given that there
are collections of thousands of life and death problems studied by
those players to the same end; I see no rational explanation why the
lessons leant by humans as they improve (or equivalent lessons
expressed computationally) can't be inferred.

cheers
stuart
___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/


[computer-go] State of the art of pattern matching

2008-03-26 Thread Mark Boon
Lately I have been putting some effort into pattern-matching.  
Although I have made progress, the result was not as good as what I  
had hoped for by about an order of magnitude. This makes me wonder  
what is currently actually the state of the art of pattern matching  
in Go.


Of course it's a bit of a vague question, as there are many possible  
forms of pattern-matching. Fixed-size patterns are the easiest since  
a hash-code can be used. Nothing is going to beat that in terms of  
speed, but its use is limited to some special occasions. Joseki is  
one and apparently 3x3 patterns are used in Monte-Carlo programs.


But I think the most generally useful form is one that can do pattern- 
matching for variable shapes. (Or which can have don't-care points if  
you like.) I thought I had a solution that would be pretty close to  
the theoretical best performance. Formally proving that would  
probably be a dissertation in itself, most important for me is in  
itself it works and with modest memory requirements. That is the good  
part. The bad part is, if I'm right this is the best it can get I'm a  
bit disappointed it isn't faster. I'd rather be proven wrong here.  
It's written in Java so making it in C would possibly give a two-fold  
speedup, but that's all I can think of.


What I have now takes 10-15 microseconds on a 2Ghz Intel computer to  
find all the patterns on a board (on average for 9x9, for 19x19 it's  
more like 15-20 usec) and it also gives me the 'new' patterns i.e.  
patterns that match now but didn't match the previous move (I believe  
that's a useful feature, but it's a detail). The set of patterns is  
under a thousand patterns. Somehow smaller sets don't go much faster,  
but larger sets do slow down, every ten-fold increase in number of  
patterns seems to double the time.


So I was wondering if others cared to share the performance of their  
pattern-matcher. I just want to find out if I'm chasing a unicorn or  
not by trying to make something faster.


Mark


___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/


Re: [computer-go] State of the art of pattern matching

2008-03-26 Thread David Doshay

Our pattern matching work is just now starting to run.
We will post details when we have done more testing.

Cheers,
David



On 26, Mar 2008, at 11:08 AM, Mark Boon wrote:

Lately I have been putting some effort into pattern-matching.  
Although I have made progress, the result was not as good as what I  
had hoped for by about an order of magnitude. This makes me wonder  
what is currently actually the state of the art of pattern matching  
in Go.


Of course it's a bit of a vague question, as there are many  
possible forms of pattern-matching. Fixed-size patterns are the  
easiest since a hash-code can be used. Nothing is going to beat  
that in terms of speed, but its use is limited to some special  
occasions. Joseki is one and apparently 3x3 patterns are used in  
Monte-Carlo programs.


But I think the most generally useful form is one that can do  
pattern-matching for variable shapes. (Or which can have don't-care  
points if you like.) I thought I had a solution that would be  
pretty close to the theoretical best performance. Formally proving  
that would probably be a dissertation in itself, most important for  
me is in itself it works and with modest memory requirements. That  
is the good part. The bad part is, if I'm right this is the best it  
can get I'm a bit disappointed it isn't faster. I'd rather be  
proven wrong here. It's written in Java so making it in C would  
possibly give a two-fold speedup, but that's all I can think of.


What I have now takes 10-15 microseconds on a 2Ghz Intel computer  
to find all the patterns on a board (on average for 9x9, for 19x19  
it's more like 15-20 usec) and it also gives me the 'new' patterns  
i.e. patterns that match now but didn't match the previous move (I  
believe that's a useful feature, but it's a detail). The set of  
patterns is under a thousand patterns. Somehow smaller sets don't  
go much faster, but larger sets do slow down, every ten-fold  
increase in number of patterns seems to double the time.


So I was wondering if others cared to share the performance of  
their pattern-matcher. I just want to find out if I'm chasing a  
unicorn or not by trying to make something faster.


Mark


___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/


___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/


Re: [computer-go] State of the art of pattern matching

2008-03-26 Thread Don Dailey
Mark,

I am doing some experimentation with something similar to patterns,  but
based  on Naive Bayes classifiers.   The idea is to use  Naive Bayes
classifiers to generalize patterns. The classifiers would still be
focused on some constrained area of the board, such as the 5x5 matrix or
something,  but they would be extremely compact compared to patterns and
very good at generalizing.  I'm convinced they would have to be enhanced
with additional attributes concerning the points being considered,  but
one of their strengths is that you can pile on huge numbers of
attributes without killing the speed or memory consumption very
significantly.

Recently there has been a huge surge of interest in Naive Bayes 
Classifiers due to their simplicity and speed, and yet amazing accuracy
despite this speed and simplicity.Nothing has been found that
improves all that much on Naive Bayes for many applications,  and some
simple improvements to Naive Bayes has put it in the same league as
other far more complex and computationally expensive methods such as
neural networks and decision trees.

I have no idea whether I'm barking up the wrong tree - but I think it's
definitely worth taking a look at. I don't think GO programs can be
improved much more by simply adding more hand crafted patterns - we need
to find ways to generalize knowledge in powerful ways. 

Naive Bayes is trained by example and it's trivial, basically a single
pass statistics gathering.   So you must basically show it a gazillion
sample patterns with known classifications.You could build these
from games of strong players for instance.  

- Don


 

Mark Boon wrote:
 Lately I have been putting some effort into pattern-matching. Although
 I have made progress, the result was not as good as what I had hoped
 for by about an order of magnitude. This makes me wonder what is
 currently actually the state of the art of pattern matching in Go.

 Of course it's a bit of a vague question, as there are many possible
 forms of pattern-matching. Fixed-size patterns are the easiest since a
 hash-code can be used. Nothing is going to beat that in terms of
 speed, but its use is limited to some special occasions. Joseki is one
 and apparently 3x3 patterns are used in Monte-Carlo programs.

 But I think the most generally useful form is one that can do
 pattern-matching for variable shapes. (Or which can have don't-care
 points if you like.) I thought I had a solution that would be pretty
 close to the theoretical best performance. Formally proving that would
 probably be a dissertation in itself, most important for me is in
 itself it works and with modest memory requirements. That is the good
 part. The bad part is, if I'm right this is the best it can get I'm a
 bit disappointed it isn't faster. I'd rather be proven wrong here.
 It's written in Java so making it in C would possibly give a two-fold
 speedup, but that's all I can think of.

 What I have now takes 10-15 microseconds on a 2Ghz Intel computer to
 find all the patterns on a board (on average for 9x9, for 19x19 it's
 more like 15-20 usec) and it also gives me the 'new' patterns i.e.
 patterns that match now but didn't match the previous move (I believe
 that's a useful feature, but it's a detail). The set of patterns is
 under a thousand patterns. Somehow smaller sets don't go much faster,
 but larger sets do slow down, every ten-fold increase in number of
 patterns seems to double the time.

 So I was wondering if others cared to share the performance of their
 pattern-matcher. I just want to find out if I'm chasing a unicorn or
 not by trying to make something faster.

 Mark


 ___
 computer-go mailing list
 computer-go@computer-go.org
 http://www.computer-go.org/mailman/listinfo/computer-go/

___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/


Re: [computer-go] State of the art of pattern matching

2008-03-26 Thread Mark Boon
Thanks for the pointer Don, that might be worth looking into at some  
point.


When you say 'amazing accuracy despite this speed and simplicity' I  
still have to ask: how fast?. I think 10usec is actually pretty  
fast. But when talking about using it during MC playouts for example,  
it's still rather slow. This is due to the pile-up of numbers. A  
(relatively) big area to look for, and a large number of occasions to  
do the lookup.


The patterns don't necessarily need to be hand-crafted. But throwing  
a million generated patterns at it also has its price in terms of  
performance.


Mark


On 26-mrt-08, at 16:17, Don Dailey wrote:


Mark,

I am doing some experimentation with something similar to  
patterns,  but

based  on Naive Bayes classifiers.   The idea is to use  Naive Bayes
classifiers to generalize patterns. The classifiers would still be
focused on some constrained area of the board, such as the 5x5  
matrix or
something,  but they would be extremely compact compared to  
patterns and
very good at generalizing.  I'm convinced they would have to be  
enhanced
with additional attributes concerning the points being considered,   
but

one of their strengths is that you can pile on huge numbers of
attributes without killing the speed or memory consumption very
significantly.

Recently there has been a huge surge of interest in Naive Bayes
Classifiers due to their simplicity and speed, and yet amazing  
accuracy

despite this speed and simplicity.Nothing has been found that
improves all that much on Naive Bayes for many applications,  and some
simple improvements to Naive Bayes has put it in the same league as
other far more complex and computationally expensive methods such as
neural networks and decision trees.

I have no idea whether I'm barking up the wrong tree - but I think  
it's
definitely worth taking a look at. I don't think GO programs  
can be
improved much more by simply adding more hand crafted patterns - we  
need

to find ways to generalize knowledge in powerful ways.

Naive Bayes is trained by example and it's trivial, basically a single
pass statistics gathering.   So you must basically show it a gazillion
sample patterns with known classifications.You could build these
from games of strong players for instance.

- Don




Mark Boon wrote:
Lately I have been putting some effort into pattern-matching.  
Although

I have made progress, the result was not as good as what I had hoped
for by about an order of magnitude. This makes me wonder what is
currently actually the state of the art of pattern matching in Go.

Of course it's a bit of a vague question, as there are many possible
forms of pattern-matching. Fixed-size patterns are the easiest  
since a

hash-code can be used. Nothing is going to beat that in terms of
speed, but its use is limited to some special occasions. Joseki is  
one

and apparently 3x3 patterns are used in Monte-Carlo programs.

But I think the most generally useful form is one that can do
pattern-matching for variable shapes. (Or which can have don't-care
points if you like.) I thought I had a solution that would be pretty
close to the theoretical best performance. Formally proving that  
would

probably be a dissertation in itself, most important for me is in
itself it works and with modest memory requirements. That is the good
part. The bad part is, if I'm right this is the best it can get I'm a
bit disappointed it isn't faster. I'd rather be proven wrong here.
It's written in Java so making it in C would possibly give a two-fold
speedup, but that's all I can think of.

What I have now takes 10-15 microseconds on a 2Ghz Intel computer to
find all the patterns on a board (on average for 9x9, for 19x19 it's
more like 15-20 usec) and it also gives me the 'new' patterns i.e.
patterns that match now but didn't match the previous move (I believe
that's a useful feature, but it's a detail). The set of patterns is
under a thousand patterns. Somehow smaller sets don't go much faster,
but larger sets do slow down, every ten-fold increase in number of
patterns seems to double the time.

So I was wondering if others cared to share the performance of their
pattern-matcher. I just want to find out if I'm chasing a unicorn or
not by trying to make something faster.

Mark


___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/


___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/


___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/


Re: [computer-go] State of the art of pattern matching

2008-03-26 Thread Moi de Quoi
On Wed, 2008-03-26 at 15:08 -0300, Mark Boon wrote:
 Lately I have been putting some effort into pattern-matching.  
 Although I have made progress, the result was not as good as what I  
 had hoped for by about an order of magnitude. This makes me wonder  
 what is currently actually the state of the art of pattern matching  
 in Go.

My go-programming efforts are very much concentrated on patterns.
(maybe I have been influenced by the Kojima-papers)
My patterns are centered around the point_to_played.
Currently a 11*11 square (120 points) area is covered.
I use a polar coordinate system.
The patterns themselves are basically stored in a n-Kd tree using a
bitmask of 4 bits (for EBW+offboard) per stored point. Not all the
points are stored, only the ones that are needed to build the tree.
Currently, the pattern's payload consists of the number of times it
was seen, and the prpposed value for Black or White to move here.

I have the pattern database running as a separate process, communicating
via UDP packets. (I don't care about performance, the idea is that the
main process can do other usefull things while the patternmatcher
fetches the patterns, and that -once hardware is cheap- I could add more
pattern-servers).

I estimate the pattern's values by making them compete (just like
Kojima) Currently, it is about 1 GB in size and contains about 8M
'positions'.
Values have been obtained by extracting from professional games (twice).
To avoid bad moves beeing absent, I am currently training with
(N=xxx) games obtained from the NNGS archives (clipped at ~10Kyu). I
intend do do another run of the professional games, to resettle good
ordering.

Currently the patterns last upto about movenumber 100 (harvesting is
ceased once there are no more patterns left to compete with), and in
most games, upto about move 20-60, circa 50% of the moves are within the
four top-ranked patterns.

I don't do speedtests. (I am only in it for the data) I guess it does
about 2 moves/sec ...

Yes, I only refetch the (120) patterns for the locations that have been
affected by the previous move (less  120 for the edges and corners, of
course). 
Yes, this makes the boardcode very heavy (basically, each point's
surroundings are stored with the point data, to be used as an argument
for the pattern search), but that is neglectable in relation to the
network roundtrip + DB lookup / update.

Recently, I started experimenting with 12 point(24 bits) patterns in a
swiss flag shape; which can be kept in memory, since there are only
16M possible cells. Once it works, I may use it in a pattern enhanced
UCT-experiment.

Adriaan van Kessel.




___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/


Re: [computer-go] State of the art of pattern matching

2008-03-26 Thread Don Dailey
The speed is based on how many attributes in the item you wish to
classify as well as how many classes.   The complexity of classifying
something is O(kn) where k is number of classes and n the number of
attributes.  

If your patterns are more than just the trivial [black, white, empty,
edge]  type of attributes where other features must be computed,  which
I think it should,  then speed probably won't be an issue because  your
time will be dominated by feature extraction. I think you need to
know, for each point, information about the liberties,  if the last move
was close, and other things to really get something powerful. 

It's also very small in memory and training is basically statistic
gathering.In one test I trained Naive Bayes based on about 200,000
instances, and it took just seconds.  

Of course if this is done for every empty point on the board for
play-outs, it will not be as fast as simple light play-outs,  but just a
few hundred heavy play-outs dominates many thousand light play-outs in
quality,   so we are willing to sacrifice a lot if we get quality in return.

Also, you can probably find ways to avoid a lot of computation in a
polished program, such as result caching, or simple pattern based
pre-tests that eliminate some points from consideration.

- Don





Mark Boon wrote:
 Thanks for the pointer Don, that might be worth looking into at some
 point.

 When you say 'amazing accuracy despite this speed and simplicity' I
 still have to ask: how fast?. I think 10usec is actually pretty
 fast. But when talking about using it during MC playouts for example,
 it's still rather slow. This is due to the pile-up of numbers. A
 (relatively) big area to look for, and a large number of occasions to
 do the lookup.

 The patterns don't necessarily need to be hand-crafted. But throwing a
 million generated patterns at it also has its price in terms of
 performance.

 Mark


 On 26-mrt-08, at 16:17, Don Dailey wrote:

 Mark,

 I am doing some experimentation with something similar to patterns,  but
 based  on Naive Bayes classifiers.   The idea is to use  Naive Bayes
 classifiers to generalize patterns. The classifiers would still be
 focused on some constrained area of the board, such as the 5x5 matrix or
 something,  but they would be extremely compact compared to patterns and
 very good at generalizing.  I'm convinced they would have to be enhanced
 with additional attributes concerning the points being considered,  but
 one of their strengths is that you can pile on huge numbers of
 attributes without killing the speed or memory consumption very
 significantly.

 Recently there has been a huge surge of interest in Naive Bayes
 Classifiers due to their simplicity and speed, and yet amazing accuracy
 despite this speed and simplicity.Nothing has been found that
 improves all that much on Naive Bayes for many applications,  and some
 simple improvements to Naive Bayes has put it in the same league as
 other far more complex and computationally expensive methods such as
 neural networks and decision trees.

 I have no idea whether I'm barking up the wrong tree - but I think it's
 definitely worth taking a look at. I don't think GO programs can be
 improved much more by simply adding more hand crafted patterns - we need
 to find ways to generalize knowledge in powerful ways.

 Naive Bayes is trained by example and it's trivial, basically a single
 pass statistics gathering.   So you must basically show it a gazillion
 sample patterns with known classifications.You could build these
 from games of strong players for instance.

 - Don




 Mark Boon wrote:
 Lately I have been putting some effort into pattern-matching. Although
 I have made progress, the result was not as good as what I had hoped
 for by about an order of magnitude. This makes me wonder what is
 currently actually the state of the art of pattern matching in Go.

 Of course it's a bit of a vague question, as there are many possible
 forms of pattern-matching. Fixed-size patterns are the easiest since a
 hash-code can be used. Nothing is going to beat that in terms of
 speed, but its use is limited to some special occasions. Joseki is one
 and apparently 3x3 patterns are used in Monte-Carlo programs.

 But I think the most generally useful form is one that can do
 pattern-matching for variable shapes. (Or which can have don't-care
 points if you like.) I thought I had a solution that would be pretty
 close to the theoretical best performance. Formally proving that would
 probably be a dissertation in itself, most important for me is in
 itself it works and with modest memory requirements. That is the good
 part. The bad part is, if I'm right this is the best it can get I'm a
 bit disappointed it isn't faster. I'd rather be proven wrong here.
 It's written in Java so making it in C would possibly give a two-fold
 speedup, but that's all I can think of.

 What I have now takes 10-15 microseconds on a 2Ghz Intel 

Re: [computer-go] State of the art of pattern matching

2008-03-26 Thread Stuart A. Yeates
My computer-go player is a single pattern system. It linearises
patterns and stores them in a very large suffix tree. At each node in
the tree counts are kept of the number of times the node has been
played or not played.

http://code.google.com/p/jgogears/

It's currently at the stage where it plays a game that looks like go
in self play after training on only 200 games. Training is currently
very slow, play very fast.

Java/javacc/eclipse/junit

cheers
stuart
___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/


Re: [computer-go] State of the art of pattern matching

2008-03-26 Thread Gunnar Farnebäck

Mark Boon wrote:
Lately I have been putting some effort into pattern-matching. Although I 
have made progress, the result was not as good as what I had hoped for 
by about an order of magnitude. This makes me wonder what is currently 
actually the state of the art of pattern matching in Go.


Of course it's a bit of a vague question, as there are many possible 
forms of pattern-matching. Fixed-size patterns are the easiest since a 
hash-code can be used. Nothing is going to beat that in terms of speed, 
but its use is limited to some special occasions. Joseki is one and 
apparently 3x3 patterns are used in Monte-Carlo programs.


But I think the most generally useful form is one that can do 
pattern-matching for variable shapes. (Or which can have don't-care 
points if you like.) I thought I had a solution that would be pretty 
close to the theoretical best performance. Formally proving that would 
probably be a dissertation in itself, most important for me is in itself 
it works and with modest memory requirements. That is the good part. The 
bad part is, if I'm right this is the best it can get I'm a bit 
disappointed it isn't faster. I'd rather be proven wrong here. It's 
written in Java so making it in C would possibly give a two-fold 
speedup, but that's all I can think of.


What I have now takes 10-15 microseconds on a 2Ghz Intel computer to 
find all the patterns on a board (on average for 9x9, for 19x19 it's 
more like 15-20 usec) and it also gives me the 'new' patterns i.e. 
patterns that match now but didn't match the previous move (I believe 
that's a useful feature, but it's a detail). The set of patterns is 
under a thousand patterns. Somehow smaller sets don't go much faster, 
but larger sets do slow down, every ten-fold increase in number of 
patterns seems to double the time.


So I was wondering if others cared to share the performance of their 
pattern-matcher. I just want to find out if I'm chasing a unicorn or not 
by trying to make something faster.


I tried this with GNU Go and got about 40-80 usec on 9x9 for a database 
of 500 patterns with in some cases lots of wildcards (black or empty, 
white or empty, black or white or empty (but not off board)). This was 
on a 2.2 GHz AMD64.


/Gunnar
___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/