subject:"\[computer\-go\] 10k UCT bots"

Re: [computer-go] 10k UCT bots

2008-05-20 Thread Michael Williams


I think the general distaste for modulo is based on the historical tendency for 
the low-order bits to be less random than the high-order bits.


Hideki Kato wrote:

Thank you for detailed explanation.  I've understood well now.

It's essentially the mapping problem from [0..N) to [0..M) where  
N > M and N % M != 0  or N is greater than M and M don't divide N.  
The frequencies of the mapping have to have the least difference, one 
(unless discarding extra part of source, mentioned in my previous 
post).


The modulo operation, however, is not preferable because it 
introduces very systematic bias, smaller numbers always have the 
bonus for every M.  In this sense, multiply and divide operations are 
not the same as modulo.


Thanks again,
Hideki

Gunnar Farnebäck: <[EMAIL PROTECTED]>:

I wrote:

Hideki Kato wrote:

Gunnar Farnebäck: <[EMAIL PROTECTED]>:

Hideki Kato wrote:

I didn't against you, Álvaro, rather I just made a caution for
programmers who will use your pseudo code as is. :)

First, I prefer SFMT (SIMD-oriented Fast Mersenne Twister) rather
than integer pseudo random number generators in practice where the
quality of play-out is important.  Modern processors execute floating
operations as fast as interger ones and
picked = mt_rand() * (double) num_candidates;
is the simplest and safe.

Please note that for uniformity purists this method has exactly the
same problem as good_quality_int_rand() % num_candidates.

Mt_rand() has very good uniform distributions in [0..1)
while
good_quality_int_rand() % num_candidates
doesn't disribute uniformly when num_candidates is not a power of 2,
assuming good_quality_int_rand() ranges [0..2^32 or so) due to modulo
operations.  They are not the same, aren't they?

Well, there's nothing magic about floating point numbers. Even a very
good uniform distribution in some interval is implemented by
distributing N discrete values over the interval as uniformly as
possible. When those N values by some mapping procedure are transformed
into a smaller range M, some of those will get at least one more hit
than some others, unless M divides N. It doesn't matter whether the
mapping procedure is an integer modulo operation or a floating point
multiplication + rounding.

It seems like this short explanation was too abstract. Let's try with
a more concrete one.

The function dsfmt_genrand_close1_open2 in the file dSFMT.h from
dSFMT-src-1.3, downloadable from
http://www.math.sci.hiroshima-u.ac.jp/~m-mat/MT/SFMT/index.html,
generates uniformly distributed double precision floating point
numbers in the interval 1 <= x < 2.  This is done by taking uniformly
distributed 52 bit integers and placing those as the least significant
bits of a 64 bit IEEE 754 double precision floating point number with
the bit pattern

0 011 y,

where y denotes the 52 bits uniformly distributed integer. This is
interpreted as the floating point number with value

x = 1 + y / 2^52

The function dsfmt_genrand_close_open generates uniformly distributed
double precision floating point numbers in the interval 0 <= x < 1
simply by subtracting one from the numbers above. These are still
exactly representable by IEEE754 double precision floating point
numbers although their bit representations become somewhat more
involved due precisely to the point floating around.

Thus we now have the uniform integer distribution 0, ..., 2^52-1
mapped onto the floating point numbers 0, 1*2^(-52), 2*2^(-52), ...,
(2^52-1)*2^(-52).

Next we multiply this by num_candidates and round down to the nearest
integer. Let's say that num_candidates is 7. Then the 52 bit integers
are mapped onto the interval 0, 1, 2, 3, 4, 5, 6 so that

   0 -  643371375338642 are mapped to 0
 643371375338643 - 1286742750677284 are mapped to 1
1286742750677285 - 1930114126015926 are mapped to 2
1930114126015927 - 2573485501354569 are mapped to 3
2573485501354570 - 3216856876693211 are mapped to 4
3216856876693212 - 3860228252031853 are mapped to 5
3860228252031854 - 4503599627370495 are mapped to 6

This means that moves 0, 3 have a relative chance of 643371375338643
to be chosen whereas moves 1, 2, 4, 5, 6 have a relative chance of
only 643371375338642. Of course this is for almost any purpose
negligible but it is exactly the same bias you get by taking a
uniformly distributed 52-bit integer modulo 7, except that then it's
the moves 0 and 1 that are favored instead of 0 and 3.

Well, that's what naive theoretical computations give at least, but
it's usually dangerous to trust intuition too much when dealing with
floating point numbers. Actually trying this out with the attached
program gives the following result on my computer,

0 0.00 0
643371375338642 0.142857 0
643371375338643 0.142857 1
1286742750677284 0.285714 1
1286742750677285 0.285714 2
1930114126015926 0.428571 2
1930114126015927 0.428571 3
2573485501354568 0.571429 3
2573485501354569 0.571429 4
2573485501354570 0.571429 4
3216856876693211 0.714286 4
3216856876693

Re: [computer-go] 10k UCT bots

2008-05-19 Thread Hideki Kato

Thank you for detailed explanation.  I've understood well now.

It's essentially the mapping problem from [0..N) to [0..M) where
N > M and N % M != 0  or N is greater than M and M don't divide N.
The frequencies of the mapping have to have the least difference, one
(unless discarding extra part of source, mentioned in my previous
post).

The modulo operation, however, is not preferable because it
introduces very systematic bias, smaller numbers always have the
bonus for every M.  In this sense, multiply and divide operations are
not the same as modulo.

Thanks again,
Hideki

Gunnar Farnebäck: <[EMAIL PROTECTED]>:
>I wrote:
> > Hideki Kato wrote:
> >> Gunnar Farnebäck: <[EMAIL PROTECTED]>:
> >>> Hideki Kato wrote:
>  I didn't against you, Álvaro, rather I just made a caution for
>  programmers who will use your pseudo code as is. :)
> 
>  First, I prefer SFMT (SIMD-oriented Fast Mersenne Twister) rather
>  than integer pseudo random number generators in practice where the
>  quality of play-out is important.  Modern processors execute floating
>  operations as fast as interger ones and
>  picked = mt_rand() * (double) num_candidates;
>  is the simplest and safe.
> >>> Please note that for uniformity purists this method has exactly the
> >>> same problem as good_quality_int_rand() % num_candidates.
> >>
> >> Mt_rand() has very good uniform distributions in [0..1)
> >> while
> >> good_quality_int_rand() % num_candidates
> >> doesn't disribute uniformly when num_candidates is not a power of 2,
> >> assuming good_quality_int_rand() ranges [0..2^32 or so) due to modulo
> >> operations.  They are not the same, aren't they?
> >
> > Well, there's nothing magic about floating point numbers. Even a very
> > good uniform distribution in some interval is implemented by
> > distributing N discrete values over the interval as uniformly as
> > possible. When those N values by some mapping procedure are transformed
> > into a smaller range M, some of those will get at least one more hit
> > than some others, unless M divides N. It doesn't matter whether the
> > mapping procedure is an integer modulo operation or a floating point
> > multiplication + rounding.
>
>It seems like this short explanation was too abstract. Let's try with
>a more concrete one.
>
>The function dsfmt_genrand_close1_open2 in the file dSFMT.h from
>dSFMT-src-1.3, downloadable from
>http://www.math.sci.hiroshima-u.ac.jp/~m-mat/MT/SFMT/index.html,
>generates uniformly distributed double precision floating point
>numbers in the interval 1 <= x < 2.  This is done by taking uniformly
>distributed 52 bit integers and placing those as the least significant
>bits of a 64 bit IEEE 754 double precision floating point number with
>the bit pattern
>
>0 011 y,
>
>where y denotes the 52 bits uniformly distributed integer. This is
>interpreted as the floating point number with value
>
>x = 1 + y / 2^52
>
>The function dsfmt_genrand_close_open generates uniformly distributed
>double precision floating point numbers in the interval 0 <= x < 1
>simply by subtracting one from the numbers above. These are still
>exactly representable by IEEE754 double precision floating point
>numbers although their bit representations become somewhat more
>involved due precisely to the point floating around.
>
>Thus we now have the uniform integer distribution 0, ..., 2^52-1
>mapped onto the floating point numbers 0, 1*2^(-52), 2*2^(-52), ...,
>(2^52-1)*2^(-52).
>
>Next we multiply this by num_candidates and round down to the nearest
>integer. Let's say that num_candidates is 7. Then the 52 bit integers
>are mapped onto the interval 0, 1, 2, 3, 4, 5, 6 so that
>
>0 -  643371375338642 are mapped to 0
>  643371375338643 - 1286742750677284 are mapped to 1
>1286742750677285 - 1930114126015926 are mapped to 2
>1930114126015927 - 2573485501354569 are mapped to 3
>2573485501354570 - 3216856876693211 are mapped to 4
>3216856876693212 - 3860228252031853 are mapped to 5
>3860228252031854 - 4503599627370495 are mapped to 6
>
>This means that moves 0, 3 have a relative chance of 643371375338643
>to be chosen whereas moves 1, 2, 4, 5, 6 have a relative chance of
>only 643371375338642. Of course this is for almost any purpose
>negligible but it is exactly the same bias you get by taking a
>uniformly distributed 52-bit integer modulo 7, except that then it's
>the moves 0 and 1 that are favored instead of 0 and 3.
>
>Well, that's what naive theoretical computations give at least, but
>it's usually dangerous to trust intuition too much when dealing with
>floating point numbers. Actually trying this out with the attached
>program gives the following result on my computer,
>
>0 0.00 0
>643371375338642 0.142857 0
>643371375338643 0.142857 1
>1286742750677284 0.285714 1
>1286742750677285 0.285714 2
>1930114126015926 0.428571 2
>1930114126015927 0.428571 3
>2573485501354568 0.571429 3
>2573485501354569 0.571429 4
>2573485501354570 0.571429 4

Re: [computer-go] 10k UCT bots

2008-05-19 Thread Gunnar Farnebäck

I wrote:
> Hideki Kato wrote:
>> Gunnar Farnebäck: <[EMAIL PROTECTED]>:
>>> Hideki Kato wrote:
 I didn't against you, Álvaro, rather I just made a caution for
 programmers who will use your pseudo code as is. :)

 First, I prefer SFMT (SIMD-oriented Fast Mersenne Twister) rather
 than integer pseudo random number generators in practice where the
 quality of play-out is important.  Modern processors execute floating
 operations as fast as interger ones and
 picked = mt_rand() * (double) num_candidates;
 is the simplest and safe.
>>> Please note that for uniformity purists this method has exactly the
>>> same problem as good_quality_int_rand() % num_candidates.
>>
>> Mt_rand() has very good uniform distributions in [0..1)
>> while
>> good_quality_int_rand() % num_candidates
>> doesn't disribute uniformly when num_candidates is not a power of 2,
>> assuming good_quality_int_rand() ranges [0..2^32 or so) due to modulo
>> operations.  They are not the same, aren't they?
>
> Well, there's nothing magic about floating point numbers. Even a very
> good uniform distribution in some interval is implemented by
> distributing N discrete values over the interval as uniformly as
> possible. When those N values by some mapping procedure are transformed
> into a smaller range M, some of those will get at least one more hit
> than some others, unless M divides N. It doesn't matter whether the
> mapping procedure is an integer modulo operation or a floating point
> multiplication + rounding.

It seems like this short explanation was too abstract. Let's try with
a more concrete one.

The function dsfmt_genrand_close1_open2 in the file dSFMT.h from
dSFMT-src-1.3, downloadable from
http://www.math.sci.hiroshima-u.ac.jp/~m-mat/MT/SFMT/index.html,
generates uniformly distributed double precision floating point
numbers in the interval 1 <= x < 2.  This is done by taking uniformly
distributed 52 bit integers and placing those as the least significant
bits of a 64 bit IEEE 754 double precision floating point number with
the bit pattern

0 011 y,

where y denotes the 52 bits uniformly distributed integer. This is
interpreted as the floating point number with value

x = 1 + y / 2^52

The function dsfmt_genrand_close_open generates uniformly distributed
double precision floating point numbers in the interval 0 <= x < 1
simply by subtracting one from the numbers above. These are still
exactly representable by IEEE754 double precision floating point
numbers although their bit representations become somewhat more
involved due precisely to the point floating around.

Thus we now have the uniform integer distribution 0, ..., 2^52-1
mapped onto the floating point numbers 0, 1*2^(-52), 2*2^(-52), ...,
(2^52-1)*2^(-52).

Next we multiply this by num_candidates and round down to the nearest
integer. Let's say that num_candidates is 7. Then the 52 bit integers
are mapped onto the interval 0, 1, 2, 3, 4, 5, 6 so that

   0 -  643371375338642 are mapped to 0
 643371375338643 - 1286742750677284 are mapped to 1
1286742750677285 - 1930114126015926 are mapped to 2
1930114126015927 - 2573485501354569 are mapped to 3
2573485501354570 - 3216856876693211 are mapped to 4
3216856876693212 - 3860228252031853 are mapped to 5
3860228252031854 - 4503599627370495 are mapped to 6

This means that moves 0, 3 have a relative chance of 643371375338643
to be chosen whereas moves 1, 2, 4, 5, 6 have a relative chance of
only 643371375338642. Of course this is for almost any purpose
negligible but it is exactly the same bias you get by taking a
uniformly distributed 52-bit integer modulo 7, except that then it's
the moves 0 and 1 that are favored instead of 0 and 3.

Well, that's what naive theoretical computations give at least, but
it's usually dangerous to trust intuition too much when dealing with
floating point numbers. Actually trying this out with the attached
program gives the following result on my computer,

0 0.00 0
643371375338642 0.142857 0
643371375338643 0.142857 1
1286742750677284 0.285714 1
1286742750677285 0.285714 2
1930114126015926 0.428571 2
1930114126015927 0.428571 3
2573485501354568 0.571429 3
2573485501354569 0.571429 4
2573485501354570 0.571429 4
3216856876693211 0.714286 4
3216856876693212 0.714286 5
3860228252031853 0.857143 5
3860228252031854 0.857143 6
4503599627370495 1.00 6

showing that 2573485501354569 maps to 4 rather than 3, meaning that
it's moves 0 and 4 which are favored.

I could go on but I think this should be enough to demonstrate my
point.

/Gunnar
#include 

int main(int argc, char **argv)
{
  unsigned long u;
  double x;
  int k;

  unsigned long v[] = {
0ul,
643371375338642ul,
643371375338643ul,
1286742750677284ul,
1286742750677285ul,
1930114126015926ul,
1930114126015927ul,
2573485501354568ul,
2573485501354569ul,
2573485501354570ul,
3216856876693211ul,
3216856876693212ul,
3860228252031853ul,
38602282

Re: [computer-go] 10k UCT bots

2008-05-19 Thread Hideki Kato

Assumming good_quality_int_rand() ranges [0..15] and N is 9, the
mapping is 0->0, 1->1, ..., 8->8, 9->0, ..., 15->6.  This shows 0 to 6
have twice of the chance of 7 and 8.  This is the bias caused by
the modulo operation.

In addition,
>multiplication + rounding.
may return N.  Use trancate instead for the range [0..N).
#I used float instead of double and also got N (very rare). Please be
careful.

-Hideki

Gunnar Farnebäck: <[EMAIL PROTECTED]>:
>Hideki Kato wrote:
>> Gunnar Farnebäck: <[EMAIL PROTECTED]>:
>>> Hideki Kato wrote:
 I didn't against you, Álvaro, rather I just made a caution for
 programmers who will use your pseudo code as is. :)

 First, I prefer SFMT (SIMD-oriented Fast Mersenne Twister) rather
 than integer pseudo random number generators in practice where the
 quality of play-out is important.  Modern processors execute floating
 operations as fast as interger ones and
picked = mt_rand() * (double) num_candidates;
 is the simplest and safe.
>>> Please note that for uniformity purists this method has exactly the
>>> same problem as good_quality_int_rand() % num_candidates.
>> 
>> Mt_rand() has very good uniform distributions in [0..1)
>> while
>> good_quality_int_rand() % num_candidates
>> doesn't disribute uniformly when num_candidates is not a power of 2, 
>> assuming good_quality_int_rand() ranges [0..2^32 or so) due to modulo 
>> operations.  They are not the same, aren't they?
>
>Well, there's nothing magic about floating point numbers. Even a very 
>good uniform distribution in some interval is implemented by 
>distributing N discrete values over the interval as uniformly as 
>possible. When those N values by some mapping procedure are transformed 
>into a smaller range M, some of those will get at least one more hit 
>than some others, unless M divides N. It doesn't matter whether the 
>mapping procedure is an integer modulo operation or a floating point 
>multiplication + rounding.
>
>/Gunnar
>___
>computer-go mailing list
>computer-go@computer-go.org
>http://www.computer-go.org/mailman/listinfo/computer-go/
--
[EMAIL PROTECTED] (Kato)
___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

Re: [computer-go] 10k UCT bots

2008-05-18 Thread Gunnar Farnebäck


Hideki Kato wrote:

Gunnar Farnebäck: <[EMAIL PROTECTED]>:

Hideki Kato wrote:

I didn't against you, Álvaro, rather I just made a caution for
programmers who will use your pseudo code as is. :)

First, I prefer SFMT (SIMD-oriented Fast Mersenne Twister) rather
than integer pseudo random number generators in practice where the
quality of play-out is important.  Modern processors execute floating
operations as fast as interger ones and
picked = mt_rand() * (double) num_candidates;
is the simplest and safe.

Please note that for uniformity purists this method has exactly the
same problem as good_quality_int_rand() % num_candidates.


Mt_rand() has very good uniform distributions in [0..1)
while
good_quality_int_rand() % num_candidates
doesn't disribute uniformly when num_candidates is not a power of 2, 
assuming good_quality_int_rand() ranges [0..2^32 or so) due to modulo 
operations.  They are not the same, aren't they?


Well, there's nothing magic about floating point numbers. Even a very 
good uniform distribution in some interval is implemented by 
distributing N discrete values over the interval as uniformly as 
possible. When those N values by some mapping procedure are transformed 
into a smaller range M, some of those will get at least one more hit 
than some others, unless M divides N. It doesn't matter whether the 
mapping procedure is an integer modulo operation or a floating point 
multiplication + rounding.


/Gunnar
___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

Re: [computer-go] 10k UCT bots

2008-05-18 Thread Hideki Kato

Gunnar Farnebäck: <[EMAIL PROTECTED]>:
>Hideki Kato wrote:
> > I didn't against you, Álvaro, rather I just made a caution for
> > programmers who will use your pseudo code as is. :)
> >
> > First, I prefer SFMT (SIMD-oriented Fast Mersenne Twister) rather
> > than integer pseudo random number generators in practice where the
> > quality of play-out is important.  Modern processors execute floating
> > operations as fast as interger ones and
> > picked = mt_rand() * (double) num_candidates;
> > is the simplest and safe.
>
>Please note that for uniformity purists this method has exactly the
>same problem as good_quality_int_rand() % num_candidates.

Mt_rand() has very good uniform distributions in [0..1)
while
good_quality_int_rand() % num_candidates
doesn't disribute uniformly when num_candidates is not a power of 2,
assuming good_quality_int_rand() ranges [0..2^32 or so) due to modulo
operations.  They are not the same, aren't they?

-Hideki

>/Gunnar
>___
>computer-go mailing list
>computer-go@computer-go.org
>http://www.computer-go.org/mailman/listinfo/computer-go/
--
[EMAIL PROTECTED] (Kato)
___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

Re: [computer-go] 10k UCT bots

2008-05-18 Thread Gunnar Farnebäck


Hideki Kato wrote:
> I didn't against you, Álvaro, rather I just made a caution for
> programmers who will use your pseudo code as is. :)
>
> First, I prefer SFMT (SIMD-oriented Fast Mersenne Twister) rather
> than integer pseudo random number generators in practice where the
> quality of play-out is important.  Modern processors execute floating
> operations as fast as interger ones and
>picked = mt_rand() * (double) num_candidates;
> is the simplest and safe.

Please note that for uniformity purists this method has exactly the
same problem as good_quality_int_rand() % num_candidates.

/Gunnar
___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

Re: [computer-go] 10k UCT bots

2008-05-14 Thread Jason House


On May 14, 2008, at 3:39 PM, Jeff Nowakowski <[EMAIL PROTECTED]> wrote:

The 10k refers to ten thousand playouts, not rank, and yes it's  
9x9.  As

for open source UCT, off the top of my head there's libego (C++) and
Orego (Java).


HouseBot is open source too (D). I really should add the random  
resampling move selector as an option to my bot. It's really easy to  
implement, I've even done it before...






-Jeff

On Wed, 2008-05-14 at 12:14 -0700, Carter Cheng wrote:
I assume this implies that there arent any open-source basic-UCT  
bots which utilize the basic eye rule and a simple permute and  
retry scheme as described by many ppl on the group? When we speak  
of these sorts of bots playing at about 10kyu I assume what is  
meant is 10kyu at 9x9 not 19x19.



--- On Wed, 5/14/08, [EMAIL PROTECTED] <[EMAIL PROTECTED] 
> wrote:



From: [EMAIL PROTECTED] <[EMAIL PROTECTED]>
Subject: Re: [computer-go] 10k UCT bots
To: computer-go@computer-go.org
Date: Wednesday, May 14, 2008, 10:44 AM

-Original Message-
From: Jacques Basaldúa <[EMAIL PROTECTED]>
To: computer-go@computer-go.org
Sent: Wed, 14 May 2008 6:38 am
Subject: Re: [computer-go] 10k UCT bots




Don Dailey wrote:



[EMAIL PROTECTED] wrote:



For those currently coding this up, I think

the most important thing

about this playout algorithm is that it is
*temporary*. You will  almost certainly

be replacing it with something different and better

just a little bit down the road.  So you

probably don't want to worry

about hair-splitting tweaks except as an

academic exercise.


Yes, I agree. Also my hair brained scheme of

pre-generated tables of

list traversal orderings was just an academic

exercise as you say.


But the problem is that when you do heavy playouts you

have the same

problem except that the probabilities of the legal

moves are no longer equal.

The problem doesn't go away but the trade-offs change
considerably. This is an interesting and relevant
discussion, but if I were trying to code up light MC
playouts for the first time, right now, I would be feeling
that this dead-simple algorithm was actually very difficult
and confusing.

For someone in that position (and only them), my advice is
1. Implement light playouts first. It's simple; you
will find many bugs that way; it's standardized enough
that other people will understand what you're talking
about; it's a fast way to get a basic bot; it will be a
very handy thing to have as a baseline when you test other
things.
2. Get it working the standard way before improving it.
It's your baseline that you'll be testing
improvements against.
3. Make it fast but don't spend excessive effort
optimizing. "Better is the enemy of good
enough."

- Dave
Hillis___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/




___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/


___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

Re: [computer-go] 10k UCT bots

2008-05-14 Thread Jeff Nowakowski

The 10k refers to ten thousand playouts, not rank, and yes it's 9x9.  As
for open source UCT, off the top of my head there's libego (C++) and
Orego (Java).

-Jeff

On Wed, 2008-05-14 at 12:14 -0700, Carter Cheng wrote:
> I assume this implies that there arent any open-source basic-UCT bots which 
> utilize the basic eye rule and a simple permute and retry scheme as described 
> by many ppl on the group? When we speak of these sorts of bots playing at 
> about 10kyu I assume what is meant is 10kyu at 9x9 not 19x19.
> 
> 
> --- On Wed, 5/14/08, [EMAIL PROTECTED] <[EMAIL PROTECTED]> wrote:
> 
> > From: [EMAIL PROTECTED] <[EMAIL PROTECTED]>
> > Subject: Re: [computer-go] 10k UCT bots
> > To: computer-go@computer-go.org
> > Date: Wednesday, May 14, 2008, 10:44 AM
> > > -Original Message-
> > > From: Jacques Basaldúa <[EMAIL PROTECTED]>
> > > To: computer-go@computer-go.org
> > > Sent: Wed, 14 May 2008 6:38 am
> > > Subject: Re: [computer-go] 10k UCT bots
> > 
> > 
> > > Don Dailey wrote: 
> >  
> > > > [EMAIL PROTECTED] wrote: 
> >  
> > > >> For those currently coding this up, I think
> > the most important thing 
> > >>>  about this playout algorithm is that it is 
> > >> >  *temporary*. You will  almost certainly
> > be replacing it with something different and better 
> > >> > just a little bit down the road.  So you
> > probably don't want to worry 
> > >> > about hair-splitting tweaks except as an
> > academic exercise. 
> > 
> > > Yes, I agree. Also my hair brained scheme of
> > pre-generated tables of
> > > > list traversal orderings was just an academic 
> >  exercise as you say.  
> > 
> > > But the problem is that when you do heavy playouts you
> > have the same 
> > > problem except that the probabilities of the legal
> > moves are no longer equal. 
> > 
> > The problem doesn't go away but the trade-offs change
> > considerably. This is an interesting and relevant
> > discussion, but if I were trying to code up light MC
> > playouts for the first time, right now, I would be feeling
> > that this dead-simple algorithm was actually very difficult
> > and confusing. 
> > 
> > For someone in that position (and only them), my advice is
> > 1. Implement light playouts first. It's simple; you
> > will find many bugs that way; it's standardized enough
> > that other people will understand what you're talking
> > about; it's a fast way to get a basic bot; it will be a
> > very handy thing to have as a baseline when you test other
> > things.
> > 2. Get it working the standard way before improving it.
> > It's your baseline that you'll be testing
> > improvements against.
> > 3. Make it fast but don't spend excessive effort
> > optimizing. "Better is the enemy of good
> > enough." 
> > 
> > - Dave
> > Hillis___
> > computer-go mailing list
> > computer-go@computer-go.org
> > http://www.computer-go.org/mailman/listinfo/computer-go/
> 
> 
>   
> ___
> computer-go mailing list
> computer-go@computer-go.org
> http://www.computer-go.org/mailman/listinfo/computer-go/

___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

Re: [computer-go] 10k UCT bots

2008-05-14 Thread Carter Cheng

I assume this implies that there arent any open-source basic-UCT bots which 
utilize the basic eye rule and a simple permute and retry scheme as described 
by many ppl on the group? When we speak of these sorts of bots playing at about 
10kyu I assume what is meant is 10kyu at 9x9 not 19x19.


--- On Wed, 5/14/08, [EMAIL PROTECTED] <[EMAIL PROTECTED]> wrote:

> From: [EMAIL PROTECTED] <[EMAIL PROTECTED]>
> Subject: Re: [computer-go] 10k UCT bots
> To: computer-go@computer-go.org
> Date: Wednesday, May 14, 2008, 10:44 AM
> > -Original Message-
> > From: Jacques Basaldúa <[EMAIL PROTECTED]>
> > To: computer-go@computer-go.org
> > Sent: Wed, 14 May 2008 6:38 am
> > Subject: Re: [computer-go] 10k UCT bots
> 
> 
> > Don Dailey wrote: 
>  
> > > [EMAIL PROTECTED] wrote: 
>  
> > >> For those currently coding this up, I think
> the most important thing 
> >>>  about this playout algorithm is that it is 
> >> >  *temporary*. You will  almost certainly
> be replacing it with something different and better 
> >> > just a little bit down the road.  So you
> probably don't want to worry 
> >> > about hair-splitting tweaks except as an
> academic exercise. 
> 
> > Yes, I agree. Also my hair brained scheme of
> pre-generated tables of
> > > list traversal orderings was just an academic 
>  exercise as you say.  
> 
> > But the problem is that when you do heavy playouts you
> have the same 
> > problem except that the probabilities of the legal
> moves are no longer equal. 
> 
> The problem doesn't go away but the trade-offs change
> considerably. This is an interesting and relevant
> discussion, but if I were trying to code up light MC
> playouts for the first time, right now, I would be feeling
> that this dead-simple algorithm was actually very difficult
> and confusing. 
> 
> For someone in that position (and only them), my advice is
> 1. Implement light playouts first. It's simple; you
> will find many bugs that way; it's standardized enough
> that other people will understand what you're talking
> about; it's a fast way to get a basic bot; it will be a
> very handy thing to have as a baseline when you test other
> things.
> 2. Get it working the standard way before improving it.
> It's your baseline that you'll be testing
> improvements against.
> 3. Make it fast but don't spend excessive effort
> optimizing. "Better is the enemy of good
> enough." 
> 
> - Dave
> Hillis___
> computer-go mailing list
> computer-go@computer-go.org
> http://www.computer-go.org/mailman/listinfo/computer-go/



___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

Re: [computer-go] 10k UCT bots

2008-05-14 Thread dhillismail

> -Original Message-
> From: Jacques Basaldúa <[EMAIL PROTECTED]>
> To: computer-go@computer-go.org
> Sent: Wed, 14 May 2008 6:38 am
> Subject: Re: [computer-go] 10k UCT bots

> Don Dailey wrote: 

> > [EMAIL PROTECTED] wrote: 

> >> For those currently coding this up, I think the most important thing 
>>>  about this playout algorithm is that it is 
>> >  *temporary*. You will  almost certainly be replacing it with something 
>>different and better 
>> > just a little bit down the road.  So you probably don't want to worry 
>> > about hair-splitting tweaks except as an academic exercise. 

> Yes, I agree. Also my hair brained scheme of pre-generated tables of
> > list traversal orderings was just an academic   exercise as you say.  

> But the problem is that when you do heavy playouts you have the same 
> problem except that the probabilities of the legal moves are no longer equal. 

The problem doesn't go away but the trade-offs change considerably. This is an 
interesting and relevant discussion, but if I were trying to code up light MC 
playouts for the first time, right now, I would be feeling that this 
dead-simple algorithm was actually very difficult and confusing. 

For someone in that position (and only them), my advice is
1. Implement light playouts first. It's simple; you will find many bugs that 
way; it's standardized enough that other people will understand what you're 
talking about; it's a fast way to get a basic bot; it will be a very handy 
thing to have as a baseline when you test other things.
2. Get it working the standard way before improving it. It's your baseline that 
you'll be testing improvements against.
3. Make it fast but don't spend excessive effort optimizing. "Better is the 
enemy of good enough." 

- Dave Hillis

___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

Re: [computer-go] 10k UCT bots

2008-05-14 Thread Don Dailey




Norbert Gábor Papp wrote:

Thanks for your fast reply,but sorry, I don't really understand this...

The situation -> both player pass, end of the game, I need the score.

I want to remove dead-stones which means :
  
There is no known perfect algorithm for doing this in every case and 
it's a function of how good your program is.   So if you figure out how 
to do it,  please let us know.


Typically, some kind of local search is done and the details of how to 
do that vary from program to program.   Some do it better than 
others.It's not trivial.Search the web,  there are papers and 
articles about it.


If you use tromp/taylor scoring you don't worry about it.   After 2 
passes EVERY stone is considered alive.   If there are dead stones then 
the opponent should not have passed.  

If you are asking because you want to implement monte carlo,  you just 
play until there are no legal eye filling moves, then you pass.   Then 
you score the board as is with no consideration of dead stones. It's 
very simple.Sometimes one side has no moves that fit this criteria 
but the other side does,  so you just play until both players run out of 
non-eye filling moves,  which gives 2 consecutive passes.


A reasonable but slow algorithm for identifying dead stones is to play a 
few hundred random games (without filling eyes) and if existing stones 
rarely or never survive they are dead with a high probability. This 
covers most of the simple and visually intuitive cases.



- Don



if (IsGameEnded) {
for (int i=0, int ,j=0; i:

  

That's a function of how smart your bot is. If you play until you only have
eye-filling moves, you can safely assume all of your opponent's stones are
alive, all your groups with two eyes are alive, and everything else is dead.
Note the asymetry - your opponent may use a different strategy.

If you use random playouts, you could compute the probability of specific
points being owned by each player, and use that for both passing and marking
dead stones.

There are many other variants that use life and death modules, but I'll
assume you don't have them yet

Sent from my iPhone


On May 14, 2008, at 9:10 AM, "Norbert Gábor Papp" <
[EMAIL PROTECTED]> wrote:

 Thanks! How can I identify dead stones?


I haven't seen algorithm for this, and it is a very important part of
a go program
2008/5/14, Don Dailey <[EMAIL PROTECTED]>:

  

This probably explains it better than I could:

 http://senseis.xmp.net/?TrompTaylorRules

- Don



Norbert Gábor Papp wrote:



Hi!

Can you tell me some algorithm to compute the score ? (Both players
pass,
and who is the winner...)

Thanks, Norbert




___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

  

___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

 ___


computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

  

___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/




  



___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

RE: [computer-go] 10k UCT bots

2008-05-14 Thread David Fotland

Is it true that most programs using heavy playouts use a probability
distribution on the moves to pick the next move?  I thought Mogo needed a
uniform distribution of nonlocal moves to make RAVE work well.

David

> 
> But the problem is that when you do heavy playouts you have the same
> problem except that the probabilities of the legal moves are no longer
> equal. Unless, of course, the board system keeps a list of legal moves,
> not just empty points and own eyes in playout mode have zero score
> which
> is the same as if they were illegal moves. Am I the only one doing
> this?
> 
> Jacques.
> 


___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

Re: [computer-go] 10k UCT bots

2008-05-14 Thread Hideki Kato

I didn't against you, Álvaro, rather I just made a caution for
programmers who will use your pseudo code as is. :)

First, I prefer SFMT (SIMD-oriented Fast Mersenne Twister) rather
than integer pseudo random number generators in practice where the
quality of play-out is important.  Modern processors execute floating
operations as fast as interger ones and
picked = mt_rand() * (double) num_candidates;
is the simplest and safe. 
http://www.math.sci.hiroshima-u.ac.jp/~m-mat/MT/SFMT/index.html
#Converting int to double is, in fact, not so fast that
using double num_candidate through out the code may be faster.

When using integer pseudo random number generators and keeping exact
uniform distribution, following code can be used (slower twice).
This discards extra random numbers that bias the distribution.
#As Álvaro wrote, if n << RAND_MAX the bias is very small and may be
ignored.

unsigned long exactly_uniform_random(unsigned long n) { 
  unsigned long m, r; 
  if  (n == 0) return 0; 
  m = RAND_MAX % n; 
  do { 
r = rand(); 
  } while (r <= m) ; 
  return r % n; 
} 

In addition, xor_shift is better than builtin rand() and faster and
much smaller than MT.  #Hiroshi, the author of Aya, uses this.
http://en.wikipedia.org/wiki/Xorshift
http://www.jstatsoft.org/v08/i14/paper

unsigned long xorshiftrand(void) { 
  static unsigned long
x=123456789, y=362436069,
z=521288629, w=88675123; 
  unsigned long t; 
  t = x ^ (x << 11);
  x = y; y = z; z = w;
  return w = (w ^ (w >> 19)) ^ (t ^ (t >> 8)); 
} 

-Hideki

Álvaro Begué: <[EMAIL PROTECTED]>:
>On Tue, May 13, 2008 at 11:57 PM, Hideki Kato <[EMAIL PROTECTED]> wrote:
>>
>>  Álvaro Begué: <[EMAIL PROTECTED]>:
>>  >Ooops! I hit sent before I finished writing the pseudo code. Sorry.
>>  >
>>  >int pick(Move *empties, int num_empties) {
>>  > int num_candidates = num_empties;
>>  > int picked;
>>  >
>>  > while(1) {
>>  >   picked = rand()%num_candidates;
>>
>>  This code introduces few bias unless num_candidates is a power of two,
>>  though.
>>  #Assuming rand() returns [0 .. power of two).
>
>Oh, please. I should probably just take that as a provocation and
>ignore it, but I am weak. :)
>
>The pseudo-code I posted was meant to illustrate the process by which
>you move an element to the end and then exclude it from the lottery.
>rand()%num_candidates is just a quick way of telling the reader "pick
>a random integer in [0,num_candidates)".
>
>Besides, some people didn't care if a point had probability that was
>twice as large as the rest. In my system, RAND_MAX is 2147483647,
>which means that in the worst case, some points will have a
>probability that is a factor of 5948709/5948708=1.0016810372941485
>larger than the rest. Even I can live with that.
>
>Preemptive argument: Now someone will point out that the last few bits
>of rand() may not be very random in some common implementations, so in
>the case where num_candidates is a power of two I may have some
>biases. Please, reread two paragraphs above, or substitute rand() with
>the Mersenne twister.
>
>
>Álvaro.
>___
>computer-go mailing list
>computer-go@computer-go.org
>http://www.computer-go.org/mailman/listinfo/computer-go/
--
[EMAIL PROTECTED] (Kato)
___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

Re: [computer-go] 10k UCT bots

2008-05-14 Thread Carter Cheng

How do L&D modules generally function? Is this discussed in the literature 
somewhere? The only open source L&D module I am aware of is the one in GNU-go 
and I am not certain how good or bad it is since my own playing strength isn't 
that good. I have found some papers on this topic but most do not go into much 
detail on how they are evaluating L&D and most just discuss performance issues 
and general algorithmic approaches (like proof-number-search). 

Regards,

Carter.


  
___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

Re: [computer-go] 10k UCT bots

2008-05-14 Thread Álvaro Begué

On Wed, May 14, 2008 at 10:12 AM, Heikki Levanto <[EMAIL PROTECTED]> wrote:
> On Wed, May 14, 2008 at 03:47:55PM +0200, Norbert Gábor Papp wrote:
>> I want to remove dead-stones which means :
>> [...]
>> I'm interested in the function dead(), which is true when a stone is dead
>> after both player pass,and the game is ended.
>
> The simple answer is that there is no such function!

In Tromp/Taylor rules, it's really easy to implement that function:
All stones are alive. If you are playing and you think some of the
opponent's stones are dead, don't pass; capture them instead.
___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

Re: [computer-go] 10k UCT bots

2008-05-14 Thread Heikki Levanto

On Wed, May 14, 2008 at 03:47:55PM +0200, Norbert Gábor Papp wrote:
> I want to remove dead-stones which means :
> [...]
> I'm interested in the function dead(), which is true when a stone is dead
> after both player pass,and the game is ended.

The simple answer is that there is no such function!

Most Monte-Carlo programs play the game out to the bitter end, when they can
assume that all stones on board are alive, and all territory is segmented
into one-point eyes. Then it is trivial to see who owns what.

The alternative is to collect the stones into groups, and check which groups
have (or can make) two eyes. This gets quite complex. Even humans make the
occasional mistake in evaluating the position after both have passed,
especially beginners.

Some say that the game should never reach the point of counting, since at
some point the loosing part will see that he can not win, and will resign on
the spot. I think it will be a long time before programs can do that with
confidence...

-H

-- 
Heikki Levanto   "In Murphy We Turst" heikki (at) lsd (dot) dk

___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

Re: [computer-go] 10k UCT bots

2008-05-14 Thread Jason House

On May 14, 2008, at 9:47 AM, "Norbert Gábor Papp" <[EMAIL PROTECTED] 
m> wrote:


Thanks for your fast reply,but sorry, I don't really understand  
this...


The situation -> both player pass, end of the game, I need the score.

I want to remove dead-stones which means :

if (IsGameEnded) {
for (int i=0, int ,j=0; iI'm interested in the function dead(), which is true when a stone is  
dead after both player pass,and the game is ended.


A proper answer requires knowing what your pass criteria is and  
possibly which logic is available. If you pass when your remaining  
options are illegal or fill one point eyes, then the logic is  
relatively simple 99% of the time... All stones in atari are dead. Ko  
messes that up if your opponent passes instead filling a ko.







2008/5/14 Jason House <[EMAIL PROTECTED]>:
That's a function of how smart your bot is. If you play until you  
only have eye-filling moves, you can safely assume all of your  
opponent's stones are alive, all your groups with two eyes are  
alive, and everything else is dead. Note the asymetry - your  
opponent may use a different strategy.


If you use random playouts, you could compute the probability of  
specific points being owned by each player, and use that for both  
passing and marking dead stones.


There are many other variants that use life and death modules, but  
I'll assume you don't have them yet


Sent from my iPhone


On May 14, 2008, at 9:10 AM, "Norbert Gábor Papp" <[EMAIL PROTECTED] 
m> wrote:


Thanks! How can I identify dead stones?
I haven't seen algorithm for this, and it is a very important part of
a go program
2008/5/14, Don Dailey <[EMAIL PROTECTED]>:

This probably explains it better than I could:

 http://senseis.xmp.net/?TrompTaylorRules

- Don



Norbert Gábor Papp wrote:
Hi!

Can you tell me some algorithm to compute the score ? (Both players  
pass,

and who is the winner...)

Thanks, Norbert


--- 
-


___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/
___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/
___
computer-go mailing list
computer-go@computer-go.org
http:
___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/
___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

Re: [computer-go] 10k UCT bots

2008-05-14 Thread Norbert Gábor Papp

Thanks for your fast reply,but sorry, I don't really understand this...

The situation -> both player pass, end of the game, I need the score.

I want to remove dead-stones which means :

if (IsGameEnded) {
for (int i=0, int ,j=0; i:

> That's a function of how smart your bot is. If you play until you only have
> eye-filling moves, you can safely assume all of your opponent's stones are
> alive, all your groups with two eyes are alive, and everything else is dead.
> Note the asymetry - your opponent may use a different strategy.
>
> If you use random playouts, you could compute the probability of specific
> points being owned by each player, and use that for both passing and marking
> dead stones.
>
> There are many other variants that use life and death modules, but I'll
> assume you don't have them yet
>
> Sent from my iPhone
>
>
> On May 14, 2008, at 9:10 AM, "Norbert Gábor Papp" <
> [EMAIL PROTECTED]> wrote:
>
>  Thanks! How can I identify dead stones?
>> I haven't seen algorithm for this, and it is a very important part of
>> a go program
>> 2008/5/14, Don Dailey <[EMAIL PROTECTED]>:
>>
>>>
>>> This probably explains it better than I could:
>>>
>>>  http://senseis.xmp.net/?TrompTaylorRules
>>>
>>> - Don
>>>
>>>
>>>
>>> Norbert Gábor Papp wrote:
>>>
 Hi!

 Can you tell me some algorithm to compute the score ? (Both players
 pass,
 and who is the winner...)

 Thanks, Norbert


 

 ___
 computer-go mailing list
 computer-go@computer-go.org
 http://www.computer-go.org/mailman/listinfo/computer-go/

>>> ___
>>> computer-go mailing list
>>> computer-go@computer-go.org
>>> http://www.computer-go.org/mailman/listinfo/computer-go/
>>>
>>>  ___
>> computer-go mailing list
>> computer-go@computer-go.org
>> http://www.computer-go.org/mailman/listinfo/computer-go/
>>
> ___
> computer-go mailing list
> computer-go@computer-go.org
> http://www.computer-go.org/mailman/listinfo/computer-go/
>
___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

Re: [computer-go] 10k UCT bots

2008-05-14 Thread Jason House

That's a function of how smart your bot is. If you play until you only  
have eye-filling moves, you can safely assume all of your opponent's  
stones are alive, all your groups with two eyes are alive, and  
everything else is dead. Note the asymetry - your opponent may use a  
different strategy.


If you use random playouts, you could compute the probability of  
specific points being owned by each player, and use that for both  
passing and marking dead stones.


There are many other variants that use life and death modules, but  
I'll assume you don't have them yet


Sent from my iPhone

On May 14, 2008, at 9:10 AM, "Norbert Gábor Papp" <[EMAIL PROTECTED] 
m> wrote:



Thanks! How can I identify dead stones?
I haven't seen algorithm for this, and it is a very important part of
a go program
2008/5/14, Don Dailey <[EMAIL PROTECTED]>:


This probably explains it better than I could:

  http://senseis.xmp.net/?TrompTaylorRules

- Don



Norbert Gábor Papp wrote:

Hi!

Can you tell me some algorithm to compute the score ? (Both  
players pass,

and who is the winner...)

Thanks, Norbert


--- 
--- 
--


___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/


___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

Re: [computer-go] 10k UCT bots

2008-05-14 Thread Norbert Gábor Papp

Thanks! How can I identify dead stones?
I haven't seen algorithm for this, and it is a very important part of
a go program
2008/5/14, Don Dailey <[EMAIL PROTECTED]>:
>
> This probably explains it better than I could:
>
>http://senseis.xmp.net/?TrompTaylorRules
>
> - Don
>
>
>
> Norbert Gábor Papp wrote:
> > Hi!
> >
> > Can you tell me some algorithm to compute the score ? (Both players pass,
> > and who is the winner...)
> >
> > Thanks, Norbert
> >
> >
> > 
> >
> > ___
> > computer-go mailing list
> > computer-go@computer-go.org
> > http://www.computer-go.org/mailman/listinfo/computer-go/
> ___
> computer-go mailing list
> computer-go@computer-go.org
> http://www.computer-go.org/mailman/listinfo/computer-go/
>
___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

Re: [computer-go] 10k UCT bots

2008-05-14 Thread Don Dailey



This probably explains it better than I could:

  http://senseis.xmp.net/?TrompTaylorRules

- Don



Norbert Gábor Papp wrote:

Hi!

Can you tell me some algorithm to compute the score ? (Both players pass,
and who is the winner...)

Thanks, Norbert

  



___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

Re: [computer-go] 10k UCT bots

2008-05-14 Thread Norbert Gábor Papp

Hi!

Can you tell me some algorithm to compute the score ? (Both players pass,
and who is the winner...)

Thanks, Norbert
___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

Re: [computer-go] 10k UCT bots

2008-05-14 Thread Jacques Basaldúa


Don Dailey wrote:


[EMAIL PROTECTED] wrote:


For those currently coding this up, I think the most important thing 
about this playout algorithm is that it is *temporary*. You will 
almost certainly be?replacing it with something different and better 
just a little bit down the road. So you probably don't want to worry 
about hair-splitting tweaks except as an academic exercise.
  


Yes,  I agree. Also  my hair brained scheme of pre-generated tables of 
list traversal orderings was just an academic exercise as you say. 


But the problem is that when you do heavy playouts you have the same
problem except that the probabilities of the legal moves are no longer 
equal. Unless, of course, the board system keeps a list of legal moves, 
not just empty points and own eyes in playout mode have zero score which

is the same as if they were illegal moves. Am I the only one doing this?

I don't know if the impact is measurable in strength or not. But as the
game advances, the own eyes are found in the root position. When that
happens, the bias applies to the same move in all the playouts. In this
case the program is favoring systematically the exploration of a move
for no reason at all and it is always the same move. Since the number 
of simulations is limited, that pays less attention to other moves.

I don't know if it is bad, but at least it is ugly. ;-)

Jacques.

___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

Re: [computer-go] 10k UCT bots

2008-05-14 Thread Álvaro Begué

On Tue, May 13, 2008 at 11:57 PM, Hideki Kato <[EMAIL PROTECTED]> wrote:
>
>  Álvaro Begué: <[EMAIL PROTECTED]>:
>  >Ooops! I hit sent before I finished writing the pseudo code. Sorry.
>  >
>  >int pick(Move *empties, int num_empties) {
>  > int num_candidates = num_empties;
>  > int picked;
>  >
>  > while(1) {
>  >   picked = rand()%num_candidates;
>
>  This code introduces few bias unless num_candidates is a power of two,
>  though.
>  #Assuming rand() returns [0 .. power of two).

Oh, please. I should probably just take that as a provocation and
ignore it, but I am weak. :)

The pseudo-code I posted was meant to illustrate the process by which
you move an element to the end and then exclude it from the lottery.
rand()%num_candidates is just a quick way of telling the reader "pick
a random integer in [0,num_candidates)".

Besides, some people didn't care if a point had probability that was
twice as large as the rest. In my system, RAND_MAX is 2147483647,
which means that in the worst case, some points will have a
probability that is a factor of 5948709/5948708=1.0016810372941485
larger than the rest. Even I can live with that.

Preemptive argument: Now someone will point out that the last few bits
of rand() may not be very random in some common implementations, so in
the case where num_candidates is a power of two I may have some
biases. Please, reread two paragraphs above, or substitute rand() with
the Mersenne twister.

Álvaro.
___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

Re: [computer-go] 10k UCT bots

2008-05-14 Thread Joel Veness

I was waiting for this one... :)

Joel

On Wed, May 14, 2008 at 1:57 PM, Hideki Kato <[EMAIL PROTECTED]> wrote:
>
>  Álvaro Begué: <[EMAIL PROTECTED]>:
>
> >Ooops! I hit sent before I finished writing the pseudo code. Sorry.
>  >
>  >int pick(Move *empties, int num_empties) {
>  > int num_candidates = num_empties;
>  > int picked;
>  >
>  > while(1) {
>  >   picked = rand()%num_candidates;
>
>  This code introduces few bias unless num_candidates is a power of two,
>  though.
>  #Assuming rand() returns [0 .. power of two).
>
>  -Hideki
>
>
>
>  >   if(!acceptable(empties[picked])) {
>  > num_candidates--;
>  > swap(empties[picked],empties[num_candidates]);
>  >   }
>  >   else
>  > break;
>  > }
>  > return picked;
>  >}
>  >
>  >
>  >
>  >On Tue, May 13, 2008 at 1:57 PM, Álvaro Begué <[EMAIL PROTECTED]> wrote:
>  >> On Tue, May 13, 2008 at 1:51 PM, Mark Boon <[EMAIL PROTECTED]> wrote:
>  >>  >
>  >>  > On 13-mei-08, at 14:10, Álvaro Begué wrote:
>  >>  >
>  >>  >
>  >>  > What others do is the right thing to do. Your method will introduce
>  >>  >
>  >>  > some biases.
>  >>  > Could you elaborate what bias it could lead to? I also do the same as 
> Jason.
>  >>  > I did consider the possibility of a bias but couldn't immediately 
> think of
>  >>  > one.
>  >>
>  >>  This has been explained already. After the first eye appears on the
>  >>  board, the first empty point after the eye has a probability of being
>  >>  picked that is twice the probability of any other empty point.
>  >>
>  >>
>  >>  > What good does moving it to the end of the list do? Next time around 
> it's
>  >>  > just as likely to be picked as when left in place. Or do you only 
> process
>  >>  > the 'end' after the 'front' moves are all tried?
>  >>
>  >>  You move it to the end and you reduce the number of candidates by one.
>  >>  The code is roughly this:
>  >>
>  >>  int pick(Move *empties, int num_empties) {
>  >>   int num_candidates = num_empties;
>  >>   int picked;
>  >>
>  >>   do {
>  >> picked = rand()%num_candidates;
>  >> num_candidates--;
>  >>   } while(!acceptable(empties[picked]);
>  >>
>  >>   return picked;
>  >>  }
>  >>
>  >>
>  >>  Álvaro.
>  >>
>  >___
>  >computer-go mailing list
>  >computer-go@computer-go.org
>  >http://www.computer-go.org/mailman/listinfo/computer-go/
>  --
>  [EMAIL PROTECTED] (Kato)
>
>
> ___
>  computer-go mailing list
>  computer-go@computer-go.org
>  http://www.computer-go.org/mailman/listinfo/computer-go/
>
___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

Re: [computer-go] 10k UCT bots

2008-05-13 Thread Hideki Kato


Álvaro Begué: <[EMAIL PROTECTED]>:
>Ooops! I hit sent before I finished writing the pseudo code. Sorry.
>
>int pick(Move *empties, int num_empties) {
> int num_candidates = num_empties;
> int picked;
>
> while(1) {
>   picked = rand()%num_candidates;

This code introduces few bias unless num_candidates is a power of two,
though.
#Assuming rand() returns [0 .. power of two).

-Hideki

>   if(!acceptable(empties[picked])) {
> num_candidates--;
> swap(empties[picked],empties[num_candidates]);
>   }
>   else
> break;
> }
> return picked;
>}
>
>
>
>On Tue, May 13, 2008 at 1:57 PM, Álvaro Begué <[EMAIL PROTECTED]> wrote:
>> On Tue, May 13, 2008 at 1:51 PM, Mark Boon <[EMAIL PROTECTED]> wrote:
>>  >
>>  > On 13-mei-08, at 14:10, Álvaro Begué wrote:
>>  >
>>  >
>>  > What others do is the right thing to do. Your method will introduce
>>  >
>>  > some biases.
>>  > Could you elaborate what bias it could lead to? I also do the same as 
>> Jason.
>>  > I did consider the possibility of a bias but couldn't immediately think of
>>  > one.
>>
>>  This has been explained already. After the first eye appears on the
>>  board, the first empty point after the eye has a probability of being
>>  picked that is twice the probability of any other empty point.
>>
>>
>>  > What good does moving it to the end of the list do? Next time around it's
>>  > just as likely to be picked as when left in place. Or do you only process
>>  > the 'end' after the 'front' moves are all tried?
>>
>>  You move it to the end and you reduce the number of candidates by one.
>>  The code is roughly this:
>>
>>  int pick(Move *empties, int num_empties) {
>>   int num_candidates = num_empties;
>>   int picked;
>>
>>   do {
>> picked = rand()%num_candidates;
>> num_candidates--;
>>   } while(!acceptable(empties[picked]);
>>
>>   return picked;
>>  }
>>
>>
>>  Álvaro.
>>
>___
>computer-go mailing list
>computer-go@computer-go.org
>http://www.computer-go.org/mailman/listinfo/computer-go/
--
[EMAIL PROTECTED] (Kato)
___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

Re: [computer-go] 10k UCT bots

2008-05-13 Thread Heikki Levanto

On Tue, May 13, 2008 at 01:34:37PM -0400, Don Dailey wrote:
> The point after an illegal move is quite a bit more likely to be 
> selected.   If the list had just 1 illegal point, then the point after 
> it in the list is twice as likely to be selected as any other point.
> Perhaps if you added a little noise after each move (perhaps swapping to 
> points at random)  you could minimize any seriously bad effects?  

I think the problem becomes more pronounced near the end of the simulation,
when most of the boars is filled with stones or 1-point eyes. If, at that
time, there are two legal moves left, one near the beginning of the list, and
one near the end, then the second one can end up as much more likely to be
chosen. And if the first one is the key point to the position, it may never
even be noticed.

No idea how likely this scenario is in real games, but at least it is
possible. And if your search tree is large enough, once-in-a-thousand
situations occur all the time...

-H

-- 
Heikki Levanto   "In Murphy We Turst" heikki (at) lsd (dot) dk

___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

Re: [computer-go] 10k UCT bots

2008-05-13 Thread Don Dailey




[EMAIL PROTECTED] wrote:

For those currently coding this up, I think the most important thing about this 
playout algorithm is that it is *temporary*. You will almost certainly 
be?replacing it with something different and better just a little bit down the 
road.

Creating an MC-UCT bot has a well worn path and its kind of an ontology 
recapitulates phylogeny deal. First you implement light playouts (the random 
non-eyefilling moves people are describing), then you implement UCT, then you 
throw away the light playouts and replace them?with heavy playouts, then you 
start extreme modifications to UCT...

So you probably don't want to worry about hair-splitting tweaks except as an 
academic exercise.
  
Yes,  I agree.   Also  my hair brained scheme of pre-generated tables of 
list traversal orderings was just an academic exercise as you say.  


- Don




- Dave Hillis


-Original Message-
From: Christoph Birk <[EMAIL PROTECTED]>
To: computer-go 
Sent: Tue, 13 May 2008 3:40 pm
Subject: Re: [computer-go] 10k UCT bots


On Tue, 13 May 2008, Mark Boon wrote:?
  

If this asymmetry really bothers you, you could very easily fix this by?
wrapping the search around. There's no asymmetry in a circle.?


That doesn't fix anything.?
  

?
Why not? The whole argument is about a bias against points towards the end. > 
In a circular list there is no 'end'.?


?
No, it was a bias towards moves "behind" illegal moves.?
Those moves are twice as likely to be played than other moves. Consider a list 
with 5 moves:?
?
[Move1] [Move2] [Move3] [Move4] [Move5]?
?
You create a random number between 1 and 5. If Move2 is illegeal?
for example, then you will play?
?Move1 if random#=1?
?Move3 if random#=2 or 3,?
?Move4 =4?
?Move5 =5?
?
Move3 is twice as likely to be played. Even if you make a circular?
list.?
?
Christoph?
?
___?
computer-go mailing list?
[EMAIL PROTECTED]
http://www.computer-go.org/mailman/listinfo/computer-go/?


  



___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

Re: [computer-go] 10k UCT bots

2008-05-13 Thread dhillismail

For those currently coding this up, I think the most important thing about this 
playout algorithm is that it is *temporary*. You will almost certainly 
be?replacing it with something different and better just a little bit down the 
road.

Creating an MC-UCT bot has a well worn path and its kind of an ontology 
recapitulates phylogeny deal. First you implement light playouts (the random 
non-eyefilling moves people are describing), then you implement UCT, then you 
throw away the light playouts and replace them?with heavy playouts, then you 
start extreme modifications to UCT...

So you probably don't want to worry about hair-splitting tweaks except as an 
academic exercise.

- Dave Hillis

-Original Message-
From: Christoph Birk <[EMAIL PROTECTED]>
To: computer-go 
Sent: Tue, 13 May 2008 3:40 pm
Subject: Re: [computer-go] 10k UCT bots

On Tue, 13 May 2008, Mark Boon wrote:?
>>> If this asymmetry really bothers you, you could very easily fix this by?
>>> wrapping the search around. There's no asymmetry in a circle.?
>> >> That doesn't fix anything.?
>?
> Why not? The whole argument is about a bias against points towards the end. > 
> In a circular list there is no 'end'.?
?
No, it was a bias towards moves "behind" illegal moves.?
Those moves are twice as likely to be played than other moves. Consider a list 
with 5 moves:?
?
[Move1] [Move2] [Move3] [Move4] [Move5]?
?
You create a random number between 1 and 5. If Move2 is illegeal?
for example, then you will play?
?Move1 if random#=1?
?Move3 if random#=2 or 3,?
?Move4 =4?
?Move5 =5?
?
Move3 is twice as likely to be played. Even if you make a circular?
list.?
?
Christoph?
?
___?
computer-go mailing list?
[EMAIL PROTECTED]
http://www.computer-go.org/mailman/listinfo/computer-go/?

___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

Re: [computer-go] 10k UCT bots

2008-05-13 Thread Jason House


On May 13, 2008, at 3:20 PM, Don Dailey <[EMAIL PROTECTED]> wrote:





I don't care much about it being noticeable. This thread is about
putting bots on CGOS that use a reproducible algorithm, to help  
people
detect bugs in their implementations. As part of specifying what  
these
bots do, we should all pick the next move in a playout using the  
same

criteria. If we agree to use uniform distribution among empty
non-eyeish points, that's what should be implemented.

I agree with this.   If we get too clever we lose confidence that  
we are comparing apples to apples by risking bugs and making  
assumptions that may not hold.



I take an opposite approach: the data will tell us when we are  
comparing apples and oranges. It will help us identify the major  
factors leading to strength. There's probably more subtle variation  
than we realize...


But when someone is explicitly asking for a direct comparison,   you  
don't want to just throw in your own magic and then try to sort it  
out later. You are probably looking for implementation bugs, not  
implementation differences.   That has it's place but not in this  
context.


Is this not in the context of the original post that started this  
thread? I thought I understood the OP pretty well...




- Don








- Don




I would imagine moving an illegal point towards the end and only  
start
including it when the other 'legal' moves run out can lead to  
terrible bias
however because they may not remain illegal for very long and  
actually
become important points to play. A ko-point is probably the most  
extreme

example of that.



I don't think you understood the algorithm. The eyeish point is
removed from the lottery only for picking this particular move, not
for the rest of the playout.


Anyway, I don't bother to order the empty-point-list or scramble  
them in any
way prior to the game. So the first point is the 1-1 point and  
the last is
the 19-19 point (or whatever boardsize you're playing) so I have  
no qualms
about those moves being a little less likely to be played. Or  
even a lot

less. I think it would actually be beneficial.



Reproducibility was the point, not strength of the bot.


If this asymmetry really bothers you, you could very easily fix  
this by

wrapping the search around. There's no asymmetry in a circle.



That doesn't fix anything.


Álvaro.
___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/



___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/


___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

Re: [computer-go] 10k UCT bots

2008-05-13 Thread Don Dailey




Mark Boon wrote:


On 13-mei-08, at 16:17, Don Dailey wrote:

I missed this from you.   I assumed that you did this anyway.   If 
you choose a random point and then traverse linearly to the end,  
what do you do when you reach the end? Do you just pass?I 
assumed you viewed the empty point list as a circular queue.


No I didn't. When I  reach the end I traverse in the opposite 
direction, starting from the originally chosen location. Using a 
circular list may actually be simpler.
It's probably simpler.   You have to choose between using branching or 
division:


   nextLoc = curLoc % LIST_SIZE;

or

  if (nextLoc > MAX_ELEMENT) nextLoc = firstLoc;

- Don




Mark





___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

Re: [computer-go] 10k UCT bots

2008-05-13 Thread Mark Boon



On 13-mei-08, at 16:17, Don Dailey wrote:

I missed this from you.   I assumed that you did this anyway.   If  
you choose a random point and then traverse linearly to the end,   
what do you do when you reach the end? Do you just pass?I  
assumed you viewed the empty point list as a circular queue.


No I didn't. When I  reach the end I traverse in the opposite  
direction, starting from the originally chosen location. Using a  
circular list may actually be simpler.


Mark


___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

Re: [computer-go] 10k UCT bots

2008-05-13 Thread Christoph Birk


On Tue, 13 May 2008, Mark Boon wrote:

If this asymmetry really bothers you, you could very easily fix this by
wrapping the search around. There's no asymmetry in a circle.


That doesn't fix anything.


Why not? The whole argument is about a bias against points towards the end. 
In a circular list there is no 'end'.


No, it was a bias towards moves "behind" illegal moves.
Those moves are twice as likely to be played than other moves. 
Consider a list with 5 moves:


[Move1] [Move2] [Move3] [Move4] [Move5]

You create a random number between 1 and 5. If Move2 is illegeal
for example, then you will play
 Move1 if random#=1
 Move3 if random#=2 or 3,
 Move4   =4
 Move5   =5

Move3 is twice as likely to be played. Even if you make a circular
list.

Christoph

___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

Re: [computer-go] 10k UCT bots

2008-05-13 Thread Álvaro Begué

On Tue, May 13, 2008 at 3:08 PM, Mark Boon <[EMAIL PROTECTED]> wrote:
>
>  On 13-mei-08, at 15:44, Álvaro Begué wrote:
>
>
> > On Tue, May 13, 2008 at 2:28 PM, Mark Boon <[EMAIL PROTECTED]>
> wrote:
> >
> > >
> > > On 13-mei-08, at 15:08, Jason House wrote:
> > >
> > > The range of the random number is reduced by one after each failed
> lookup.
> > > Shuffled data has no impact on future use of the array of empty points.
> > >
> > > OK, I understand now why a point at the end (or beginning) is a little
> less
> > > likely to be picked. Although I still have doubts whether that will lead
> to
> > > a noticable bias, I'll try to think about it.
> > >
> >
> > I don't care much about it being noticeable. This thread is about
> > putting bots on CGOS that use a reproducible algorithm, to help people
> > detect bugs in their implementations. As part of specifying what these
> > bots do, we should all pick the next move in a playout using the same
> > criteria. If we agree to use uniform distribution among empty
> > non-eyeish points, that's what should be implemented.
> >
> >
>
>  Indeed it seems I misunderstood. I thought the general algorithm needed to
> be the same, not necessarily the exact implementation. Why not publish some
> pseudo-code with the exact statements then? Seems a bit less prone to errors
> then by talking about it.

No, we don't need to all use the same implementations. You can
implement uniformly picking a random non-eyeish point any way you
like. I actually don't use the method we've talked about. The problem
with the other method is that it implements something else.

> > > I would imagine moving an illegal point towards the end and only start
> > > including it when the other 'legal' moves run out can lead to terrible
> bias
> > > however because they may not remain illegal for very long and actually
> > > become important points to play. A ko-point is probably the most extreme
> > > example of that.
> > >
> >
> > I don't think you understood the algorithm. The eyeish point is
> > removed from the lottery only for picking this particular move, not
> > for the rest of the playout.
> >
>
>  Yes, again I misunderstood. So you do another random lookup each time you
> hit an illegal point? That could turn out very expensive, especially by the
> end of the game.

You can try it. It's actually not that expensive. A long time ago I
modified libego to use this method and it didn't slow it down much
(sorry, I can't remember the fraction of speed lost).

> > > Anyway, I don't bother to order the empty-point-list or scramble them in
> any
> > > way prior to the game. So the first point is the 1-1 point and the last
> is
> > > the 19-19 point (or whatever boardsize you're playing) so I have no
> qualms
> > > about those moves being a little less likely to be played. Or even a lot
> > > less. I think it would actually be beneficial.
> > >
> >
> > Reproducibility was the point, not strength of the bot.
> >
>
>  I'd say, share the source if that's the objective. I doubt you'll get real
> reproducibility any other way.

Why not? The only things we haven't clearly specified are the UCT
formula to use and how we end playouts (although there is only one
natural choice for this last part, I believe).

> > > If this asymmetry really bothers you, you could very easily fix this by
> > > wrapping the search around. There's no asymmetry in a circle.
> > >
> >
> > That doesn't fix anything.
> >
>
>  Why not? The whole argument is about a bias against points towards the end.
> In a circular list there is no 'end'.

No, the whole argument is about bias towards points that happen to be
immediately after eyeish or illegal points.


Álvaro.
___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

Re: [computer-go] 10k UCT bots

2008-05-13 Thread Don Dailey





I don't care much about it being noticeable. This thread is about
putting bots on CGOS that use a reproducible algorithm, to help people
detect bugs in their implementations. As part of specifying what these
bots do, we should all pick the next move in a playout using the same
criteria. If we agree to use uniform distribution among empty
non-eyeish points, that's what should be implemented.

I agree with this.   If we get too clever we lose confidence that we 
are comparing apples to apples by risking bugs and making assumptions 
that may not hold.



I take an opposite approach: the data will tell us when we are 
comparing apples and oranges. It will help us identify the major 
factors leading to strength. There's probably more subtle variation 
than we realize...


But when someone is explicitly asking for a direct comparison,   you 
don't want to just throw in your own magic and then try to sort it out 
later. You are probably looking for implementation bugs, not 
implementation differences.   That has it's place but not in this context.


- Don








- Don





I would imagine moving an illegal point towards the end and only start
including it when the other 'legal' moves run out can lead to 
terrible bias

however because they may not remain illegal for very long and actually
become important points to play. A ko-point is probably the most 
extreme

example of that.



I don't think you understood the algorithm. The eyeish point is
removed from the lottery only for picking this particular move, not
for the rest of the playout.


Anyway, I don't bother to order the empty-point-list or scramble 
them in any
way prior to the game. So the first point is the 1-1 point and the 
last is
the 19-19 point (or whatever boardsize you're playing) so I have no 
qualms
about those moves being a little less likely to be played. Or even 
a lot

less. I think it would actually be beneficial.



Reproducibility was the point, not strength of the bot.


If this asymmetry really bothers you, you could very easily fix 
this by

wrapping the search around. There's no asymmetry in a circle.



That doesn't fix anything.


Álvaro.
___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/



___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/


___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

Re: [computer-go] 10k UCT bots

2008-05-13 Thread Don Dailey






If this asymmetry really bothers you, you could very easily fix this by
wrapping the search around. There's no asymmetry in a circle.


That doesn't fix anything.


Why not? The whole argument is about a bias against points towards the 
end. In a circular list there is no 'end'.


I missed this from you.   I assumed that you did this anyway.   If you 
choose a random point and then traverse linearly to the end,  what do 
you do when you reach the end? Do you just pass?I assumed you 
viewed the empty point list as a circular queue.


- Don




Mark


___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/


___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

Re: [computer-go] 10k UCT bots

2008-05-13 Thread Jason House


On May 13, 2008, at 3:02 PM, Don Dailey <[EMAIL PROTECTED]> wrote:




Álvaro Begué wrote:
On Tue, May 13, 2008 at 2:28 PM, Mark Boon  
<[EMAIL PROTECTED]> wrote:



On 13-mei-08, at 15:08, Jason House wrote:

The range of the random number is reduced by one after each failed  
lookup.
Shuffled data has no impact on future use of the array of empty  
points.


OK, I understand now why a point at the end (or beginning) is a  
little less
likely to be picked. Although I still have doubts whether that  
will lead to

a noticable bias, I'll try to think about it.



I don't care much about it being noticeable. This thread is about
putting bots on CGOS that use a reproducible algorithm, to help  
people
detect bugs in their implementations. As part of specifying what  
these

bots do, we should all pick the next move in a playout using the same
criteria. If we agree to use uniform distribution among empty
non-eyeish points, that's what should be implemented.

I agree with this.   If we get too clever we lose confidence that we  
are comparing apples to apples by risking bugs and making  
assumptions that may not hold.



I take an opposite approach: the data will tell us when we are  
comparing apples and oranges. It will help us identify the major  
factors leading to strength. There's probably more subtle variation  
than we realize...






- Don




I would imagine moving an illegal point towards the end and only  
start
including it when the other 'legal' moves run out can lead to  
terrible bias
however because they may not remain illegal for very long and  
actually
become important points to play. A ko-point is probably the most  
extreme

example of that.



I don't think you understood the algorithm. The eyeish point is
removed from the lottery only for picking this particular move, not
for the rest of the playout.


Anyway, I don't bother to order the empty-point-list or scramble  
them in any
way prior to the game. So the first point is the 1-1 point and the  
last is
the 19-19 point (or whatever boardsize you're playing) so I have  
no qualms
about those moves being a little less likely to be played. Or even  
a lot

less. I think it would actually be beneficial.



Reproducibility was the point, not strength of the bot.


If this asymmetry really bothers you, you could very easily fix  
this by

wrapping the search around. There's no asymmetry in a circle.



That doesn't fix anything.


Álvaro.
___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/



___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

Re: [computer-go] 10k UCT bots

2008-05-13 Thread Mark Boon



On 13-mei-08, at 15:44, Álvaro Begué wrote:

On Tue, May 13, 2008 at 2:28 PM, Mark Boon  
<[EMAIL PROTECTED]> wrote:


On 13-mei-08, at 15:08, Jason House wrote:

The range of the random number is reduced by one after each failed  
lookup.
Shuffled data has no impact on future use of the array of empty  
points.


OK, I understand now why a point at the end (or beginning) is a  
little less
likely to be picked. Although I still have doubts whether that  
will lead to

a noticable bias, I'll try to think about it.


I don't care much about it being noticeable. This thread is about
putting bots on CGOS that use a reproducible algorithm, to help people
detect bugs in their implementations. As part of specifying what these
bots do, we should all pick the next move in a playout using the same
criteria. If we agree to use uniform distribution among empty
non-eyeish points, that's what should be implemented.



Indeed it seems I misunderstood. I thought the general algorithm  
needed to be the same, not necessarily the exact implementation. Why  
not publish some pseudo-code with the exact statements then? Seems a  
bit less prone to errors then by talking about it.


I would imagine moving an illegal point towards the end and only  
start
including it when the other 'legal' moves run out can lead to  
terrible bias
however because they may not remain illegal for very long and  
actually
become important points to play. A ko-point is probably the most  
extreme

example of that.


I don't think you understood the algorithm. The eyeish point is
removed from the lottery only for picking this particular move, not
for the rest of the playout.


Yes, again I misunderstood. So you do another random lookup each time  
you hit an illegal point? That could turn out very expensive,  
especially by the end of the game.




Anyway, I don't bother to order the empty-point-list or scramble  
them in any
way prior to the game. So the first point is the 1-1 point and the  
last is
the 19-19 point (or whatever boardsize you're playing) so I have  
no qualms
about those moves being a little less likely to be played. Or even  
a lot

less. I think it would actually be beneficial.


Reproducibility was the point, not strength of the bot.


I'd say, share the source if that's the objective. I doubt you'll get  
real reproducibility any other way.




If this asymmetry really bothers you, you could very easily fix  
this by

wrapping the search around. There's no asymmetry in a circle.


That doesn't fix anything.


Why not? The whole argument is about a bias against points towards  
the end. In a circular list there is no 'end'.


Mark


___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

Re: [computer-go] 10k UCT bots

2008-05-13 Thread Don Dailey




Álvaro Begué wrote:

On Tue, May 13, 2008 at 2:28 PM, Mark Boon <[EMAIL PROTECTED]> wrote:
  

On 13-mei-08, at 15:08, Jason House wrote:

The range of the random number is reduced by one after each failed lookup.
Shuffled data has no impact on future use of the array of empty points.

OK, I understand now why a point at the end (or beginning) is a little less
likely to be picked. Although I still have doubts whether that will lead to
a noticable bias, I'll try to think about it.



I don't care much about it being noticeable. This thread is about
putting bots on CGOS that use a reproducible algorithm, to help people
detect bugs in their implementations. As part of specifying what these
bots do, we should all pick the next move in a playout using the same
criteria. If we agree to use uniform distribution among empty
non-eyeish points, that's what should be implemented.
  
I agree with this.   If we get too clever we lose confidence that we are 
comparing apples to apples by risking bugs and making assumptions that 
may not hold.


- Don


  

I would imagine moving an illegal point towards the end and only start
including it when the other 'legal' moves run out can lead to terrible bias
however because they may not remain illegal for very long and actually
become important points to play. A ko-point is probably the most extreme
example of that.



I don't think you understood the algorithm. The eyeish point is
removed from the lottery only for picking this particular move, not
for the rest of the playout.

  

Anyway, I don't bother to order the empty-point-list or scramble them in any
way prior to the game. So the first point is the 1-1 point and the last is
the 19-19 point (or whatever boardsize you're playing) so I have no qualms
about those moves being a little less likely to be played. Or even a lot
less. I think it would actually be beneficial.



Reproducibility was the point, not strength of the bot.

  

If this asymmetry really bothers you, you could very easily fix this by
wrapping the search around. There's no asymmetry in a circle.



That doesn't fix anything.


Álvaro.
___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

  

___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

Re: [computer-go] 10k UCT bots

2008-05-13 Thread Don Dailey




Don Dailey wrote:



Jason House wrote:

On May 13, 2008, at 1:51 PM, Mark Boon <[EMAIL PROTECTED]> wrote:



On 13-mei-08, at 14:10, Álvaro Begué wrote:

What others do is the right thing to do. Your method will introduce
some biases.


Could you elaborate what bias it could lead to? I also do the same 
as Jason. I did consider the possibility of a bias but couldn't 
immediately think of one.


And I was thinking "let's not repeat this topic"...

The probability of picking a point is proportional to 1 + number of 
illegal points before it.


In practice, illegal moves are pretty rare until endgame. At that 
point, it's a trade off between bias and speed. Random number 
generation is not cheap. I have yet to see empirical proof that the 
pick and scan is bad. I've tried both methods in the past and saw no 
measurable difference in strength.
I may perhaps take a look at this myself.I think there is a way to 
deal with this issue and get the best of both worlds or at least 
approach it.   If this method does not degrade the quality, it's a 
moot point.   Otherwise, here is an idea:


  1. For each move list size, pre-compute N tables of move list 
traversal orderings.

  2. At move selection time alternative between them.

So you would have something in the C language like: 



int *order[S][C]


I forget to mention that S is the move list size (number of empty points 
you are interested in) and C is the number of list orderings you want to 
alternate between.   And a list is returns that is of size S-1 (since 
your first move is selected by calling the random number generater.)   A 
number like 8 or 16 is good (a power of two) because each time you using 
the list to select a move, you can increment a counter so that you use a 
different move list ordering the next time you are confronted with a 
move list of this same exact size.



Your first move is selected randomly, then from then on you use the 
array returned to form an offset from this.
This would pay off if you really wanted something very close to 
uniform random and your random number generator was clearly expensive.
I think there is a great deal of chaos that would make this "almost" 
perfectly uniform.   The first move would be selected as you already 
do, using the random number generator but after this you traverse the 
list in random order as specified by this pre-computed table. Even 
though the bias is still there for any given move,  it's basically 
non-predictable.   On the very next move you will using a different 
table which you choose from in round robin fashion. 
Don't know how flawed this idea is.   I believe it is would come close 
to uniformly random but wouldn't be perfect.  There would be a minor 
bias when measured over a lot of games,  it would probably occur that 
a certain move was being chosen slightly more often,  but nothing like 
the bias in the current method. 
Is it worth it?   Probably not,  but just a crazy idea.   Likely there 
are simpler ways to eliminate most of the bias while still minimizing 
the number of random move generations.

- Don
 







What good does moving it to the end of the list do? Next time around 
it's just as likely to be picked as when left in place. Or do you 
only process the 'end' after the 'front' moves are all tried?


The range of the random number is reduced by one after each failed 
lookup. Shuffled data has no impact on future use of the array of 
empty points.






Mark

___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/




___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/


___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

Re: [computer-go] 10k UCT bots

2008-05-13 Thread Don Dailey




Mark Boon wrote:


On 13-mei-08, at 15:08, Jason House wrote:

The range of the random number is reduced by one after each failed 
lookup. Shuffled data has no impact on future use of the array of 
empty points.




OK, I understand now why a point at the end (or beginning) is a little 
less likely to be picked. Although I still have doubts whether that 
will lead to a noticable bias, I'll try to think about it.


I don't believe there is any bias based on where a good point is, only 
where the bad points are.   Because you can think of this list of empty 
points as a circular array,  where no particular location is any more 
significant than any other.




I would imagine moving an illegal point towards the end and only start 
including it when the other 'legal' moves run out can lead to terrible 
bias however because they may not remain illegal for very long and 
actually become important points to play. A ko-point is probably the 
most extreme example of that.
I had a "bug" in my program that may have been an actual improvement.   
I did not always resurrect empty points that were eyes.   I went ahead 
and fixed this and didn't try to assess whether it was a "good bug" or 
not but I didn't notice any improvement.  The net effect was that if a 
move was an eye move that you shouldn't move to,   and the opponent 
moved to a diagonal,  the program still would not move to it.   That's 
probably sound in some cases and not in others.




Anyway, I don't bother to order the empty-point-list or scramble them 
in any way prior to the game. So the first point is the 1-1 point and 
the last is the 19-19 point (or whatever boardsize you're playing) so 
I have no qualms about those moves being a little less likely to be 
played. Or even a lot less. I think it would actually be beneficial.


If this asymmetry really bothers you, you could very easily fix this 
by wrapping the search around. There's no asymmetry in a circle.


Mark




___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

Re: [computer-go] 10k UCT bots

2008-05-13 Thread Don Dailey




Jason House wrote:

On May 13, 2008, at 1:51 PM, Mark Boon <[EMAIL PROTECTED]> wrote:



On 13-mei-08, at 14:10, Álvaro Begué wrote:

What others do is the right thing to do. Your method will introduce
some biases.


Could you elaborate what bias it could lead to? I also do the same as 
Jason. I did consider the possibility of a bias but couldn't 
immediately think of one.


And I was thinking "let's not repeat this topic"...

The probability of picking a point is proportional to 1 + number of 
illegal points before it.


In practice, illegal moves are pretty rare until endgame. At that 
point, it's a trade off between bias and speed. Random number 
generation is not cheap. I have yet to see empirical proof that the 
pick and scan is bad. I've tried both methods in the past and saw no 
measurable difference in strength.
I may perhaps take a look at this myself.I think there is a way to 
deal with this issue and get the best of both worlds or at least 
approach it.   If this method does not degrade the quality, it's a moot 
point.   Otherwise, here is an idea:


  1. For each move list size, pre-compute N tables of move list 
traversal orderings.

  2. At move selection time alternative between them.

So you would have something in the C language like:  

int *order[S][C] 

Your first move is selected randomly, then from then on you use the 
array returned to form an offset from this. 

This would pay off if you really wanted something very close to uniform 
random and your random number generator was clearly expensive. 

I think there is a great deal of chaos that would make this "almost" 
perfectly uniform.   The first move would be selected as you already do, 
using the random number generator but after this you traverse the list 
in random order as specified by this pre-computed table. Even though 
the bias is still there for any given move,  it's basically 
non-predictable.   On the very next move you will using a different 
table which you choose from in round robin fashion.  

Don't know how flawed this idea is.   I believe it is would come close 
to uniformly random but wouldn't be perfect.  There would be a minor 
bias when measured over a lot of games,  it would probably occur that a 
certain move was being chosen slightly more often,  but nothing like the 
bias in the current method.  

Is it worth it?   Probably not,  but just a crazy idea.   Likely there 
are simpler ways to eliminate most of the bias while still minimizing 
the number of random move generations. 


- Don
 







What good does moving it to the end of the list do? Next time around 
it's just as likely to be picked as when left in place. Or do you 
only process the 'end' after the 'front' moves are all tried?


The range of the random number is reduced by one after each failed 
lookup. Shuffled data has no impact on future use of the array of 
empty points.






Mark

___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/




___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

Re: [computer-go] 10k UCT bots

2008-05-13 Thread Álvaro Begué

On Tue, May 13, 2008 at 2:28 PM, Mark Boon <[EMAIL PROTECTED]> wrote:
>
> On 13-mei-08, at 15:08, Jason House wrote:
>
> The range of the random number is reduced by one after each failed lookup.
> Shuffled data has no impact on future use of the array of empty points.
>
> OK, I understand now why a point at the end (or beginning) is a little less
> likely to be picked. Although I still have doubts whether that will lead to
> a noticable bias, I'll try to think about it.

I don't care much about it being noticeable. This thread is about
putting bots on CGOS that use a reproducible algorithm, to help people
detect bugs in their implementations. As part of specifying what these
bots do, we should all pick the next move in a playout using the same
criteria. If we agree to use uniform distribution among empty
non-eyeish points, that's what should be implemented.

> I would imagine moving an illegal point towards the end and only start
> including it when the other 'legal' moves run out can lead to terrible bias
> however because they may not remain illegal for very long and actually
> become important points to play. A ko-point is probably the most extreme
> example of that.

I don't think you understood the algorithm. The eyeish point is
removed from the lottery only for picking this particular move, not
for the rest of the playout.

> Anyway, I don't bother to order the empty-point-list or scramble them in any
> way prior to the game. So the first point is the 1-1 point and the last is
> the 19-19 point (or whatever boardsize you're playing) so I have no qualms
> about those moves being a little less likely to be played. Or even a lot
> less. I think it would actually be beneficial.

Reproducibility was the point, not strength of the bot.

> If this asymmetry really bothers you, you could very easily fix this by
> wrapping the search around. There's no asymmetry in a circle.

That doesn't fix anything.

Álvaro.
___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

Re: [computer-go] 10k UCT bots

2008-05-13 Thread Mark Boon



On 13-mei-08, at 15:08, Jason House wrote:

The range of the random number is reduced by one after each failed  
lookup. Shuffled data has no impact on future use of the array of  
empty points.




OK, I understand now why a point at the end (or beginning) is a  
little less likely to be picked. Although I still have doubts whether  
that will lead to a noticable bias, I'll try to think about it.


I would imagine moving an illegal point towards the end and only  
start including it when the other 'legal' moves run out can lead to  
terrible bias however because they may not remain illegal for very  
long and actually become important points to play. A ko-point is  
probably the most extreme example of that.


Anyway, I don't bother to order the empty-point-list or scramble them  
in any way prior to the game. So the first point is the 1-1 point and  
the last is the 19-19 point (or whatever boardsize you're playing) so  
I have no qualms about those moves being a little less likely to be  
played. Or even a lot less. I think it would actually be beneficial.


If this asymmetry really bothers you, you could very easily fix this  
by wrapping the search around. There's no asymmetry in a circle.


Mark

___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

Re: [computer-go] 10k UCT bots

2008-05-13 Thread Don Dailey




Mark Boon wrote:


On 13-mei-08, at 14:15, Don Dailey wrote:

Yes, it's not random at all.   The points near the end of the list 
are much less likely to be chosen for instance.


OK, I'm not very good at statistics, but I don't see how the last 
points are "much" less likely to be picked. At best they are a 
"little" less likely to be picked but I actually don't see that 
either. I would like to see some logical argument supporting that claim.

I already retracted my statement in a previous message and I agree with you.

In fact, I don't think there is any bias based on where the points are 
in the list,  it has more to do with where the non-usable points are in 
the list.


I think it's pretty complicated too.   Assuming the list is static for 
each game,  you are going to get some ugly clustering effects similar to 
hash tables that use open addressing for conflict resolution.All of 
the point after an un-usable point has a locally increased chance of 
being selected "soon", depending on how you gather up the empty points.  

I think it may not be so bad if your algorithm is to compress the list 
by moving a random point into the newly occupied slot.   It won't be 
uniformly random, but it's hard for me to predict if this is a serious 
crime or not without thorough testing.


- Don





Mark





___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

Re: [computer-go] 10k UCT bots

2008-05-13 Thread Jason House


On May 13, 2008, at 1:51 PM, Mark Boon <[EMAIL PROTECTED]> wrote:



On 13-mei-08, at 14:10, Álvaro Begué wrote:

What others do is the right thing to do. Your method will introduce
some biases.


Could you elaborate what bias it could lead to? I also do the same  
as Jason. I did consider the possibility of a bias but couldn't  
immediately think of one.


And I was thinking "let's not repeat this topic"...

The probability of picking a point is proportional to 1 + number of  
illegal points before it.


In practice, illegal moves are pretty rare until endgame. At that  
point, it's a trade off between bias and speed. Random number  
generation is not cheap. I have yet to see empirical proof that the  
pick and scan is bad. I've tried both methods in the past and saw no  
measurable difference in strength.



What good does moving it to the end of the list do? Next time around  
it's just as likely to be picked as when left in place. Or do you  
only process the 'end' after the 'front' moves are all tried?


The range of the random number is reduced by one after each failed  
lookup. Shuffled data has no impact on future use of the array of  
empty points.






Mark

___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/
___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

Re: [computer-go] 10k UCT bots

2008-05-13 Thread Mark Boon



On 13-mei-08, at 14:15, Don Dailey wrote:

Yes, it's not random at all.   The points near the end of the list  
are much less likely to be chosen for instance.


OK, I'm not very good at statistics, but I don't see how the last  
points are "much" less likely to be picked. At best they are a  
"little" less likely to be picked but I actually don't see that  
either. I would like to see some logical argument supporting that claim.


Mark


___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

Re: [computer-go] 10k UCT bots

2008-05-13 Thread Álvaro Begué

Ooops! I hit sent before I finished writing the pseudo code. Sorry.

int pick(Move *empties, int num_empties) {
 int num_candidates = num_empties;
 int picked;

 while(1) {
   picked = rand()%num_candidates;
   if(!acceptable(empties[picked])) {
 num_candidates--;
 swap(empties[picked],empties[num_candidates]);
   }
   else
 break;
 }
 return picked;
}



On Tue, May 13, 2008 at 1:57 PM, Álvaro Begué <[EMAIL PROTECTED]> wrote:
> On Tue, May 13, 2008 at 1:51 PM, Mark Boon <[EMAIL PROTECTED]> wrote:
>  >
>  > On 13-mei-08, at 14:10, Álvaro Begué wrote:
>  >
>  >
>  > What others do is the right thing to do. Your method will introduce
>  >
>  > some biases.
>  > Could you elaborate what bias it could lead to? I also do the same as 
> Jason.
>  > I did consider the possibility of a bias but couldn't immediately think of
>  > one.
>
>  This has been explained already. After the first eye appears on the
>  board, the first empty point after the eye has a probability of being
>  picked that is twice the probability of any other empty point.
>
>
>  > What good does moving it to the end of the list do? Next time around it's
>  > just as likely to be picked as when left in place. Or do you only process
>  > the 'end' after the 'front' moves are all tried?
>
>  You move it to the end and you reduce the number of candidates by one.
>  The code is roughly this:
>
>  int pick(Move *empties, int num_empties) {
>   int num_candidates = num_empties;
>   int picked;
>
>   do {
> picked = rand()%num_candidates;
> num_candidates--;
>   } while(!acceptable(empties[picked]);
>
>   return picked;
>  }
>
>
>  Álvaro.
>
___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

Re: [computer-go] 10k UCT bots

2008-05-13 Thread Don Dailey


Hi Mark,

Did you read my last email post?   Using Jason's method,  the point 
immediately AFTER an illegal point (perhaps an eye space) is TWICE as 
likely to be selected because you are scanning sequentially forward.   
Hitting on either point is going to lead to the same move selection.



Occasionally you could get 2 or 3 or more illegal points in a row, 
especially near the end of the game.  In such a case, you could be 3 or 
more times as likely to get the first point immediately after this 
sequence selected as your move choice. 



If your empty list is actually scrambled after every move, I think this 
is uniformly random except that it would be expensive.  



- Don


Mark Boon wrote:


On 13-mei-08, at 14:10, Álvaro Begué wrote:


What others do is the right thing to do. Your method will introduce
some biases.


Could you elaborate what bias it could lead to? I also do the same as 
Jason. I did consider the possibility of a bias but couldn't 
immediately think of one.


What good does moving it to the end of the list do? Next time around 
it's just as likely to be picked as when left in place. Or do you only 
process the 'end' after the 'front' moves are all tried?


Mark




___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

Re: [computer-go] 10k UCT bots

2008-05-13 Thread Álvaro Begué

On Tue, May 13, 2008 at 1:51 PM, Mark Boon <[EMAIL PROTECTED]> wrote:
>
> On 13-mei-08, at 14:10, Álvaro Begué wrote:
>
>
> What others do is the right thing to do. Your method will introduce
>
> some biases.
> Could you elaborate what bias it could lead to? I also do the same as Jason.
> I did consider the possibility of a bias but couldn't immediately think of
> one.

This has been explained already. After the first eye appears on the
board, the first empty point after the eye has a probability of being
picked that is twice the probability of any other empty point.

> What good does moving it to the end of the list do? Next time around it's
> just as likely to be picked as when left in place. Or do you only process
> the 'end' after the 'front' moves are all tried?

You move it to the end and you reduce the number of candidates by one.
The code is roughly this:

int pick(Move *empties, int num_empties) {
  int num_candidates = num_empties;
  int picked;

  do {
picked = rand()%num_candidates;
num_candidates--;
  } while(!acceptable(empties[picked]);

  return picked;
}

Álvaro.
___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

Re: [computer-go] 10k UCT bots

2008-05-13 Thread Mark Boon



On 13-mei-08, at 14:10, Álvaro Begué wrote:


What others do is the right thing to do. Your method will introduce
some biases.


Could you elaborate what bias it could lead to? I also do the same as  
Jason. I did consider the possibility of a bias but couldn't  
immediately think of one.


What good does moving it to the end of the list do? Next time around  
it's just as likely to be picked as when left in place. Or do you  
only process the 'end' after the 'front' moves are all tried?


Mark

___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

Re: [computer-go] 10k UCT bots

2008-05-13 Thread Don Dailey

It's not clear how bad Jason's method is however.   The points near the 
end of the list are LESS likely to be chosen but probably not "much less 
likely" and this method is probably pretty fast.I wonder how bad it 
really is?   

The point after an illegal move is quite a bit more likely to be 
selected.   If the list had just 1 illegal point, then the point after 
it in the list is twice as likely to be selected as any other point.
Perhaps if you added a little noise after each move (perhaps swapping to 
points at random)  you could minimize any seriously bad effects?  


- Don


Don Dailey wrote:



Christoph Birk wrote:


On May 13, 2008, at 10:04 AM, Jason House wrote:
On May 13, 2008, at 12:57 PM, Carter Cheng <[EMAIL PROTECTED]> 
wrote:


I have a list of empty points. I pick one at random and then scan 
until I find a legal one.


That's not random.
Yes, it's not random at all.   The points near the end of the list are 
much less likely to be chosen for instance.


I have a list of empty points and I choose from this list at random 
and temporarily "put away" any non-legal choices and try again until I 
get a legal move.


- Don




Christoph

___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/


___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/


___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

Re: [computer-go] 10k UCT bots

2008-05-13 Thread Don Dailey




Christoph Birk wrote:


On May 13, 2008, at 10:04 AM, Jason House wrote:
On May 13, 2008, at 12:57 PM, Carter Cheng <[EMAIL PROTECTED]> 
wrote:


I have a list of empty points. I pick one at random and then scan 
until I find a legal one.


That's not random.
Yes, it's not random at all.   The points near the end of the list are 
much less likely to be chosen for instance.


I have a list of empty points and I choose from this list at random and 
temporarily "put away" any non-legal choices and try again until I get a 
legal move.


- Don




Christoph

___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/


___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

Re: [computer-go] 10k UCT bots

2008-05-13 Thread Álvaro Begué

On Tue, May 13, 2008 at 1:04 PM, Jason House
<[EMAIL PROTECTED]> wrote:
> [,,,]
>  I have a list of empty points. I pick one at random and then scan until I
> find a legal one. Others reduce the list size (swap to end?) and repick.

What others do is the right thing to do. Your method will introduce
some biases. I think libego used to do it your way but was then fixed.

Álvaro.
___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

Re: [computer-go] 10k UCT bots

2008-05-13 Thread Don Dailey

I think Christoph just described the eye rule most of us are using. 
It's pretty much standard and I think it's mainly just a mechanism to 
prevent the play-outs from going on and on for hundreds of moves.  
You can probably improve on the rule,  but this is probably not going to 
make any measurable difference until you cover much more important 
things such as improving the quality of the play-outs,  what Dave Hillis 
calls "heavy play-outs."   

My suggestion is to just start with the uniform random play-out policy  
with the simple and effective eye rule we are using first,  then you can 
obsess over the eye rule and heavy play-out policy stuff later to 
improve on what you have.


- Don



Carter Cheng wrote:
I am curious about the eye rule situation since I am at that stage of my implementation of the board/playout code. currently I have only implemented the basic rules of the game so that no illegal moves are possible (no superko rule) but undesirable stuff like filling your own eye spaces still are. 

I am curious how and when are the eye lists being calculated using the scheme described above. Also one related issue is that for certain LD shapes wont you still inadvertently fill your own eye space? 


Regards,

Carter


  
___

computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

  

___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

Re: [computer-go] 10k UCT bots

2008-05-13 Thread Christoph Birk



On May 13, 2008, at 10:04 AM, Jason House wrote:
On May 13, 2008, at 12:57 PM, Carter Cheng <[EMAIL PROTECTED]>  
wrote:


I have a list of empty points. I pick one at random and then scan  
until I find a legal one.


That's not random.

Christoph

___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

Re: [computer-go] 10k UCT bots

2008-05-13 Thread Christoph Birk



On May 13, 2008, at 10:00 AM, Jason House wrote:
On May 13, 2008, at 12:00 PM, "David Fotland" <[EMAIL PROTECTED] 
games.com> wrote:


When you say pure uct, what is the playout policy?  Pure random  
moves except

don't fill one point eyes?


That's exactly what I meant. I'd also assume other stuff like the  
UCB1 formula, no RAVE, and no initial move bias. I'm not opposed to  
other variants to see the effects, but I do want to ensure my  
implementation is correct.


'my-Ctest-10k-UCT' uses none of these, except guiding the search by UCT.

Christoph

___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

Re: [computer-go] 10k UCT bots

2008-05-13 Thread Jason House

On May 13, 2008, at 12:57 PM, Carter Cheng <[EMAIL PROTECTED]>  
wrote:


I am curious about the eye rule situation since I am at that stage  
of my implementation of the board/playout code. currently I have  
only implemented the basic rules of the game so that no illegal  
moves are possible (no superko rule) but undesirable stuff like  
filling your own eye spaces still are.


I am curious how and when are the eye lists being calculated using  
the scheme described above. Also one related issue is that for  
certain LD shapes wont you still inadvertently fill your own eye  
space?


I have a list of empty points. I pick one at random and then scan  
until I find a legal one. Others reduce the list size (swap to end?)  
and repick.


Yes, the playouts are still extremely stupid and violate what even 30k  
humans know.




Regards,

Carter



___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

Re: [computer-go] 10k UCT bots

2008-05-13 Thread Jason House

On May 13, 2008, at 12:00 PM, "David Fotland" <[EMAIL PROTECTED] 
games.com> wrote:


When you say pure uct, what is the playout policy?  Pure random  
moves except

don't fill one point eyes?


That's exactly what I meant. I'd also assume other stuff like the UCB1  
formula, no RAVE, and no initial move bias. I'm not opposed to other  
variants to see the effects, but I do want to ensure my implementation  
is correct.




What's your eye rule?


All direct neighbors are the eye-owner's color and no more than one  
enemy stone in the diagonals. Corner/edge require no enemy stones. I  
experimented with others, but that one is the simplest/best. With  
large patterns, someone could do better.







David


-Original Message-
From: [EMAIL PROTECTED] [mailto:computer-go-
[EMAIL PROTECTED] On Behalf Of Christoph Birk
Sent: Tuesday, May 13, 2008 8:51 AM
To: computer-go
Subject: Re: [computer-go] 10k UCT bots


On May 13, 2008, at 7:25 AM, Jason House wrote:

I'm testing my bot on CGOS using pure UCT, no pondering, and 10,000
playouts per move. Can someone put up a comparable bot?



I will re-start 'myCtest-10k-UCT' later today.

Christoph

___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/


___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

Re: [computer-go] 10k UCT bots

2008-05-13 Thread Carter Cheng

I am curious about the eye rule situation since I am at that stage of my 
implementation of the board/playout code. currently I have only implemented the 
basic rules of the game so that no illegal moves are possible (no superko rule) 
but undesirable stuff like filling your own eye spaces still are. 

I am curious how and when are the eye lists being calculated using the scheme 
described above. Also one related issue is that for certain LD shapes wont you 
still inadvertently fill your own eye space? 

Regards,

Carter


  
___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

Re: [computer-go] 10k UCT bots

2008-05-13 Thread Christoph Birk



On May 13, 2008, at 9:00 AM, David Fotland wrote:
When you say pure uct, what is the playout policy?  Pure random  
moves except

don't fill one point eyes?


Yes.


What's your eye rule?


all four neighbors occupied by "my" stones and not more 1 diagonal
occupied by opponent (center) or no diagonal occupied by opponent
(edge, corner).

Christoph


___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

RE: [computer-go] 10k UCT bots

2008-05-13 Thread David Fotland

When you say pure uct, what is the playout policy?  Pure random moves except
don't fill one point eyes?  What's your eye rule?

David

> -Original Message-
> From: [EMAIL PROTECTED] [mailto:computer-go-
> [EMAIL PROTECTED] On Behalf Of Christoph Birk
> Sent: Tuesday, May 13, 2008 8:51 AM
> To: computer-go
> Subject: Re: [computer-go] 10k UCT bots
> 
> 
> On May 13, 2008, at 7:25 AM, Jason House wrote:
> > I'm testing my bot on CGOS using pure UCT, no pondering, and 10,000
> > playouts per move. Can someone put up a comparable bot?
> >
> 
> I will re-start 'myCtest-10k-UCT' later today.
> 
> Christoph
> 
> ___
> computer-go mailing list
> computer-go@computer-go.org
> http://www.computer-go.org/mailman/listinfo/computer-go/

___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

Re: [computer-go] 10k UCT bots

2008-05-13 Thread Christoph Birk



On May 13, 2008, at 7:25 AM, Jason House wrote:
I'm testing my bot on CGOS using pure UCT, no pondering, and 10,000  
playouts per move. Can someone put up a comparable bot?




I will re-start 'myCtest-10k-UCT' later today.

Christoph
 
___

computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

[computer-go] 10k UCT bots

2008-05-13 Thread Jason House

I'm testing my bot on CGOS using pure UCT, no pondering, and 10,000  
playouts per move. Can someone put up a comparable bot?


A while back, someone else made a similar request, and I discovered  
that my bot had somehow broken. I've scoured for bugs and I believe I  
have a functional implementation.


There are two hb-739-UCT-10k-# bots because an external library leaks  
memory and it's the easiest way to make cgosGtp restart the bot  
periodically. My bot may still be a bit weaker than 10k playouts would  
imply since it's slow and may be cutting back from time restrictions.


Sent from my iPhone
___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

65 matches

Mail list logo