Re: [computer-go] RefBot (thought-) experiments

Mark Boon Mon, 15 Dec 2008 15:10:49 -0800

Weston,

Although those result sound intriguing, it also looks like a
convoluted experiment. I wouldn't call gnu-go an expert judge,
although it is an impartial one. The fact that it says that the 5K
ref-bot is ahead after 10 moves 46% of the time alone makes it suspect
in my eyes. But it is curious it consistently shows a much lower
percentage for the bot with more playouts.


It would have been much more persuasive if you had simply run a 5K
playout bot against a 100K bot and see which wins more. It shouldn't
take much more than a day to gather a significant number of games.
twogtp is perfect for this. Or connect both to CGOS and see which ends
up with a higher rating. But in that case it will take a week or more
before you get conclusive data. Unless the difference is really clear.

I did in fact put up a 100K+ ref-bot on CGOS for a little while, and
it ended up with a rating slightly (possibly insignificantly) higher
than the 2K ref-bot. Maybe I didn't put it there long enough,
certainly not for thousands of games. But it didn't look anywhere near
to supporting your findings.

I say 100K+ because I didn't set it to a specific number, just run as
many as it could within time allowed. Generally it would reach well
over 100K per move, probably more like 250K-500K. That should only
make things worse according to your hypothesis.

So although I think the result of your experiment is very curious, I
think it might be a bit hasty draw your conclusion.

Mark


On Mon, Dec 15, 2008 at 8:30 PM, Weston Markham
<weston.mark...@gmail.com> wrote:
> Hi.  This is a continuation of a month-old conversation about the
> possibility that the quality of AMAF Monte Carlo can degrade, as the
> number of simulations increases:
>
> Me:  "running 10k playouts can be significantly worse than running 5k 
> playouts."
>
> On Tue, Nov 18, 2008 at 2:27 PM, Don Dailey <drdai...@cox.net> wrote:
>> On Tue, 2008-11-18 at 14:17 -0500, Weston Markham wrote:
>>> On Tue, Nov 18, 2008 at 12:02 PM, Michael Williams
>>> <michaelwilliam...@gmail.com> wrote:
>>> > It doesn't make any sense to me from a theoretical perspective.  Do you 
>>> > have
>>> > empirical evidence?
>>>
>>> I used to have data on this, from a program that I think was very
>>> nearly identical to Don's reference spec.  When I get a chance, I'll
>>> try to reproduce it.
>>
>> Unless the difference is large, you will have to run thousands of games
>> to back this up.
>>
>> - Don
>
> I am comparing the behavior of the AMAF reference bot with 5000
> playouts against the behavior with 100000 playouts, and I am only
> considering the first ten moves (five from each player) of the (9x9)
> games.  I downloaded a copy of Don's reference bot, as well as a copy
> of Mogo, which is used as an opponent for each of the two settings.
> gnugo version 3.7.11 is also used, in order to judge which side won
> (jrefgo or mogo) after each individual match.  gnugo was used because
> it is simple to set it up for this sort of thing via command-line
> options, and it seems plausible that it should give a somewhat
> realistic assessment of the situation.
>
> jrefgo always plays black, and Mogo plays white.  Komi is set to 0.5,
> so that jrefgo has a reasonable number of winning lines available to
> it, although the general superiority of Mogo means that egregiously
> bad individual moves will be punished.
>
> In the games played, Mogo would occasionally crash.  (This was run
> under Windows Vista; perhaps there is some incompatibility of the
> binary I downloaded)  I have discarded these games (about 1 out of 50,
> I think) from the statistics gathered.  As far as I know, there would
> be no reason to think that this would skew the comparison between 5k
> playouts and 100k playouts.  Other than occasional crashes, the
> behavior of Mogo seemed reasonable in other games that I observed.  I
> have no reason to think that it was not playing at a relatively high
> level in the retained results.
>
> Out of 3637 matches using 5k playouts, jrefgo won (i.e., was ahead
> after 10 moves, as estimated by gnugo) 1688 of them.  (46.4%)
> Out of 2949 matches using 100k playouts, jrefgo won 785.  (26.6%)
>
> It appears clear to me that increasing the number of playouts from 5k
> to 100k certainly degrades the performance of jrefgo.  Below, I am
> including the commands that I used to run the tests and tally the
> results.
>
> Weston
>
>
> $ cat scratch5k.sh
>
> ../gogui-1.1.3/bin/gogui-twogtp -auto -black "\"C:\\\\Program 
> Files\\\\Java\\\\j
> dk1.6.0_06\\\\bin\\\\java.exe\" -jar jrefgo.jar 5000" -games 10000 -komi 0.5 
> -ma
> xmoves 10 -referee "gnugo --mode gtp --score aftermath --chinese-rules 
> --positio
> nal-superko" -sgffile games/jr5k-v-mogo -size 9 -white 
> C:\\\\cygwin\\\\home\\\\E
> xperience\\\\projects\\\\go\\\\MoGo_release3\\\\mogo.exe
>
>
> $ grep B+ games/jr5k-v-mogo.dat | grep -v unexp | wc -l
> 1688
>
> $ grep W+ games/jr5k-v-mogo.dat | grep -v unexp | wc -l
> 1949
>
> $ grep B+ games/jr100k-v-mogo.dat | grep -v unexp | wc -l
> 785
>
> $ grep W+ games/jr100k-v-mogo.dat | grep -v unexp | wc -l
> 2164
> _______________________________________________
> computer-go mailing list
> computer-go@computer-go.org
> http://www.computer-go.org/mailman/listinfo/computer-go/
>
_______________________________________________
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

Re: [computer-go] RefBot (thought-) experiments

Reply via email to