I agree that the experience is interesting in itself.
I also agree that it's hard to draw any conclusion
from it :) Running the game to the end would probably
give near 0% win for the AMAF bot.
Running the 5k bot against the 100K bot is certainly
something you would like to do if you were to argue
that 5k is indeed stronger. Although it also might
be that for some reason the 5k bot is better at
the oppening. The 5k has a wider range of choice
while playing than the 100k bot. So it's easy
to imagine that it plays the good (oppening) moves more.
All in all, trying to assess the strength of a bot
is awfully hard. It can make very good move
and yet be very weak. It can have good global
perception, or move ordonnancing, and be very
weak. It can predict pro-moves with incredible
accurracy, and still be very weak. (although you
then would be able to use this prediction feature
in a monte-carlo bot - CrazyStone).
I guess any hard data will always be welcome. Your
experiment was very original, in that few people
would have tried it. I have no idea what one should
conclude out of it. But it certainly can't hurt our
understanding :) (or un-understanding) Maybe some day
someone will look-up at this particular experiment
and come out with the next computer-go revolution :)
> Date: Mon, 15 Dec 2008 21:10:07 -0200
> From: tesujisoftw...@gmail.com
> To: computer-go@computer-go.org
> Subject: Re: [computer-go] RefBot (thought-) experiments
>
> Weston,
>
> Although those result sound intriguing, it also looks like a
> convoluted experiment. I wouldn't call gnu-go an expert judge,
> although it is an impartial one. The fact that it says that the 5K
> ref-bot is ahead after 10 moves 46% of the time alone makes it
suspect
> in my eyes. But it is curious it consistently shows a much lower
> percentage for the bot with more playouts.
>
> It would have been much more persuasive if you had simply run a 5K
> playout bot against a 100K bot and see which wins more. It shouldn't
> take much more than a day to gather a significant number of games.
> twogtp is perfect for this. Or connect both to CGOS and see which
ends
> up with a higher rating. But in that case it will take a week or
more
> before you get conclusive data. Unless the difference is really
clear.
>
> I did in fact put up a 100K+ ref-bot on CGOS for a little while, and
> it ended up with a rating slightly (possibly insignificantly) higher
> than the 2K ref-bot. Maybe I didn't put it there long enough,
> certainly not for thousands of games. But it didn't look anywhere
near
> to supporting your findings.
>
> I say 100K+ because I didn't set it to a specific number, just run
as
> many as it could within time allowed. Generally it would reach well
> over 100K per move, probably more like 250K-500K. That should only
> make things worse according to your hypothesis.
>
> So although I think the result of your experiment is very curious, I
> think it might be a bit hasty draw your conclusion.
>
> Mark
>
>
> On Mon, Dec 15, 2008 at 8:30 PM, Weston Markham
> <weston.mark...@gmail.com> wrote:
> > Hi. This is a continuation of a month-old conversation about the
> > possibility that the quality of AMAF Monte Carlo can degrade, as
the
> > number of simulations increases:
> >
> > Me: "running 10k playouts can be significantly worse than
running 5k playouts."
> >
> > On Tue, Nov 18, 2008 at 2:27 PM, Don Dailey <drdai...@cox.net>
wrote:
> >> On Tue, 2008-11-18 at 14:17 -0500, Weston Markham wrote:
> >>> On Tue, Nov 18, 2008 at 12:02 PM, Michael Williams
> >>> <michaelwilliam...@gmail.com> wrote:
> >>> > It doesn't make any sense to me from a theoretical
perspective. Do you have
> >>> > empirical evidence?
> >>>
> >>> I used to have data on this, from a program that I think was
very
> >>> nearly identical to Don's reference spec. When I get a chance,
I'll
> >>> try to reproduce it.
> >>
> >> Unless the difference is large, you will have to run thousands
of games
> >> to back this up.
> >>
> >> - Don
> >
> > I am comparing the behavior of the AMAF reference bot with 5000
> > playouts against the behavior with 100000 playouts, and I am only
> > considering the first ten moves (five from each player) of the
(9x9)
> > games. I downloaded a copy of Don's reference bot, as well as a
copy
> > of Mogo, which is used as an opponent for each of the two
settings.
> > gnugo version 3.7.11 is also used, in order to judge which side
won
> > (jrefgo or mogo) after each individual match. gnugo was used
because
> > it is simple to set it up for this sort of thing via command-line
> > options, and it seems plausible that it should give a somewhat
> > realistic assessment of the situation.
> >
> > jrefgo always plays black, and Mogo plays white. Komi is set to
0.5,
> > so that jrefgo has a reasonable number of winning lines
available to
> > it, although the general superiority of Mogo means that
egregiously
> > bad individual moves will be punished.
> >
> > In the games played, Mogo would occasionally crash. (This was run
> > under Windows Vista; perhaps there is some incompatibility of the
> > binary I downloaded) I have discarded these games (about 1 out
of 50,
> > I think) from the statistics gathered. As far as I know, there
would
> > be no reason to think that this would skew the comparison
between 5k
> > playouts and 100k playouts. Other than occasional crashes, the
> > behavior of Mogo seemed reasonable in other games that I
observed. I
> > have no reason to think that it was not playing at a relatively
high
> > level in the retained results.
> >
> > Out of 3637 matches using 5k playouts, jrefgo won (i.e., was ahead
> > after 10 moves, as estimated by gnugo) 1688 of them. (46.4%)
> > Out of 2949 matches using 100k playouts, jrefgo won 785. (26.6%)
> >
> > It appears clear to me that increasing the number of playouts
from 5k
> > to 100k certainly degrades the performance of jrefgo. Below, I am
> > including the commands that I used to run the tests and tally the
> > results.
> >
> > Weston
> >
> >
> > $ cat scratch5k.sh
> >
> > ../gogui-1.1.3/bin/gogui-twogtp -auto -black "\"C:\\\\Program
Files\\\\Java\\\\j
> > dk1.6.0_06\\\\bin\\\\java.exe\" -jar jrefgo.jar 5000" -games
10000 -komi 0.5 -ma
> > xmoves 10 -referee "gnugo --mode gtp --score aftermath --chinese-
rules --positio
> > nal-superko" -sgffile games/jr5k-v-mogo -size 9 -white C:\\\
\cygwin\\\\home\\\\E
> > xperience\\\\projects\\\\go\\\\MoGo_release3\\\\mogo.exe
> >
> >
> > $ grep B+ games/jr5k-v-mogo.dat | grep -v unexp | wc -l
> > 1688
> >
> > $ grep W+ games/jr5k-v-mogo.dat | grep -v unexp | wc -l
> > 1949
> >
> > $ grep B+ games/jr100k-v-mogo.dat | grep -v unexp | wc -l
> > 785
> >
> > $ grep W+ games/jr100k-v-mogo.dat | grep -v unexp | wc -l
> > 2164
> > _______________________________________________
> > computer-go mailing list
> > computer-go@computer-go.org
> > http://www.computer-go.org/mailman/listinfo/computer-go/
> >
> _______________________________________________
> computer-go mailing list
> computer-go@computer-go.org
> http://www.computer-go.org/mailman/listinfo/computer-go/
Discutez sur Messenger où que vous soyez ! Mettez Messenger sur votr
e mobile !
_______________________________________________
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/