On Tue, Apr 22, 2008 at 4:23 PM, Don Dailey <[EMAIL PROTECTED]> wrote:
>  Here is what I'm going to do:
>
>  I will take an open source chess program, Toga,  and run  a multi-round
>  robin between 7 versions from fixed depth 1 to fixed depth 7.   Two
>  versions of Toga at these 7 levels where one version has pawn structure,
>  king safety, and passed pawns turned off.
...

I am not familiar with chess programming, and I haven't been paying
complete attention to this discussion, but I thought that I should
comment on this.  Without any background knowledge, I would expect
that the bits of "knowledge" that you are turning off are present in
the starting program largely because they do scale well.

Furthermore, if your claim is:

"a chess program with a better evaluation function improves MORE with
increasing depth than one with a lesser evaluation function"

...then I don't see how you will make much progress at Settling the
Matter with this study, since all it will show (at best) is that there
exists one pair of evaluation functions that match your rule.

A better approach, to my mind, would be to test a wide variety of
different evaluation functions.  As I understand it, you want to show
that there is a strong correlation between their relative playing
ability at (widely) different depths.  Ideally, you should include as
many evaluation functions as you can manage, and ones that are as
different from each other as possible.  Also, you possibly might want
to also combine them with multiple, different kinds of
pruning/searching/whatever-else-goes-into-a-chess-engine-that-isn't-considered-evaluation.
 This would show that you are exposing the general rule, rather than
just an example of that rule.  Am I misunderstanding your claim?

Of course, that would be quite a bit of work, that I am suggesting.
Perhaps a modest step in this direction would be to run tournaments
between 3 versions of Toga, each with only one enabled feature out of
the three that you identify.  (Or perhaps two.  However, avoid
including combinations where one version has features that are a
subset of another.  This may help to mitigate objections such as my
initial one above.)  On the plus side, though, I see no reason to run
any depths other than 1 and 7, since I think that you just want the
rank correlation between two different depths.

Weston
_______________________________________________
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

Reply via email to