Re: [Bug-gnubg] Confused

Ian Shaw Fri, 12 Jun 2015 07:34:10 -0700

Hi Lucas, 

You make a good point about Michael's benchmark. Are you repeating these tests? 
If not, if somebody has the positions in a file, I might be able to devote some 
CPU time to it.


The tests I ran were several years ago, so they weren't with the latest 
weights. I think they'd have been the 0.14 weights.

Did Michael give you evidence that 3-ply was worse than 2-ply? I don’t remember 
any tests being reported on the mailing list, and I'd be very interested.

-- Ian


-----Original Message-----
From: Lucas [mailto:[email protected]] 
Sent: 12 June 2015 14:37
To: Ian Shaw; [email protected]
Subject: Re: [Bug-gnubg] Confused

Hi Ian

First Michael Depreli's tests are no longer relevant because gnubg uses since 
2013 a different  weighttabel

I'm testing the latest version of gnugb 1.05.00 which use the latest weightabel 
(2013)

I tested on fibs as i wrote 5000 5 point matches (> 17000 games actually) it 
might be not enough But all the time 3 ply was behind.

I talked about it with Micheal Petch one of the maintainers/ coders of gnubg 
and he CONFIRMED that 3 ply plays not as well as 2 ply.
So i stopped testing 2 against 3 ply.
Why should i continue testing where the coders of gnubg are telling me that 
indeed
2 ply plays better then 3 ply
Micheal Petch also told me that partly the corrected a little but not fully.

about 0 against 2 ply  i stopped at 1000 5 point matches because all the time 
the result was about 50% with a < 0.5% on both sides.

When did you test 0 at 2 ply did gnubg at that time the weighttabel of 2103 
which version of gnubg did you use ?

Currently I run gnubg standard worldclass against the old dll of Mike Rudman 
the dll is set to play expert level which is actually 2 ply ( GreedyGammon uses 
that dll  as bot on that server.)

Sofar they played 449 matches where that old dll won 235 and lost 214 which 
gives a winrate of the dll of 52.34% so soon to come to any conclusion ;-)

Lucas

-----Oorspronkelijk bericht-----
From: Ian Shaw
Sent: Friday, June 12, 2015 12:34 PM
To: Lucas ; [email protected]
Subject: RE: [Bug-gnubg] Confused

Hi Lucas,

I agree that it is unwise to make assumptions.

I'd be surprised if just 3000 5-point matches (maybe 12000 games) was 
sufficient to produce statistical significance. I no longer have the results, 
but when I tested 0-ply vs 2-ply over 100000+ games, 2-ply was definitely 
ahead. I'm convinced it plays better.

I agree that 3-ply might not be better than 2-ply, due to odd vs. even ply 
evaluation effects. Michael Depreli's tests 
http://www.bkgm.com/articles/Keith/DepreliBotComparison/ showed that 3-ply had 
slightly better chequer play then 2-ply, but over-valued the doubler's 
position. This resulted in numerous wrong doubles and wrong takes.

If I recall correctly, 3-ply chequer play with 2-ply cube was in vogue for 
rollouts for a while.

-- Ian




-----Original Message-----
From: Lucas [mailto:[email protected]]
Sent: 12 June 2015 10:35
To: Ian Shaw; [email protected]
Subject: Re: [Bug-gnubg] Confused

Hi Ian

To say that gnubg expert level plays less then worldclass or makes mistakes 
might not be true I did run on local disk expert against worldclass 1000 5 
point matches the result was worldclass  had a winrate of 50,2 % . To me that 
0,2 % might be due to luck.
I agree that expert plays sometimes a different move compared with worldclass 
on the same position etc

Assuming that a higher level of gnubg wil play better then lower level is not 
true.
Last year i tested using Fibs,( were i did in the past 8 bots), 2 bots one set 
to play Worldclass and the other at grandmaster so 2 ply against 3 ply they 
played 3000 5 point matches Worldclass the lesser setting had a winrate of 55 % 
I mailed about it with Micheal Petch one of the maintainers of gnubg and he 
confirmed that indeed worldclass plays better the Grandmaster

Greetings

Lucas


-----Oorspronkelijk bericht-----
From: Ian Shaw
Sent: Thursday, June 11, 2015 6:22 PM
To: Michael Brennan ; [email protected]
Subject: Re: [Bug-gnubg] Confused

Hi Michael,

Welcome to GNU Backgammon.

1) Dice are no respecter of ability! Even the best players can be defeated by 
bad rolls. Look at the Luck ratings to see how gnubg thinks it played out.

Gnubg thinks it made no errors because it is using itself as the best judge of 
perfection!

If you ask gnubg to play on Expert Level but analyse on World Class, you will 
start to see some errors creep into its game.

2) It depends on how beginnerish you are.

According to the FIBS Rating Calculator at 
http://www.netadelica.com/bg/fibscalc.html a player with a 500 point Elo 
advantage should win a 13-point match about 89% of the time. This increases to 
93% for a 21-pointer.

You can play with the calculator yourself for other scenarios. (The experience 
fields should be set to at least 400 for both players.)

I hope this helps.
Ian Shaw




-----Original Message-----
From: [email protected]
[mailto:[email protected]] On Behalf Of 
Michael Brennan
Sent: 11 June 2015 00:19
To: [email protected]
Subject: [Bug-gnubg] Confused

I would be obliged for some help on the following. Please forward it on to 
someone who could answer if it arrives at the wrong destination.

gnubg describes my ability as a “beginner" - fair comment. However in a 13pt 
match, I beat gnubg 13-6. The analysis described gnubg’s ability as 
“supernatural" and indicated that it made no errors.

I Have two queries in relation to this

1. How come gnubg lost?

2. How many points should a match be in order to ensure that gnubg wins every 
time against a beginner?

Regards

Michael Brennan
_______________________________________________
Bug-gnubg mailing list
[email protected]
https://lists.gnu.org/mailman/listinfo/bug-gnubg
_______________________________________________
Bug-gnubg mailing list
[email protected]
https://lists.gnu.org/mailman/listinfo/bug-gnubg

_______________________________________________
Bug-gnubg mailing list
[email protected]
https://lists.gnu.org/mailman/listinfo/bug-gnubg

Re: [Bug-gnubg] Confused

Reply via email to