CPANTS is not a game.

2006-05-23 Thread Michael G Schwern

I haven't looked at what's going on in CPANTS for a while but Andy's post
made me have a look and oh dear.  There's a problem.  CPANTS is not a game.
If you make it a game, the system does not work.

Let's review.

CPANTS is not a measure of module quality since module quality is not well
defined and difficult to measure.  CPANTS is a measure of those things which
can be measured about quality modules.  To use a real world example, fast
cars tend to be red.

There is no direct relationship between a CPANTS kwalitee indicator and the
actual quality of the module, it is merely a statistical indicator.  A car
can be red without being fast.  Conversely, a blue car can be fast.

If I alter my distribution to increase its CPANTS score I have not reduced
the bug count, made it more efficient, improved the documentation or given
it an easier interface.  I have not done any of the difficult to measure
things which we associate with improving the real quality of a module. [1]
If I paint my car red, it does not go faster.

If people *think* red cars are faster, and I want to sell my car, I'm going
to paint my car red.  People will now be more likely to buy my car thinking
it goes faster, but they have been fooled.  I have gamed the system, used my
knowledge of its rules to my own gain.

The more people game CPANTS, the more they alter their modules specificly to
increase their CPANTS score, the less useful the CPANTS indicators will be.
If everyone paints their car red then its no longer a valid indicator of
whether its a fast car.  The indicator is now useless.

CPANTS is already easy to game since its rules are published.  This is hard
enough to deal with, but gaming is gleefully encouraged!  Its called The
CPANTS Game.  There's a scoreboard with a top 40, hall of fame and shame.
Failure to have a CPANTS indicator is expressed as a shortcoming which
must be remedied.

In order to best perform its function and reduce the urge to game, CPANTS
should...

* not express itself as a game or competition
* not widely publish its rules
* express its indicators in a neutral fashion without indication as to which
state of the indicator is better
* not suggest ways to fix your module
* not publish a numerical score which one may be compelled to raise, instead
a less precise indicator such as color
* not publish a scoreboard, top ten, hall of fame or shame
* add a feedback loop so that certain human-checked low and high quality
modules are checked against their CPANTS indicators to see which still have
value.  Adjust the indicator's weights accordingly.

Sorry if it sucks all the fun out of it, and I don't mean to poo-poo the
work Thomas and others have done, but I think we've forgotten what CPANTS
is.


[1] I may have improved the real quality of the distribution, but not the
payload contained within.  Using CPANTS as a distribution improvement tool
is another story and may ultimately be its best destiny.


Re: CPANTS is not a game.

2006-05-23 Thread Smylers
Michael G Schwern writes:

 There's a problem.  CPANTS is not a game.  If you make it a game, the
 system does not work.

Hi there.  I made a similarish point on this list about a year ago, to
which you replied:

  http://groups.google.co.uk/[EMAIL PROTECTED]

Your reply included:

  Finally, the scoreboard does have a purpose.  Part of the original
  idea of CPANTS was to provide an automated checklist for a good
  distribution.  ... Then, if this were a web page, the author could
  just click on that to get an explaination of why this is a Good Thing
  and what they can do to fix it. ...
  
  How do you get authors to actually look at the CPANTS information and
  make corrections?  Well, we like competition.  Make it a game! 

So it was you -- or somebody impersonating you on this list -- who
managed to persuade me that actually Cpants being a game was a good
thing!

Smylers


Re: CPANTS is not a game.

2006-05-23 Thread Andy Lester
  How do you get authors to actually look at the CPANTS information  
and

  make corrections?  Well, we like competition.  Make it a game!

So it was you -- or somebody impersonating you on this list -- who
managed to persuade me that actually Cpants being a game was a good
thing!


The key is that we're playing for different goals.  Schwern was  
saying that the improvement of the modules is a game.  PerlGirl is  
making a game out of improving the numeric score for her modules, but  
without any improvement of the module itself.


--
Andy Lester = [EMAIL PROTECTED] = www.petdance.com = AIM:petdance





Re: CPANTS is not a game.

2006-05-23 Thread David Golden

Andy Lester wrote:

  How do you get authors to actually look at the CPANTS information and
  make corrections?  Well, we like competition.  Make it a game!

So it was you -- or somebody impersonating you on this list -- who
managed to persuade me that actually Cpants being a game was a good
thing!


The key is that we're playing for different goals.  Schwern was saying 
that the improvement of the modules is a game.  PerlGirl is making a 
game out of improving the numeric score for her modules, but without any 
improvement of the module itself.


How does is_prereq improve quality?

Or, put differently, how does measuring something that an author can't 
control create an incentive to improve?


If CPANTS is a objective quality measure, then it makes sense.  If 
CPANTS is a quality game -- i.e. a friendly competition to improve 
one's scores -- then it doesn't.


If CPANTS stays with a narrow set of well-defined, objective criteria, 
then it can serve both purposes.  Remove or refine the subjective or 
hard-to-measure ones and the numerical gaming that doesn't change 
apparent quality goes away.


Regards,
David Golden



Re: CPANTS is not a game.

2006-05-23 Thread Chris Dolan

On May 23, 2006, at 8:39 AM, David Golden wrote:


How does is_prereq improve quality?

Or, put differently, how does measuring something that an author  
can't control create an incentive to improve?


is_prereq is usually a proxy metric for software maturity: if someone  
thinks your module is good enough that he would rather depend on it  
than reinvent it, then it's probably a better-than-average module on  
CPAN.  is_prereq is usually a vote of confidence, so it is likely a  
good proxy for quality.  In fact I believe that because the author  
(usually) can't control it directly, is_prereq is one of the best  
proxies for quality among the current kwalitee metrics.


CPANTS by itself does nothing to improve quality.  However, by  
drawing attention to kwalitee metrics, I argue that CPANTS draws  
attention to quality too.  Even if many authors don't understand  
that, the ones that do makes CPANTS worthwhile.  If making it a game  
helps to further raise awareness, then I think that should be  
tolerated until CPANTS matures.


IMHO, the best way to improve CPANTS and move away from the game  
mentality is to continually add more tests.  Each added test  
diminishes the weight of previous tests.  This will annoy the  
gamers because their modules keep dropping in kwalitee, while those  
that genuinely care about quality will appreciate the additional  
measurements.  If some gamers get annoyed enough to quit the game,  
that's not a big deal because they didn't really understand the point  
of CPANTS anyway.  If some keep playing the game by cleaving to the  
standards the community sets for them, then all the better for the  
rest of us.


As an example, consider pod_coverage.  It's a rather annoying metric,  
most of us agree.  Test::Pod::Coverage really only needs to be run on  
the author's machine, not on every user's machine.  However, by  
adding pod_coverage to kwalitee we got LOTS of authors to improve  
their POD with the cost of wasting cycles on users' machines.  I  
think that's a price worth paying -- at least until we rewrite the  
metric to actually test POD coverage (which is a decent proxy for POD  
quality) instead of just checking for the presence of a t/ 
pod_coverage.t file (which is a weak proxy for POD quality, but  
dramatically easier to measure).


Chris
--
Chris Dolan, Software Developer, Clotho Advanced Media Inc.
608-294-7900, fax 294-7025, 1435 E Main St, Madison WI 53703
vCard: http://www.chrisdolan.net/ChrisDolan.vcf

Clotho Advanced Media, Inc. - Creators of MediaLandscape Software  
(http://www.media-landscape.com/) and partners in the revolutionary  
Croquet project (http://www.opencroquet.org/)





Re: CPANTS is not a game.

2006-05-23 Thread H.Merijn Brand
On Tue, 23 May 2006 09:35:27 -0500, Chris Dolan [EMAIL PROTECTED] wrote:

 On May 23, 2006, at 8:39 AM, David Golden wrote:
 
  How does is_prereq improve quality?
 
  Or, put differently, how does measuring something that an author  
  can't control create an incentive to improve?
 
 is_prereq is usually a proxy metric for software maturity: if someone  
 thinks your module is good enough that he would rather depend on it  
 than reinvent it, then it's probably a better-than-average module on  
 CPAN.

Very true for base modules like Getopt::Long, Test::More, or DBI
They are built to be used as basic blocks. DBI on itself is quite useless. It
only shows it's value combined with a DBD. The DBD itself however is more
unlikely to be required, as it is an end-block. That does not inhibit other
authors to extend on it (see DBD::Pg), but the functionality on itself quite
often is enough to not invite people to make it a requirement for a new
module (see DBD::mysql). These modules however are mature enough to compete.

 is_prereq is usually a vote of confidence,

I respectfully disagree completely.
It's been more than once that I did *not* install a module because it
required a module that I did not trust, either because of (the programming
style of) the author of that module, or because that module required yet
another zillion things I do not want installed (think YAML).

 so it is likely a good proxy for quality.  In fact I believe that because
 the author (usually) can't control it directly, is_prereq is one of the best  
 proxies for quality among the current kwalitee metrics.

I'd say: drop it. It's a worthless metric.

 CPANTS by itself does nothing to improve quality.  However, by  
 drawing attention to kwalitee metrics, I argue that CPANTS draws  
 attention to quality too.  Even if many authors don't understand  
 that, the ones that do makes CPANTS worthwhile.  If making it a game  
 helps to further raise awareness, then I think that should be  
 tolerated until CPANTS matures.

Hurray!
Yes, I used it to go over all my modules again, and use Covarage and pod
testing because of CPANTS. That indeed increased the qualitee of my modules

 IMHO, the best way to improve CPANTS and move away from the game  
 mentality is to continually add more tests.  Each added test  
 diminishes the weight of previous tests.  This will annoy the  
 gamers because their modules keep dropping in kwalitee, while those  
 that genuinely care about quality will appreciate the additional  
 measurements.  If some gamers get annoyed enough to quit the game,  
 that's not a big deal because they didn't really understand the point  
 of CPANTS anyway.  If some keep playing the game by cleaving to the  
 standards the community sets for them, then all the better for the  
 rest of us.

Tests should make sense. I still think there should be a test for copyright
notices, and TODO lists.

 As an example, consider pod_coverage.  It's a rather annoying metric,  
 most of us agree.  Test::Pod::Coverage really only needs to be run on  
 the author's machine, not on every user's machine.  However, by  
 adding pod_coverage to kwalitee we got LOTS of authors to improve  
 their POD with the cost of wasting cycles on users' machines.

Yep. Here too.

 I think that's a price worth paying -- at least until we rewrite the  
 metric to actually test POD coverage (which is a decent proxy for POD  
 quality) instead of just checking for the presence of a t/ 
 pod_coverage.t file (which is a weak proxy for POD quality, but  
 dramatically easier to measure).


-- 
H.Merijn BrandAmsterdam Perl Mongers (http://amsterdam.pm.org/)
using  porting perl 5.6.2, 5.8.x, 5.9.x  on HP-UX 10.20, 11.00, 11.11,
 11.23, SuSE 10.0, AIX 4.3  5.2, and Cygwin.   http://qa.perl.org
http://mirrors.develooper.com/hpux/   http://www.test-smoke.org
   http://www.goldmark.org/jeff/stupid-disclaimers/


Re: CPANTS is not a game.

2006-05-23 Thread David Golden

Chris Dolan wrote:
is_prereq is usually a proxy metric for software maturity: if someone 
thinks your module is good enough that he would rather depend on it than 
reinvent it, then it's probably a better-than-average module on CPAN.  
is_prereq is usually a vote of confidence, so it is likely a good proxy 
for quality.  In fact I believe that because the author (usually) can't 
control it directly, is_prereq is one of the best proxies for quality 
among the current kwalitee metrics.


I'd go so far to argue that is_prereq is perhaps a more significant 
metric than Kwalitee itself as it is really a measure of *utility*.  I'd 
be very interested to see it explored fully, not just as a binary -- 
e.g. how many different authors used a module in at least one of their 
distributions.


That said, it doesn't mean much for quality -- people may well use a 
poor quality distribution if it is sufficiently useful.


As an example, consider pod_coverage.  It's a rather annoying metric, 
most of us agree.  Test::Pod::Coverage really only needs to be run on 
the author's machine, not on every user's machine.  However, by adding 
pod_coverage to kwalitee we got LOTS of authors to improve their POD 
with the cost of wasting cycles on users' machines.  I think that's a 
price worth paying -- at least until we rewrite the metric to actually 
test POD coverage (which is a decent proxy for POD quality) instead of 
just checking for the presence of a t/pod_coverage.t file (which is a 
weak proxy for POD quality, but dramatically easier to measure).


It doesn't check for the existence of a t/pod_coverage.t file.  It 
checks that a string like use Test::Pod::Coverage appears properly 
formatted.  E.g. I believe this is sufficient to get the Kwalitee point:


  # t/pod_coverage.t
  __END__
  use Test::Pod::Coverage;

And, unfortunately, it also misses actual perl that doesn't meet its 
regex expectations.  (E.g. see the bug I recently filed for 
Module::ExtractUse.)


Regards,
David



Re: CPANTS is not a game.

2006-05-23 Thread Chris Dolan

On May 23, 2006, at 10:15 AM, H.Merijn Brand wrote:


is_prereq is usually a vote of confidence,


I respectfully disagree completely.
It's been more than once that I did *not* install a module because it
required a module that I did not trust, either because of (the  
programming
style of) the author of that module, or because that module  
required yet

another zillion things I do not want installed (think YAML).


I believe we are largely in agreement, but I think your example  
demonstrates that you missed my point.  As you well know, CPANTS is  
not making recommendations whether a module is a good solution for  
your problem, or whether you should trust a given module.  Instead,  
it is currently measuring maturity of a module and the author's  
attention to detail.  is_prereq just measures whether *any* other  
humans trust the module in question.  In that way, is_prereq is like  
a simplistic binary version of Google's page rank.  Just because  
Google rates a page highly doesn't mean it's a good page.  Similarly  
just because CPANTS ranks a module highly doesn't mean it's a good  
module.  However, both is_prereq and page rank are among the current  
best automatable proxies we have for approximating human judgment of  
quality.  Yes, there are great modules with is_prereq of 0 and there  
are great web sites with low page ranks.  But in both cases they're  
harder to find than their highly-linked counterparts, except via word  
of mouth (perlmonks, cpanratings, etc).


I keep advocating for is_prereq because currently it's the only non- 
author-controlled input to CPANTS.  That's it's primary value, and it  
will continue to be important until some better proxy for human  
confidence comes along, like incorporating cpanratings into CPANTS (I  
do NOT advocate that!) or getting download stats from CPAN (never  
gonna happen) or adding voluntary Someone installed module X pings  
from CPAN.pm.


Chris
--
Chris Dolan, Software Developer, Clotho Advanced Media Inc.
608-294-7900, fax 294-7025, 1435 E Main St, Madison WI 53703
vCard: http://www.chrisdolan.net/ChrisDolan.vcf

Clotho Advanced Media, Inc. - Creators of MediaLandscape Software  
(http://www.media-landscape.com/) and partners in the revolutionary  
Croquet project (http://www.opencroquet.org/)





Re: CPANTS is not a game.

2006-05-23 Thread Chris Dolan

On May 23, 2006, at 10:34 AM, David Golden wrote:


Chris Dolan wrote:
... just checking for the presence of a t/pod_coverage.t file  
(which is a weak proxy for POD quality, but dramatically easier to  
measure).


It doesn't check for the existence of a t/pod_coverage.t file.  It  
checks that a string like use Test::Pod::Coverage appears  
properly formatted.  E.g. I believe this is sufficient to get the  
Kwalitee point:


  # t/pod_coverage.t
  __END__
  use Test::Pod::Coverage;

And, unfortunately, it also misses actual perl that doesn't meet  
its regex expectations.  (E.g. see the bug I recently filed for  
Module::ExtractUse.)


Point taken, apologies for the inaccuracy.  However, that supports my  
argument that pod_coverage is a weak proxy.  I say it's much better  
than nothing, but still weak and the brittleness documented above  
makes it weaker.


Actually, I'd rather see a robust pod_coverage that just checks for  
the existence of t/.*pod_coverage.t than a slightly brittle that  
parses that file.  That is, I'd rather see false positives than false  
negatives.  To put it another way, I'll tolerate cheaters to avoid  
annoying the well-intentioned authors.


Chris
--
Chris Dolan, Software Developer, Clotho Advanced Media Inc.
608-294-7900, fax 294-7025, 1435 E Main St, Madison WI 53703
vCard: http://www.chrisdolan.net/ChrisDolan.vcf

Clotho Advanced Media, Inc. - Creators of MediaLandscape Software  
(http://www.media-landscape.com/) and partners in the revolutionary  
Croquet project (http://www.opencroquet.org/)





Re: CPANTS is not a game.

2006-05-23 Thread Michael G Schwern

On 5/23/06, Andy Lester [EMAIL PROTECTED] wrote:


   How do you get authors to actually look at the CPANTS information
 and
   make corrections?  Well, we like competition.  Make it a game!

 So it was you -- or somebody impersonating you on this list -- who
 managed to persuade me that actually Cpants being a game was a good
 thing!



See, now that's why I write stuff down.  On mailing lists.  So someone else
can remember it for me. ;)


The key is that we're playing for different goals.  Schwern was

saying that the improvement of the modules is a game.  PerlGirl is
making a game out of improving the numeric score for her modules, but
without any improvement of the module itself.



Therein lies the problem.  CPANTS is a fairly direct measure of distribution
quality (as opposed to code quality), so it has become useful as a
distribution improvement tool.  Trouble is, CPANTS as distribution quality
tool and CPANTS as kwalitee measurement have mutually exclusive methods to
reach their goals.  One works better as a game, one does not.

So I guess its down to this: pick a goal.  Either drop the gaming aspects or
drop any remaining pretense that its a measurement of module quality.  Since
the whole kwalitee thing is pretty flimsy to begin with, I'd go with just
making it a distribution improvement game.  That's what it seems to do best,
what people like to use it for and games are fun!


Re: CPANTS is not a game.

2006-05-23 Thread Thomas Klausner
Hi!

I missed most of this discussion due to work and a very important
shopping trip to IKEA (well, maybe not that important, but I'll let you
argue this out with my girlfriend...)

I'm also a bit exhausted now, so here are just some semi-random comments
on this thread:

- I think the biggest problem with CPANTS now is lack of (meaningful)
  tests. There were a lot of suggestions for more tests on this list, in
  private mail (and some even in my brain..). The only problem is that I
  never had time implementing them ($job etc, you know). Then, in
  Jannuary this year, I changed jobs, so I had to move CPANTS to a new
  server. At the same time I did some fundamental changes to the internals
  (e.g. factor out Module:CPANTS::Analyse to allow for stuff like
  Test::Kwalitee and cpants_lint.pl)

  But...

  I'm now settled in my new job (and new appartment), the new and
  improved CPANTS is running on a new server (provided by yi.org, thanks
  again to Tyler MacDonald!). So basically all the time I can spend on
  CPANTS will go into new tests (eg a check if used modules (minus stuff
  in Module::CoreList) matches PREREQ_PM).


- Until I grok PPI and merry it with CPANTS, testing distribution
  kwalitee is basically the only halfway serious option. Even this
  doesn't work all the time (see has_test_pod*).

  Dist tests are low-hanging fruits. But I'll promise I'll reach
  further. Later...


- CPANTS as a multiplayer online game is an easy way to get peoples
  attention without totaly offending them. I /could/ send an email to
  everybody on CPAN with some 'helpfull hints' on how to improve
  kwalitee. I guess the biggest effect would be to get added to some SPAM
  blacklists etc...

  But with the tongue-in-cheek 'highscore lists', people get
  interested/hooked and DO improve their code. I got several mails of
  people who discovered semi-serious problems in their code (eg missing
  'use strict' statements) because they checked their CPANTS ratings.

  If people want to 'cheat', that's ok for me. As soon as I have some
  time to spend on the issue, I can improve the tests (but that's rather
  low on my todo list, as I like to assume that we are all grown-ups and
  do not need faked cpants ratings to boost our ego (I might be
  wrong...)).

  And no, I won't take the fun out of CPANTS.

- With regard to various problems with certain metrics: I won't remove a
  single metric unless I (or somebody else...) implemented a new one
  (and even than I'll think very hard before removing it)
  
  Again, serveral people found bugs/lacks of docu thanks to
  has_test_pod_coverage. Yes, some people use other tools to check
  pod/code coverage. Ok, some people don't ship their developer test
  suite to the world. But those are very few and very able authors. They
  do not need CPANTS to increase their kwalitee.
  
  But there are hundreds of authors who do need hints to increase
  kwalitee (most likely because there's a new trend in Perl, and not
  everyone attends YAPCs / reads all the lists / etc). CPANTS is a way
  to introduce those new (or not so new) trends to the majority of CPAN
  authors who do not participate in the 'inner circles' of PERL.



-- 
#!/usr/bin/perl   http://domm.zsi.at
for(ref bless{},just'another'perl'hacker){s-:+-$-gprint$_.$/}


Re: CPANTS is not a game.

2006-05-23 Thread Andy Lester


On May 23, 2006, at 9:24 PM, James E Keenan wrote:

I've mostly ignored CPANTS, in large part because I refuse to  
include t/pod.t and t/pod_coverage.t in my distributions because  
they don't pick up the format in which some of my best  
documentation is written.  And refusing to include those tests  
lowers my kwalitee score.


Have we talked about this?  I'd like to make those more useful to you  
if I can.


--
Andy Lester = [EMAIL PROTECTED] = www.petdance.com = AIM:petdance