CPANTS is not a game.
I haven't looked at what's going on in CPANTS for a while but Andy's post made me have a look and oh dear. There's a problem. CPANTS is not a game. If you make it a game, the system does not work. Let's review. CPANTS is not a measure of module quality since module quality is not well defined and difficult to measure. CPANTS is a measure of those things which can be measured about quality modules. To use a real world example, fast cars tend to be red. There is no direct relationship between a CPANTS kwalitee indicator and the actual quality of the module, it is merely a statistical indicator. A car can be red without being fast. Conversely, a blue car can be fast. If I alter my distribution to increase its CPANTS score I have not reduced the bug count, made it more efficient, improved the documentation or given it an easier interface. I have not done any of the difficult to measure things which we associate with improving the real quality of a module. [1] If I paint my car red, it does not go faster. If people *think* red cars are faster, and I want to sell my car, I'm going to paint my car red. People will now be more likely to buy my car thinking it goes faster, but they have been fooled. I have gamed the system, used my knowledge of its rules to my own gain. The more people game CPANTS, the more they alter their modules specificly to increase their CPANTS score, the less useful the CPANTS indicators will be. If everyone paints their car red then its no longer a valid indicator of whether its a fast car. The indicator is now useless. CPANTS is already easy to game since its rules are published. This is hard enough to deal with, but gaming is gleefully encouraged! Its called The CPANTS Game. There's a scoreboard with a top 40, hall of fame and shame. Failure to have a CPANTS indicator is expressed as a shortcoming which must be remedied. In order to best perform its function and reduce the urge to game, CPANTS should... * not express itself as a game or competition * not widely publish its rules * express its indicators in a neutral fashion without indication as to which state of the indicator is better * not suggest ways to fix your module * not publish a numerical score which one may be compelled to raise, instead a less precise indicator such as color * not publish a scoreboard, top ten, hall of fame or shame * add a feedback loop so that certain human-checked low and high quality modules are checked against their CPANTS indicators to see which still have value. Adjust the indicator's weights accordingly. Sorry if it sucks all the fun out of it, and I don't mean to poo-poo the work Thomas and others have done, but I think we've forgotten what CPANTS is. [1] I may have improved the real quality of the distribution, but not the payload contained within. Using CPANTS as a distribution improvement tool is another story and may ultimately be its best destiny.
Re: CPANTS is not a game.
Michael G Schwern writes: There's a problem. CPANTS is not a game. If you make it a game, the system does not work. Hi there. I made a similarish point on this list about a year ago, to which you replied: http://groups.google.co.uk/[EMAIL PROTECTED] Your reply included: Finally, the scoreboard does have a purpose. Part of the original idea of CPANTS was to provide an automated checklist for a good distribution. ... Then, if this were a web page, the author could just click on that to get an explaination of why this is a Good Thing and what they can do to fix it. ... How do you get authors to actually look at the CPANTS information and make corrections? Well, we like competition. Make it a game! So it was you -- or somebody impersonating you on this list -- who managed to persuade me that actually Cpants being a game was a good thing! Smylers
Re: CPANTS is not a game.
How do you get authors to actually look at the CPANTS information and make corrections? Well, we like competition. Make it a game! So it was you -- or somebody impersonating you on this list -- who managed to persuade me that actually Cpants being a game was a good thing! The key is that we're playing for different goals. Schwern was saying that the improvement of the modules is a game. PerlGirl is making a game out of improving the numeric score for her modules, but without any improvement of the module itself. -- Andy Lester = [EMAIL PROTECTED] = www.petdance.com = AIM:petdance
Re: CPANTS is not a game.
Andy Lester wrote: How do you get authors to actually look at the CPANTS information and make corrections? Well, we like competition. Make it a game! So it was you -- or somebody impersonating you on this list -- who managed to persuade me that actually Cpants being a game was a good thing! The key is that we're playing for different goals. Schwern was saying that the improvement of the modules is a game. PerlGirl is making a game out of improving the numeric score for her modules, but without any improvement of the module itself. How does is_prereq improve quality? Or, put differently, how does measuring something that an author can't control create an incentive to improve? If CPANTS is a objective quality measure, then it makes sense. If CPANTS is a quality game -- i.e. a friendly competition to improve one's scores -- then it doesn't. If CPANTS stays with a narrow set of well-defined, objective criteria, then it can serve both purposes. Remove or refine the subjective or hard-to-measure ones and the numerical gaming that doesn't change apparent quality goes away. Regards, David Golden
Re: CPANTS is not a game.
On May 23, 2006, at 8:39 AM, David Golden wrote: How does is_prereq improve quality? Or, put differently, how does measuring something that an author can't control create an incentive to improve? is_prereq is usually a proxy metric for software maturity: if someone thinks your module is good enough that he would rather depend on it than reinvent it, then it's probably a better-than-average module on CPAN. is_prereq is usually a vote of confidence, so it is likely a good proxy for quality. In fact I believe that because the author (usually) can't control it directly, is_prereq is one of the best proxies for quality among the current kwalitee metrics. CPANTS by itself does nothing to improve quality. However, by drawing attention to kwalitee metrics, I argue that CPANTS draws attention to quality too. Even if many authors don't understand that, the ones that do makes CPANTS worthwhile. If making it a game helps to further raise awareness, then I think that should be tolerated until CPANTS matures. IMHO, the best way to improve CPANTS and move away from the game mentality is to continually add more tests. Each added test diminishes the weight of previous tests. This will annoy the gamers because their modules keep dropping in kwalitee, while those that genuinely care about quality will appreciate the additional measurements. If some gamers get annoyed enough to quit the game, that's not a big deal because they didn't really understand the point of CPANTS anyway. If some keep playing the game by cleaving to the standards the community sets for them, then all the better for the rest of us. As an example, consider pod_coverage. It's a rather annoying metric, most of us agree. Test::Pod::Coverage really only needs to be run on the author's machine, not on every user's machine. However, by adding pod_coverage to kwalitee we got LOTS of authors to improve their POD with the cost of wasting cycles on users' machines. I think that's a price worth paying -- at least until we rewrite the metric to actually test POD coverage (which is a decent proxy for POD quality) instead of just checking for the presence of a t/ pod_coverage.t file (which is a weak proxy for POD quality, but dramatically easier to measure). Chris -- Chris Dolan, Software Developer, Clotho Advanced Media Inc. 608-294-7900, fax 294-7025, 1435 E Main St, Madison WI 53703 vCard: http://www.chrisdolan.net/ChrisDolan.vcf Clotho Advanced Media, Inc. - Creators of MediaLandscape Software (http://www.media-landscape.com/) and partners in the revolutionary Croquet project (http://www.opencroquet.org/)
Re: CPANTS is not a game.
On Tue, 23 May 2006 09:35:27 -0500, Chris Dolan [EMAIL PROTECTED] wrote: On May 23, 2006, at 8:39 AM, David Golden wrote: How does is_prereq improve quality? Or, put differently, how does measuring something that an author can't control create an incentive to improve? is_prereq is usually a proxy metric for software maturity: if someone thinks your module is good enough that he would rather depend on it than reinvent it, then it's probably a better-than-average module on CPAN. Very true for base modules like Getopt::Long, Test::More, or DBI They are built to be used as basic blocks. DBI on itself is quite useless. It only shows it's value combined with a DBD. The DBD itself however is more unlikely to be required, as it is an end-block. That does not inhibit other authors to extend on it (see DBD::Pg), but the functionality on itself quite often is enough to not invite people to make it a requirement for a new module (see DBD::mysql). These modules however are mature enough to compete. is_prereq is usually a vote of confidence, I respectfully disagree completely. It's been more than once that I did *not* install a module because it required a module that I did not trust, either because of (the programming style of) the author of that module, or because that module required yet another zillion things I do not want installed (think YAML). so it is likely a good proxy for quality. In fact I believe that because the author (usually) can't control it directly, is_prereq is one of the best proxies for quality among the current kwalitee metrics. I'd say: drop it. It's a worthless metric. CPANTS by itself does nothing to improve quality. However, by drawing attention to kwalitee metrics, I argue that CPANTS draws attention to quality too. Even if many authors don't understand that, the ones that do makes CPANTS worthwhile. If making it a game helps to further raise awareness, then I think that should be tolerated until CPANTS matures. Hurray! Yes, I used it to go over all my modules again, and use Covarage and pod testing because of CPANTS. That indeed increased the qualitee of my modules IMHO, the best way to improve CPANTS and move away from the game mentality is to continually add more tests. Each added test diminishes the weight of previous tests. This will annoy the gamers because their modules keep dropping in kwalitee, while those that genuinely care about quality will appreciate the additional measurements. If some gamers get annoyed enough to quit the game, that's not a big deal because they didn't really understand the point of CPANTS anyway. If some keep playing the game by cleaving to the standards the community sets for them, then all the better for the rest of us. Tests should make sense. I still think there should be a test for copyright notices, and TODO lists. As an example, consider pod_coverage. It's a rather annoying metric, most of us agree. Test::Pod::Coverage really only needs to be run on the author's machine, not on every user's machine. However, by adding pod_coverage to kwalitee we got LOTS of authors to improve their POD with the cost of wasting cycles on users' machines. Yep. Here too. I think that's a price worth paying -- at least until we rewrite the metric to actually test POD coverage (which is a decent proxy for POD quality) instead of just checking for the presence of a t/ pod_coverage.t file (which is a weak proxy for POD quality, but dramatically easier to measure). -- H.Merijn BrandAmsterdam Perl Mongers (http://amsterdam.pm.org/) using porting perl 5.6.2, 5.8.x, 5.9.x on HP-UX 10.20, 11.00, 11.11, 11.23, SuSE 10.0, AIX 4.3 5.2, and Cygwin. http://qa.perl.org http://mirrors.develooper.com/hpux/ http://www.test-smoke.org http://www.goldmark.org/jeff/stupid-disclaimers/
Re: CPANTS is not a game.
Chris Dolan wrote: is_prereq is usually a proxy metric for software maturity: if someone thinks your module is good enough that he would rather depend on it than reinvent it, then it's probably a better-than-average module on CPAN. is_prereq is usually a vote of confidence, so it is likely a good proxy for quality. In fact I believe that because the author (usually) can't control it directly, is_prereq is one of the best proxies for quality among the current kwalitee metrics. I'd go so far to argue that is_prereq is perhaps a more significant metric than Kwalitee itself as it is really a measure of *utility*. I'd be very interested to see it explored fully, not just as a binary -- e.g. how many different authors used a module in at least one of their distributions. That said, it doesn't mean much for quality -- people may well use a poor quality distribution if it is sufficiently useful. As an example, consider pod_coverage. It's a rather annoying metric, most of us agree. Test::Pod::Coverage really only needs to be run on the author's machine, not on every user's machine. However, by adding pod_coverage to kwalitee we got LOTS of authors to improve their POD with the cost of wasting cycles on users' machines. I think that's a price worth paying -- at least until we rewrite the metric to actually test POD coverage (which is a decent proxy for POD quality) instead of just checking for the presence of a t/pod_coverage.t file (which is a weak proxy for POD quality, but dramatically easier to measure). It doesn't check for the existence of a t/pod_coverage.t file. It checks that a string like use Test::Pod::Coverage appears properly formatted. E.g. I believe this is sufficient to get the Kwalitee point: # t/pod_coverage.t __END__ use Test::Pod::Coverage; And, unfortunately, it also misses actual perl that doesn't meet its regex expectations. (E.g. see the bug I recently filed for Module::ExtractUse.) Regards, David
Re: CPANTS is not a game.
On May 23, 2006, at 10:15 AM, H.Merijn Brand wrote: is_prereq is usually a vote of confidence, I respectfully disagree completely. It's been more than once that I did *not* install a module because it required a module that I did not trust, either because of (the programming style of) the author of that module, or because that module required yet another zillion things I do not want installed (think YAML). I believe we are largely in agreement, but I think your example demonstrates that you missed my point. As you well know, CPANTS is not making recommendations whether a module is a good solution for your problem, or whether you should trust a given module. Instead, it is currently measuring maturity of a module and the author's attention to detail. is_prereq just measures whether *any* other humans trust the module in question. In that way, is_prereq is like a simplistic binary version of Google's page rank. Just because Google rates a page highly doesn't mean it's a good page. Similarly just because CPANTS ranks a module highly doesn't mean it's a good module. However, both is_prereq and page rank are among the current best automatable proxies we have for approximating human judgment of quality. Yes, there are great modules with is_prereq of 0 and there are great web sites with low page ranks. But in both cases they're harder to find than their highly-linked counterparts, except via word of mouth (perlmonks, cpanratings, etc). I keep advocating for is_prereq because currently it's the only non- author-controlled input to CPANTS. That's it's primary value, and it will continue to be important until some better proxy for human confidence comes along, like incorporating cpanratings into CPANTS (I do NOT advocate that!) or getting download stats from CPAN (never gonna happen) or adding voluntary Someone installed module X pings from CPAN.pm. Chris -- Chris Dolan, Software Developer, Clotho Advanced Media Inc. 608-294-7900, fax 294-7025, 1435 E Main St, Madison WI 53703 vCard: http://www.chrisdolan.net/ChrisDolan.vcf Clotho Advanced Media, Inc. - Creators of MediaLandscape Software (http://www.media-landscape.com/) and partners in the revolutionary Croquet project (http://www.opencroquet.org/)
Re: CPANTS is not a game.
On May 23, 2006, at 10:34 AM, David Golden wrote: Chris Dolan wrote: ... just checking for the presence of a t/pod_coverage.t file (which is a weak proxy for POD quality, but dramatically easier to measure). It doesn't check for the existence of a t/pod_coverage.t file. It checks that a string like use Test::Pod::Coverage appears properly formatted. E.g. I believe this is sufficient to get the Kwalitee point: # t/pod_coverage.t __END__ use Test::Pod::Coverage; And, unfortunately, it also misses actual perl that doesn't meet its regex expectations. (E.g. see the bug I recently filed for Module::ExtractUse.) Point taken, apologies for the inaccuracy. However, that supports my argument that pod_coverage is a weak proxy. I say it's much better than nothing, but still weak and the brittleness documented above makes it weaker. Actually, I'd rather see a robust pod_coverage that just checks for the existence of t/.*pod_coverage.t than a slightly brittle that parses that file. That is, I'd rather see false positives than false negatives. To put it another way, I'll tolerate cheaters to avoid annoying the well-intentioned authors. Chris -- Chris Dolan, Software Developer, Clotho Advanced Media Inc. 608-294-7900, fax 294-7025, 1435 E Main St, Madison WI 53703 vCard: http://www.chrisdolan.net/ChrisDolan.vcf Clotho Advanced Media, Inc. - Creators of MediaLandscape Software (http://www.media-landscape.com/) and partners in the revolutionary Croquet project (http://www.opencroquet.org/)
Re: CPANTS is not a game.
On 5/23/06, Andy Lester [EMAIL PROTECTED] wrote: How do you get authors to actually look at the CPANTS information and make corrections? Well, we like competition. Make it a game! So it was you -- or somebody impersonating you on this list -- who managed to persuade me that actually Cpants being a game was a good thing! See, now that's why I write stuff down. On mailing lists. So someone else can remember it for me. ;) The key is that we're playing for different goals. Schwern was saying that the improvement of the modules is a game. PerlGirl is making a game out of improving the numeric score for her modules, but without any improvement of the module itself. Therein lies the problem. CPANTS is a fairly direct measure of distribution quality (as opposed to code quality), so it has become useful as a distribution improvement tool. Trouble is, CPANTS as distribution quality tool and CPANTS as kwalitee measurement have mutually exclusive methods to reach their goals. One works better as a game, one does not. So I guess its down to this: pick a goal. Either drop the gaming aspects or drop any remaining pretense that its a measurement of module quality. Since the whole kwalitee thing is pretty flimsy to begin with, I'd go with just making it a distribution improvement game. That's what it seems to do best, what people like to use it for and games are fun!
Re: CPANTS is not a game.
Hi! I missed most of this discussion due to work and a very important shopping trip to IKEA (well, maybe not that important, but I'll let you argue this out with my girlfriend...) I'm also a bit exhausted now, so here are just some semi-random comments on this thread: - I think the biggest problem with CPANTS now is lack of (meaningful) tests. There were a lot of suggestions for more tests on this list, in private mail (and some even in my brain..). The only problem is that I never had time implementing them ($job etc, you know). Then, in Jannuary this year, I changed jobs, so I had to move CPANTS to a new server. At the same time I did some fundamental changes to the internals (e.g. factor out Module:CPANTS::Analyse to allow for stuff like Test::Kwalitee and cpants_lint.pl) But... I'm now settled in my new job (and new appartment), the new and improved CPANTS is running on a new server (provided by yi.org, thanks again to Tyler MacDonald!). So basically all the time I can spend on CPANTS will go into new tests (eg a check if used modules (minus stuff in Module::CoreList) matches PREREQ_PM). - Until I grok PPI and merry it with CPANTS, testing distribution kwalitee is basically the only halfway serious option. Even this doesn't work all the time (see has_test_pod*). Dist tests are low-hanging fruits. But I'll promise I'll reach further. Later... - CPANTS as a multiplayer online game is an easy way to get peoples attention without totaly offending them. I /could/ send an email to everybody on CPAN with some 'helpfull hints' on how to improve kwalitee. I guess the biggest effect would be to get added to some SPAM blacklists etc... But with the tongue-in-cheek 'highscore lists', people get interested/hooked and DO improve their code. I got several mails of people who discovered semi-serious problems in their code (eg missing 'use strict' statements) because they checked their CPANTS ratings. If people want to 'cheat', that's ok for me. As soon as I have some time to spend on the issue, I can improve the tests (but that's rather low on my todo list, as I like to assume that we are all grown-ups and do not need faked cpants ratings to boost our ego (I might be wrong...)). And no, I won't take the fun out of CPANTS. - With regard to various problems with certain metrics: I won't remove a single metric unless I (or somebody else...) implemented a new one (and even than I'll think very hard before removing it) Again, serveral people found bugs/lacks of docu thanks to has_test_pod_coverage. Yes, some people use other tools to check pod/code coverage. Ok, some people don't ship their developer test suite to the world. But those are very few and very able authors. They do not need CPANTS to increase their kwalitee. But there are hundreds of authors who do need hints to increase kwalitee (most likely because there's a new trend in Perl, and not everyone attends YAPCs / reads all the lists / etc). CPANTS is a way to introduce those new (or not so new) trends to the majority of CPAN authors who do not participate in the 'inner circles' of PERL. -- #!/usr/bin/perl http://domm.zsi.at for(ref bless{},just'another'perl'hacker){s-:+-$-gprint$_.$/}
Re: CPANTS is not a game.
On May 23, 2006, at 9:24 PM, James E Keenan wrote: I've mostly ignored CPANTS, in large part because I refuse to include t/pod.t and t/pod_coverage.t in my distributions because they don't pick up the format in which some of my best documentation is written. And refusing to include those tests lowers my kwalitee score. Have we talked about this? I'd like to make those more useful to you if I can. -- Andy Lester = [EMAIL PROTECTED] = www.petdance.com = AIM:petdance