On Sep 12, 2006, at 9:24 AM, Salve J Nilsen wrote:

Any metric that catches bad things, particularly bad technical things, is going to be just fine. Metrics that try to push "good" behavior are fraught with trouble, because they start pushing people in odd directions.

Do you have an example on this? (Any pointer would be wonderful.)

I have two: pod.t and pod_coverage.t. These are pointless to run on an end-user's machine. At best they are redundant to immutable tests already run on the author's machine and just waste processor cycles. At worst they fail and cause false negative test reports. The prevalence of those two tests in CPAN modules is almost entirely due to the influence of CPANTS.

Despite the criticisms above, the CPANTS POD tests have ultimately succeeded: they have convinced authors to do a better job of documenting all methods, or marking private methods as such. I think nobody can argue that the POD tests, in particular, have had a net positive effect on the quality of CPAN.

 ===

Now begins a huge digression on encouraging good behavior vs. discouraging bad behavior leading to a recommendation for CPANTS.

One flaw in the language of Adam's assertion is that he doesn't properly distinguish the goals and metrics of CPANTS. Discouraging a specific bad behavior is just a way encouraging other unspecified behavior, which could be good or bad. IF FEASIBLE, it's always better to encourage good behavior. The danger is not metrics that encourage good behavior, but instead metrics that encourage a specific good behavior when there are a multitude of equally-valid good behaviors. In that case, discouraging the bad behaviors is the best you can do. I believe that's what Adam was trying to say.

I'm going to continue with the specific example of POD Kwalitee. The CPANTS goal is (obviously) to encourage higher quality documentation. However, that's a hard thing for a computer to measure. So, instead we try to discourage specific bad behaviors: POD syntax errors and undocumented subroutines.

Let me run through an exercise. In the first step, consider how one would arrive at the need for CPANTS POD tests:

Goal: encourage high-quality CPAN packages
  Assertion: high-quality packages have high-quality documentation
    Assertion: high-quality documentation is parseable by doc tools
      Subgoal: discourage invalid POD
        Measure: Is the POD valid for each module in the package?
Assertion: high-quality documentation describes every public subroutine
      Subgoal: discourage undocumented subs
Measure: Does each module in the package have documentation for every public sub?

The next step in the exercise becomes how to implement those measures. In the current CPANTS simple proxies are used for those measures. Namely, we assume that if there is a t/*pod.t file present then the pod is valid, and if there is a t/*pod_coverage.t present then all subroutines are documented.

Note that my subgoals are stated as discouraging bad behavior. It's always easier to test for failures than successes (case in point: governments usually create laws, not commandments). The CPANTS POD tests, however, check for good behavior ("Thou shalt add pod.t to thy package") instead of checking for bad behavior ("Thou shalt not include invalid POD in thy module").

Wouldn't it be better to just measure the POD validity directly instead of using a proxy for that measurement? As an outsider, I'll guess that CPANTS resorts to the proxies for these reasons, in order of importance:
 1) reliability
 2) ease of implementation
 3) speed of evaluation
Certainly, CPANTS wants to avoid false negatives at all costs. It's impact on the community is purely voluntary, so it wants to avoid antagonizing authors. If CPANTS mistakenly says that your module has incomplete POD coverage when you *know* that you have documented every method, you're going to be annoyed. Some authors may decline to participate in CPANTS if they get annoyed enough. So, false negatives are perilous to the success of the entire project.

I believe the main reason that CPANTS measures t/*pod*.t existence instead of directly running Test::Pod/Test::Pod::Coverage is that the latter is harder for the author to judge consistently before he/she uploads to CPAN. But, with the improved availability of offline CPANTS analysis (via Module::CPANTS::Analyse), it should be feasible for authors to get rid of more complex false negatives before uploading to CPAN.

So, as a technological expedient, CPANTS is encouraging a sub-optimal good behavior (adding t/*pod*.t to CPAN releases) in the process of trying to discourage a bad behavior. To fix this, we need to remove the need for the expedient. That means letting CPANTS perform more complicate analyses and letting authors test those analyses offline exactly as they would be performed online on cpants.perl.org.

Thus, I finally get to an action item: CPANTS should encourage authors to run Module::CPANTS::Analyse offline before uploading to CPAN. I assert that if we can convince authors to perform more thorough tests of their packages at author-time, then the quality of CPAN will improve. And the more closely the metrics match our real quality goals, the bigger the quality delta we will achieve.

Chris

--
Chris Dolan, Software Developer, Clotho Advanced Media Inc.
608-294-7900, fax 294-7025, 1435 E Main St, Madison WI 53703
vCard: http://www.chrisdolan.net/ChrisDolan.vcf

Clotho Advanced Media, Inc. - Creators of MediaLandscape Software (http://www.media-landscape.com/) and partners in the revolutionary Croquet project (http://www.opencroquet.org/)


Reply via email to