I've been following the progress of pugs in awe of the speed of the
progress, the breadth of the scope and the exemplary way in which
Autrijus has run the project.  And I've been considering how best I
could contribute.

One of the many aspects of the project that is to be admired is the test
driven development.  Knowing a little about code coverage I wondered
whether there were any Haskell code coverage tools that we could use to
aid in that process.  The best option seemed to be something based on
hat (http://www.haskell.org/hat/) but unfortunately pugs uses some
libraries which hat does not support.  The last release of hat came two
years ago, so I'm not expecting those libraries to be supported in the
near future.  If anyone knows of a way to do code coverage in Haskell,
and specifically of pugs, please pipe up.

So next I thought about code coverage of Perl6 itself.  Now that people
are starting to write Perl6 modules code coverage could be useful.  It
seemed to me that walking the AST and hooking into the reduce function
could provide the required data.  Autrijus confirmed this, but thought
there might be a better way to do it and asked me to send requirements
in time for the refactoring of Eval.hs next week.  So here they are.

At the most basic level two sets of data are required: the constructs in
the program, and which of them were exercised.  So for subroutine
coverage, for example, that would mean a list of all the subroutines in
the program and which of them were called.  But a little more
information can be useful.  For example, the exact location of the
construct in the source code, the number of times it was exercised, how
long it took to be exercised, and so on.

In Perl5, Devel::Cover gets information about the constructs in the
program from two sources: the optree and the source code.  Information
from the optree is obtained by calling B::Deparse and overriding some of
its functions.  Information about which constructs were exercised is
obtained by installing a custom runops function and squirrelling away
data as ops are run.

So, requirements:

1.  Access to all the coverable constructs in the program: subroutines,
    statements, branches, conditions, loops, rules, pod ...

2.  Information about those constructs.  Where to find them in the
    source, textual representation, information about the parts of the
    constructs.

3.  Which constructs were exercised.  How many times, the order, when
    and how long it took.

Not all this information is required all the time.  For example, timing
and count information may not be wanted for efficiency reasons.
Depending on how this information is available other requirements may
surface.  For example, in Perl5 it is necessary to know that coverage is
to be collected before any code in the program is executed, and it is
necessary to process the data after the last code in the program has
been executed.

Aspects of Perl5 that make collecting coverage difficult:

 - doesn't allow constructs to be exactly regenerated from optree data

 - doesn't store enough information, eg can't determine location of
   elsif condition
   $ perl -we 'if ($a) {}
   elsif ($a + 1) {}'
   Use of uninitialized value in addition (+) at -e line 1.

 - filenames stored as relative paths so they can be difficult to find
   later

 - not always easy to determine the source file of a construct

 - deletes parts of the optree, eg code in modules which is not within a
   sub

 - doesn't provide a way to uniquely identify constructs

 - difficult to determine the value of an arbitrary boolean expression

I can probably go into nauseating detail on any of these points if
required.

If you've got this far, thanks for listening!

-- 
Paul Johnson - [EMAIL PROTECTED]
http://www.pjcj.net

Reply via email to