I've been following the progress of pugs in awe of the speed of the progress, the breadth of the scope and the exemplary way in which Autrijus has run the project. And I've been considering how best I could contribute.
One of the many aspects of the project that is to be admired is the test driven development. Knowing a little about code coverage I wondered whether there were any Haskell code coverage tools that we could use to aid in that process. The best option seemed to be something based on hat (http://www.haskell.org/hat/) but unfortunately pugs uses some libraries which hat does not support. The last release of hat came two years ago, so I'm not expecting those libraries to be supported in the near future. If anyone knows of a way to do code coverage in Haskell, and specifically of pugs, please pipe up. So next I thought about code coverage of Perl6 itself. Now that people are starting to write Perl6 modules code coverage could be useful. It seemed to me that walking the AST and hooking into the reduce function could provide the required data. Autrijus confirmed this, but thought there might be a better way to do it and asked me to send requirements in time for the refactoring of Eval.hs next week. So here they are. At the most basic level two sets of data are required: the constructs in the program, and which of them were exercised. So for subroutine coverage, for example, that would mean a list of all the subroutines in the program and which of them were called. But a little more information can be useful. For example, the exact location of the construct in the source code, the number of times it was exercised, how long it took to be exercised, and so on. In Perl5, Devel::Cover gets information about the constructs in the program from two sources: the optree and the source code. Information from the optree is obtained by calling B::Deparse and overriding some of its functions. Information about which constructs were exercised is obtained by installing a custom runops function and squirrelling away data as ops are run. So, requirements: 1. Access to all the coverable constructs in the program: subroutines, statements, branches, conditions, loops, rules, pod ... 2. Information about those constructs. Where to find them in the source, textual representation, information about the parts of the constructs. 3. Which constructs were exercised. How many times, the order, when and how long it took. Not all this information is required all the time. For example, timing and count information may not be wanted for efficiency reasons. Depending on how this information is available other requirements may surface. For example, in Perl5 it is necessary to know that coverage is to be collected before any code in the program is executed, and it is necessary to process the data after the last code in the program has been executed. Aspects of Perl5 that make collecting coverage difficult: - doesn't allow constructs to be exactly regenerated from optree data - doesn't store enough information, eg can't determine location of elsif condition $ perl -we 'if ($a) {} elsif ($a + 1) {}' Use of uninitialized value in addition (+) at -e line 1. - filenames stored as relative paths so they can be difficult to find later - not always easy to determine the source file of a construct - deletes parts of the optree, eg code in modules which is not within a sub - doesn't provide a way to uniquely identify constructs - difficult to determine the value of an arbitrary boolean expression I can probably go into nauseating detail on any of these points if required. If you've got this far, thanks for listening! -- Paul Johnson - [EMAIL PROTECTED] http://www.pjcj.net