The next gen TAP, bringing it together.

Michael G Schwern Tue, 27 Mar 2007 15:22:03 -0800

Andy Armstrong wrote:
> On 27 Mar 2007, at 18:16, Gary Hawkins wrote:
>> What about diag() output.
> 
> It goes to STDERR and is therefore not really part of TAP. Many uses of
> diag are expected to be replaced by the YAML based mechanism.


To expand on that, the plan is to transition test module authors away from
using free-format diag() and using the structured YAML stuff instead.  In
general when a test module is using diag() they really want structured
diagnostics anyway.

As for users who use diag(), those messages do tend to really be free-form.
For that diag() will start to use the proposed TAP logging syntax and no
longer print to STDERR.
http://testanything.org/wiki/index.php/TAP_logging_syntax

diag() would be at the level "notice" and thus normally displayed.  I'll
probably add to Test::More an info() function which outputs at level "info"
which is roughly the equivalent now to printing to STDOUT.  Its not normally
displayed.

So, let's bring this together.  A simple test like this:

    use File::Temp qw(tempfile);

    use Test::More tests => 3;

    is $dog, "Basset hound", "Basset hounds got long ears";

    my($fh, $tmpfile) = tempfile();
    print "# Using file $tmpfile\n";
    ok( -w $tmpfile, "tmpfile is writable" )
        or diag "$tmpfile not writable: $!";

    ok( -x $tmpfile, "tmpfile is executable" )
        or diag "$tmpfile not executable: $!";


Currently outputs this

1..3
not ok 1 - Basset hounds got long ears
#   Failed test 'Basset hounds got long ears'
#   at /Users/schwern/tmp/foo.t line 8.
#          got: 'Mutt'
#     expected: 'Basset hound'
# Using file /tmp/HBN5dFYRBm
ok 2 - tmpfile is writable
not ok 3 - tmpfile is executable
#   Failed test 'tmpfile is executable'
#   at /Users/schwern/tmp/foo.t line 15.
# /tmp/HBN5dFYRBm not executable:
# Looks like you failed 2 tests of 3.


And when run through Test::Harness it looks like this:

$ prove ~/tmp/foo.t
/Users/schwern/tmp/foo....
#   Failed test 'Basset hounds got long ears'
#   at /Users/schwern/tmp/foo.t line 8.
#          got: 'Mutt'
#     expected: 'Basset hound'
/Users/schwern/tmp/foo....NOK 1/3
/Users/schwern/tmp/foo....ok 2/3#   Failed test 'tmpfile is executable'
#   at /Users/schwern/tmp/foo.t line 15.
# /tmp/U2x9QTnKoL not executable:
# Looks like you failed 2 tests of 3.
/Users/schwern/tmp/foo....dubious
        Test returned status 2 (wstat 512, 0x200)
DIED. FAILED tests 1, 3
        Failed 2/3 tests, 33.33% okay
Failed Test              Stat Wstat Total Fail  List of Failed
-------------------------------------------------------------------------------
/Users/schwern/tmp/foo.t    2   512     3    2  1 3
Failed 1/1 test scripts. 2/3 subtests failed.
Files=1, Tests=3,  0 wallclock secs ( 0.05 cusr +  0.01 csys =  0.06 CPU)
Failed 1/1 test programs. 2/3 subtests failed.

A bit messy.  Since all the failure diagnostics are going to STDERR
Test::Harness can't format their output or control it or be aware of it at
all.  So its a mess.  They're also unparsable.  Nobody but a human can say why
the tests failed.

The next gen TAP using the same code, except the print to STDOUT replaced with
an info() call, will look something like this.

TAP version 16
    ---
    datetime: Tue, 27 Mar 2007 15:56:58 -0700
    file:     /Users/schwern/tmp/foo.t
    hostname: windhund.schwern.org
    producer:
        name:    Test::Builder
        version: 0.72
    ...
1..3
not ok 1 - Basset hounds got long ears
    ---
    line: 8
    code: 'is $dog, "Basset hound", "Basset hounds got long ears";'
    got:      'Mutt'
    expected: 'Basset hound'
    display: |2
           got: 'Mutt'
      expected: 'Basset hound'
    ...
**info** Using file /tmp/HBN5dFYRBm
ok 2 - tmpfile is writable
    ---
    line: 12
    code: ok( -w $tmpfile, "tmpfile is writable" )
    ...
not ok 3 - tmpfile is executable
    ---
    line: 15
    code: ok( -x $tmpfile, "tmpfile is executable" )
    ...
**notice** /tmp/HBN5dFYRBm not executable:
**fail** Looks like you failed 2 tests of 3.


This incorporates these three proposals as well as the new version syntax
which has been accepted.
http://testanything.org/wiki/index.php/Test_meta_information
http://testanything.org/wiki/index.php/TAP_diagnostic_syntax
http://testanything.org/wiki/index.php/TAP_logging_syntax

The amount of information to put in the YAML is up to the TAP producer, all of
the keys are optional as is outputting any YAML at all.  This example is
probably about the level of default verbosity I'd expect.  The "display" key
is just a suggestion to the TAP producer about how to display the failure.

The "code" key is an example of extra information it will be possible for a
producer to send on to the parser.  The parser will then choose what to do
with it.


So the raw test output is a little more verbose than before and possibly a
little harder for a human to read.  But now its all in one stream (STDOUT) and
all parsed by the parser.  We trade off a little verbosity and a little human
readability of the raw stream for a much more flexible TAP display.  A
hypothetical display might look like this:


$ prove3 ~/tmp/foo.t
Running /Users/schwern/tmp/foo....1/3
-------------------------------------------------------
!!!! Failed test #1: 'Basset hounds got long ears' !!!!
!!!! at /Users/schwern/tmp/foo.t line 8.           !!!!
!!!!       code: 'is $dog, "Basset hound", "Basset hounds got long ears";'
!!!!     actual: 'Mutt'                            !!!!
!!!!   expected: 'Basset hound'                    !!!!
-------------------------------------------------------
       ...........................3/3
-------------------------------------------------------
!!!! Failed test #3: 'tmpfile is executable'       !!!!
!!!! at /Users/schwern/tmp/foo.t line 15.          !!!!
!!!!   code: 'ok( -x $tmpfile, "tmpfile is executable" )'
-------------------------------------------------------
[notice] /tmp/U2x9QTnKoL not executable:
[fail] Looks like you failed 2 tests of 3.
    Test returned status 2 (wstat 512, 0x200)
    DIED. FAILED tests 1, 3
    Failed 2/3 tests, 33.33% okay

Failed Test              Stat Wstat Total Fail  List of Failed
-------------------------------------------------------------------------------
/Users/schwern/tmp/foo.t    2   512     3    2  1 3
Failed 1/1 test scripts. 2/3 subtests failed.
Files=1, Tests=3,  0 wallclock secs ( 0.05 cusr +  0.01 csys =  0.06 CPU)
Failed 1/1 test programs. 2/3 subtests failed.


Not terribly imaginative, I admit, but still much improved readability.  This
is possible because the displayer has full control over all the diagnostics.
It can parse and reformat them as it likes.  Everything in the example above
is printed by the parser.  Nothing comes directly from the producer.  Its not
a mistake that a changed the "got" key to "actual" as some people have 
requested.

The parser can accept flags to display information about passing tests, show
different levels of logs, be very quiet, be very verbose, put it all into a
file, translate it to XML, send it out via email... sky's the limit.

Folks who do more radical things like TAP::HTML::Matrix (which is radical only
in a relative sense) will find this new syntax extremely useful as they can
now archive and display all the same information you get by running a test on
the terminal.  They no longer have to scrape things like file and line number
out of the diagnostics.

The next gen TAP, bringing it together.

Reply via email to