Re: Sparse Test Output

2006-10-27 Thread chromatic
On Thursday 26 October 2006 07:37, Paul Beckingham wrote:

 To each his own, but my thoughts were not that it takes time to  
 parse, or that there is any unreasonable performance issue here -  
 just that it is so completely *unnecessary* to say ok lots of times.

Crazy thought: don't call ok() so many times.

-- c


Re: Sparse Test Output

2006-10-27 Thread Nik Clayton

Michael Peters wrote:

Jonathan Rockway wrote:


That said, I am willing to clean it up when I have time, and hopefully
be able to provide a libtap that makes TAP support easy for any language.


There is already a libtap -
http://www.onlamp.com/pub/a/onlamp/2006/01/19/libtap.html


Indeed.  Is anyone actually using it for anything?

N


Re: Sparse Test Output

2006-10-26 Thread Ovid
--- Michael G Schwern [EMAIL PROTECTED] wrote:
 TAPx::Parser has not yet been optimized so some profiling information
 would likely prove interesting.  I would much rather improve the
 parser than change the protocol.

This has been a concern of mine for quite some time as I've noticed
this, too.  However, I've needed a break from the parser for a bit and
with the current upheaval of my company shutting our offices and moving
to London, I've been a bit distracted.

Cheers,
Ovid

--

Buy the book -- http://www.oreilly.com/catalog/perlhks/
Perl and CGI -- http://users.easystreet.com/ovid/cgi_course/


Re: Sparse Test Output

2006-10-26 Thread Nicholas Clark
On Wed, Oct 25, 2006 at 03:53:58PM -0700, Michael G Schwern wrote:

 I am totally unconvinced.  I would love to see some profiling on this.  I 
 would hope that the act of running 50,000 tests would swamp the cost of 
 parsing 50,000 tests.

I don't have the exact numbers any more, but had experimented with
multithreading the core tests (needs some refactoring of Test::Harness to
do properly)

I had 1 process parsing the TAP streams, with the others running the tests and
generating the TAP. Above about 7 processes in parallel the TAP parser
process was the bottleneck. This was on a 32 core Sun, so I had plenty of
headroom to run more TAP generating threads.

Nicholas Clark


Re: Sparse Test Output

2006-10-26 Thread Paul Beckingham
 This raises the question, why are you storing the raw results  
rather than

 storing the results after having been run through a TAP parser?

I am storing all test results, from all test runs, to drive a  
tinderbox report with drill-down capability.  I can (and do) post- 
process the TAP output and strip the ^ok lines to achieve part of  
my goal.


...for a large product.


 Its important to know that a test ran.

Absoutely.  Or perhaps to invert that, it's important to know when a  
test did not run.  Either way, there is no loss of information.



 Essentially, I'm unconvinced that this is A) a real performance  
problem,
 running 10 million tests should swamp parsing 10 million TAP lines  
and

 B) can't be solved with a more flexible TAP parser than Test::Harness
 such as Ovid's TAPx::Parser.

For me, it's not about speed, just about the inefficiency.  Consider  
representing N tests as


1..N
ok 1
ok 2
not ok 3
ok 4
...
ok N

-or-

1..N sparse
not ok 3

There is no loss of information, but there is a saving as N  
approaches and exceeds 4.  Less so with test names, diag output etc.



 One of the benefits of having a decent protocol is that you don't
 have to keep changing the protocol to do interesting things.

Agreed - stable protocols are important, and this is indeed a good  
protocol.  But does that rule out any future modifications?  What if  
the protocol could be improved (questionably) to consume fewer bits  
without data loss, then there are some folks (one so far) who could  
really benefit.


Paul



Re: Sparse Test Output

2006-10-26 Thread Chris Dolan

On Oct 26, 2006, at 9:37 AM, Paul Beckingham wrote:

 and waiting for the test output/TAP to be parsed takes quite a  
while.

 Just my $0.02 worth.
 -=Chris

To each his own, but my thoughts were not that it takes time to  
parse, or that there is any unreasonable performance issue here -  
just that it is so completely *unnecessary* to say ok lots of times.


It's only unnecessary if the test program doesn't crash!  Compare  
ease of debugging for these two cases:


1..10_000_000 sparse
*segfault*

vs.

1..10_000_000
ok 1 - use_ok Foo::Bar
ok 2 - input = 0
ok 3 - input = 1
[... snip ...]
ok 99 - input = 97
ok 100 - input = 98
ok 101 - input = 99
*segfault*

In the former case, perhaps all 10_000_000 tests passed and the test  
program failed in END{}, or perhaps it died on the first test.  You  
just don't know without re-running the tests in a debugger, or  
turning off sparse.


In the latter case, the harness can tell you how far along you were  
before the tests failed.  If the harness is smart enough to report  
the last test result before the crash, you can even guess that input  
= 100 caused the crash.


In even scarier terms, would you prefer to see an RT or smoke report  
about your module with the former or the latter detail?


Chris

--
Chris Dolan, Software Developer, Clotho Advanced Media Inc.
608-294-7900, fax 294-7025, 1435 E Main St, Madison WI 53703
vCard: http://www.chrisdolan.net/ChrisDolan.vcf

Clotho Advanced Media, Inc. - Creators of MediaLandscape Software  
(http://www.media-landscape.com/) and partners in the revolutionary  
Croquet project (http://www.opencroquet.org/)



--
Chris Dolan, Software Developer, http://www.chrisdolan.net/
Public key: http://www.chrisdolan.net/public.key
vCard: http://www.chrisdolan.net/ChrisDolan.vcf





Re: Sparse Test Output

2006-10-26 Thread Michael G Schwern
Paul Beckingham wrote:
 and waiting for the test output/TAP to be parsed takes quite a while.
 Just my $0.02 worth.
 -=Chris
 
 To each his own, but my thoughts were not that it takes time to parse,
 or that there is any unreasonable performance issue here - just that it
 is so completely *unnecessary* to say ok lots of times.

The wonderful thing about computers is they're really good at handling lots and 
lots and lots of data.
http://www.angryflower.com/allthe.gif


Re: Sparse Test Output

2006-10-26 Thread Michael Peters


Jonathan Rockway wrote:

 That said, I am willing to clean it up when I have time, and hopefully
 be able to provide a libtap that makes TAP support easy for any language.

There is already a libtap -
http://www.onlamp.com/pub/a/onlamp/2006/01/19/libtap.html

-- 
Michael Peters
Developer
Plus Three, LP



Re: Sparse Test Output

2006-10-26 Thread Jonathan Rockway

 There is already a libtap

Ah!  Thanks for pointing that out.  I was planning a libtap to *parse*
TAP, whereas this libtap *emits* TAP.  If necessary, the name could be
changed to libparsetap or something :)

-- 
package JAPH;use Catalyst qw/-Debug/;($;=JAPH)-config(name = do {
$,.=reverse qw[Jonathan tsu rehton lre rekca Rockway][$_].[split //,
;$;]-[$_].q; ;for 1..4;$,=~s;^.;;;$,});$;-setup;


Re: Sparse Test Output

2006-10-26 Thread A. Pagaltzis
* Jonathan Rockway [EMAIL PROTECTED] [2006-10-26 21:15]:
 If necessary, the name could be changed to libparsetap or
 something :)

Maybe Yet Another Parser for TAP – libyaptap. :^)

Regards,
-- 
Aristotle Pagaltzis // http://plasmasturm.org/


Re: Sparse Test Output

2006-10-25 Thread Paul Beckingham

 I'm with Adrian.  Printing out ok 100,000 times shouldn't be a
 big deal unless you're reading the TAP via some sort of IP over
 clay tablets protocol.  But...

My test estimate is two orders of magnitude larger, so it actually is  
a big deal to capture and store those results.


But I would like to point out that there is a very low information  
density in 100,000 ok messages.  I was looking to see what folks  
thought of an *optional* feature that doesn't bother outputting the  
ok messages, just the interesting ones.  Hence the sparse.


The basic idea was that instead of:

1..10_000_000
ok 1
ok 2
...
not ok 123
...
ok 10_000_000

The output could be collapsed to the following, with no loss of  
important information:


1..10_000_000 sparse
not ok 123

Yes, I could achieve this with grep, sed, Perl etc, but I thought it  
might make a nice addition to TAP.


Paul.


Re: Sparse Test Output

2006-10-25 Thread Christopher H. Laco
Paul Beckingham wrote:
 I'm with Adrian.  Printing out ok 100,000 times shouldn't be a
 big deal unless you're reading the TAP via some sort of IP over
 clay tablets protocol.  But...
 
 My test estimate is two orders of magnitude larger, so it actually is a
 big deal to capture and store those results.
 
 But I would like to point out that there is a very low information
 density in 100,000 ok messages.  I was looking to see what folks
 thought of an *optional* feature that doesn't bother outputting the ok
 messages, just the interesting ones.  Hence the sparse.
 
 The basic idea was that instead of:
 
 1..10_000_000
 ok 1
 ok 2
 ...
 not ok 123
 ...
 ok 10_000_000
 
 The output could be collapsed to the following, with no loss of
 important information:
 
 1..10_000_000 sparse
 not ok 123
 
 Yes, I could achieve this with grep, sed, Perl etc, but I thought it
 might make a nice addition to TAP.
 
 Paul.
 
 

I'm in the same boat. Recently, I've started testing my environment when
things go wrong. (I blame Andy). I have one test alone that has a test
count of 500,000+. That's a lot of oks to be processed, when I only want
the ones that didn't pass.

Now, add in a few more tests that easily go into the 50,000 range in
addition to 'the bi one', and waiting for the test output/TAP to be
parsed takes quite a while.

Just my $0.02 worth.

-=Chris



signature.asc
Description: OpenPGP digital signature


Re: Sparse Test Output

2006-10-25 Thread Michael G Schwern
Christopher H. Laco wrote:
 I'm in the same boat. Recently, I've started testing my environment when
 things go wrong. (I blame Andy). I have one test alone that has a test
 count of 500,000+. That's a lot of oks to be processed, when I only want
 the ones that didn't pass.
 
 Now, add in a few more tests that easily go into the 50,000 range in
 addition to 'the bi one', and waiting for the test output/TAP to be
 parsed takes quite a while.
 
 Just my $0.02 worth.

I am totally unconvinced.  I would love to see some profiling on this.  I would 
hope that the act of running 50,000 tests would swamp the cost of parsing 
50,000 tests.


Re: Sparse Test Output

2006-10-25 Thread Christopher H. Laco
Michael G Schwern wrote:
 Christopher H. Laco wrote:
 I'm in the same boat. Recently, I've started testing my environment when
 things go wrong. (I blame Andy). I have one test alone that has a test
 count of 500,000+. That's a lot of oks to be processed, when I only want
 the ones that didn't pass.

 Now, add in a few more tests that easily go into the 50,000 range in
 addition to 'the bi one', and waiting for the test output/TAP to be
 parsed takes quite a while.

 Just my $0.02 worth.
 
 I am totally unconvinced.  I would love to see some profiling on this.  I 
 would hope that the act of running 50,000 tests would swamp the cost of 
 parsing 50,000 tests.
 
 

Depends on what the tests are. If it's simply verifying that 500,000
pieces of data confirm to a certain format, then those kinds of tests
aren't very expensive, or long running.

Just for sake of curiosity, I'll profile one of the tests tomorrow just
to see what happens.

-=Chris



signature.asc
Description: OpenPGP digital signature


Re: Sparse Test Output

2006-10-25 Thread Michael G Schwern
Paul Beckingham wrote:
 I'm with Adrian.  Printing out ok 100,000 times shouldn't be a
 big deal unless you're reading the TAP via some sort of IP over
 clay tablets protocol.  But...
 
 My test estimate is two orders of magnitude larger, so it actually is a
 big deal to capture and store those results.

This raises the question, why are you storing the raw results rather than 
storing the results after having been run through a TAP parser?


 But I would like to point out that there is a very low information
 density in 100,000 ok messages.  I was looking to see what folks
 thought of an *optional* feature that doesn't bother outputting the ok
 messages, just the interesting ones.  Hence the sparse.

Its important to know that a test ran.

Essentially, I'm unconvinced that this is A) a real performance problem, 
running 10 million tests should swamp parsing 10 million TAP lines and B) can't 
be solved with a more flexible TAP parser than Test::Harness such as Ovid's 
TAPx::Parser.  TAPx::Parser can be used to strip out all the ok lines, then you 
store what's left.

One of the benefits of having a decent protocol is that you don't have to keep 
changing the protocol to do interesting things.  The parser can do the 
interesting things.  And we finally have an interesting parser in TAPx::Parser.

Here ya go.

#!/usr/bin/perl -w  


use strict;
use TAPx::Parser;

my $parser = TAPx::Parser-new( { source = $ARGV[0] } );

while( my $result = $parser-next ) {
print $result-as_string, \n if !$result-is_test or !$result-is_ok;
}


It does add significant overhead.  Here's the example of one of 
Regexp::Common's tests.

0 windhund /private/var/local/cpan_shell/build/Regexp-Common-2.120$ time perl 
-Ilib ~/tmp/strip_ok t/number/integer.t 
1..23534

real0m4.882s
user0m5.469s
sys 0m0.155s
0 windhund /private/var/local/cpan_shell/build/Regexp-Common-2.120$ time perl 
-Ilib t/number/integer.t   /dev/null

real0m2.051s
user0m2.020s
sys 0m0.018s

TAPx::Parser has not yet been optimized so some profiling information would 
likely prove interesting.  I would much rather improve the parser than change 
the protocol.


Re: Sparse Test Output

2006-10-25 Thread Michael G Schwern
Michael G Schwern wrote:
 It does add significant overhead.  Here's the example of one of 
 Regexp::Common's tests.
 
 0 windhund /private/var/local/cpan_shell/build/Regexp-Common-2.120$ time perl 
 -Ilib ~/tmp/strip_ok t/number/integer.t 
 1..23534
 
 real0m4.882s
 user0m5.469s
 sys 0m0.155s
 0 windhund /private/var/local/cpan_shell/build/Regexp-Common-2.120$ time perl 
 -Ilib t/number/integer.t   /dev/null
 
 real0m2.051s
 user0m2.020s
 sys 0m0.018s
 
 TAPx::Parser has not yet been optimized so some profiling information would 
 likely prove interesting.  I would much rather improve the parser than change 
 the protocol.

A Devel::DProf run of the above is attached.  I don't trust the times but I do 
trust the counts.  Here's the top routines to get an idea...

Total Elapsed Time = 4.270165 Seconds
  User+System Time =0 Seconds
Exclusive Times
%Time ExclSec CumulS #Calls sec/call Csec/c  Name
 0.00   1.300  3.318 11   0. 0.  TAPx::Parser::Streamed::_lex
 0.00   1.279  1.268 10   0. 0.  TAPx::Parser::_aggregate_results
 0.00   0.910  1.129 10   0. 0.  
TAPx::Parser::Grammar::_make_test_token
 0.00   0.889  2.277 900025   0. 0.  TAPx::Parser::__ANON__
 0.00   0.700  0.700 11   0. 0.  TAPx::Parser::Iterator::FH::next
 0.00   0.600  1.729 11   0. 0.  TAPx::Parser::Grammar::__ANON__
 0.00   0.340  0.340 11   0. 0.  TAPx::Parser::Results::new
 0.00   0.220  0.220 50   0. 0.  TAPx::Parser::Grammar::_trim
 0.00   0.180  0.180 20   0. 0.  
TAPx::Parser::Results::Test::is_actual_ok
 0.00   0.170  0.170 22   0. 0.  TAPx::Parser::Results::type
 0.00   0.160  1.927 11   0. 0.  TAPx::Parser::_validate
 0.00   0.110  0.110 14   0. 0.  TAPx::Parser::Grammar::syntax_for
 0.00   0.100  0.040 10   0. 0.  TAPx::Parser::_check_ending_plan
 0.00   0.100  0.100 50   0. 0.  
TAPx::Parser::Results::Test::has_todo
 0.00   0.090  0.090 11   0. 0.  
TAPx::Parser::Iterator::FH::is_first
 0.00   0.080  0.080 33   0. 0.  TAPx::Parser::Results::Test::number
 0.00   0.050  0.050 11   0. 0.  TAPx::Parser::Grammar::token_types
 0.00   0.040  0.010 10   0. 0.  
TAPx::Parser::Results::Test::todo_failed
 0.00   0.020  0.020 10   0. 0.  
TAPx::Parser::Results::Test::has_skip
 0.00   0.020  0.020 11   0. 0.  TAPx::Parser::Grammar::handler_for
 0.00   0.010  0.010  9   0.0011 0.0011  TAPx::Parser::Results::BEGIN
 0.00   0.010  0.018  9   0.0011 0.0020  TAPx::Parser::BEGIN






dproff.out
Description: Binary data


Re: Sparse Test Output

2006-10-24 Thread Michael G Schwern
Adrian Howard wrote:
 
 On 5 Oct 2006, at 15:11, Paul Beckingham wrote:
 
 Recently I was required to create another flavor of test harness that
 runs tests, then captures and stores output.

 The nature of my testing means that I am running millions of tests,
 and the resultant captured output is therefore huge.  So I modified my
 tests to do the *equivalent* of the following:
 [snip]
 
 Personally I would say it would be better to leave the things generating
 the TAP output alone, and have the test harness only store the test
 failures. Ovid's TAPx::Parser makes this pretty darn easy, and it's not
 stupidly difficult with Test::Harness::Straps although it needs some
 sneakyness with private methods.

I'm with Adrian.  Printing out ok 100,000 times shouldn't be a big deal 
unless you're reading the TAP via some sort of IP over clay tablets protocol.  
But...

Test::Builder does store the test history and this can get prohibitive.  There 
is a wishlist item for this and will gladly accept a patch.  I originally 
intended this for Test::AtRuntime which can produce an infinite stream of TAP.
http://rt.cpan.org/Public/Bug/Display.html?id=8387

Something like:

$tb-no_history(1);

with the caveat that this makes summary() and details() no longer work.