Re: Sparse Test Output
On Thursday 26 October 2006 07:37, Paul Beckingham wrote: To each his own, but my thoughts were not that it takes time to parse, or that there is any unreasonable performance issue here - just that it is so completely *unnecessary* to say ok lots of times. Crazy thought: don't call ok() so many times. -- c
Re: Sparse Test Output
Michael Peters wrote: Jonathan Rockway wrote: That said, I am willing to clean it up when I have time, and hopefully be able to provide a libtap that makes TAP support easy for any language. There is already a libtap - http://www.onlamp.com/pub/a/onlamp/2006/01/19/libtap.html Indeed. Is anyone actually using it for anything? N
Re: Sparse Test Output
--- Michael G Schwern [EMAIL PROTECTED] wrote: TAPx::Parser has not yet been optimized so some profiling information would likely prove interesting. I would much rather improve the parser than change the protocol. This has been a concern of mine for quite some time as I've noticed this, too. However, I've needed a break from the parser for a bit and with the current upheaval of my company shutting our offices and moving to London, I've been a bit distracted. Cheers, Ovid -- Buy the book -- http://www.oreilly.com/catalog/perlhks/ Perl and CGI -- http://users.easystreet.com/ovid/cgi_course/
Re: Sparse Test Output
On Wed, Oct 25, 2006 at 03:53:58PM -0700, Michael G Schwern wrote: I am totally unconvinced. I would love to see some profiling on this. I would hope that the act of running 50,000 tests would swamp the cost of parsing 50,000 tests. I don't have the exact numbers any more, but had experimented with multithreading the core tests (needs some refactoring of Test::Harness to do properly) I had 1 process parsing the TAP streams, with the others running the tests and generating the TAP. Above about 7 processes in parallel the TAP parser process was the bottleneck. This was on a 32 core Sun, so I had plenty of headroom to run more TAP generating threads. Nicholas Clark
Re: Sparse Test Output
This raises the question, why are you storing the raw results rather than storing the results after having been run through a TAP parser? I am storing all test results, from all test runs, to drive a tinderbox report with drill-down capability. I can (and do) post- process the TAP output and strip the ^ok lines to achieve part of my goal. ...for a large product. Its important to know that a test ran. Absoutely. Or perhaps to invert that, it's important to know when a test did not run. Either way, there is no loss of information. Essentially, I'm unconvinced that this is A) a real performance problem, running 10 million tests should swamp parsing 10 million TAP lines and B) can't be solved with a more flexible TAP parser than Test::Harness such as Ovid's TAPx::Parser. For me, it's not about speed, just about the inefficiency. Consider representing N tests as 1..N ok 1 ok 2 not ok 3 ok 4 ... ok N -or- 1..N sparse not ok 3 There is no loss of information, but there is a saving as N approaches and exceeds 4. Less so with test names, diag output etc. One of the benefits of having a decent protocol is that you don't have to keep changing the protocol to do interesting things. Agreed - stable protocols are important, and this is indeed a good protocol. But does that rule out any future modifications? What if the protocol could be improved (questionably) to consume fewer bits without data loss, then there are some folks (one so far) who could really benefit. Paul
Re: Sparse Test Output
On Oct 26, 2006, at 9:37 AM, Paul Beckingham wrote: and waiting for the test output/TAP to be parsed takes quite a while. Just my $0.02 worth. -=Chris To each his own, but my thoughts were not that it takes time to parse, or that there is any unreasonable performance issue here - just that it is so completely *unnecessary* to say ok lots of times. It's only unnecessary if the test program doesn't crash! Compare ease of debugging for these two cases: 1..10_000_000 sparse *segfault* vs. 1..10_000_000 ok 1 - use_ok Foo::Bar ok 2 - input = 0 ok 3 - input = 1 [... snip ...] ok 99 - input = 97 ok 100 - input = 98 ok 101 - input = 99 *segfault* In the former case, perhaps all 10_000_000 tests passed and the test program failed in END{}, or perhaps it died on the first test. You just don't know without re-running the tests in a debugger, or turning off sparse. In the latter case, the harness can tell you how far along you were before the tests failed. If the harness is smart enough to report the last test result before the crash, you can even guess that input = 100 caused the crash. In even scarier terms, would you prefer to see an RT or smoke report about your module with the former or the latter detail? Chris -- Chris Dolan, Software Developer, Clotho Advanced Media Inc. 608-294-7900, fax 294-7025, 1435 E Main St, Madison WI 53703 vCard: http://www.chrisdolan.net/ChrisDolan.vcf Clotho Advanced Media, Inc. - Creators of MediaLandscape Software (http://www.media-landscape.com/) and partners in the revolutionary Croquet project (http://www.opencroquet.org/) -- Chris Dolan, Software Developer, http://www.chrisdolan.net/ Public key: http://www.chrisdolan.net/public.key vCard: http://www.chrisdolan.net/ChrisDolan.vcf
Re: Sparse Test Output
Paul Beckingham wrote: and waiting for the test output/TAP to be parsed takes quite a while. Just my $0.02 worth. -=Chris To each his own, but my thoughts were not that it takes time to parse, or that there is any unreasonable performance issue here - just that it is so completely *unnecessary* to say ok lots of times. The wonderful thing about computers is they're really good at handling lots and lots and lots of data. http://www.angryflower.com/allthe.gif
Re: Sparse Test Output
Jonathan Rockway wrote: That said, I am willing to clean it up when I have time, and hopefully be able to provide a libtap that makes TAP support easy for any language. There is already a libtap - http://www.onlamp.com/pub/a/onlamp/2006/01/19/libtap.html -- Michael Peters Developer Plus Three, LP
Re: Sparse Test Output
There is already a libtap Ah! Thanks for pointing that out. I was planning a libtap to *parse* TAP, whereas this libtap *emits* TAP. If necessary, the name could be changed to libparsetap or something :) -- package JAPH;use Catalyst qw/-Debug/;($;=JAPH)-config(name = do { $,.=reverse qw[Jonathan tsu rehton lre rekca Rockway][$_].[split //, ;$;]-[$_].q; ;for 1..4;$,=~s;^.;;;$,});$;-setup;
Re: Sparse Test Output
* Jonathan Rockway [EMAIL PROTECTED] [2006-10-26 21:15]: If necessary, the name could be changed to libparsetap or something :) Maybe Yet Another Parser for TAP – libyaptap. :^) Regards, -- Aristotle Pagaltzis // http://plasmasturm.org/
Re: Sparse Test Output
I'm with Adrian. Printing out ok 100,000 times shouldn't be a big deal unless you're reading the TAP via some sort of IP over clay tablets protocol. But... My test estimate is two orders of magnitude larger, so it actually is a big deal to capture and store those results. But I would like to point out that there is a very low information density in 100,000 ok messages. I was looking to see what folks thought of an *optional* feature that doesn't bother outputting the ok messages, just the interesting ones. Hence the sparse. The basic idea was that instead of: 1..10_000_000 ok 1 ok 2 ... not ok 123 ... ok 10_000_000 The output could be collapsed to the following, with no loss of important information: 1..10_000_000 sparse not ok 123 Yes, I could achieve this with grep, sed, Perl etc, but I thought it might make a nice addition to TAP. Paul.
Re: Sparse Test Output
Paul Beckingham wrote: I'm with Adrian. Printing out ok 100,000 times shouldn't be a big deal unless you're reading the TAP via some sort of IP over clay tablets protocol. But... My test estimate is two orders of magnitude larger, so it actually is a big deal to capture and store those results. But I would like to point out that there is a very low information density in 100,000 ok messages. I was looking to see what folks thought of an *optional* feature that doesn't bother outputting the ok messages, just the interesting ones. Hence the sparse. The basic idea was that instead of: 1..10_000_000 ok 1 ok 2 ... not ok 123 ... ok 10_000_000 The output could be collapsed to the following, with no loss of important information: 1..10_000_000 sparse not ok 123 Yes, I could achieve this with grep, sed, Perl etc, but I thought it might make a nice addition to TAP. Paul. I'm in the same boat. Recently, I've started testing my environment when things go wrong. (I blame Andy). I have one test alone that has a test count of 500,000+. That's a lot of oks to be processed, when I only want the ones that didn't pass. Now, add in a few more tests that easily go into the 50,000 range in addition to 'the bi one', and waiting for the test output/TAP to be parsed takes quite a while. Just my $0.02 worth. -=Chris signature.asc Description: OpenPGP digital signature
Re: Sparse Test Output
Christopher H. Laco wrote: I'm in the same boat. Recently, I've started testing my environment when things go wrong. (I blame Andy). I have one test alone that has a test count of 500,000+. That's a lot of oks to be processed, when I only want the ones that didn't pass. Now, add in a few more tests that easily go into the 50,000 range in addition to 'the bi one', and waiting for the test output/TAP to be parsed takes quite a while. Just my $0.02 worth. I am totally unconvinced. I would love to see some profiling on this. I would hope that the act of running 50,000 tests would swamp the cost of parsing 50,000 tests.
Re: Sparse Test Output
Michael G Schwern wrote: Christopher H. Laco wrote: I'm in the same boat. Recently, I've started testing my environment when things go wrong. (I blame Andy). I have one test alone that has a test count of 500,000+. That's a lot of oks to be processed, when I only want the ones that didn't pass. Now, add in a few more tests that easily go into the 50,000 range in addition to 'the bi one', and waiting for the test output/TAP to be parsed takes quite a while. Just my $0.02 worth. I am totally unconvinced. I would love to see some profiling on this. I would hope that the act of running 50,000 tests would swamp the cost of parsing 50,000 tests. Depends on what the tests are. If it's simply verifying that 500,000 pieces of data confirm to a certain format, then those kinds of tests aren't very expensive, or long running. Just for sake of curiosity, I'll profile one of the tests tomorrow just to see what happens. -=Chris signature.asc Description: OpenPGP digital signature
Re: Sparse Test Output
Paul Beckingham wrote: I'm with Adrian. Printing out ok 100,000 times shouldn't be a big deal unless you're reading the TAP via some sort of IP over clay tablets protocol. But... My test estimate is two orders of magnitude larger, so it actually is a big deal to capture and store those results. This raises the question, why are you storing the raw results rather than storing the results after having been run through a TAP parser? But I would like to point out that there is a very low information density in 100,000 ok messages. I was looking to see what folks thought of an *optional* feature that doesn't bother outputting the ok messages, just the interesting ones. Hence the sparse. Its important to know that a test ran. Essentially, I'm unconvinced that this is A) a real performance problem, running 10 million tests should swamp parsing 10 million TAP lines and B) can't be solved with a more flexible TAP parser than Test::Harness such as Ovid's TAPx::Parser. TAPx::Parser can be used to strip out all the ok lines, then you store what's left. One of the benefits of having a decent protocol is that you don't have to keep changing the protocol to do interesting things. The parser can do the interesting things. And we finally have an interesting parser in TAPx::Parser. Here ya go. #!/usr/bin/perl -w use strict; use TAPx::Parser; my $parser = TAPx::Parser-new( { source = $ARGV[0] } ); while( my $result = $parser-next ) { print $result-as_string, \n if !$result-is_test or !$result-is_ok; } It does add significant overhead. Here's the example of one of Regexp::Common's tests. 0 windhund /private/var/local/cpan_shell/build/Regexp-Common-2.120$ time perl -Ilib ~/tmp/strip_ok t/number/integer.t 1..23534 real0m4.882s user0m5.469s sys 0m0.155s 0 windhund /private/var/local/cpan_shell/build/Regexp-Common-2.120$ time perl -Ilib t/number/integer.t /dev/null real0m2.051s user0m2.020s sys 0m0.018s TAPx::Parser has not yet been optimized so some profiling information would likely prove interesting. I would much rather improve the parser than change the protocol.
Re: Sparse Test Output
Michael G Schwern wrote: It does add significant overhead. Here's the example of one of Regexp::Common's tests. 0 windhund /private/var/local/cpan_shell/build/Regexp-Common-2.120$ time perl -Ilib ~/tmp/strip_ok t/number/integer.t 1..23534 real0m4.882s user0m5.469s sys 0m0.155s 0 windhund /private/var/local/cpan_shell/build/Regexp-Common-2.120$ time perl -Ilib t/number/integer.t /dev/null real0m2.051s user0m2.020s sys 0m0.018s TAPx::Parser has not yet been optimized so some profiling information would likely prove interesting. I would much rather improve the parser than change the protocol. A Devel::DProf run of the above is attached. I don't trust the times but I do trust the counts. Here's the top routines to get an idea... Total Elapsed Time = 4.270165 Seconds User+System Time =0 Seconds Exclusive Times %Time ExclSec CumulS #Calls sec/call Csec/c Name 0.00 1.300 3.318 11 0. 0. TAPx::Parser::Streamed::_lex 0.00 1.279 1.268 10 0. 0. TAPx::Parser::_aggregate_results 0.00 0.910 1.129 10 0. 0. TAPx::Parser::Grammar::_make_test_token 0.00 0.889 2.277 900025 0. 0. TAPx::Parser::__ANON__ 0.00 0.700 0.700 11 0. 0. TAPx::Parser::Iterator::FH::next 0.00 0.600 1.729 11 0. 0. TAPx::Parser::Grammar::__ANON__ 0.00 0.340 0.340 11 0. 0. TAPx::Parser::Results::new 0.00 0.220 0.220 50 0. 0. TAPx::Parser::Grammar::_trim 0.00 0.180 0.180 20 0. 0. TAPx::Parser::Results::Test::is_actual_ok 0.00 0.170 0.170 22 0. 0. TAPx::Parser::Results::type 0.00 0.160 1.927 11 0. 0. TAPx::Parser::_validate 0.00 0.110 0.110 14 0. 0. TAPx::Parser::Grammar::syntax_for 0.00 0.100 0.040 10 0. 0. TAPx::Parser::_check_ending_plan 0.00 0.100 0.100 50 0. 0. TAPx::Parser::Results::Test::has_todo 0.00 0.090 0.090 11 0. 0. TAPx::Parser::Iterator::FH::is_first 0.00 0.080 0.080 33 0. 0. TAPx::Parser::Results::Test::number 0.00 0.050 0.050 11 0. 0. TAPx::Parser::Grammar::token_types 0.00 0.040 0.010 10 0. 0. TAPx::Parser::Results::Test::todo_failed 0.00 0.020 0.020 10 0. 0. TAPx::Parser::Results::Test::has_skip 0.00 0.020 0.020 11 0. 0. TAPx::Parser::Grammar::handler_for 0.00 0.010 0.010 9 0.0011 0.0011 TAPx::Parser::Results::BEGIN 0.00 0.010 0.018 9 0.0011 0.0020 TAPx::Parser::BEGIN dproff.out Description: Binary data
Re: Sparse Test Output
Adrian Howard wrote: On 5 Oct 2006, at 15:11, Paul Beckingham wrote: Recently I was required to create another flavor of test harness that runs tests, then captures and stores output. The nature of my testing means that I am running millions of tests, and the resultant captured output is therefore huge. So I modified my tests to do the *equivalent* of the following: [snip] Personally I would say it would be better to leave the things generating the TAP output alone, and have the test harness only store the test failures. Ovid's TAPx::Parser makes this pretty darn easy, and it's not stupidly difficult with Test::Harness::Straps although it needs some sneakyness with private methods. I'm with Adrian. Printing out ok 100,000 times shouldn't be a big deal unless you're reading the TAP via some sort of IP over clay tablets protocol. But... Test::Builder does store the test history and this can get prohibitive. There is a wishlist item for this and will gladly accept a patch. I originally intended this for Test::AtRuntime which can produce an infinite stream of TAP. http://rt.cpan.org/Public/Bug/Display.html?id=8387 Something like: $tb-no_history(1); with the caveat that this makes summary() and details() no longer work.