Re: Dude, where's my diagnostics? (Re: Halting on first test failure)

2008-01-12 Thread Ovid
--- Michael G Schwern <[EMAIL PROTECTED]> wrote:

> > Set a flag that T::B should quit when the next test result is
> > about to be recorded?
> 
> I guess it works, but it leaves you dead halfway through another test
> function which is weird.

The code I sent is conceptually similar, but since a failure is always
followed by a diag(), it quits when the next diag() is hit.  So far it
seems to work.

> > Load the debugger and set a breakpoint?
> 
> Oh, good one.  If the debugger wasn't so damned full of bugs that
> might just work as a general solution.

Agreed.  The debugger is too buggy.  Maybe Devel::Ebug?

Cheers,
Ovid

--
Buy the book  - http://www.oreilly.com/catalog/perlhks/
Perl and CGI  - http://users.easystreet.com/ovid/cgi_course/
Personal blog - http://publius-ovidius.livejournal.com/
Tech blog - http://use.perl.org/~Ovid/journal/


Re: Dude, where's my diagnostics? (Re: Halting on first test failure)

2008-01-12 Thread Ovid
--- Michael G Schwern <[EMAIL PROTECTED]> wrote:

> PS  Couldn't you have the TAP harness kill the test process on first
> failure?

I then have even less control over the diagnostics than I would if
Test::Builder handled this responsibility.

It's also an improper separation of concerns.  Test::Builder produces
the output and the harness interprets the output.  BAIL OUT isn't an
exception since that's telling the harness to stop running more tests
but it's Test::Builder which handles the termination:

  sub BAIL_OUT {
 my($self, $reason) = @_;

 $self->{Bailed_Out} = 1;
 $self->_print("Bail out!  $reason");
 exit 255;
 }

Cheers,
Ovid

--
Buy the book  - http://www.oreilly.com/catalog/perlhks/
Perl and CGI  - http://users.easystreet.com/ovid/cgi_course/
Personal blog - http://publius-ovidius.livejournal.com/
Tech blog - http://use.perl.org/~Ovid/journal/


Re: Dude, where's my diagnostics? (Re: Halting on first test failure)

2008-01-12 Thread Michael G Schwern
Aristotle Pagaltzis wrote:
> * Michael G Schwern <[EMAIL PROTECTED]> [2008-01-12 12:00]:
>> Ovid wrote:
>>> I'll go fix that diagnostic thing now. Unfortunately, I
>>> think I'll have to violate encapsulation :(
>> If you know how to fix it let me know, because other than
>> enumerating each testing module you might use and lex-wrapping
>> all the functions they export, I'm not sure how to do it.
> 
> Set a flag that T::B should quit when the next test result is
> about to be recorded?

I guess it works, but it leaves you dead halfway through another test function
which is weird.


>> One possibility involves taking advantage of $Level, so at
>> least Test::Builder knows which is the test function the user
>> called, and then, somehow, inserting the code necessary to
>> cause failure when that function exits. I don't know how you
>> insert code to run when a function that's already being
>> executed exits.
> 
> Load the debugger and set a breakpoint?

Oh, good one.  If the debugger wasn't so damned full of bugs that might just
work as a general solution.


-- 
Robrt:   People can't win
Schwern: No, but they can riot after the game.


Re: Dude, where's my diagnostics? (Re: Halting on first test failure)

2008-01-12 Thread Michael G Schwern
Ovid wrote:
> --- Michael G Schwern <[EMAIL PROTECTED]> wrote:
> 
>> The whole idea of halting on first failure was introduced to me by
>> some XUnit
>> folks ... As any field scientist knows, there's no such thing as
>> uncontaminated data.
> 
> As any tester knows, a "one size fits all suit" often doesn't fit.  Let
> people decide for themselves when a particular method of testing is
> appropriate.  I hate "you must halt testing on a failure" as much as I
> hate "you must not halt testing on failure".  It's not XOR.

When it comes to failure, I like to err on the side of more information.


> There's a certain irony that beginning testers are often told to fix
> the *first* error *first* and subsequent errors go away.  I'm not
> saying this is a silver bullet to solve testing, but sometimes it's
> very useful.  

That's the general idea for dealing with syntax errors, too.

The trick is, you don't know ahead of time whether the information from the
follow on failures will prove to be useful.  Can't tell until you see it.  So
don't freak out over all the subsequent failures, fix the first thing and
re-run is a decent plan, but you can't just ignore them either.


> I am feeling a bit stupid because I can't figure out your conclusion. 
> Humor me.  At times it sounds like you're telling people not to do this
> and at times it sounds like you're telling people it's hard to do with
> Test::Builder :)

Yes, I'm saying both.  I don't like it AND it's appears impossible to do right
with TB.  Though I do still ponder how to make it work anyway.


PS  Couldn't you have the TAP harness kill the test process on first failure?

-- 
24. Must not tell any officer that I am smarter than they are, especially
if it’s true.
-- The 213 Things Skippy Is No Longer Allowed To Do In The U.S. Army
   http://skippyslist.com/?page_id=3


Re: Dude, where's my diagnostics? (Re: Halting on first test failure)

2008-01-12 Thread Aristotle Pagaltzis
* Michael G Schwern <[EMAIL PROTECTED]> [2008-01-12 12:00]:
> Ovid wrote:
> > I'll go fix that diagnostic thing now. Unfortunately, I
> > think I'll have to violate encapsulation :(
> 
> If you know how to fix it let me know, because other than
> enumerating each testing module you might use and lex-wrapping
> all the functions they export, I'm not sure how to do it.

Set a flag that T::B should quit when the next test result is
about to be recorded?

> One possibility involves taking advantage of $Level, so at
> least Test::Builder knows which is the test function the user
> called, and then, somehow, inserting the code necessary to
> cause failure when that function exits. I don't know how you
> insert code to run when a function that's already being
> executed exits.

Load the debugger and set a breakpoint?

Regards,
-- 
Aristotle Pagaltzis // 


Re: Dude, where's my diagnostics? (Re: Halting on first test failure)

2008-01-12 Thread Ovid
--- Michael G Schwern <[EMAIL PROTECTED]> wrote:

> The whole idea of halting on first failure was introduced to me by
> some XUnit
> folks ... As any field scientist knows, there's no such thing as
> uncontaminated data.

As any tester knows, a "one size fits all suit" often doesn't fit.  Let
people decide for themselves when a particular method of testing is
appropriate.  I hate "you must halt testing on a failure" as much as I
hate "you must not halt testing on failure".  It's not XOR.

There's a certain irony that beginning testers are often told to fix
the *first* error *first* and subsequent errors go away.  I'm not
saying this is a silver bullet to solve testing, but sometimes it's
very useful.  

I am feeling a bit stupid because I can't figure out your conclusion. 
Humor me.  At times it sounds like you're telling people not to do this
and at times it sounds like you're telling people it's hard to do with
Test::Builder :)

Cheers,
Ovid

--
Buy the book  - http://www.oreilly.com/catalog/perlhks/
Perl and CGI  - http://users.easystreet.com/ovid/cgi_course/
Personal blog - http://publius-ovidius.livejournal.com/
Tech blog - http://use.perl.org/~Ovid/journal/


Re: Dude, where's my diagnostics? (Re: Halting on first test failure)

2008-01-12 Thread Michael G Schwern
The whole idea of halting on first failure was introduced to me by some XUnit
folks.  Their rationale was not to avoid spewing output, they had no such
problem since it's all done via a GUI, but that once one failure has happened
the failing code might hose the environment and all following results are now
considered contaminated.  This might make sense in a laboratory, but it seems
a bit like overkill in for day-to-day software testing throwing out perfectly
fine data.  As any field scientist knows, there's no such thing as
uncontaminated data.

The idea that you can diagnose everything from the first failure reminded me
of a gag about tech support that goes something like this:
http://www.netfunny.com/rhf/jokes/97/Oct/techsupport.html

TECH: "Ridge Hall computer assistant; may I help you?"

CUST: "Yes, well, I'm having trouble with WordPerfect."

TECH: "What sort of trouble?"

CUST: "Well, I was just typing along, and all of a sudden the words went
away."

TECH: "Went away?"

CUST: "They disappeared."

TECH: "Hmm. So what does your screen look like now?"

CUST: "Nothing."

TECH: "Nothing?"

CUST: "It's blank; it won't accept anything when I type."

TECH: "Are you still in WordPerfect, or did you get out?"

CUST: "How do I tell?"

TECH: "Can you see the "C" prompt on the screen?"

CUST: "What's a sea-prompt?"

TECH: "Never mind. Can you move the cursor around on the screen?"

CUST: "There isn't any cursor: I told you, it won't accept anything I
type."

TECH: "Does your monitor have a power indicator?"

CUST: "What's a monitor?"

TECH: "It's the thing with the screen on it that looks like a TV. Does it
have a little light that tells you when it's on?"

CUST: "I don't know."

TECH: "Well, then look on the back of the monitor and find where the power
cord goes into it. Can you see that?"

CUST: "...Yes, I think so."

TECH: "Great! Follow the cord to the plug, and tell me if it's plugged into
the wall."

CUST: "...Yes, it is."

TECH: "When you were behind the monitor, did you notice that there were two
cables plugged into the back of it, not just one?"

CUST: "No."

TECH: "Well, there are. I need you to look back there again and find the
other cable."

CUST: "...Okay, here it is."

TECH: "Follow it for me, and tell me if it's plugged securely into the back
of your computer."

CUST: "I can't reach."

TECH: "Uh huh. Well, can you see if it is?"

CUST: "No."

TECH: "Even if you maybe put your knee on something and lean way over?"

CUST: "Oh, it's not because I don't have the right angle-it's because it's
dark."

TECH: "Dark?"

CUST: "Yes-the office light is off, and the only light I have is coming in
from the window."

TECH: "Well, turn on the office light then."

CUST: "I can't."

TECH: "No? Why not?"

CUST: "Because there's a power outage."

TECH: "A power... a power outage? Aha! Okay, we've got it licked now.  Do
you still have the boxes and manuals and packing stuff your computer came
in?"

CUST: "Well, yes, I keep them in the closet."

TECH: "Good! Go get them, and unplug your system and pack it up just like
it was when you got it. Then take it back to the store you bought it from."

CUST: "Really? Is it that bad?"

TECH: "Yes, I'm afraid it is."

CUST: "Well, all right then, I suppose. What do I tell them?"

TECH: "Tell them you're too stupid to own a computer."


-- 
94. Crucifixes do not ward off officers, and I should not test that.
-- The 213 Things Skippy Is No Longer Allowed To Do In The U.S. Army
   http://skippyslist.com/?page_id=3


Re: Dude, where's my diagnostics? (Re: Halting on first test failure)

2008-01-12 Thread Michael G Schwern
Ovid wrote:
> I'll go fix that diagnostic thing now.  Unfortunately, I think I'll
> have to violate encapsulation :(

If you know how to fix it let me know, because other than enumerating each
testing module you might use and lex-wrapping all the functions they export,
I'm not sure how to do it.  Test::Builder could cheat and register each module
as they load Test::Builder, but that relies on their using Test::Builder and
not requiring it.  Or builder() or new() could do the registration, but
there's no guarantee that they'll be called in the right package... however,
it is very likely.

One possibility involves taking advantage of $Level, so at least Test::Builder
knows which is the test function the user called, and then, somehow, inserting
the code necessary to cause failure when that function exits.  I don't know
how you insert code to run when a function that's already being executed exits.

This is why I altered the recommended calling conventions for Test::Builder to
call ->builder at the beginning of each function rather than just use one
global.  Then at least I can use the builder object's DESTROY method to
indicate the end of a test to trigger this stuff.

There is, of course, a way to eliminate the problem at the source.  Since the
issue is the spewing test output and then having to scroll up to find the
original point of failure, perhaps the solution is not to truncate the output
but to use something better than just raw terminal output.  If only there was
something that could... I don't know... read the TAP and error messages and
produce a nicer output.  Some sort of TAP parser... :P


-- 
 What we learned was if you get confused, grab someone and swing
  them around a few times
-- Life's lessons from square dancing


Re: Dude, where's my diagnostics? (Re: Halting on first test failure)

2008-01-12 Thread Ovid
--- Geoffrey Young <[EMAIL PROTECTED]> wrote:

> 
> > There are two usual rebuttals.  
> 
> the third being "just add it and let me decide"
> 
> :)

Right in one!

I'll go fix that diagnostic thing now.  Unfortunately, I think I'll
have to violate encapsulation :(

As a side note:  after reading everything Michael wrote, I remember an
incident at the company he mentioned where the test suites died on
failure (I *hated* that feature because it was mandatory).  One day I
was in a meeting with a vice president and he mentioned to a programmer
that a particular "adjustmentment percentange" on a page had changed
and the customers (a bunch of very powerful prima donnas) wanted it
changed back.

The programmer took a huge amount of time explaining why the number had
changed, drawing diagrams on the board, explaining the new statistical
formulas which were being used for the calculations and making it very,
very clear why the *new* number was correct.  The vice president's eyes
glazed over and when the programmer was done, the veep said "yeah, but
can you just change the fucking number back?"

So the programmer did and the customers were happy.

Cheers,
Ovid

--
Buy the book  - http://www.oreilly.com/catalog/perlhks/
Perl and CGI  - http://users.easystreet.com/ovid/cgi_course/
Personal blog - http://publius-ovidius.livejournal.com/
Tech blog - http://use.perl.org/~Ovid/journal/


Re: Dude, where's my diagnostics? (Re: Halting on first test failure)

2008-01-11 Thread Geoffrey Young


There are two usual rebuttals.  


the third being "just add it and let me decide"

:)

--Geoff


Re: Dude, where's my diagnostics? (Re: Halting on first test failure)

2008-01-11 Thread demerphq
On 12/01/2008, Michael G Schwern <[EMAIL PROTECTED]> wrote:
> --
> 184. When operating a military vehicle I may *not* attempt something
>  "I saw in a cartoon".
> -- The 213 Things Skippy Is No Longer Allowed To Do In The U.S. Army
>http://skippyslist.com/?page_id=3

That was one of the funniest things i have read in quite a long time.

yves

-- 
perl -Mre=debug -e "/just|another|perl|hacker/"


Dude, where's my diagnostics? (Re: Halting on first test failure)

2008-01-11 Thread Michael G Schwern
Ovid wrote:
> I've posted a trimmed down version of the custom 'Test::More' we use
> here:
> 
>   http://use.perl.org/~Ovid/journal/35363
> 
> I can't recall who was asking about this, but you can now do this:
> 
>   use Our::Test::More 'no_plan', 'fail';
> 
> If 'fail' is included in the import list, the test program will die
> immediately after the first failure.  VERY HANDY at times.

I've experimented with this idea in the past to use Test::Builder to replace
home rolled "die on failure" assert() style test suites.  Unfortunately
there's a major problem:

$ perl -wle 'use OurMore "fail", "no_plan";  is 23, 42'
not ok 1
#   Failed test at /usr/local/perl/5.8.8/lib/Test/More.pm line 329.
Test failed.  Halting at OurMore.pm line 44.
1..1

Dude, where's my diagnostics?

In Test::Builder, the diagnostics are printed *after* the test fails.  So
dying on ok() will kill those very important diagnostics.  Sure, you don't
have to read a big list of garbage but now you don't have anything to read at 
all!

Since the diagnostics are printed by a calling function outside of
Test::Builder's control (even if you cheated and wrapped all of Test::More
there's all the Test modules on CPAN, too) I'd considered die on failure
impossible. [1]  The diagnostics are far more important.


Now, getting into opinion, I really, really hate die on failure.  I had to use
a system that implemented it for a year (Ovid knows just what I'm talking
about) and I'd rather scroll up through an occasional burst of errors and
warnings then ever not be able to fully diagnose a bug because a test bailed
out before it was done giving me all the information I needed to fix it.  For
example, let's look at the ExtUtils::MakeMaker tests for generating a PPD file.

ok( open(PPD, 'Big-Dummy.ppd'), '  .ppd file generated' );
my $ppd_html;
{ local $/; $ppd_html =  }
close PPD;
like( $ppd_html, qr{^}m,
   '  ' );
like( $ppd_html, qr{^\s*Big-Dummy}m,'  '   );
like( $ppd_html, qr{^\s*Try "our" hot dog's}m,
   '  ');
like( $ppd_html,
  qr{^\s*Michael G Schwern <[EMAIL PROTECTED]>}m,
   '  '  );
like( $ppd_html, qr{^\s*}m,  '  ');
like( $ppd_html, qr{^\s*}m,
   '  ' );
like( $ppd_html, qr{^\s*}m,
   '  '  );
my $archname = $Config{archname};
$archname .= "-". substr($Config{version},0,3) if $] >= 5.008;
like( $ppd_html, qr{^\s*}m,
   '  ');
like( $ppd_html, qr{^\s*}m,'  ');
like( $ppd_html, qr{^\s*}m,   '  ');
like( $ppd_html, qr{^\s*}m,  '  ');

Let's say the first like() fails.  So you go into the PPD code and fix that.
Rerun the test.  Oh, the second like failed.  Go into the PPD code and fix
that.  Oh, the fifth like failed.  Go into the PPD code and fix that...

Might it be faster and useful to see all the related failures at once?

And then sometimes tests are combinatorial.  A failure of A means one thing
but A + B means another entirely.

Again, let's look at the MakeMaker test to see if files got installed.

ok( -e $files{'dummy.pm'}, '  Dummy.pm installed' );
ok( -e $files{'liar.pm'},  '  Liar.pm installed'  );
ok( -e $files{'program'},  '  program installed'  );
ok( -e $files{'.packlist'},'  packlist created'   );
ok( -e $files{'perllocal.pod'},'  perllocal.pod created' );

If the first test fails, what does that mean?  Well, it could mean...

A)  Only Dummy.pm failed to get installed and it's a special case.
B)  None of the .pm files got installed, but everything else installed ok.
C)  None of the .pm files or the programs got installed, but the
generated files are ok
D)  Nothing got installed and the whole thing is broken.

Each of these things suggests different debugging tactics.  But with a "die on
failure" system they all look exactly the same.


Oooh, and if you're the sort of person that likes to use the debugger it's
jolly great fun to have the test suite just KILL THE PROGRAM when you want to
diagnose a post-failure problem.


There are two usual rebuttals.  The first is "well just turn off
die-on-faillure and rerun the test."  Ovid's system is at least capable of
being turned off, many hard code "failure == die".  Unfortunately Ovid's is at
the file level, it should be at the user level since the "do I or do I not
want to see the gobbledygook" is more a user preference.

But we all know the problems with the "just rerun the tests" approach.

Maybe re-running the tests just isn't possible, or it's really slow to do so?
 What if these are tests on a shipped module and all you've got is an email
with a cut & pasted report?  Now you've lost time waiting for the user to
rerun the tests with a special flag set... assuming you hea