Ovid wrote:
> ----- Original Message ----
> From: Adam Kennedy <[EMAIL PROTECTED]>
> 
>> What I find surprising is the concept that a TODO test is assumed to fail.
>>
>> If I mark a test (block) as TODO, I would have thought by flagging some 
>> tests as not done yet then for the purposes of PASS/FAIL I don't care 
>> what the result it.
> 
> As a test author, you may not care, but as someone using your code, I most 
> assuredly DO care.  Consider:
> 
>   my $object = Object->new;
>   # do a bunch of stuff
>   $object->store;
> 
> Now imagine that the code blows up when I try to store the object.  If I only 
> use features 
> that are listed as 'OK' by the test suite, I should presume that they not 
> leave the object 
> in some sort of inconsistent state that store() fails on.  If TODO tests mark 
> a complete 
> feature and some parts of that feature pass but I don't notice that they've 
> unexpectedly 
> succeeded, I have no way of really trusting what state they'll leave my 
> object in.  Thus, 
> I can't trust TODO tests.

Let me poke some holes in this scenario.

You are, presumably, using an documented feature since one hopes the authors 
wouldn't document a feature which is not yet implemented.  If its a bug, bugs 
don't get documented anyway so you're no better off than you would be without 
the TODO test.

In order to find out what features are listed as "OK" you have presumably 
looked at the raw TAP output.  Most tests are displayed at a high enough 
granularity under normal operations for you to determine the pass/fail of 
individual features.  How did you misinterpret miss the "not ok # TODO" line as 
a pass?  It says right there "not ok".

Finally, this argument seems to be that "not ok # TODO" should fail which is to 
say "don't use TODO tests".  But the alternative is to SKIP or delete the test 
(presuming the author has not fixed the bug), both result in a passing test 
suite AND you can't even look at the TAP output to see the broken (yet TODO) 
test.  The alternative sucks more.


> 
> This is why we normalize databases and use transactions:  to make it 
> impossible (in theory) 
> to leave the database in an inconsistent state.
> 
> Let's look at something concrete:
> 
>   TODO: {
>     local $TODO = 'We do not have anti-gravity drives';
>     can_ok $spachship, 'engage_anti_grativity';
>     ok $spaceship->engage_anti_gravity,
>       'We can use anti-gravity';
>     ok $spaceship->is_floating,
>       'The spaceship is floating'
>   }
> 
> And if someone wrote the following stub:
> 
>   sub engage_anti_gravity { 1 }
> 
> Then under the current Test::Harness behavior, that would be reported as a 
> success, 
> the first two tests would be reported as success.
> 
> And the developer sees that engage_anti_gravity() passed so she writes:
> 
>   $spaceship->engage_anti_gravity;
> 
> And everybody dies.
> 
> That would be disappointing.

Same arguments apply as above about insufficient displayed test granularity.  
In order for a user to get into this scenario they have to look at the raw TAP.

In order for this scenario to happen the *developer*, not the user, is the one 
making the mistakes.  The developer must release code with a stub in it.  The 
developer is the one who is using the test harness which does not display TODO 
tests.  The developer shipped busted code.  At that point I'd say "your shit is 
broke, use a better harness".

And even with your proposed change, where passing TODO tests become failures, 
the developer could misinterpret it just as your proposed user does.  Maybe 
developer A added the stub and developer B, not knowing about the stub, removes 
the TODO wrapper.  Or maybe the developer just forgot.  The scenario continues 
unchanged.

The real problem is the test ain't so hot.  I would call "everybody dies" a 
condition I would like to test a little more thoroughly than "it returned true".

    ok( $spaceship->engage_anti_gravity );
    is( gravity(), 0 );

Reply via email to