Harald:

Thanks for providing these answers; I appreciate your taking the time to
walk me through your thought process on this.  I suppose that the challenge
is in conveying the 'rules' for interpreting the test data and the
assertions/failures.  Once you internalize the rules for yourself, I can see
how that makes the tests (potentially) more understandable (like anything,
once you know the rules you can better follow along with what's going on).

Steve Bohlen
[email protected]
http://blog.unhandled-exceptions.com
http://twitter.com/sbohlen


On Tue, Apr 26, 2011 at 9:10 AM, Harald Mueller <[email protected]>wrote:

> Hi -
>
> > As an example, if/when any of these tests fail, it doesn't seem as if it
> > would be trivial (or even straightforward) to determine the *meaning* of
> > that failure without an inordinate amount of time spent in the debugger
> (or
> > much worse IMO, spent staring at the test data) to understand just what
> > exactly failed.
>
> No, that's no so. I know this because some of the tests filed "quite
> nicely" during development (I don't know about Patrick's experience,
> however). The full input coverage creates quite a nice output. I number all
> the objects per class with a static variable; so if you run a single test
> (that failed in a regression run), you get a failure like
>
>    Expected: <1, 3, 4, 5, 7>
>    But was: <1, 4, 5>
>
> So, you knwo that the condition did not work as expected on object 3. But
> if you now look at the "Setters" definition, you see e.g. that it is
>
>    Setters<TK, TBO1_I>(SetK1, SetBO1_I1)
>
> you know that there are 3 * 4 = 1 records in the primary table ("MyBO"),
> which will have the values
>
>    K1        BO1.I1
>    ----------------
>    0 (null)  null
>    0 (null)      null
>    0 (null)      0
>    0 (null)      1
>    0 (zero)  null
>    0 (zero)      null
>    0 (zero)      0
>    0 (zero)      1
> etc.
>
> So you know that the first difference in the 3rd line, which in this case
> is where K1 is 0 and BO1.I1 is also 0.
> After a few failing tests, you can almost read the "failure data" off the
> test output.
>
> HOWEVER: My introduction of ValueNull for all possible values creates
> multiple absolutely identical object graphs. The reason is that - as you can
> see above - for a non-nullable int property (like K1), both TK.ValueNull and
> TK.Zero produce the same value. Reason: I cannot assign null to K1, so I
> assign 0. A better implementation would throw an exception when trying to
> assign null to K1; and by this, tell the data generator that this value does
> not make sense --> then, these objects would not be generated at all. I'll
> try to change this shortly - and check how much smaller the number of
> objects then gets.
>
> > Some of these tests seem to lack clear assertions.
>
> All of them have the assertion that the Linq2Objects result is equal to the
> NHib.Linq result, except for exceptional objects. That's just the - current
> - definition of the semantics of NHib.Linq, except for special cases which
> have to be handwritten anyway.
>
> > Presumably
> > these are testing for no-exceptions-thrown(?) but this kind of test
> failure
> > isn't going to result in clarity-of-reason-for-failure.
>
> Yes, that would be too little to get out of tests. I avoid such trivial
> tests as much as possible (if one has slipped through, it can only be a
> hand-written one I forgot to upgrade to a useful one).
>
> > It could be that "this is in fact the best way to go about this" but
> > intuitively it just doesn't feel 'right' to me for these tests to survive
> as-is
> > in the project.
>
> I agree - that would be useless.
>
> > Generative tests seem to have their purpose in re: their being a tool to
> > 'prove out' via probing to find the proper behavior of the codebase (all
> > edge cases, etc) but I'm honestly ambivalent about their long-term
> maintenance
> > prospects.
>
> The problem here seems to be to understand the right *space* to be tested:
> It is *not* the states of objects, but the possible variations of
> conditions. So each of these tests is, like all other unit tests,
> hand-crafted to a special condition construct (e.g. doubly nested plus in
> the most recent tests; something Patrick wanted because he found an error
> there). The assertion to be fulfilled by the NHib.Linq machine is then that
> the SQL is equivalent (in a defined sense) to some behavior. The only
> alternative I can see is to write a prover that shows that the resulting SQL
> is equivalent in that sense to the Linq query - something no-one would try;
> and would accept as a "test", IMHO.
>
> > In a sense, this feels like the results of PEX where you look
> > at the generated test suite and say "this was a valuable exercise, but
> > there's little value in hanging on to it long-term".
>
> Well, did you try to "failurize" (or whatever you'd call that - "change the
> code so that it fails in a way you designed") any test, e.g. by modifiyng
> the condition is a semantically interesting way - e.g. adding existence
> guards or the like) and then scrutinize the result? I cannot believe that
> you then would have this "feeling" ... but I may be wrong if you did it!
> Then, please, tell us about it!
>
> >
> > As an example, I'd wonder what would happen in the following scenario: we
> > decide to change the semantic behavior of some part of the LINQ provider
> > (and thus, presumably, a number of these tests would then be broken).
>  Would
> > the next step be modifying all the now-broken tests + dependent test data
> > to get them to pass or would the next step be to discard all the existing
> > tests and regenerate them all again?
>
> As I did just that weekend (look into the changeset!), you look at each
> failing test; decide why it fails - in these cases, the semantics definition
> of NHib.Linq is different from Linq2Objects in cases that we (at least
> Patrick and I) habve not discussed before; and then either change the
> implementation (if the test is right) or change the test (if you want the
> implementation as is; and have a good explanation of why the test is wrong -
> this is why I sent all of you the modified tests with the comments that
> explain the complexity of the change).
>
> So, as far as I see it, you do exactly what you do with other tests.
>
> Why should you discard all these well-understood tests??
>
> >
> > I honestly don't have a well-formed opinion on this, but it sort of seems
> > to me as if the answer to this Q would lead us to a better-informed
> choice
> > about the long-term viability of this generated test suite.
> >
> > Does this make any sense --?
>
> Do my answers make any? (Patrick, may I ask what you say ...?)
>
> Regards
> Harald M.
>
>
> --
> NEU: FreePhone - kostenlos mobil telefonieren und surfen!
> Jetzt informieren: http://www.gmx.net/de/go/freephone
>

Reply via email to