Harald: Thanks for providing these answers; I appreciate your taking the time to walk me through your thought process on this. I suppose that the challenge is in conveying the 'rules' for interpreting the test data and the assertions/failures. Once you internalize the rules for yourself, I can see how that makes the tests (potentially) more understandable (like anything, once you know the rules you can better follow along with what's going on).
Steve Bohlen [email protected] http://blog.unhandled-exceptions.com http://twitter.com/sbohlen On Tue, Apr 26, 2011 at 9:10 AM, Harald Mueller <[email protected]>wrote: > Hi - > > > As an example, if/when any of these tests fail, it doesn't seem as if it > > would be trivial (or even straightforward) to determine the *meaning* of > > that failure without an inordinate amount of time spent in the debugger > (or > > much worse IMO, spent staring at the test data) to understand just what > > exactly failed. > > No, that's no so. I know this because some of the tests filed "quite > nicely" during development (I don't know about Patrick's experience, > however). The full input coverage creates quite a nice output. I number all > the objects per class with a static variable; so if you run a single test > (that failed in a regression run), you get a failure like > > Expected: <1, 3, 4, 5, 7> > But was: <1, 4, 5> > > So, you knwo that the condition did not work as expected on object 3. But > if you now look at the "Setters" definition, you see e.g. that it is > > Setters<TK, TBO1_I>(SetK1, SetBO1_I1) > > you know that there are 3 * 4 = 1 records in the primary table ("MyBO"), > which will have the values > > K1 BO1.I1 > ---------------- > 0 (null) null > 0 (null) null > 0 (null) 0 > 0 (null) 1 > 0 (zero) null > 0 (zero) null > 0 (zero) 0 > 0 (zero) 1 > etc. > > So you know that the first difference in the 3rd line, which in this case > is where K1 is 0 and BO1.I1 is also 0. > After a few failing tests, you can almost read the "failure data" off the > test output. > > HOWEVER: My introduction of ValueNull for all possible values creates > multiple absolutely identical object graphs. The reason is that - as you can > see above - for a non-nullable int property (like K1), both TK.ValueNull and > TK.Zero produce the same value. Reason: I cannot assign null to K1, so I > assign 0. A better implementation would throw an exception when trying to > assign null to K1; and by this, tell the data generator that this value does > not make sense --> then, these objects would not be generated at all. I'll > try to change this shortly - and check how much smaller the number of > objects then gets. > > > Some of these tests seem to lack clear assertions. > > All of them have the assertion that the Linq2Objects result is equal to the > NHib.Linq result, except for exceptional objects. That's just the - current > - definition of the semantics of NHib.Linq, except for special cases which > have to be handwritten anyway. > > > Presumably > > these are testing for no-exceptions-thrown(?) but this kind of test > failure > > isn't going to result in clarity-of-reason-for-failure. > > Yes, that would be too little to get out of tests. I avoid such trivial > tests as much as possible (if one has slipped through, it can only be a > hand-written one I forgot to upgrade to a useful one). > > > It could be that "this is in fact the best way to go about this" but > > intuitively it just doesn't feel 'right' to me for these tests to survive > as-is > > in the project. > > I agree - that would be useless. > > > Generative tests seem to have their purpose in re: their being a tool to > > 'prove out' via probing to find the proper behavior of the codebase (all > > edge cases, etc) but I'm honestly ambivalent about their long-term > maintenance > > prospects. > > The problem here seems to be to understand the right *space* to be tested: > It is *not* the states of objects, but the possible variations of > conditions. So each of these tests is, like all other unit tests, > hand-crafted to a special condition construct (e.g. doubly nested plus in > the most recent tests; something Patrick wanted because he found an error > there). The assertion to be fulfilled by the NHib.Linq machine is then that > the SQL is equivalent (in a defined sense) to some behavior. The only > alternative I can see is to write a prover that shows that the resulting SQL > is equivalent in that sense to the Linq query - something no-one would try; > and would accept as a "test", IMHO. > > > In a sense, this feels like the results of PEX where you look > > at the generated test suite and say "this was a valuable exercise, but > > there's little value in hanging on to it long-term". > > Well, did you try to "failurize" (or whatever you'd call that - "change the > code so that it fails in a way you designed") any test, e.g. by modifiyng > the condition is a semantically interesting way - e.g. adding existence > guards or the like) and then scrutinize the result? I cannot believe that > you then would have this "feeling" ... but I may be wrong if you did it! > Then, please, tell us about it! > > > > > As an example, I'd wonder what would happen in the following scenario: we > > decide to change the semantic behavior of some part of the LINQ provider > > (and thus, presumably, a number of these tests would then be broken). > Would > > the next step be modifying all the now-broken tests + dependent test data > > to get them to pass or would the next step be to discard all the existing > > tests and regenerate them all again? > > As I did just that weekend (look into the changeset!), you look at each > failing test; decide why it fails - in these cases, the semantics definition > of NHib.Linq is different from Linq2Objects in cases that we (at least > Patrick and I) habve not discussed before; and then either change the > implementation (if the test is right) or change the test (if you want the > implementation as is; and have a good explanation of why the test is wrong - > this is why I sent all of you the modified tests with the comments that > explain the complexity of the change). > > So, as far as I see it, you do exactly what you do with other tests. > > Why should you discard all these well-understood tests?? > > > > > I honestly don't have a well-formed opinion on this, but it sort of seems > > to me as if the answer to this Q would lead us to a better-informed > choice > > about the long-term viability of this generated test suite. > > > > Does this make any sense --? > > Do my answers make any? (Patrick, may I ask what you say ...?) > > Regards > Harald M. > > > -- > NEU: FreePhone - kostenlos mobil telefonieren und surfen! > Jetzt informieren: http://www.gmx.net/de/go/freephone >
