Hi Patrick - then let me jump in with a list of items on my mind:

a. Let's talk about design & code!
b. Dialects / database vendors to support
c. Still create table joins? or always build inner joins?
d. Unit testing vs. system testing
e. A first list of concrete tests to be done

Ad a. I think it is necessary to explain the design and code (decisions). Here 
is the most important item: I try to minimize the number of clauses and left 
joins. The most important part of this is the attempt to "pull up" the 
"existence clauses" as high as possible. What are "existence clauses?" Well, as 
I wrote in a previous posting, the whole idea is to rewrite (Linq)

    ... || ...a.b.c.d.Prop...
    
to (HQL)

    ... LEFT JOIN a.b AS a_b
    ... LEFT JOIN a_b.c AS a_b_c
    ... LEFT JOIN a_b_c.d AS a_b_c_d
    ...
    WHERE
        ... OR (a_b IS NOT NULL AND a_b_c IS NOT NULL AND a_b_c_d IS NOT NULL 
AND (...a.b.c.d.Prop...))
        
Thus, the objects on the "path" a.b.c.d are outer joined so that the || 
operator can still "succeed on the left side". But before anything is evaluated 
for a.b.c.d on the right side, it is guaranteed that the objects exist by the 
IS NOT NULL clauses. I call these clauses "existence clauses".

They are especially important in the case

    ... || a.b.Prop == null
    
which can NOT be transformed simply to

    ... LEFT JOIN a.b AS a_b
    ...
    WHERE
        ... OR a_b.Prop IS NULL

because this will obviously be true also when a_b is not at all present. 
Analyzing the condition for IS NULL etc. is in my experience (we tried it) 
futile, as you can have arbitrarily complex conditions like !(!(!(a.b.Prop != 
null))). So the (or rahter, one possible) correct rewrite is

    ... LEFT JOIN a.b AS a_b
    ...
    WHERE
        ... OR (a_b IS NOT NULL AND (a_b.Prop IS NULL))

Naive rewriting of this sort has a problem with a query expression like

    a.b.Prop == 4 || a.b.Prop == 5
    
This would yield

    ... LEFT JOIN a.b AS a_b
    ...
    WHERE
        (a_b IS NOT NULL AND (a_b.Prop == 4)) 
     OR (a_b IS NOT NULL AND (a_b.Prop == 5))

Obviously, neither the left join nor the existence clauses are necessary for 
this frequently occurring pattern. Therefore, the code "pulls up" existence 
clauses that occur on both sides of an ||.

However, I do *not* go the extra length of reordering clauses. For example,

    (a.b.Prop == 4 || a.OtherProp == 100) || a.b.Prop == 5
    
will create left join + existence clauses for a.b.

... "and so on": I could write here or in comments more about examples and 
considerations. However, maybe now you tune in with questions and comments of 
yours!

Ad b. Right now, I only test against SQL Server 2008 (Express and Standard). 
The || problem is independent of that. Against which databases should I (or we) 
test such changes? Something like SQL Server + MySql would sound useful to me 
...

Ad c. Right now, my code has an if at the end where it essentially says "if I 
did not produce an outer join, then don't do anything different" = "use table 
joins." 
I did this to not change the behavior in all cases that already work correctly 
- which could be important for applications that have tuned their indices to 
query plans resulting from the table joins. Personally, I'd prefer to have "the 
one and only" join method - i.e., always create inner and outer joins, never 
table joins.

Ad d. "System tests," for this code, are tests that set up a database and check 
whether the results are correct.
"Unit tests" would be tests that test only a single inner unit, e.g. the 
AddJoinReWriter or even smaller parts.

I have the gut feeling that with the non-trivial algorithms, I'd like to also 
write unit tests. Seeing problems only through the lens of the subsequent HQL 
machine is not enough, neither for exploratory testing (if someone else wants 
to understand what goes on) nor for bug fixing. Do you agree? ... although 
this, of course, adds more work.

Ad e. Last, but not least, here a quickly written list of possible test cases 
(inputs and partial output expectations) that should cover the code (which 
could be checked with a coverage tool). The notation is: 

- A, B, ... are Linq clauses
- xyzP is shorthand for a path x.y.P, where 
-- y and z are :1 associations, and 
-- P denotes one or more simple properties.
- IJ means "expected to be inner joined *without* existence clauses"
- OJ means "...to be outer joined *with* existence clauses"

For example, the first test case

              (A && B) && (C || D)
                    // xyP in A, C --> xy IJ

could be expanded to

     session.Query<MyBO>().Where(x =>
         (x.BO1.I1 == value1 && x.Flag1)
         &&
         (x.BO1.I2 == value2 || x.Flag2)

For this expression, it would now be necessary to do the following tests:

(a) Is the joining as designed - i.e., is BO1 joined via an inner join?

(b) Is the complete rewrite ok? In other words, one should test via "condition 
coverage," i.e., produce enough data so that each side of each && and || 
becomes true and false. I'd hope that there is some sort of tooling for this 
(maybe another Linq tree walker? :-) ) - would you know any?

Towards the end, the list gets more and more sketchy ...

                    // 1. One || tree: (A && B) && (C || D)
                    
                    // 1a. One path
                    // xyP in A, C --> xy IJ
                    // xyP in A, C, D --> xy IJ
                    // xyP in C, D --> xy IJ
                    // xyP in C --> xy OJ

                    // 1b. Two paths
                    // xyP in A, C; rsQ in C --> xy IJ, rs OJ
                    // xyP in A, C; rsQ in C, D --> xy IJ, rs IJ
                    // xyP in C, D; rsQ in C, D --> xy IJ, rs IJ
                    // xyP in C, rsQ in C --> xy OJ, rs OJ
                    // xyP in C, rsQ in D --> xy OJ, rs OJ

                    // 1c. Path and subpath
                    // wxyP in A, C --> IJ wx, IJ wxy
                    // wxy in C, wxyP D --> IJ wx, OJ wxy
                    // wxy in A, wxyP D --> IJ wx, OJ wxy

                    // 1d. Partially overlapping paths
                    // wxyP in A, C --> IJ wx, IJ wxy
                    // wxvP in C, wxyP D --> IJ wx, OJ wxv, OJ wxy
                    // wxvP in A, wxyP D --> IJ wx, IJ wxv, OJ wxy

                    // 2. ORDER BY: (A && B) && (C || D) ORDER BY E
                    
                    // xyP in E --> xy OJ
                    // ...and all of 1a. with xyP in E

                    // 3. SELECT: (A && B) && (C || D) SELECT F
                    
                    // xyP in F --> xy OJ
                    // xy in A, xyP in F --> xy OJ
                    // xyP in A, xyP in F --> xy IJ
                    // xyP in A, xy in F --> xy IJ
                    // ...and all of 1a. with xyP in F

                    // 4. SELECT and ORDER BY: (A && B) && (C || D) ORDER BY E 
SELECT F
                    
                    // ...all of 3. with xyP in E

                    // 5. Two || trees side by side: (A || B) && (C || D)
                    
                    // 5a. One path
                    // xyP in A, C --> xy OJ
                    // xyP in A, C, D --> xy IJ
                    // xyP in C, D --> xy IJ
                    // xyP in C --> xy OJ

                    // 5b. Two paths
                    // xyP in A, C; rsQ in C --> xy OJ, rs OJ
                    // xyP in A, C; rsQ in C, D --> xy OJ, rs IJ
                    // xyP in A, B; rsQ in C, D --> xy IJ, rs IJ
                    // xyP in C, rsQ in C --> xy OJ, rs OJ
                    // xyP in A, C, D; rsQ in A, B, D --> xy IJ, rs IJ

                    // 5c. Path and subpath
                    // wxyP in A, C --> OJ wx, OJ wxy
                    // ...and many more...???...
                    // ...also partial paths as IJ, then rest as OJ...???...

                    // 6. Two || trees side by side and an &&: (A || B) && (C 
|| D) && (E && F)
                    
                    // ...Idea: E filters out OJ from A and C ???
                    // ...???...

                    // 7. Nested || trees: (A || B) || (C || D)
                    
                    // ...TBD...

                    // 8. Deeply nested || trees: (A || (B || (C || D)))
                    
                    // ... all of 7. (semantically equivalent! - but not for 
the tree walking algorithms)

                    // 9. Any: Any(q => A || B)
                    
                    // ???

                    // 10. Nested Any: Any(e => A || Any(f => B)
                    
                    // ???

                    // 11. Nested Any with ||: Any(e => A || Any(f => B || C)
                    
                    // ???

                    // 12. ...Join???, Group???...

So much for some input.

Best regards
Harald M.

-------- Original-Nachricht --------
> Datum: Wed, 23 Mar 2011 15:57:02 -0600
> Von: Patrick Earl <[email protected]>
> An: [email protected]
> Betreff: Re: [nhibernate-development] NH-2583 - Query with || operator and 
> navigations (many-to-one) creates wrong joins

> Hi Harald.
> 
> I'd like to discuss it with you as well, though I haven't had time to
> get my brain into the problem yet.  I've seen your issue and
> appreciate the effort you're putting in.  If you don't get traction
> from me or somebody else soon, feel free to bug me again.  It's
> definitely an important case we need to take care of.
> 
>        Patrick Earl
> 

-- 
GMX DSL Doppel-Flat ab 19,99 Euro/mtl.! Jetzt mit 
gratis Handy-Flat! http://portal.gmx.net/de/go/dsl

Reply via email to