Hi Patrick - Interesting :-) [all of the discussion!] - especially the argument
> Actually, a.P == a.Q is translated to ...P == ...Q. Without even reading the arguments, I risk saying that the logic thus defined is inconsistent, i.e. you can write (complex) queries where it is unclear whether the result should be true or false. But if people can live with this sort of "experimental logic" (i.e., program, try out what you get, then re-program), then so be it. EF and Linq2SQL are the same sort of thing ... and therefore making them a standard is a little bit "interesting". (SQL and C#, on the other hand, have deeply formally arguable consistency). But as I said: I can live with any sort of semantics. What I'd ask *you* (and/or others) to do before I change the code: Rewrite the test cases (which are now, in the best tradition of TDD and "test first", "specificational tests") so that they match your semantics. What I do not want is to myself change code *and* tests as I believe you meant it to behave - and then have a debate whether this was "right" or not. We are now lucky to have those 74 test cases I wrote (which happily test *all combinations*; so one could argue they are more like 300 or so test cases) - and many of them test null logic -, so you have the chance to describe at least for all these expressions and value combinations what we should get. If you think that agreement with (the current version of) EF is tantamount, you could simply run them through EF and say "that's the results we need". I'll then change the code so that it conforms to what you want! Rest of some remarks inline ... [...] > > There are more anomalies, like the &&-commutativity and > ||-commutativity, which are apparently accepted by all ... > > > What are you referring to here regarding the commutativity? Oh - just a little thing: && in Linq2Objects is not commutative (but a shortcut operator). So a.B != null && a.B.P == 4 is *not* the same as a.B.P == 4 && a.B != null On the other hand, C# always has had a commutative logical operator & a.B.Q == 3 & a.B.P == 4 is *exactly* the same as a.B.P == 4 & a.B.Q == 3 For reasons which I do not know we *do* use the non-commutative && operator in Linq expressions intended to be translated to the commutative SQL AND operator - but we do not support the (intentionally commutative) & operator. Same for || and |. It's just interesting how an eco-system of a language gets used down the road ... [...] > > If you really want to convince me (and probably many others), you have > to write down the *disadvantages* of our proposal. If they are visible - > i.e., there are examples for them -, then one can see and accept them. > > > These are the disadvantages I see, though they're not particularly formal. I missed a crucial letter, it seems: I want to know the disadvantages of *your* proposal in order to accept it. I know the disadvantages of mine :-) - except one: That keeping to a (maybe somewhat flawed) EF or Linq2SQL semantics is better than having a ... mhm, let me say "straight" semantics (I'd like to write "consistent" - but I did not provide consistency proofs for (||-5) either ...). Anyway, you summed up the problems quite nicely --> so what now about the disadvantages of your "outer join proposal"? > Of these, the first is of greatest importance. While the semantics > for a.B.C.D == null could be argued in either direction, the fact that > (x!=y) is not equivalent to !(x == y) is rather disturbing from a user > perspective IMHO. Just to re-iterate: We are *only* talking about the cases where Linq2Objects would throw an exception. In all other cases, this *is* equivalent. And the answer is: No, it's not disturbing for us. We have not had one problem with it over the years. Have you ever programmed against such a model? > To me, both > of the following mean a.B.C exists and is not equal to 1. I would not > expect to get rows where a.B is null when I perform the second query. > Mixing existence and equality is confusing IMHO. > > a.B.C != 1 > !(a.B.C == 1) > I see your point. I'm also quite sure that with this reasoning, you can get contradictory results ... here is an attempt to give an example just that you see how one might argue: Let's say you can navigate * from a to B and C, and C is (in your application) an integer >= 0; and also * from a to X and Y, and Y is also an integer >= 0. The starting point of NH-2583 was that for a.B.C != 1 || a.X.Y != 1 [X] to become true, it should be allowed that a.B is null when a.X.Y != 1. In other words, objects where a.B is null, but a.X != null and a.X.Y != 1 should be found by that query. By de'Morgan (which is true in all of C# and SQL), this is equivalent to !(a.B.C == 1 && a.X.Y == 1) [Y] I am not yet sure whether this includes "a.B != null && a.X != null" in your semantics. But I think we agree that under the conditions above, a.B.C == 1 && a.X.Y == 1 is equivalent to a.B.C * a.X.Y == 1 (the latter follows from the previous; but also the other way round - the only way to get a product one from non-negative integers is one times one). So the whole condition [Y] is equivalent to !(a.B.C * a.X.Y == 1) or, by your expectation, a.B.C * a.X.Y != 1 [Z] I assume now that you would expect that this implicitly includes the fact that both a.B and a.X are *not* null. But now we have a contradiction: The same condition, just formulated differently, * on the one hand should allow that a.B is null; (at [X]) * on the other hand should imply that a.B is not null (at [Z]). I do not say that you cannot define the operator logic like this. But probably, one should then add the "de'Morgan anomaly" to the documentation (i.e., in contrast to C# and SQL, de'Morgan no longer holds). Personally, to me this logic sounds "risky": On first glance, it is maybe more natural than others. But when you start arguing about more complex conditions, you get in muddy water like above. Maybe you can "define my example away" - but logic has often and often shown us that the more you keep to the fundamental laws of propositional and predicate logic, the less problems you get ... It reminds me somewhat of my children's asking "but why is 10 to the zeroth power equal to 1? If you dont write down 10 at all, it should be zero!" to which the answer is (you would probably say against "intuition" and "simplicity") "If you want a consistent system where you are free of surprises later, it is just better to have the zeroth power of every non-zero number to be one. Look, here, I can give you examples of what fails if you take your definition of 10^0 = 0 ... like the failure of a^x * a^y = a^(x+y)". Anyway, although I have never seen a programming language that risks such effects, it is interesting to see that you go for it! You are maybe deeper into language semantics than I am, and so you know what you do! > You are correct that we don't share the same assumption here. > However, my assumption comes not from looking at the source > expression, but in assuming the outer join operating model and working > backwards. An interesting method - writing the compiler before defining the language ;-) .. just being a little nasty. But yes, that's exactly what I did with my (||-3) "definition"! So you probably are also thinking about examples how this can fail - that's how I found the or-sum-anomaly example. Have you found any other interesting effect? > Here are the latest reasons: > >> 1. Significantly simpler implementation. > > > > Trat could be the case --> let's try it (although I'm not yet convinced > - see below after 5.). > > > Well, as far as I can see, the most trivial implementation just uses > outer joins everywhere and neither adds nor removes any clauses. This > should require almost no code. Well, the current implementation of NHibernate is even more trivial! > The thing that I have no real concept > of is how difficult it is to optimize to inner joins when possible. Ye(eeee?)s - I did not think it through. I guess that's essentially optimization B (unless you accept the or-sum-anomaly). Probably about as complex as the current implementation - after all, you have to distinguish the operators (below || and ! and ?:, you need possible outer joins; and then you lift them up - if they come out on top together, you can transform them to inner joins). So not that complex, but also not that easy. [...] > > Linq to RDBMS does have different semantics than Linq to Objects, That's probably the core observation. In my BEHAV-1...BEHAV-4, I argued differently: There should be *no* difference in semantics of a query when there is no exception in Linq2Objects. If this cornerstone falls, there is no reason not to do arbitrary semantics. > By easier to understand, I just meant the generated query with fewer > clauses; I wasn't referring to the programming model. I could argue > here that II-4 is satisfactory to many developers, since it's being > used by Linq to SQL and EF and there doesn't seem to be much > discussion about it. Did you talk to people using Linq to SQL or EF about 3-valued logic, logcial consistency and anomalies? All people I have met to this day, when shown the effects (e.g. the famous "null" effect in Linq to SQL - replacing a constant null with a variable of value null creates wrong SQL), at some point said something like "Well, maybe in the next version Microsoft will iron out those problems" and "we did not write complex queries up to now; mostly, we avoid the not operator" etc. NHibernate has seen so much more applications and interesting cases than any Linq system, I'd say. Still, you are right - people accept quite haphazard logic, because with some rewriting you get most things to work. > > While I don't have time to think about it right now, it may also be > worth considering what happens when writing something like a.B.C == > a.B.D instead of a.B.C == 1. This could tune it with my example above ... maybe I find time on my commute to think about it! Best regards, and thanks for a not-at-all-easy discussion!!! Harald -- Empfehlen Sie GMX DSL Ihren Freunden und Bekannten und wir belohnen Sie mit bis zu 50,- Euro! https://freundschaftswerbung.gmx.de
