Re: [sqlite] bug with NULL in NOT IN

Dennis Cote Wed, 25 Jun 2008 14:50:32 -0700

D. Richard Hipp wrote:
> 
> If I understand Peter correctly, he is saying that NULL should mean  
> "unknown" in the context of the RHS of a NOT IN operator.  SQLite does  
> not currently operate this way.  SQLite currently interprets a NULL in  
> the RHS of a NOT IN operator to mean "nothing".
> 
> Can you or anybody else point to text in any  
> SQL spec that would suggest that SQLites behavior in this case is wrong?



I believe that your interpretation is correct, and that SQLite's current 
behavior is incorrect according to the standard, at least the SQL:1999 
standard.

The IN predicate is a synonym for the quantified predicate = ANY, or = 
SOME which is equivalent. The inversion of this predicate, NOT IN, is 
therefore the inversion of the quantified predicate. The rules defined 
in section 8.4 <in predicate> Syntax Rules 3 and 4 show how this is 
transformed.

   col NOT IN subquery

becomes

   NOT col IN subquery

which becomes

   NOT (col = SOME subquery)

And the inversion of the quantified predicate is

   col <> ALL subquery

In the standard the rules for evaluating this quantified subquery are in 
   section 8.8 <quantified comparison predicate> General Rules 1 through 
2e copied below.

     1) Let R be the result of the <row value expression> and let T be 
the result of the <table subquery>.
     2) The result of ‘‘R <comp op> <quantifier> T’’ is derived by the 
application of the implied <comparison predicate> ‘‘R <comp op> RT’’ to 
every row RT in T:
     Case:
         a) If T is empty or if the implied <comparison predicate> is 
true for every row RT in T, then ‘‘R <comp op> <all> T’’ is true.
         b) If the implied <comparison predicate> is false for at least 
one row RT in T, then ‘‘R <comp op> <all> T’’ is false.
         c) If the implied <comparison predicate> is true for at least 
one row RT in T, then ‘‘R <comp op> <some> T’’ is true.
         d) If T is empty or if the implied <comparison predicate> is 
false for every row RT in T, then ‘‘R <comp op> <some> T’’ is false.
         e) If ‘‘R <comp op> <quantifier> T’’ is neither true nor false, 
then it is unknown .

For the example given, the engine evaluates the following predicates.

   1 <> ALL (NULL, 3, 4, 5)
   2 <> ALL (NULL, 3, 4, 5)
   3 <> ALL (NULL, 3, 4, 5)

In each case the first comparison is

   X <> NULL

And from 8.2 <comparison predicate> General Rules 1a we have

     a)  If either XV or YV is the null value, then
             X <comp op> Y
         is unknown.

The results of the comparisons are therefore:

   (unknown, true, true, true)
   (unknown, true, true, true)
   (unknown, false, true, true)

So all three rows result in an unknown result for the first, NULL, 
element. The last row is handled buy case b in section 8.8 General Rule 
2 above since one subquery result is false. The other rows all fall 
through to case e in section 8.8 General Rule 2 above. The result of 
each quantified comparison is therefore unknown.

The where clause only returns rows where the condition is true according 
to section 7.8 <where clause> General Rule 1.

     1) The <search condition> is applied to each row of T. The result 
of the <where clause> is a table of those rows of T for which the result 
of the <search condition> is true.

It looks like this should be changed to match the other database engines 
for improved standard compliance.

Dennis Cote
_______________________________________________
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] bug with NULL in NOT IN

Reply via email to