D. Richard Hipp wrote:
>
> If I understand Peter correctly, he is saying that NULL should mean
> "unknown" in the context of the RHS of a NOT IN operator. SQLite does
> not currently operate this way. SQLite currently interprets a NULL in
> the RHS of a NOT IN operator to mean "nothing".
>
> Can you or anybody else point to text in any
> SQL spec that would suggest that SQLites behavior in this case is wrong?
I believe that your interpretation is correct, and that SQLite's current
behavior is incorrect according to the standard, at least the SQL:1999
standard.
The IN predicate is a synonym for the quantified predicate = ANY, or =
SOME which is equivalent. The inversion of this predicate, NOT IN, is
therefore the inversion of the quantified predicate. The rules defined
in section 8.4 <in predicate> Syntax Rules 3 and 4 show how this is
transformed.
col NOT IN subquery
becomes
NOT col IN subquery
which becomes
NOT (col = SOME subquery)
And the inversion of the quantified predicate is
col <> ALL subquery
In the standard the rules for evaluating this quantified subquery are in
section 8.8 <quantified comparison predicate> General Rules 1 through
2e copied below.
1) Let R be the result of the <row value expression> and let T be
the result of the <table subquery>.
2) The result of ‘‘R <comp op> <quantifier> T’’ is derived by the
application of the implied <comparison predicate> ‘‘R <comp op> RT’’ to
every row RT in T:
Case:
a) If T is empty or if the implied <comparison predicate> is
true for every row RT in T, then ‘‘R <comp op> <all> T’’ is true.
b) If the implied <comparison predicate> is false for at least
one row RT in T, then ‘‘R <comp op> <all> T’’ is false.
c) If the implied <comparison predicate> is true for at least
one row RT in T, then ‘‘R <comp op> <some> T’’ is true.
d) If T is empty or if the implied <comparison predicate> is
false for every row RT in T, then ‘‘R <comp op> <some> T’’ is false.
e) If ‘‘R <comp op> <quantifier> T’’ is neither true nor false,
then it is unknown .
For the example given, the engine evaluates the following predicates.
1 <> ALL (NULL, 3, 4, 5)
2 <> ALL (NULL, 3, 4, 5)
3 <> ALL (NULL, 3, 4, 5)
In each case the first comparison is
X <> NULL
And from 8.2 <comparison predicate> General Rules 1a we have
a) If either XV or YV is the null value, then
X <comp op> Y
is unknown.
The results of the comparisons are therefore:
(unknown, true, true, true)
(unknown, true, true, true)
(unknown, false, true, true)
So all three rows result in an unknown result for the first, NULL,
element. The last row is handled buy case b in section 8.8 General Rule
2 above since one subquery result is false. The other rows all fall
through to case e in section 8.8 General Rule 2 above. The result of
each quantified comparison is therefore unknown.
The where clause only returns rows where the condition is true according
to section 7.8 <where clause> General Rule 1.
1) The <search condition> is applied to each row of T. The result
of the <where clause> is a table of those rows of T for which the result
of the <search condition> is true.
It looks like this should be changed to match the other database engines
for improved standard compliance.
Dennis Cote
_______________________________________________
sqlite-users mailing list
[email protected]
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users