On Mon, May 6, 2013 at 10:29 PM, Simon Slavin <slav...@bigfraud.org> wrote:

>
> The problem, I think, is that a COLLATE shouldn't change any value anyhow,
> no matter which SELECT the ORDER clause is attached to.  The COLLATE
> modifier is part of the ORDER BY clause.  It is there to change the ORDER
> that the values are returned in, not the values themselves.


And, indeed, that is exactly what COLLATE is doing.

The problem is this:  When SQLite sees the ORDER BY clause on the EXCEPT it
tries to do the EXCEPT using a merge.  In other words, it computes two
subqueries:  (SELECT .. FROM x ORDER BY ...) and (SELECT ... FROM y ORDER
BY ...).  Then it looks at the output of these subqueries, row by row.

(1)  x<y:  output x
(2)  x>y:  pop y
(3)  x=y:  pop and discard both x and y

You can implement INTERSECT, UNION, and UNION ALL in much the same way, by
supplying different actions for each of the above cases.

The above works great (and is very efficient) if the collating sequence of
the ORDER BY is the same as the natural collating sequence of the output
columns.  If it isn't, then the above code gives the wrong answer.  The
basic problem is that SQLite is not recognizing that the collating
sequences are different and is trying to use the algorithm above when it it
shouldn't.

This was an oversight when I first implemented the merging algorithm 5
years ago.  It didn't occur to me then (and apparently hasn't occurred to
anybody else in the last 5 years) that the collating sequence in the ORDER
BY might be different from the natural collating sequence of the result
columns.

Unfortunately, the merge algorithm outlined above is the only means SQLite
currently has for doing a compound select that contains an ORDER BY.  In
order to fix this, I'm going to have to come up with a whole new algorithm,
just for this case.  ON the other hand, since nobody has noticed it in 5
years, presumably it doesn't come up that often, so there isn't a huge rush
to get the fix in.  So I'm going to take my time and try to come up with
the minimally disruptive fix.



>  And something like
>
> SELECT x EXCEPT y
>
> is subtracting one set from another, and in sets the order doesn't matter.
>  The problem is something like doing
>
> SELECT words FROM dictionary ORDER BY words COLLATE NOCASE
>
> and getting all the words back as capital letters.  This shouldn't happen.
>
> Simon.
> _______________________________________________
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>



-- 
D. Richard Hipp
d...@sqlite.org
_______________________________________________
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Reply via email to