On Wed, Feb 23, 2011 at 04:24:14AM -0800, Wiktor Adamski scratched on the wall:
> SQLite version 3.7.5
> Enter ".help" for instructions
> Enter SQL statements terminated with a ";"
> sqlite> CREATE TABLE t1 (a INT);
> sqlite> CREATE TABLE t2 (b INT);
> sqlite> CREATE TABLE t3 (a INT);
> sqlite> SELECT * FROM t1 JOIN t2 ON t1.a = t2.b AND a /* obviously
> t1.a */ != 1 JOIN t3 ON t1.a = t3.a;
> Error: ambiguous column name: a
Not a bug, not obviously anything, and just the joy of SQL.
In short, you cannot assume conditions are processed left-to-right,
including JOIN conditions. You have two tables in your statement
with an "a" column, so any reference, anywhere in the statement, must
be qualified or it will be ambiguous.
In this case the expression "t1.a != 1" would be much more appropriate
in a WHERE clause, since it is just a row filter and has nothing to do
with the JOIN itself (it only involves one table, after all). If we
rewrite the query with that condition in the WHERE clause, the column
name is clearly ambiguous (even if the structure of the query tells
us that all "a" columns must have the same value):
SELECT *
FROM t1
JOIN t2 ON t1.a = t2.b
JOIN t3 ON t1.a = t3.a
WHERE a != 1; -- which "a"?
Internally, SQLite actually moves all JOIN conditions to the WHERE
clause, effectively processing all statement conditions in one batch.
That means the statement that SQLite is actually processing looks a lot
more like this:
SELECT *
FROM t1
JOIN t2
JOIN t3
WHERE t1.a = t2.b
AND t1.a = t3.a
AND a != 1; -- which "a"?
Again, viewed this way, the "a" reference is clearly ambiguous.
Moving the conditions to the WHERE clause is allowed under the SQL spec,
as is reordering those conditions, mostly because this is how all JOINs
used to be written before the "ANSI JOIN syntax" came about (which I
greatly prefer).
You can see more evidence of this in a statement like this, which, at
face value, is even more clear about which "a" you want but still
throws an "ambiguous column name" error:
SELECT *
FROM t1
JOIN t2 ON a = b -- can't figure out which "a" this is.
JOIN t3 ON t1.a = t3.a;
Once again, things become more clear when you realize the statement
being processed might look like this:
SELECT *
FROM t1
JOIN t2
JOIN t3
WHERE t1.a = t3.a
AND a = b; -- which "a"?
While these kinds of issues are the root of many headaches and a lot
of cursing at the designers of SQL, there are very good reasons for
this behavior. First, there are the historical issues in how the
language has evolved. It would be very bad to have a query output
change just because an alternate syntax was used, especially a syntax
that is supposed to be equivalent.
But more importantly, moving all the conditions into the WHERE clause
allows the query optimizer to consider all the different JOINs and
all the different filter conditions at once. This allows it to re-order
conditions, filters and joins. For example, the query optimizer
might reorder a series of JOINs based off table size and available
indexes. Given the ability of a JOIN to generate a vast number of
rows, this reordering can have an extremely significant impact on the
processing time for a query. So the actual query might join t2 to t3,
and then add t1 to the mix, especially if the optimizer could guess
that the output of (t2 JOIN t3) was only a handful of rows, while the
output of (t1 JOIN t2) might produce a vast number of rows.
The take-away from all this is, if you ever mix tables with similar
column names, make sure you always qualify your column references,
because you really don't know exactly what the query optimizer might
do with your statement, and it is better to be safe than sorry. This
is not unlike using extra ()s in complex math statements, even if the
language doesn't actually require them.
-j
--
Jay A. Kreibich < J A Y @ K R E I B I.C H >
"Intelligence is like underwear: it is important that you have it,
but showing it to the wrong people has the tendency to make them
feel uncomfortable." -- Angela Johnson
_______________________________________________
sqlite-users mailing list
[email protected]
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users