Jay A. Kreibich wrote:
> On Sun, Oct 18, 2009 at 02:17:42PM +0200, Kristoffer Danielsson scratched on 
> the wall:
>> Clearly, SQLite executes a cartesian product!
> 
>   Look at the output.  It does not produce a Cartesian product.  All
>   the rows are valid:
> 
> SQLite version 3.6.19
> Enter ".help" for instructions
> Enter SQL statements terminated with a ";"
> sqlite> create table t (c1, c2);
> sqlite> insert into t values ( 1, 2 );
> sqlite> insert into t values ( 3, 4 );
> sqlite> insert into t values ( 5, 6 );
> sqlite> select * from t natural join t; 
> 1|2
> 1|2
> 1|2
> 3|4
> 3|4
> 3|4
> 5|6
> 5|6
> 5|6
> 
>   I'm not sure I'd call it correct, but it isn't a product.

Jay, you've just proven Kristoffer's point.  That result demonstrates that a 
cartesian product *was* produced.  The table t had 3 rows, and the result had 
3*3 rows, which is a cartesian product by definition.

Your query should have produced the same result as this query:

   select t1.* from t as t1 inner join t as t2 using (c1,c2);

... but instead it produced the same result as this query:

   select t1.* from t as t1 cross join t as t2 using (c1,c2);

Now I would be ready to consider that SQLite has a bug, but then looking at the 
syntax at http://sqlite.org/lang_select.html I see that SQLite defines multiple 
versions of natural join; it has *both* NATURAL INNER JOIN and NATURAL CROSS 
JOIN, and I'm guessing that if you leave the middle word out it is using CROSS 
by default, ostensibly for consistency for when you simply say JOIN.

So if that is the case, then the current behavior is clearly documented as 
expected and so not an implementation bug.  And so then you would have to say 
this:

   select * from t natural inner join t;

... to get the expected result of 3 rows.

This all being said, the whole mess strikes me as a *design bug*.  It simply 
doesn't make sense to have both NATURAL INNER and NATURAL CROSS syntax.  One 
should simply be able to say NATURAL and it would do the right thing, which is 
a 
cartesian product when no column names are the same, an intersect when all 
column names are the same, and an inner join otherwise.

My proposal is certainly logically sound.  A natural join by definition only 
has 
a result row for each distinct pair of source rows that have the same values 
for 
the subset of their columns with the same names; a cartesian product is a 
degenerate case where that subset of columns has zero members, and so since the 
empty set matches the empty set every row from each source rowsets would match 
every row from the other rowsets.

The only variations that make sense on a natural join is OUTER.

-- Darren Duncan
_______________________________________________
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Reply via email to