Richard, Thanks for acknowledging this, you are absolutely right, that's why I stated that no DB does perfect optimisations and that computing the alternative -OR- based closures are probably much harder to tackle. Also E. Pasma pointed out the -OR- queries as I wrote them are not really semantically equivalent unless the 2 joins are disjunct.
However, the case of the imperative WHERE NOT NULL implying INNER JOIN is just a matter of replacing a predicate with a stronger one, so in all fairness I imagined it a far lesser overhead than, say, the query flattener. And I imagine it's a much more common situation, too, especially when users are adding additional filters via WHERE clauses to a base query, so it might benefit a lot of users, too. I know it would us, by not having to rewrite these queries when porting; we are working on x86 servers, and a stick of memory or a hard drive cost less than a programmer's day for us :) For the extra memory, I know for computing relational closures the spatial complexity can get big, but only when the structure of the query is written warrants it in the first place, so it shouldn't manifest heavily on a query that doesn't have this structure. This is just my best view on this, obviously it's a political decision to be made so it's no make-it-or-break-it thing, like mentioned before, we are porting some pretty big system and when I notice differences with SQLite, I jolt them down, in the hope it might benefit you or the millions of users, if not by changing SQLite, then simply by pointing out the workaround to other users, such as moving the WHERE condition out to the ON clause, it's not necessarily a trivial thing to consider for everyone. Ryan, You cannot ask SQL a Method query, that's where my whole RDBMS understanding takes me. It nullifies the purpose of queries as well as all efforts you yourselves have put into a lot of things, query flattening to mention just one. The "same result" is not accidental, the equivalent queries will produce the same result no matter which data populates the tables. That is the only deffinition I know of semantic equivalence. SQL is declarative and thus everything that describes the same thing is the same thing. E.Pasma, Thanks for taking the time to make the TC. This is always a huge putdown for me, because finally the execution plan depends on the data indexes are populated with (via ANALYZE) and are tables are huge so it's always a putdown for me to create a minimal TC. Indeed I noticed just now the 2 queries are not equivalent that way :) Thanks for pointing that out! I will work on an equivalent -and- optimized rewrite :) For the query plans though, here is where the index stats come in: here a "SCAN a" makes sense, but in our case the number of records in a is on the order of 10000x records to b and c, and also the cardinality of b.d and c.d is on the order of 1000; so a "SEARCH b, SEARCH c" works out. At minimum you should have indexes on b.d, c.d, a.ab, a.ac; but even so and with adding another 1000 records on a, b and c and running the query: EXPLAIN QUERY PLAN SELECT * FROM a JOIN b ON b=ab AND b.d IN (1,2,3) JOIN c ON c=ac AND c.d IN (4,5) selectid |order |from |detail | ---------|------|-----|---------------------------------------------------| 0 |0 |0 |SCAN TABLE a | 0 |1 |1 |SEARCH TABLE b USING INTEGER PRIMARY KEY (rowid=?) | 0 |0 |0 |EXECUTE LIST SUBQUERY 1 | 0 |2 |2 |SEARCH TABLE c USING INTEGER PRIMARY KEY (rowid=?) | it still plans a "SCAN a" first. So I guess I'll have to backtrack from the real data to generate a TC. -- Sent from: http://sqlite.1065341.n5.nabble.com/ _______________________________________________ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users