On 23/01/2012 12:48 PM, Simon Slavin wrote:
On 23 Jan 2012, at 4:30pm, Ryan Johnson wrote:
Bump?
On 21/01/2012 2:47 PM, Ryan Johnson wrote:
On 21/01/2012 2:44 PM, Simon Slavin wrote:
On 21 Jan 2012, at 7:23pm, Ryan Johnson wrote:
It's a SQL89 join. Sqlite docs say SQL89 and SQL92 joins perform identically
[1], which I confirmed before sending the OP.
Oh. Okay. If it spits out the same EXPLAIN QUERY PLAN then SQLite is
interpreting it the same way.
Which brings us back to the original question: why does sqlite spit out a bad
query plan when a vastly better one exists? There's no clear reason the better
answer should have been hard to find.
Well, the approach would be to find a specific query, post the query and query
plan, then to simplify the query until the strangeness in the query plan goes
away. The last simplification is the one that caused the problem.
On 21/01/2012 12:49 PM, Ryan Johnson wrote:
consider the following query:
select count(*) from orders O, Customer C where C.custkey=O.custkey
and C.name like '%115';
.stats/explain reports 149999 fullscan steps for the query plan:
0|0|TABLE orders AS O
1|1|TABLE Customer AS C USING PRIMARY KEY
Putting Customer first in the FROM clause makes the query markedly
faster and executes only 14999 fullscan steps. The query plan confirms
the change:
0|0|TABLE Customer AS C
1|1|TABLE orders AS O WITH INDEX OrderCustomers
In what way does the above query from the OP not sufficient to
demonstrate the problem? There are only two tables involved. There are
only two relevant indexes -- the automatically-created PK and manually
created index the query *should* use; I don't think it's reasonable to
suggest dropping the primary key index. Changing "count(*)" to "*"
doesn't change the chosen access path for the query. Removing the
predicate doesn't affect the access path either, thought it does
increase by 1000x the number of rows returned (runtime stays about the
same).
Given that ANALYZE ran and established |Customer|=15k vs. |Orders|=150k,
why does the optimizer choose to do a full table scan of Orders, instead
of scanning Customer and accessing Orders using the index created
specifically for that purpose? (It's even marked as a foreign key in the
schema, though I'm pretty sure sqlite doesn't pay attention to that).
This seems a cut-and-dried join reordering scenario that the optimizer
should eat for lunch, and instead it seems to limit its search to
selecting the best index(es) to use given the join ordering the original
SQL specified. If the optimizer were merely making a bad decision --
which the second example of the OP seems to demonstrate -- then it would
reorder (wrongly) the "correct" version of the SQL; it does not.
Thoughts?
Ryan
_______________________________________________
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users