In the two queries below, there's a 5x performance difference.

But logically, it seems to me that the limit 1 on the order by is logically
equivalent to a min or max depending on ascending or descending ordering,
and is "optimize-able" by using an appropriate transformation.

This use case is not as bogus as it seems. We have tree UI components,
where the child nodes of a tree node is determined by a query, with some of
these "child queries" having order by clauses. The UI needs to know if the
parent node is expendable or not, and for that we currently add the limit 1
clause to the "child query", but the limit 1 is applied *after* the
ordering (or after the union all, see below), which forces getting all the
children rows and columns, a major performance issue.

The queries can be quite complex, so we avoid trying to "parse" it to
transform it, as it would be very brittle code, unless SQLite itself
exposed the real query AST it uses internally to prepare the query so we
could reliably introspect it and transform it to a form that tells us if it
will return any rows if run, in a way that's as efficient as possible.

Also note that quite a few of our queries are "union" or "union all"
queries, joining rows from several different tables, so in this specific
use case, any query returning a row would be enough to stop executing the
other queries and without the need to union all the inner result sets.

Can someone recommend a strategy to find out if an arbitrarily complex
query, possibly with ordering clause(s), will return any rows, as fast as
possible, with some kind of "short-circuit" evaluation, that avoid going
thru the normal query processing? Something that would make query#2 below
run in query#1's run time for example?

Thanks for any advice on this matter, --DD

C:\Users\DDevienne>sqlite3
SQLite version 3.7.15.2 2013-01-09 11:53:05
Enter ".help" for instructions
Enter SQL statements terminated with a ";"
sqlite> create table t10 (id int);
sqlite> insert into t10 values (1), (2), (3), (4), (5), (6), (7), (8), (9),
(10);
sqlite> create table t100 as select (lhs.id*10)+rhs.id as id from t10 lhs
cross join t10 rhs;
sqlite> create table t10k as select (lhs.id*100)+rhs.id as id from t100 lhs
cross join t100 rhs;
sqlite> create table t100m as select (lhs.id*10*1000)+rhs.id as id from
t10k lhs cross join t10k rhs;
sqlite> select count(*) from t100m;
100000000
sqlite> .timer on
sqlite> select max(id) from t100m limit 1;
111111110
CPU Time: user 15.085297 sys 0.000000
sqlite> select id from t100m order by id desc limit 1;
111111110
CPU Time: user 78.328102 sys 0.000000
_______________________________________________
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Reply via email to