I have a view gathering statistics from 3 different tables, one of which is
largish (~ 2M rows).

The view is similar to this:

with
kcounts(t, c) as (
  select code_type, count(*)
   from keys
  group by code_type
),
...
select ...,
       (select c from kcounts where t = 'SET') as "#sets",
       (select c from kcounts where t = 'CST') as "#constants",
       (select c from kcounts where t = 'CMT') as "#comments",
       (select c from kcounts where t = 'LOG') as "#logs",
       ...;

With all 4 counts, the views takes ~ 6.5s to compute its 1 row (with the
file in cache, was 13.5s with a cold filesystem cache, while DB is only
380MB in size).

If I remove 3 of the lines, keeping a single "select c from kcounts where t
= ..." that drops to 2.15s, or about one third. (this is inside SQLiteSpy
using SQLite 3.13.0 BTW).

As a programmer, I can see this can be done in a single pass, but most
likely the SQL as written does several passes over kcounts, despite using a
CTE (I had hoped kcounts would be instantiated into an ephemeral table, and
thus accessing a 4 row tables would be "free").

An explain query plan indeed shows 4x "SCAN TABLE keys" steps, not just 1.

Is there a way to reformulate this query differently to somehow achieve a
single pass on the keys table? Or force the query planner to "instantiate"
the kcounts "CTE view" to achieve that single scan, via some kind of hint?
If none exists, could one such hint be added? Ideally the planner itself
would decide kcounts should be instantiated, but I'm happy to tell him it
should via a hint.

The speed up is non-negligible at 3x, thus my request for comments from
experts on this.

Thanks, --DD

PS: BTW, I did run ANALYZE and that didn't change the plan or execution
time.
_______________________________________________
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Reply via email to