Public bug reported:

THE PROBLEM:
I am seeing query times around 200ms (and 150ms with my latest performance 
tweak in trunk), which surprised me as quite slow since my queries where quite 
simple. I had expected times around 1-2ms.

Reading up on the sqlite documentation I see that the queries we
generate are pretty far from optimized in an sqlite world [1]. The case
is that when ever you use an OR sqlite will no longer use an index. Thus
this query is NOT using the indexes:

  SELECT * FROM event WHERE interpretation=1 OR interpretation=2

But if we rewrite it using IN instead the indexes will be used:

  SELECT * FROM event WHERE interpretation IN (1, 2)

This also explains the case where Michal where seeing tremendously slow
query times when searching for a big range of mimetypes.

Looking in _zeitgeist.engine.sql.WhereClause.add_text_condition() I am
also pretty sure we are not using the indexes for prefix queries (eg
file://home/*).


THE SOLUTION
I *definitely* don't think we should panic and feverishly start rewriting our 
query compilation. Here's what I propose:

 1) Implement an envvar ZEITGEIST_DEBUG_QUERY_PLANS which will spit out
all our SQL calls and the query plans for each of our calls. The query
plan will tell us how the db is queried and which indexes are used if
any. The query plan is obtained by prefixing the SQL statement with
EXPLAIN QUERY PLAN.

 2) Collect some useful intelligence with this new tool, and generally
learn more about how we can optimize sqlite queries. A big question here
is how the event_view VIEW impacts the query plan.

 3) Write a new template -> SQL compilation engine that generates SQL
optimized for sqlite. We can actually be quite clever about grouping our
OR statements into IN clauses - but it will be tricky to get right.

NOTE: That this doesn't imply any change in the public API or event
template system. That would be the wrong solution imho. Our current API
is nice and simple by my standards. Let's keep it that way.

[1]: See fx the section "Using Indexes To Speed Searching" in
http://www.sqlite.org/vdbe.html

** Affects: zeitgeist
     Importance: Undecided
         Status: New

-- 
Slow queries: SQL indexes not used
https://bugs.launchpad.net/bugs/632363
You received this bug notification because you are a member of Zeitgeist
Framework Team, which is subscribed to Zeitgeist Framework.

Status in Zeitgeist Framework: New

Bug description:
THE PROBLEM:
I am seeing query times around 200ms (and 150ms with my latest performance 
tweak in trunk), which surprised me as quite slow since my queries where quite 
simple. I had expected times around 1-2ms.

Reading up on the sqlite documentation I see that the queries we generate are 
pretty far from optimized in an sqlite world [1]. The case is that when ever 
you use an OR sqlite will no longer use an index. Thus this query is NOT using 
the indexes:

  SELECT * FROM event WHERE interpretation=1 OR interpretation=2

But if we rewrite it using IN instead the indexes will be used:

  SELECT * FROM event WHERE interpretation IN (1, 2)

This also explains the case where Michal where seeing tremendously slow query 
times when searching for a big range of mimetypes.

Looking in _zeitgeist.engine.sql.WhereClause.add_text_condition() I am also 
pretty sure we are not using the indexes for prefix queries (eg file://home/*).


THE SOLUTION
I *definitely* don't think we should panic and feverishly start rewriting our 
query compilation. Here's what I propose:

 1) Implement an envvar ZEITGEIST_DEBUG_QUERY_PLANS which will spit out all our 
SQL calls and the query plans for each of our calls. The query plan will tell 
us how the db is queried and which indexes are used if any. The query plan is 
obtained by prefixing the SQL statement with EXPLAIN QUERY PLAN.

 2) Collect some useful intelligence with this new tool, and generally learn 
more about how we can optimize sqlite queries. A big question here is how the 
event_view VIEW impacts the query plan.

 3) Write a new template -> SQL compilation engine that generates SQL optimized 
for sqlite. We can actually be quite clever about grouping our OR statements 
into IN clauses - but it will be tricky to get right.

NOTE: That this doesn't imply any change in the public API or event template 
system. That would be the wrong solution imho. Our current API is nice and 
simple by my standards. Let's keep it that way.

[1]: See fx the section "Using Indexes To Speed Searching" in 
http://www.sqlite.org/vdbe.html



_______________________________________________
Mailing list: https://launchpad.net/~zeitgeist
Post to     : zeitgeist@lists.launchpad.net
Unsubscribe : https://launchpad.net/~zeitgeist
More help   : https://help.launchpad.net/ListHelp

Reply via email to