I was analyzing hundreds of queries in our system and a few stood out
(average query time is subsecond, but these few that stood out were 10+
seconds).

Here is the basic query for these problem few:

SELECT T1.A from T1 INNER JOIN T2 on T1.B=T2.B and T1.C=T2.C where
T1.D='2015-12-31' and T2.E=2 GROUP BY A

When run as-is: it takes 20 seconds.  and explain query plan is 3 lines:
1: SEARCH TABLE T2 USING AUTOMATIC PARTIAL COVERING INDEX (E=?)
2: SEARCH TABLE T1 USING INDEX Indx_T1 (D=?)
3: USE TEMP B-TREE FOR GROUP BY

With my knowledge of the DB, these seem like the right indexes to use.  In
trying different options I noticed that when I added an INDEXED BY phrase
to T1 it helped significantly -  even though it was already using the same
index I am suggesting!

SELECT T1.A from T1 INDEXED BY Indx_T1 INNER JOIN T2 on T1.B=T2.B and
T1.C=T2.C where T1.D='2015-12-31' and T2.E=2 GROUP BY A

This query takes 0.6 second and this this explain query plan:
1: SEARCH TABLE T1 USING INDEX Indx_T1 (D=?)
2: SEARCH TABLE T2 USING AUTOMATIC PARTIAL COVERING INDEX (E=? AND B=? AND
C=?)
3: USE TEMP B-TREE FOR GROUP BY

FWIW. T1 has 1.4M rows and T2 has 2000.

So my questions are simply: why would these two very similar queries have
such difference performance results since it looks to me like SQLite is
doing nearly the same thing in both cases, and since they are so different,
why would the planner need that INDEXED BY hint in order to get the best
performance?

Reply via email to