following up on my own response to Dan…

I see what you mean Dan. I am not an expert at reading the query plans, but I 
do see that effectively my new query has the same query plan as the last two 
queries combined as well as the original query. The only difference is the 
order in which it proceeds. So, the original (slow query)

|--SCAN TABLE vt1 VIRTUAL TABLE INDEX 0:m
|--SEARCH TABLE t2 USING INDEX ix_t2_scm (deleted=? AND deleted=?)
`--SEARCH TABLE t1 USING COVERING INDEX ix_t1_t1Id (deleted=? AND t1Id=?)

The two sub-queries separately that run very fast

`--SCAN TABLE vt1 VIRTUAL TABLE INDEX 0:m

and 
|--SEARCH TABLE t2 USING INDEX ix_t2_scm (deleted=? AND deleted=?)
`--SEARCH TABLE t1 USING COVERING INDEX ix_t1_t1Id (deleted=? AND t1Id=?)

and the new query that also runs very fast using the two sub-queries

QUERY PLAN
|--CO-ROUTINE 1
|  |--SEARCH TABLE t2 USING INDEX ix_t2_scm (deleted=? AND deleted=?)
|  `--SEARCH TABLE t1 USING COVERING INDEX ix_t1_t1Id (deleted=? AND t1Id=?)
|--SCAN SUBQUERY 1 AS a
`--LIST SUBQUERY 2
  `--SCAN TABLE vt1 VIRTUAL TABLE INDEX 0:m


No idea what is going on.


> On Mar 9, 2020, at 2:08 PM, P Kishor <punk.k...@gmail.com> wrote:
> 
> 
> 
>> On Mar 9, 2020, at 1:04 PM, Dan Kennedy <danielk1...@gmail.com> wrote:
>> 
>> 
>> On 9/3/63 01:44, Puneet Kishor wrote:
>>> Update: so, after much hitting of my head against the wall of sql, I came 
>>> up with the following – as noted above, I really have two distinct set of 
>>> queries I can do separately like so
>>> 
>>> Q1: (SELECT t1Id FROM t1 WHERE …) AS a
>>> 
>>> Q2: (SELECT t1Id FROM vt1 WHERE vt1 MATCH ‘bar’) AS b
>>> 
>>> Then, I can do the following -
>>> 
>>> SELECT Count(*) FROM a WHERE a.t1Id IN b
>>> 
>>> Of course, in reality, I don’t do this separately but all in one go to make 
>>> a really messy SQL but a really fast query, a couple of hundred ms as 
>>> opposed to > 25s
>> 
>> 
>> I don't see why that would be any different from your join query. What does 
>> EXPLAIN QUERY PLAN say for the version that takes a few hundred ms?
> 
> 
> Here it is, the new query made up of two queries derived from the original 
> query (summarize again below). The query itself takes 301ms. 
> 
> ```
> SQLite version 3.30.0 2019-10-04 15:03:17
> Enter ".help" for usage hints.
> sqlite> EXPLAIN QUERY PLAN
>   ...> SELECT Count(*) FROM
>   ...> 
>   ...> -- this one is 'QUERY a'
>   ...> (SELECT Count(DISTINCT t1.t1Id) t1Id 
>   ...> FROM t2 JOIN t1 ON t2.t1Id = t1.t1Id 
>   ...> WHERE t1.deleted = 0 AND t2.deleted = 0 AND scm != '') a
>   ...>
>   ...> WHERE a.t1Id IN 
>   ...> 
>   ...> -- and this one is 'QUERY b'
>   ...> (SELECT Count(DISTINCT t1Id) FROM vt1 WHERE vt1 MATCH "foo");
> QUERY PLAN
> |--CO-ROUTINE 1
> |  |--SEARCH TABLE t2 USING INDEX ix_t2 (deleted=? AND deleted=?)
> |  `--SEARCH TABLE t1 USING COVERING INDEX ix_t1_t1Id (deleted=? AND t1Id=?)
> |--SCAN SUBQUERY 1 AS a
> `--LIST SUBQUERY 2
>   `--SCAN TABLE vt1 VIRTUAL TABLE INDEX 0:m
> sqlite>
> ```
> 
> Here is the original query (QUERY zero). This is the one that takes a very 
> long time, 10.611s. Note, in this one I’ve added an additional table ’t2’ but 
> the concept is the same. Normal tables JOINed to each other, and then JOINed 
> to a virtual table.
> 
> ```
> sqlite> EXPLAIN QUERY PLAN
>   ...> SELECT Count(DISTINCT t1.t1Id)
>   ...> FROM t2 JOIN t1 ON t2.t1Id = t1.t1Id JOIN vt1 ON t1.t1Id = vt1.t1Id
>   ...> WHERE t1.deleted = 0 AND t2.deleted = 0 AND scm != '' AND vt1 MATCH 
> "foo";
> QUERY PLAN
> |--SCAN TABLE vt1 VIRTUAL TABLE INDEX 0:m
> |--SEARCH TABLE t2 USING INDEX ix_t2_scm (deleted=? AND deleted=?)
> `--SEARCH TABLE t1 USING COVERING INDEX ix_t1_t1Id (deleted=? AND t1Id=?)
> sqlite>
> ```
> 
> Here are the two queries derived from QUERY zero. The first one (QUERY a) 
> takes 324ms
> 
> ```
> sqlite> EXPLAIN QUERY PLAN
>   ...> SELECT Count(DISTINCT t1.t1Id)
>   ...> FROM t2 JOIN t1 ON t2.t1Id = t1.t1Id
>   ...> WHERE t1.deleted = 0 AND t2.deleted = 0 AND scm != '';
> QUERY PLAN
> |--SEARCH TABLE t2 USING INDEX ix_t2_scm (deleted=? AND deleted=?)
> `--SEARCH TABLE t1 USING COVERING INDEX ix_t1_t1Id (deleted=? AND t1Id=?)
> sqlite>
> ```
> 
> The second query (QUERY b) takes: 27ms
> 
> ```
> sqlite> EXPLAIN QUERY PLAN
>   ...> SELECT Count(DISTINCT t1Id) FROM vt1 WHERE vt1 MATCH “foo";
> QUERY PLAN
> `--SCAN TABLE vt1 VIRTUAL TABLE INDEX 0:m
> sqlite>
> ```
> 
> One more thing: All of this goes to hell if the virtual table returns way too 
> many matches. For example, I have a term that returns 80K rows from the FTS 
> MATCH. In that case, even my new query very slow because, well, because the 
> QUERY b above is slow.

_______________________________________________
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Reply via email to