> On Mar 9, 2020, at 1:04 PM, Dan Kennedy <danielk1...@gmail.com> wrote: > > > On 9/3/63 01:44, Puneet Kishor wrote: >> Update: so, after much hitting of my head against the wall of sql, I came up >> with the following – as noted above, I really have two distinct set of >> queries I can do separately like so >> >> Q1: (SELECT t1Id FROM t1 WHERE …) AS a >> >> Q2: (SELECT t1Id FROM vt1 WHERE vt1 MATCH ‘bar’) AS b >> >> Then, I can do the following - >> >> SELECT Count(*) FROM a WHERE a.t1Id IN b >> >> Of course, in reality, I don’t do this separately but all in one go to make >> a really messy SQL but a really fast query, a couple of hundred ms as >> opposed to > 25s > > > I don't see why that would be any different from your join query. What does > EXPLAIN QUERY PLAN say for the version that takes a few hundred ms?
Here it is, the new query made up of two queries derived from the original query (summarize again below). The query itself takes 301ms. ``` SQLite version 3.30.0 2019-10-04 15:03:17 Enter ".help" for usage hints. sqlite> EXPLAIN QUERY PLAN ...> SELECT Count(*) FROM ...> ...> -- this one is 'QUERY a' ...> (SELECT Count(DISTINCT t1.t1Id) t1Id ...> FROM t2 JOIN t1 ON t2.t1Id = t1.t1Id ...> WHERE t1.deleted = 0 AND t2.deleted = 0 AND scm != '') a ...> ...> WHERE a.t1Id IN ...> ...> -- and this one is 'QUERY b' ...> (SELECT Count(DISTINCT t1Id) FROM vt1 WHERE vt1 MATCH "foo"); QUERY PLAN |--CO-ROUTINE 1 | |--SEARCH TABLE t2 USING INDEX ix_t2 (deleted=? AND deleted=?) | `--SEARCH TABLE t1 USING COVERING INDEX ix_t1_t1Id (deleted=? AND t1Id=?) |--SCAN SUBQUERY 1 AS a `--LIST SUBQUERY 2 `--SCAN TABLE vt1 VIRTUAL TABLE INDEX 0:m sqlite> ``` Here is the original query (QUERY zero). This is the one that takes a very long time, 10.611s. Note, in this one I’ve added an additional table ’t2’ but the concept is the same. Normal tables JOINed to each other, and then JOINed to a virtual table. ``` sqlite> EXPLAIN QUERY PLAN ...> SELECT Count(DISTINCT t1.t1Id) ...> FROM t2 JOIN t1 ON t2.t1Id = t1.t1Id JOIN vt1 ON t1.t1Id = vt1.t1Id ...> WHERE t1.deleted = 0 AND t2.deleted = 0 AND scm != '' AND vt1 MATCH "foo"; QUERY PLAN |--SCAN TABLE vt1 VIRTUAL TABLE INDEX 0:m |--SEARCH TABLE t2 USING INDEX ix_t2_scm (deleted=? AND deleted=?) `--SEARCH TABLE t1 USING COVERING INDEX ix_t1_t1Id (deleted=? AND t1Id=?) sqlite> ``` Here are the two queries derived from QUERY zero. The first one (QUERY a) takes 324ms ``` sqlite> EXPLAIN QUERY PLAN ...> SELECT Count(DISTINCT t1.t1Id) ...> FROM t2 JOIN t1 ON t2.t1Id = t1.t1Id ...> WHERE t1.deleted = 0 AND t2.deleted = 0 AND scm != ''; QUERY PLAN |--SEARCH TABLE t2 USING INDEX ix_t2_scm (deleted=? AND deleted=?) `--SEARCH TABLE t1 USING COVERING INDEX ix_t1_t1Id (deleted=? AND t1Id=?) sqlite> ``` The second query (QUERY b) takes: 27ms ``` sqlite> EXPLAIN QUERY PLAN ...> SELECT Count(DISTINCT t1Id) FROM vt1 WHERE vt1 MATCH “foo"; QUERY PLAN `--SCAN TABLE vt1 VIRTUAL TABLE INDEX 0:m sqlite> ``` One more thing: All of this goes to hell if the virtual table returns way too many matches. For example, I have a term that returns 80K rows from the FTS MATCH. In that case, even my new query very slow because, well, because the QUERY b above is slow. _______________________________________________ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users