Re: [sqlite] speeding up an FTS5 search with a JOIN

P Kishor Mon, 09 Mar 2020 06:08:58 -0700


> On Mar 9, 2020, at 1:04 PM, Dan Kennedy <[email protected]> wrote:
> 
> 
> On 9/3/63 01:44, Puneet Kishor wrote:
>> Update: so, after much hitting of my head against the wall of sql, I came up 
>> with the following – as noted above, I really have two distinct set of 
>> queries I can do separately like so
>> 
>> Q1: (SELECT t1Id FROM t1 WHERE …) AS a
>> 
>> Q2: (SELECT t1Id FROM vt1 WHERE vt1 MATCH ‘bar’) AS b
>> 
>> Then, I can do the following -
>> 
>> SELECT Count(*) FROM a WHERE a.t1Id IN b
>> 
>> Of course, in reality, I don’t do this separately but all in one go to make 
>> a really messy SQL but a really fast query, a couple of hundred ms as 
>> opposed to > 25s
> 
> 
> I don't see why that would be any different from your join query. What does 
> EXPLAIN QUERY PLAN say for the version that takes a few hundred ms?



Here it is, the new query made up of two queries derived from the original 
query (summarize again below). The query itself takes 301ms. 

```
SQLite version 3.30.0 2019-10-04 15:03:17
Enter ".help" for usage hints.
sqlite> EXPLAIN QUERY PLAN
   ...> SELECT Count(*) FROM
   ...> 
   ...> -- this one is 'QUERY a'
   ...> (SELECT Count(DISTINCT t1.t1Id) t1Id 
   ...> FROM t2 JOIN t1 ON t2.t1Id = t1.t1Id 
   ...> WHERE t1.deleted = 0 AND t2.deleted = 0 AND scm != '') a
   ...>
   ...> WHERE a.t1Id IN 
   ...> 
   ...> -- and this one is 'QUERY b'
   ...> (SELECT Count(DISTINCT t1Id) FROM vt1 WHERE vt1 MATCH "foo");
QUERY PLAN
|--CO-ROUTINE 1
|  |--SEARCH TABLE t2 USING INDEX ix_t2 (deleted=? AND deleted=?)
|  `--SEARCH TABLE t1 USING COVERING INDEX ix_t1_t1Id (deleted=? AND t1Id=?)
|--SCAN SUBQUERY 1 AS a
`--LIST SUBQUERY 2
   `--SCAN TABLE vt1 VIRTUAL TABLE INDEX 0:m
sqlite>
```

Here is the original query (QUERY zero). This is the one that takes a very long 
time, 10.611s. Note, in this one I’ve added an additional table ’t2’ but the 
concept is the same. Normal tables JOINed to each other, and then JOINed to a 
virtual table.

```
sqlite> EXPLAIN QUERY PLAN
   ...> SELECT Count(DISTINCT t1.t1Id)
   ...> FROM t2 JOIN t1 ON t2.t1Id = t1.t1Id JOIN vt1 ON t1.t1Id = vt1.t1Id
   ...> WHERE t1.deleted = 0 AND t2.deleted = 0 AND scm != '' AND vt1 MATCH 
"foo";
QUERY PLAN
|--SCAN TABLE vt1 VIRTUAL TABLE INDEX 0:m
|--SEARCH TABLE t2 USING INDEX ix_t2_scm (deleted=? AND deleted=?)
`--SEARCH TABLE t1 USING COVERING INDEX ix_t1_t1Id (deleted=? AND t1Id=?)
sqlite>
```

Here are the two queries derived from QUERY zero. The first one (QUERY a) takes 
324ms

```
sqlite> EXPLAIN QUERY PLAN
   ...> SELECT Count(DISTINCT t1.t1Id)
   ...> FROM t2 JOIN t1 ON t2.t1Id = t1.t1Id
   ...> WHERE t1.deleted = 0 AND t2.deleted = 0 AND scm != '';
QUERY PLAN
|--SEARCH TABLE t2 USING INDEX ix_t2_scm (deleted=? AND deleted=?)
`--SEARCH TABLE t1 USING COVERING INDEX ix_t1_t1Id (deleted=? AND t1Id=?)
sqlite>
```

The second query (QUERY b) takes: 27ms

```
sqlite> EXPLAIN QUERY PLAN
   ...> SELECT Count(DISTINCT t1Id) FROM vt1 WHERE vt1 MATCH “foo";
QUERY PLAN
`--SCAN TABLE vt1 VIRTUAL TABLE INDEX 0:m
sqlite>
```

One more thing: All of this goes to hell if the virtual table returns way too 
many matches. For example, I have a term that returns 80K rows from the FTS 
MATCH. In that case, even my new query very slow because, well, because the 
QUERY b above is slow.
_______________________________________________
sqlite-users mailing list
[email protected]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] speeding up an FTS5 search with a JOIN

Reply via email to