Clemens Ladisch wrote:

>Try to put the FTS matches into a subquery (this is usually done to
>avoid inefficient joins with the FTS tables):
>
>EXPLAIN QUERY PLAN
>SELECT files.key FROM files, search_hierarchy
>WHERE search_hierarchy.child = files.parent AND
>( search_hierarchy.parent = 12 OR files.parent = 12 ) AND
>(files.key IN (SELECT rowid FROM filename_fts WHERE file_name_fts MATCH '"X*"' 
>UNION
>             SELECT rowid FROM file_comment_metadata WHERE fileComment MATCH 
> '"X*"'));


Thanks for the suggestions - Unfortunately this form (using a subquery for the 
FTS) also takes minutes, proportional to the number of results returned by FTS 
query.   In the example case, there are about 1000000 records returned by the 
FTS result.   The form that uses joins is nice in that it seems to be able to 
stop working on the FTS lookups once the LIMIT conditions are met.

Seems UNIONing repetitions of whole query for each OR section is offering the 
best option so far.  I'll keep trying things. 

Cheers,
Paul


________________________________
 From: Clemens Ladisch <clem...@ladisch.de>
To: sqlite-users@sqlite.org 
Sent: Friday, May 31, 2013 4:11 AM
Subject: Re: [sqlite] Avoiding table scans with complex queries
 

Paul Vercellotti wrote:
> I've got a very slow query that seems to be oh-so-close to running
> quickly, and it seems to come down to handling of OR operators.
> When the query has one OR criteria, it uses the correct indexes, but
> when there are two, it does a table scan, which in this case brings
> the query time from milliseconds up to minutes in length when the
> database is large (~3GB).   I don't understand why it thinks it needs
> a table scan

The _estimated_ cost for executing the ORs separately is higher than
that for single table scan.

> 0|0|1|SCAN TABLE search_hierarchy (~1000000 rows)

You did not run ANALYZE.
<http://www.sqlite.org/lang_analyze.html>

> Are there any suggestions to eliminate the table scan in the original
> query, since it seems quite capable of efficiently handling each of
> the OR expressions on their own.

Try to put the FTS matches into a subquery (this is usually done to
avoid inefficient joins with the FTS tables):

EXPLAIN QUERY PLAN
SELECT files.key FROM files, search_hierarchy
WHERE search_hierarchy.child = files.parent AND
( search_hierarchy.parent = 12 OR files.parent = 12 ) AND
(files.key IN (SELECT rowid FROM filename_fts WHERE file_name_fts MATCH '"X*"' 
UNION
               SELECT rowid FROM file_comment_metadata WHERE fileComment MATCH 
'"X*"'));

0|0|0|SEARCH TABLE files USING INTEGER PRIMARY KEY (rowid=?) (~25 rows)
0|0|0|EXECUTE LIST SUBQUERY 1
2|0|0|SCAN TABLE filename_fts VIRTUAL TABLE INDEX 2: (~0 rows)
3|0|0|SCAN TABLE file_comment_metadata VIRTUAL TABLE INDEX 2: (~0 rows)
1|0|0|COMPOUND SUBQUERIES 2 AND 3 USING TEMP B-TREE (UNION)
0|1|1|SEARCH TABLE search_hierarchy USING INDEX search_hierarchy_child_idx 
(child=?) (~5 rows)

> (I could do a UNION of multiple versions of the whole query with each
> OR section separated out, but that's not terribly satisfying.)

Replacing this UNION with OR does not appear to help (I guess the other
OR interferes too much):

EXPLAIN QUERY PLAN
SELECT files.key FROM files, search_hierarchy
WHERE search_hierarchy.child = files.parent AND
( search_hierarchy.parent = 12 OR files.parent = 12 ) AND
(files.key IN (SELECT rowid FROM filename_fts WHERE file_name_fts MATCH '"X*"') 
OR
files.key IN (SELECT rowid FROM file_comment_metadata WHERE fileComment MATCH 
'"X*"'));

0|0|0|EXECUTE LIST SUBQUERY 1
1|0|0|SCAN TABLE filename_fts VIRTUAL TABLE INDEX 2: (~0 rows)
0|0|0|SEARCH TABLE files USING INTEGER PRIMARY KEY (rowid=?) (~25 rows)
0|0|0|EXECUTE LIST SUBQUERY 2
2|0|0|SCAN TABLE file_comment_metadata VIRTUAL TABLE INDEX 2: (~0 rows)
0|0|0|SEARCH TABLE files USING INTEGER PRIMARY KEY (rowid=?) (~25 rows)
0|0|0|EXECUTE LIST SUBQUERY 3
3|0|0|SCAN TABLE filename_fts VIRTUAL TABLE INDEX 2: (~0 rows)
0|0|0|EXECUTE LIST SUBQUERY 4
4|0|0|SCAN TABLE file_comment_metadata VIRTUAL TABLE INDEX 2: (~0 rows)
0|1|1|SEARCH TABLE search_hierarchy USING INDEX search_hierarchy_child_idx 
(child=?) (~5 rows)


Regards,
Clemens
_______________________________________________
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
_______________________________________________
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Reply via email to