Hi there,
I've got a very slow query that seems to be oh-so-close to running quickly, and it seems to come down to handling of OR operators. When the query has one OR criteria, it uses the correct indexes, but when there are two, it does a table scan, which in this case brings the query time from milliseconds up to minutes in length when the database is large (~3GB). I don't understand why it thinks it needs a table scan, and would love some suggestions. This is with SQLite 3.7.16.2. It's a hierarchical data layout (as for a filesystem tree), and uses FTS4 tables. Here's a simplified schema (sorry I couldn't simplify it further): CREATE TABLE files( key INTEGER DEFAULT 0 primary key, name TEXT COLLATE NOCASE, type INTEGER, parent INTEGER ); CREATE INDEX files_parent_idx_idx ON files (parent); CREATE TABLE search_hierarchy( parent INTEGER REFERENCES files(key), child INTEGER REFERENCES files(key) ); CREATE INDEX search_hierarchy_parent_idx ON search_hierarchy (parent); CREATE INDEX search_hierarchy_child_idx ON search_hierarchy (child); CREATE VIRTUAL TABLE filename_fts USING fts4( file_name_fts ); CREATE VIRTUAL TABLE file_comment_metadata USING fts4( fileComment ); And the query in question: SELECT files.key FROM files, search_hierarchy, filename_fts, file_comment_metadata WHERE search_hierarchy.child = files.parent AND filename_fts.rowid = files.key AND file_comment_metadata.rowid = files.key AND ( search_hierarchy.parent = 12 OR files.parent = 12 ) AND ( (filename_fts.file_name_fts MATCH '"X*"') OR (file_comment_metadata.fileComment MATCH '"X*"') ) LIMIT 0, 1000; This gets run as 0|0|1|SCAN TABLE search_hierarchy (~1000000 rows) <<<<< 0|1|0|SEARCH TABLE files USING COVERING INDEX files_parent_idx_idx (parent=?) (~5 rows) 0|2|2|SCAN TABLE filename_fts VIRTUAL TABLE INDEX 1: (~0 rows) 0|3|3|SCAN TABLE file_comment_metadata VIRTUAL TABLE INDEX 1: (~0 rows) It's that first SCAN TABLE search_hierarchy that is killing it (~3 minutes for this query). It seems to be the OR's on the last two lines that seem to throw it for a loop. If I change it to the following, by removing the first OR: SELECT files.key FROM files, search_hierarchy, filename_fts, file_comment_metadata WHERE search_hierarchy.child = files.parent AND filename_fts.rowid = files.key AND file_comment_metadata.rowid = files.key AND ( search_hierarchy.parent = 12 ) AND ( (filename_fts.file_name_fts MATCH '"X*"') OR (file_comment_metadata.fileComment MATCH '"X*"') ) LIMIT 0, 1000; It runs very fast (75ms) as 0|0|1|SEARCH TABLE search_hierarchy USING INDEX search_hierarchy_parent_idx (parent=?) (~10 rows) 0|1|0|SEARCH TABLE files USING COVERING INDEX files_parent_idx_idx (parent=?) (~10 rows) 0|2|2|SCAN TABLE filename_fts VIRTUAL TABLE INDEX 1: (~0 rows) 0|3|3|SCAN TABLE file_comment_metadata VIRTUAL TABLE INDEX 1: (~0 rows) Or if change it to: SELECT files.key FROM files, search_hierarchy, filename_fts, file_comment_metadata WHERE search_hierarchy.child = files.parent AND filename_fts.rowid = files.key AND file_comment_metadata.rowid = files.key AND ( search_hierarchy.parent = 12 OR files.parent = 12 ) AND ( (filename_fts.file_name_fts MATCH '"X*"') ) LIMIT 0, 1000; it runs very quickly (~70ms) as well, explained as 0|0|2|SCAN TABLE filename_fts VIRTUAL TABLE INDEX 2: (~0 rows) 0|1|0|SEARCH TABLE files USING INTEGER PRIMARY KEY (rowid=?) (~1 rows) 0|2|3|SCAN TABLE file_comment_metadata VIRTUAL TABLE INDEX 1: (~0 rows) 0|3|1|SEARCH TABLE search_hierarchy USING INDEX search_hierarchy_child_idx (child=?) (~5 rows) Are there any suggestions to eliminate the table scan in the original query, since it seems quite capable of efficiently handling each of the OR expressions on their own. (I could do a UNION of multiple versions of the whole query with each OR section separated out, but that's not terribly satisfying.) Many thanks! -Paul _______________________________________________ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users