Hi I was trying to optimize this FTS query which can be a bottleneck in my application:
SELECT docId,matchinfo(ftsStreets,'pcx') FROM ftsStreets WHERE ftsStreets MATCH '...' AND docId BETWEEN docid1 AND docid2; Can't SQLite optimize the query to use the fact that docid cannot be more than the specified upper limit (docid2) given in the BETWEEN clause here? Looking that the SQLite code (3.7.14), I see that it unpacks docid lists (in function fts3SegReaderNextDocid.c) of the FTS inverted index. Those docid lists are stored in ascending order and stored by delta. Each delta is stored in variable size (1 bytes to 10 bytes depending on size of delta). When running above query, I see that SQLite unpacks sequentially all the docids of the lists, even after it reaches beyond upper bound docid2, which seems useless at least in above query. I made a temporary change to sqlite3.c (function fts3SegReaderNextDocid.c) just to prove that there is a potential optimization here: $ p4 diff -d-c sqlite3.c *************** *** 126024,126029 **** --- 126024,126030 ---- }else{ rc = fts3SegReaderRequire(pReader, p, FTS3_VARINT_MAX); if( rc==SQLITE_OK ){ + extern sqlite3_int64 docidMax; sqlite3_int64 iDelta; pReader->pOffsetList = p + sqlite3Fts3GetVarint(p, &iDelta); if( pTab->bDescIdx ){ *************** *** 126031,126036 **** --- 126032,126042 ---- }else{ pReader->iDocid += iDelta; } + if ((unsigned long long)pReader->iDocid > (unsigned long long)docidMax) + { + fprintf(stdout, "*** MAX DOCID (0x%llx) REACHED! (abort unpack of docid list)\n", docidMax); + pReader->pOffsetList = 0; + } } } } The docidMax is normally initialized to 0xffffffffffffffffUL (-1), so my optimization normally do not do anything. But I set doidMax to docid2 (upper bound of the BETWEEN clause) temporarily when I run my query and then put it back to 0xffffffffffffffffUL. Needless to say that this is an ugly hack which should definitely not be checked-in as-is. But it clearly helps to speed up my query and it seems to work (without being 100% sure it's correct). I get the same results, faster. And I hope it's good enough at least to prove that there is a potential optimization in FTS SQLite. I don't know the SQLite code well enough to come up with a clean patch for the optimization. I measured timing several times and it consistently speeds up. For example, I see some queries that used to take 203 ms which now take 68 ms. Of course speed up depends on the upper bound value docid2. Regards Dominique _______________________________________________ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users