And the indexes are not supposed to be used ? Shouldn't a index cover the ranking too ?
It seems a mistake for me. Cheers ! > Mon Apr 04 2016 11:07:51 AM CEST from "Dan Kennedy" ><danielk1977 at gmail.com> Subject: Re: [sqlite] FTS5 problem ? more than >3,000 >queries ! > > On 04/04/2016 05:43 AM, Domingo Alvarez Duarte wrote: > >>Hello ! >> >> I'm testing fts5 on a database with data from hacker news (around 13M >> records) and it took more than 2 hours to index, I have this application >>to >> log all queries executed for debugging and although the search isn't slow >> it's making a lot of queries >> >> Is this normal ? More than 3,000 queries ? >> > Probably. Your query does "ORDER BY rank" with "rank" set to the built > in bm25() function. One of the things bm25() needs is the size of each > document that it is ranking. And obtaining this requires a separate > query on the %_sz table for each row. So if the MATCH expression matches > 3000 rows, FTS5 makes 3000 SQL queries to obtain the size data. > > Dan. > > > > > > > >>Cheers ! >> >> >> >> CREATE TABLE "items" ( >> 'id' integer PRIMARY KEY, >> 'parent' INTEGER, >> 'by' text COLLATE NOCASE, >> 'score' integer DEFAULT 0, >> 'title' text COLLATE NOCASE, >> 'type' text COLLATE NOCASE, >> 'url' text COLLATE NOCASE, >> 'deleted' BOOLEAN DEFAULT 0, >> 'dead' BOOLEAN DEFAULT 0, >> 'comment' TEXT COLLATE NOCASE, >> 'time' integer NOT NULL, >> descendants integer default 0 >> ); >> >> CREATE VIRTUAL TABLE fts_idx_items USING fts5(title, comment, >>content=items, >> content_rowid=id, prefix=3); >> >> >> The query: >> >> SQL : SELECT a.id, a.parent, a.by, a.score, >> snippet(fts_idx_items, 0, '<em>', '</em>', '...', >> 12) as title, >> a.type, a.url, >> --snippet(fts_idx_items, 1, '<em>', '</em>', >> '...', 12) as comment, >> a.descendants, >> datetime(a."time", 'unixepoch') AS time, >> (julianday('now') - julianday(datetime(a."time", >> 'unixepoch'))) as elapsed_time >> FROM items a, fts_idx_items b >> where fts_idx_items match 'title:sql' >> and a.id = b.rowid >> --and a.type != 'comment' >> ORDER BY rank --bm25(fts_idx_items) >> LIMIT 30 OFFSET 0 >> >> Generates: >> >> SQL : -- SELECT rowid, rank FROM 'main'.'fts_idx_items' ORDER BY >> bm25(fts_idx_items) ASC >> >> Followed by 28 of this: >> >> SQL : -- SELECT pgno FROM 'main'.'fts_idx_items_idx' WHERE >> segid=? AND term<=? ORDER BY term DESC LIMIT 1 >> >> Followed by around 3,000 of: >> >> SQL : -- SELECT sz FROM 'main''fts_idx_items_docsize' WHERE id=? >> >> >> Followed by 18 of: >> >> SQL : -- SELECT T.'id', T.'title', T.'comment' FROM >> 'main'.'items' T WHERE T.'id'=? >> >> SQL : -- SELECT sz FROM 'main'.'fts_idx_items_docsize' WHERE >> id=? >> >> _______________________________________________ >> sqlite-users mailing list >> sqlite-users at mailinglists.sqlite.org >> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users >> > _______________________________________________ > sqlite-users mailing list > sqlite-users at mailinglists.sqlite.org > http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users > > > ?