And the indexes are not supposed to be used ?  

Shouldn't a index cover the ranking too ?  

It seems a mistake for me.  

Cheers !  
>  Mon Apr 04 2016 11:07:51 AM CEST from "Dan Kennedy"
><danielk1977 at gmail.com>  Subject: Re: [sqlite] FTS5 problem ? more than 
>3,000
>queries !
>
>  On 04/04/2016 05:43 AM, Domingo Alvarez Duarte wrote:
>  
>>Hello !
>> 
>> I'm testing fts5 on a database with data from hacker news (around 13M
>> records) and it took more than 2 hours to index, I have this application
>>to
>> log all queries executed for debugging and although the search isn't slow
>> it's making a lot of queries
>> 
>> Is this normal ? More than 3,000 queries ?
>> 

>  Probably. Your query does "ORDER BY rank" with "rank" set to the built 
> in bm25() function. One of the things bm25() needs is the size of each 
> document that it is ranking. And obtaining this requires a separate 
> query on the %_sz table for each row. So if the MATCH expression matches 
> 3000 rows, FTS5 makes 3000 SQL queries to obtain the size data.
> 
> Dan.
> 
> 
> 
> 
> 
> 
>  
>>Cheers !
>> 
>> 
>> 
>> CREATE TABLE "items" (
>> 'id' integer PRIMARY KEY,
>> 'parent' INTEGER,
>> 'by' text COLLATE NOCASE,
>> 'score' integer DEFAULT 0,
>> 'title' text COLLATE NOCASE,
>> 'type' text COLLATE NOCASE,
>> 'url' text COLLATE NOCASE,
>> 'deleted' BOOLEAN DEFAULT 0,
>> 'dead' BOOLEAN DEFAULT 0,
>> 'comment' TEXT COLLATE NOCASE,
>> 'time' integer NOT NULL,
>> descendants integer default 0
>> );
>> 
>> CREATE VIRTUAL TABLE fts_idx_items USING fts5(title, comment,
>>content=items,
>> content_rowid=id, prefix=3);
>> 
>> 
>> The query:
>> 
>> SQL : SELECT a.id, a.parent, a.by, a.score,
>> snippet(fts_idx_items, 0, '<em>', '</em>', '...',
>> 12) as title,
>> a.type, a.url,
>> --snippet(fts_idx_items, 1, '<em>', '</em>',
>> '...', 12) as comment,
>> a.descendants,
>> datetime(a."time", 'unixepoch') AS time,
>> (julianday('now') - julianday(datetime(a."time",
>> 'unixepoch'))) as elapsed_time
>> FROM items a, fts_idx_items b
>> where fts_idx_items match 'title:sql'
>> and a.id = b.rowid
>> --and a.type != 'comment'
>> ORDER BY rank --bm25(fts_idx_items)
>> LIMIT 30 OFFSET 0
>> 
>> Generates:
>> 
>> SQL : -- SELECT rowid, rank FROM 'main'.'fts_idx_items' ORDER BY
>> bm25(fts_idx_items) ASC
>> 
>> Followed by 28 of this:
>> 
>> SQL : -- SELECT pgno FROM 'main'.'fts_idx_items_idx' WHERE
>> segid=? AND term<=? ORDER BY term DESC LIMIT 1
>> 
>> Followed by around 3,000 of:
>> 
>> SQL : -- SELECT sz FROM 'main''fts_idx_items_docsize' WHERE id=?
>> 
>> 
>> Followed by 18 of:
>> 
>> SQL : -- SELECT T.'id', T.'title', T.'comment' FROM
>> 'main'.'items' T WHERE T.'id'=?
>> 
>> SQL : -- SELECT sz FROM 'main'.'fts_idx_items_docsize' WHERE
>> id=?
>> 
>> _______________________________________________
>> sqlite-users mailing list
>> sqlite-users at mailinglists.sqlite.org
>> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>> 

>  _______________________________________________
> sqlite-users mailing list
> sqlite-users at mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
> 
>
>  



?

Reply via email to