On 04/04/2016 05:43 AM, Domingo Alvarez Duarte wrote:
> Hello !
>
> I'm testing fts5 on a database with data from hacker news (around 13M
> records) and it took more than 2 hours to index, I have this application to
> log all queries executed for debugging and although the search isn't slow
> it's making a lot of queries.
>
> Is this normal ? More than 3,000 queries ?

Probably. Your query does "ORDER BY rank" with "rank" set to the built 
in bm25() function. One of the things bm25() needs is the size of each 
document that it is ranking. And obtaining this requires a separate 
query on the %_sz table for each row. So if the MATCH expression matches 
3000 rows, FTS5 makes 3000 SQL queries to obtain the size data.

Dan.





>
> Cheers !
>
>     
>
> CREATE TABLE "items" (
>      'id' integer PRIMARY KEY,
>      'parent' INTEGER,
>      'by' text COLLATE NOCASE,
>      'score' integer DEFAULT 0,
>      'title' text  COLLATE NOCASE,
>      'type' text  COLLATE NOCASE,
>      'url' text  COLLATE NOCASE,
>      'deleted' BOOLEAN DEFAULT 0,
>      'dead' BOOLEAN DEFAULT 0,
>      'comment' TEXT COLLATE NOCASE,
>      'time' integer NOT NULL,
>      descendants integer default 0
> );
>
> CREATE VIRTUAL TABLE fts_idx_items USING fts5(title, comment, content=items,
> content_rowid=id, prefix=3);
>
>
> The query:
>
> SQL    :  SELECT     a.id, a.parent, a.by, a.score,
>                  snippet(fts_idx_items, 0, '<em>', '</em>', '...',
> 12) as title,
>                  a.type, a.url,
>                  --snippet(fts_idx_items, 1, '<em>', '</em>',
> '...', 12) as comment,
>                  a.descendants,
>                  datetime(a."time", 'unixepoch') AS time,
>                  (julianday('now') - julianday(datetime(a."time",
> 'unixepoch'))) as elapsed_time
>              FROM items a, fts_idx_items b
>              where fts_idx_items match 'title:sql'
>              and a.id = b.rowid
>              --and a.type != 'comment'
>              ORDER BY rank --bm25(fts_idx_items)
>              LIMIT 30 OFFSET 0
>
> Generates:
>
> SQL    :    -- SELECT rowid, rank FROM 'main'.'fts_idx_items' ORDER BY
> bm25(fts_idx_items) ASC
>
> Followed by 28 of this:
>
> SQL    :    -- SELECT pgno FROM 'main'.'fts_idx_items_idx' WHERE
> segid=? AND term<=? ORDER BY term DESC LIMIT 1
>
> Followed by around 3,000 of:
>
> SQL    :    -- SELECT sz FROM 'main'.'fts_idx_items_docsize' WHERE id=?
>   
>
> Followed by 18 of:
>
> SQL    :    -- SELECT T.'id', T.'title', T.'comment' FROM
> 'main'.'items' T WHERE T.'id'=?
>
> SQL    :    -- SELECT sz FROM 'main'.'fts_idx_items_docsize' WHERE
> id=?
>
> _______________________________________________
> sqlite-users mailing list
> sqlite-users at mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Reply via email to