Hi,

I'm currently working on a project that makes use of the SQLite FTS
functionality (version 3.7.14) on a large data set of POI (point of
interest). Most of the time it works fine but I stumbled on a few queries
that take a lot of time to complete. It's confusing to me what actually is
going on here so I hope someone can help me out here.

The query that gives a problem is the following:
SELECT matchinfo(ftsPois, 'cs'), ftsPois.docId FROM ftsPois WHERE ftsPois
MATCH "(categories:7310 OR mot) haven";

This returns 2602 results and takes 9.95 seconds to complete! When I would
remove the matchinfo part of the query you get a totally different number:
SELECT ftsPois.docId FROM ftsPois WHERE ftsPois MATCH "(categories:7310 OR
mot) haven";

Now it only takes 0.05 seconds to complete the query. So a difference of
9.9 seconds.

To make it more interesting you can see the performance degradation getting
more prominent around 1500 results. Here are the timings for the query with
the matchinfo in steps of 250 results (obtained by adding a LIMIT to the
query):
Result count: Time spent
 250: CPU Time: user 0.020001 sys 0.000000
 500: CPU Time: user 0.028001 sys 0.000000
 750: CPU Time: user 0.028002 sys 0.000000
1000: CPU Time: user 0.036003 sys 0.000000
1250: CPU Time: user 0.044003 sys 0.000000
1500: CPU Time: user 0.644040 sys 0.012000
1750: CPU Time: user 2.772173 sys 0.016001
2000: CPU Time: user 4.900307 sys 0.012001
2250: CPU Time: user 7.000438 sys 0.008001
2500: CPU Time: user 9.136571 sys 0.028002

So after 1500 results you pay an extra 2 seconds for each extra 250 results.

I found two ways to somewhat work around this. The first one is to take the
union of two queries:
SELECT ftsPois.docId,matchinfo(ftsPois,'cs') FROM ftsPois WHERE ftsPois
MATCH "mot haven" UNION SELECT ftsPois.docId,matchinfo(ftsPois,'cs') FROM
ftsPois WHERE ftsPois MATCH "categories:7310 haven";

This query takes 0.07 seconds to complete.

And then there is this workaround:
SELECT matchinfo(ftsPois, 'cs'),ftsPois.docId FROM ftsPois WHERE ftsPois
MATCH "(categories:7310 OR mot) haven" ORDER BY ftsPois.docId DESC;

Which takes 0.08 seconds to complete.

Especially the second workaround baffles me. I wouldn't know why this would
speed the matchinfo part of the query.

I hope someone can help me out here or at least could explain the observed
behavior. Thanks in advance.
_______________________________________________
sqlite-users mailing list
[email protected]
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Reply via email to