Well, defeated by FTS4 for now, I will try the following approach --

1. drop the fts tables and rebuild them and test.

2. if the above doesn't work, then either migrate the data to Postgres and use 
its fts, or implement e-Swish or httpdig for full text search.

On Sep 28, 2011, at 4:35 PM, Puneet Kishor wrote:

> 
> On Sep 28, 2011, at 4:18 PM, Black, Michael (IS) wrote:
> 
>> I have no idea if this would work...but...here's some more thoughts...
>> 
>> 
>> 
>> #1 How long does this take:
>> 
>>    select count(*) from fts_uri match 'education school';
>> 
>> 
>> 
>> #2 Create a view on uris with just what you need and use that in your join 
>> (I'm guessing that uri_content takes up most of your database space).
>> 
>> 
>> 
>> create view v_uris as select uri_id,feed_history_id from uri;
>> 
>> 
>> ..
> 
> 
> <snipped a bunch of stuff>
> 
> I did a query on just the fts table and got the answers relatively quickly. 
> Not instantly, but very fast compared to all the attempts so far. So, 
> assuming that the bottleneck is the multiple JOINs to get the data for the 
> correct project_id, I created a temp table with all that JOIN nonsense
> 
>       sqlite> CREATE TEMP TABLE tmp_uris AS SELECT u.uri_id, u.uri uri, 
> u.u_downloaded_on  
>          ...> FROM projects p 
>          ...>   JOIN feeds f ON f.project_id = p.project_id 
>          ...>   JOIN feed_history fh ON f.feed_id = fh.feed_id 
>          ...>   JOIN uris u ON fh.feed_history_id = u.feed_history_id 
>          ...> WHERE p.project_id = 3 AND u.u_downloaded_on >= 
> p.u_project_start;
>       CPU Time: user 16.369556 sys 81.393235
>       
> 
>       sqlite> EXPLAIN QUERY PLAN SELECT u.uri_id uri_id, u.uri uri, 
> u.u_downloaded_on, 
>          ...>   Snippet(fts_uri, '<span class="hilite">', '</span>', 
> '&hellip;', -1, 64) snippet
>          ...> FROM fts_uri f JOIN tmp_uris u ON f.uri_id = u.uri_id 
>          ...> WHERE fts_uri MATCH 'education school'
>          ...> ORDER BY u.uri_id, u_downloaded_on DESC;
>       0|0|0|SCAN TABLE fts_uri AS f VIRTUAL TABLE INDEX 4: (~0 rows)
>       0|1|1|SEARCH TABLE tmp_uris AS u USING INDEX tmp_uris_uri_id (uri_id=?) 
> (~10 rows)
>       0|0|0|USE TEMP B-TREE FOR ORDER BY
>       CPU Time: user 0.000086 sys 0.000006
> 
> and yet
> 
>       sqlite> SELECT u.uri_id uri_id, u.uri uri, u.u_downloaded_on, 
>          ...>   Snippet(fts_uri, '<span class="hilite">', '</span>', 
> '&hellip;', -1, 64) snippet
>          ...> FROM fts_uri f JOIN tmp_uris u ON f.uri_id = u.uri_id 
>          ...> WHERE fts_uri MATCH 'education school'
>          ...> ORDER BY u.uri_id, u_downloaded_on DESC;
> 
>       CPU Time: user 21.871541 sys 26.414337
> 
> 
> A lot better, but simply not usable for a web application.
> 
> --
> Puneet Kishor

_______________________________________________
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Reply via email to