Why do you think postgres would be any better?

Have you thought about writing your own code to process the tables rather than 
letting the database do all the work?

Load your data into memory and then slog through the uris to winnow out the 
matches?
Probably a LOT faster than letting the db do it.



Michael D. Black
Senior Scientist
NG Information Systems
Advanced Analytics Directorate




From: sqlite-users-boun...@sqlite.org [sqlite-users-boun...@sqlite.org] on 
behalf of Mr. Puneet Kishor [punk.k...@gmail.com]
Sent: Thursday, September 29, 2011 8:30 AM
To: General Discussion of SQLite Database
Subject: EXT :Re: [sqlite] speeding up FTS4


Well, defeated by FTS4 for now, I will try the following approach --

1. drop the fts tables and rebuild them and test.

2. if the above doesn't work, then either migrate the data to Postgres and use 
its fts, or implement e-Swish or httpdig for full text search.

On Sep 28, 2011, at 4:35 PM, Puneet Kishor wrote:

>
> On Sep 28, 2011, at 4:18 PM, Black, Michael (IS) wrote:
>
>> I have no idea if this would work...but...here's some more thoughts...
>>
>>
>>
>> #1 How long does this take:
>>
>>    select count(*) from fts_uri match 'education school';
>>
>>
>>
>> #2 Create a view on uris with just what you need and use that in your join 
>> (I'm guessing that uri_content takes up most of your database space).
>>
>>
>>
>> create view v_uris as select uri_id,feed_history_id from uri;
>>
>>
>> ..
>
>
> <snipped a bunch of stuff>
>
> I did a query on just the fts table and got the answers relatively quickly. 
> Not instantly, but very fast compared to all the attempts so far. So, 
> assuming that the bottleneck is the multiple JOINs to get the data for the 
> correct project_id, I created a temp table with all that JOIN nonsense
>
>        sqlite> CREATE TEMP TABLE tmp_uris AS SELECT u.uri_id, u.uri uri, 
> u.u_downloaded_on
>           ...> FROM projects p
>           ...>   JOIN feeds f ON f.project_id = p.project_id
>           ...>   JOIN feed_history fh ON f.feed_id = fh.feed_id
>           ...>   JOIN uris u ON fh.feed_history_id = u.feed_history_id
>           ...> WHERE p.project_id = 3 AND u.u_downloaded_on >= 
> p.u_project_start;
>        CPU Time: user 16.369556 sys 81.393235
>
>
>        sqlite> EXPLAIN QUERY PLAN SELECT u.uri_id uri_id, u.uri uri, 
> u.u_downloaded_on,
>           ...>   Snippet(fts_uri, '<span class="hilite">', '</span>', 
> '&hellip;', -1, 64) snippet
>           ...> FROM fts_uri f JOIN tmp_uris u ON f.uri_id = u.uri_id
>           ...> WHERE fts_uri MATCH 'education school'
>           ...> ORDER BY u.uri_id, u_downloaded_on DESC;
>        0|0|0|SCAN TABLE fts_uri AS f VIRTUAL TABLE INDEX 4: (~0 rows)
>        0|1|1|SEARCH TABLE tmp_uris AS u USING INDEX tmp_uris_uri_id 
> (uri_id=?) (~10 rows)
>        0|0|0|USE TEMP B-TREE FOR ORDER BY
>        CPU Time: user 0.000086 sys 0.000006
>
> and yet
>
>        sqlite> SELECT u.uri_id uri_id, u.uri uri, u.u_downloaded_on,
>           ...>   Snippet(fts_uri, '<span class="hilite">', '</span>', 
> '&hellip;', -1, 64) snippet
>           ...> FROM fts_uri f JOIN tmp_uris u ON f.uri_id = u.uri_id
>           ...> WHERE fts_uri MATCH 'education school'
>           ...> ORDER BY u.uri_id, u_downloaded_on DESC;
>
>        CPU Time: user 21.871541 sys 26.414337
>
>
> A lot better, but simply not usable for a web application.
>
> --
> Puneet Kishor

_______________________________________________
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
_______________________________________________
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Reply via email to