all of the below is really good advice that I shall follow over this weekend.

Many thanks.


On Sep 29, 2011, at 10:05 AM, Petite Abeille wrote:

> 
> On Sep 29, 2011, at 3:30 PM, Mr. Puneet Kishor wrote:
> 
>> Well, defeated by FTS4 for now, I will try the following approach --
> 
> [didn't follow the thread blow by blow, so apologies if this was already 
> covered and dismissed :)]
> 
> Before you jump to the deep end...
> 
> FTS tables are meant to be accessed by either their rowid or queried with a 
> match qualifier. Anything else will be rather slow, as it will result in a 
> full table scan. See section 1.4. "Simple FTS Queries" of the fine manual [1].
> 
> Looking at the DDL you posted original, you have:
> 
>   CREATE VIRTUAL TABLE fts_uri USING fts4 (uri_id, uri_content); 
> 
> That uri_id is not helping anything, as you cannot really use it to lookup 
> the table (see section "1.3. Populating FTS Tables"  [2]) . You should 
> instead drop it, and simply set the rowid of your FTS table to the 
> uris.uri_id.
> 
> So:
> 
>   CREATE VIRTUAL TABLE fts_uri USING fts4 (uri_content);  -- don't bother 
> with uri_id here, as it cannot be used for lookups
> 
> Now populate fts_uri with the content of  uris, using the same rowid for both:
> 
> insert
> into          fts_uri
>               (
>                       rowid,
>                       uri_content
>               )
> select        uris.uri_id,
>               uris. uris
> from  uris
> 
> Also, it would appear that you are storing your content twice: once in uris. 
> uri_content and once again in fts_uri. uri_content. That's once too many.
> 
> From uris, you can retrieve your content directly from fts_uri:
> 
> select  *
> from    uris
> join     fts_uri
> on       fts_uri.rowid = uris.uri_id
> 
> No point in storing the data twice as you can retrieve the text verbatim from 
> fts_uri without much ado.
> 
> And now you can access uris directly from fts_uri as well as they share the 
> same rowid.
> 
> Now, for your search, decompose the problem:
> 
> (1) do the FTS first, assuming it's the most selective part of your query
> 
> select  fts_uri.rowid as uri_id,
>        snippet(fts_uri, '<span class="hilite">', '</span>', '&hellip;', -1, 
> 64) snippet 
> from    fts_uri
> 
> where   fts_uri.uri_content match 'education,school' 
> 
> Is that slow? 
> 
> (2) Add joins one by one
> 
> select  fts_uri.rowid as uri_id,
>        snippet(fts_uri, '<span class="hilite">', '</span>', '&hellip;', -1, 
> 64) snippet, 
>        uris.uri as uri, 
>        uris.downloaded_on as downloaded_on
> from    fts_uri
> 
> join    uris
> on      uris.uri_id = fts_uri.rowid
> 
> where   fts_uri.uri_content match 'education,school' 
> 
> Is that slow?
> 
> Repeat and rinse :)
> 
> 
> 
> 
> [1] http://www.sqlite.org/fts3.html#section_1_4
> [2] http://www.sqlite.org/fts3.html#section_1_3
> _______________________________________________
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

_______________________________________________
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Reply via email to