all of the below is really good advice that I shall follow over this weekend.
Many thanks. On Sep 29, 2011, at 10:05 AM, Petite Abeille wrote: > > On Sep 29, 2011, at 3:30 PM, Mr. Puneet Kishor wrote: > >> Well, defeated by FTS4 for now, I will try the following approach -- > > [didn't follow the thread blow by blow, so apologies if this was already > covered and dismissed :)] > > Before you jump to the deep end... > > FTS tables are meant to be accessed by either their rowid or queried with a > match qualifier. Anything else will be rather slow, as it will result in a > full table scan. See section 1.4. "Simple FTS Queries" of the fine manual [1]. > > Looking at the DDL you posted original, you have: > > CREATE VIRTUAL TABLE fts_uri USING fts4 (uri_id, uri_content); > > That uri_id is not helping anything, as you cannot really use it to lookup > the table (see section "1.3. Populating FTS Tables" [2]) . You should > instead drop it, and simply set the rowid of your FTS table to the > uris.uri_id. > > So: > > CREATE VIRTUAL TABLE fts_uri USING fts4 (uri_content); -- don't bother > with uri_id here, as it cannot be used for lookups > > Now populate fts_uri with the content of uris, using the same rowid for both: > > insert > into fts_uri > ( > rowid, > uri_content > ) > select uris.uri_id, > uris. uris > from uris > > Also, it would appear that you are storing your content twice: once in uris. > uri_content and once again in fts_uri. uri_content. That's once too many. > > From uris, you can retrieve your content directly from fts_uri: > > select * > from uris > join fts_uri > on fts_uri.rowid = uris.uri_id > > No point in storing the data twice as you can retrieve the text verbatim from > fts_uri without much ado. > > And now you can access uris directly from fts_uri as well as they share the > same rowid. > > Now, for your search, decompose the problem: > > (1) do the FTS first, assuming it's the most selective part of your query > > select fts_uri.rowid as uri_id, > snippet(fts_uri, '<span class="hilite">', '</span>', '…', -1, > 64) snippet > from fts_uri > > where fts_uri.uri_content match 'education,school' > > Is that slow? > > (2) Add joins one by one > > select fts_uri.rowid as uri_id, > snippet(fts_uri, '<span class="hilite">', '</span>', '…', -1, > 64) snippet, > uris.uri as uri, > uris.downloaded_on as downloaded_on > from fts_uri > > join uris > on uris.uri_id = fts_uri.rowid > > where fts_uri.uri_content match 'education,school' > > Is that slow? > > Repeat and rinse :) > > > > > [1] http://www.sqlite.org/fts3.html#section_1_4 > [2] http://www.sqlite.org/fts3.html#section_1_3 > _______________________________________________ > sqlite-users mailing list > sqlite-users@sqlite.org > http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users _______________________________________________ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users