On Fri, Feb 18, 2011 at 12:08 AM, David M. Cotter <m...@davecotter.com> wrote:
> so i am still left wondering if searching substrings is really any faster 
> using FTS.

You may want to search the archives, as this has come up before.  I
don't recall if anyone had an inspired solution.  You could possibly
use a custom tokenizer which indexed all sub-string points within each
term (at obvious cost to index size).

FTS stores a sorted index of the of the indexed terms.  This means
that prefix searches ('th*', for instance) can be found by considering
the sub-tree of terms starting with 'th' - prefix search was
essentially free to add.  But full sub-string matches require a full
scan, just like for a regular table.  It's possible that the current
code is doing a full table scan, though a full index scan might be
faster (though you should note that it would still be substantially
slower than a search anchored at the front of a term).

Indexing strategies which allow for fast substring searches are
available, but it would require a different (and much bigger) index
structure, and such structures generally don't show good locality.  So
it would be a definite loss for use cases which don't need substring
searches.

-scott
_______________________________________________
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Reply via email to