Re: [sqlite] Substring (LIKE "%key%") searches, would FTS3 with suffix-tree tokenizer be the fast way?

Black, Michael (IS) Mon, 09 Aug 2010 05:20:31 -0700

Sounds to me like Boyer-Moore is needed
http://en.wikipedia.org/wiki/Boyer%E2%80%93Moore_string_search_algorithm
 
And...I would probably pre-load the database table into 26 seperate memory 
tables to avoid any SQL interactivity at all other than the initial loading.  
Adding the SQL layer slows things down far too much.
 
Michael D. Black
Senior Scientist
Advanced Analytics Directorate
Northrop Grumman Information Systems

________________________________

From: sqlite-users-boun...@sqlite.org on behalf of Tim Romano
Sent: Mon 8/9/2010 7:00 AM
To: General Discussion of SQLite Database
Subject: EXTERNAL:Re: [sqlite] Substring (LIKE "%key%") searches, would FTS3 
with suffix-tree tokenizer be the fast way?

First, permit me a little rant. As a user, I dislike this kind of
incremental search feature if there's no easy way to toggle it or to
configure it and the list of items will be large enough to cause a typing
lag. The feature can become an intrusive nuisance, the opposite of what is
intended.  Browsers put this feature on the URL address bar and Google has
it on its search-input. Keystrokes entered often get swallowed up. It's
worse than typing on a 300 baud dumb terminal, for at least on those ancient
machines your characters would eventually be displayed on the green screen,
whereas with today's browsers the characters often just get eaten; I find
myself having to retype the first few characters of a URL or search term far
too often.

I agree with Radzi's suggestion. Once you have the initial set of of hits
(rowid, name)  in an array, do the rest in procedurally rather than going
back against the database with a new SQL query and a longer search string.
That will be much faster that issuing a new SQL query after every keystroke.
 I would wait until the user had typed at least two characters before
kicking off the initial search because finding every value that contains a
common letter is not helpful when the list of matches is a very long one.

Regards
Tim Romano
Swarthmore PA

On Fri, Aug 6, 2010 at 9:54 PM, Scott Hess <sh...@google.com> wrote:

> On Fri, Aug 6, 2010 at 6:08 PM, Sam Roberts <vieuxt...@gmail.com> wrote:
> > On Fri, Aug 6, 2010 at 11:32 AM, Scott Hess <sh...@google.com> wrote:
> >> On Thu, Aug 5, 2010 at 12:42 PM, Sam Roberts <vieuxt...@gmail.com>
> wrote:
> >>> FTS3 only searches full terms/words by default, but I think if I built
> a custom
> >>> tokenizer that returned all the suffix trees for a name:
> >>
> >> FTS3 can do prefix searches, MATCH 'a*'.  Also, it aimed to support
> >
> > Prefix searches don't allow matching in the middle of words. For
> > example, I want  "bert"
> > to match my name, "roberts".
>
> Darn.  Sorry, was only thinking with half my brain, and that half
> connected your problem up with some past idea.  You're right, you'd
> need the tidbits to get at the interior substrings.
>
> That said, you should be able to pretty easily copy the current
> tokenizer and modify it to return multiple tokens at a single
> location.
>
> -scott
> _______________________________________________
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>
_______________________________________________
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

_______________________________________________
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Substring (LIKE "%key%") searches, would FTS3 with suffix-tree tokenizer be the fast way?

Reply via email to