-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 25/01/13 12:59, Paul Vercellotti wrote: > As I understand, it's tricky to get FTS to do substring matching, no? > What's the best way to do that?
In what way is it tricky? There are several examples of doing it in the doc I pointed to. Even when it does a full scan the list of all words should be shorter than visiting each source row. I recommend you actually go ahead and use FTS before deciding it doesn't work. You'll be able to get accurate performance information for your data set. http://c2.com/cgi/wiki?PrematureOptimization If you want to do substring matching using an index then you need to use n-grams. This involves taking fragments from the text. For example if your source text is "hi there" and you are doing n-grams between 2 and 4 letters then you would index these: 'hi' 'hi ' 'hi t' 'i ' 'i t' 'i th' ' t' ' th' ' the' 'th' 'the' 'ther' 'he' 'her' 'here' 'er' 'ere' 're' You can possibly also use a FTS tokenizer that produces n-grams. Roger -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.11 (GNU/Linux) iEYEARECAAYFAlEDFdwACgkQmOOfHg372QTShgCfXMmtiWFbWL9INRMF4TfTUTGb 5+IAn2LrTYKTm9mLcJ6mR6piRQ8LT6nw =taL+ -----END PGP SIGNATURE----- _______________________________________________ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users