Thanks for the reply, Dan! > On Dec 12, 2018, at 7:08 AM, Dan Kennedy <danielk1...@gmail.com> wrote: > > Leaving stop words in while parsing queries won't quite work anyway. If your > tokenizer returns "the" when parsing a query, FTS3/4 will search for "the" in > the index. And it won't be there if the tokenizer used for parsing documents > stripped it out.
I was only talking about leaving them in when followed immediately by a “*” — so it would preserve “the*” but not “the”. Then FTS4 will interpret “the*” as a prefix match, not the word “the”. > I think your best options might be to switch to FTS5 I haven’t looked into how hard it would be to switch to FTS5. I recall that when I started writing this code a few years ago, FTS5 had some issues or limitations that led me to use FTS4 instead. Also, there are by now many databases out in the field that have FTS4 tables/indexes in them. If I switch to FTS5 will those be upgraded, or do I need to do so manually? > or to write a tokenizer smart enough to remove the AND or other syntax > tokens when required. Not sure what you mean by this — the “when required” part is the sticking point, which is the reason I posted. —Jens _______________________________________________ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users