I've never used FTS, just throwing an off-the-wall idea out: instead of 
tokenising partial words, could you tokenise/store the reverse of each word 
(possibly in a separate place if that can be done):

enihsnoom
enihs
enihsnus

Then search for "enihs" as well as "shine". If you can't separate the forward 
and reversed versions, you'd have to filter-out when "dog" matches "god".

Graham

Sent from Samsung Mobile

-------- Original message --------
From: "Mario M. Westphal" <m...@mwlabs.de> 
Date: 07/01/2016  18:31  (GMT+00:00) 
To: sqlite-users at mailinglists.sqlite.org 
Subject: [sqlite] Some FTS5 guidance 

Hello,



I recently looked into FTS 5. 

The documentation is clear and I was able to get it running with a small
test database quickly. And the response times are awesome :-)



My question: 



At least as I understand it at this point, FTS can only do prefix queries.



If my database contains the words



moon

moonlight

moonshine

shine

sunshine



A FTS query like "moon*" will find all three terms starting with "moon" -
very fast.



But there is no way to find "moonshine" or "sunshine" by running a query for
"shine" or "shine*" ?



Currently I search using LIKE and there such 'contains' queries are easy. My
users of course don't understand all this and want to find all words
containing shine, wherever the term appears in the word.



The only idea I had so far was to write my own tokenizer and to store each
word with every possible 'sub-word':



When "moonshine" is added to FTS, it is split into multiple words:



moonshine
oonshine
onshine
nshine
shine
. 



(maybe I limit this to a minimum of 2 or 3 characters).



This of course produces a log of extra entries in FTS and may impact
performance and database size. 

I hence wonder if this problem has been tackled already and if there is a
"standard" solution. 

_______________________________________________
sqlite-users mailing list
sqlite-users at mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Reply via email to