Hi all, I found a previous FTS5 thread and, encouraged by the comments of Dan 
Kennedy, thought I would comment on the issue. - Smaller memory footprint and 
more speed is always great. I'm already very impressed with the speed but 
even faster is even better of course. My experience is that searches that 
produce few hits are very fast (well under a second on a db with 10M+ records). 
Searches that produce many hits (tens of thousands) are much slower: several 
seconds, or even minutes if there are 100,000+ hits. I can live with that, but 
improvement on queries with many hits would be welcome. - I would probably also 
make tokenize=unicode61 "remove_diacritics=0" the default tokenization 
behaviour instead of simple, but that's a minor issue. - In my usage, the 
most inconvenient limitation is that the first  search term can't be 
negative in queries (i.e. MATCH foo NOT bar is good but MATCH NOT bar foo 
throws an error). I would also like to have negative-only queries (MATCH NOT f
 oo, returning all records that don't contain foo). Negative-only queries 
would mostly be used in combination (INTERSECT) with a positive query on 
another column. I know this is probably not a common need, but one can dream. - 
Fuzzy matching would be useful as well, but obviously that's a major 
feature and introducing it might well compromise performance.
- Same for in-word matching (i.e. MATCH reasonable also matching "unreasonable")
- Same for advanced matching like matching 3 out of 4 search terms if there is 
no match with 4 out of 4, or ranking hits based on how close to each other 
terms occur.
- For some reason, searches like SELECT * FROM ftstable WHERE col1 MATCH ? 
INTERSECT SELECT * FROM ftstable WHERE col2 MATCH ? run very slowly for me. 
Much slower than running the two queries separately. This may not be related to 
FTS per se, and maybe the query could be written better.
- BTW, will there be full backwards compatibility? And I assume one will need 
to recreate (export/reimport) existing databases with FTS5 in order to enjoy 
the new features, right? AF 
Context:
"Fts5 will use less memory and be faster than fts4 (I think - initial 
testing has been positive). It will also be smaller, as we can do 
without a bunch of code that is used to workaround problems inherent in 
the file-format. 
...
The most user-visible change is the addition of an API that allows users 
to write their own auxiliary (i.e. snippet(), rank(), offsets()) functions: 
   http://www.sqlite.org/src/artifact/064f9bf705e59d
The included snippet() and rank() functions use this API. 
...
Fts5 is still in the experimental stage at the moment. 
If anybody has any ideas for useful features, or knows of problems with 
FTS4 that could be fixed in FTS5, don't keep them to yourself! - Dan 
Kennedy"
_______________________________________________
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Reply via email to