thank you, this helped a lot and confirmed what I expected.
Best Martin



________________________________
Von: D. Richard Hipp <d...@hwaci.com>
An: General Discussion of SQLite Database <sqlite-users@sqlite.org>
Gesendet: Dienstag, den 26. Mai 2009, 12:27:59 Uhr
Betreff: Re: [sqlite] FTS3


On May 26, 2009, at 5:03 AM, Martin Pfeifle wrote:

> Dear all,
> we need full and fuzzy text search for addresses.
> Currently we are looking into Lucene and SQLite's FTS extension.
> For us it is crucial to understand the file structures and the  
> concepts behind the libraries.
> Is there a self-contained, comprehensive document for FTS3 (besides  
> the comments in fts3.c) ?

There is no information on FTS3 apart from the code comments and the  
README files in the source tree.

The file formats for FTS3 and lucene are completely different at the  
byte level.  But if you dig deeper, you will find that they both use  
the same underlying concepts and ideas and really are two different  
implementations of the same algorithm.  During development, we were  
constantly testing the performance and index size of FTS3 against  
CLucene using the Enron email corpus.  Our goal was for FTS3 to run  
significantly faster than CLucene and to generate an index that was no  
larger in size.  That goal was easily met at the time, though we have  
not tested FTS3 against CLucene lately to see if anything has changed.

One of the issues with CLucene that FTS3 sought to address was that  
when inserting new elements into the index, the insertion time was  
unpredictable.  Usually the insertions would be very fast.  But lucene  
will occasionally take a very long time for a single insertion in  
order to merge multiple smaller indices into larger indices.  This was  
seen as undesirable.  FTS3 strives to give much better worst-case  
insertion times by doing index merges incrementally and spreading the  
cost of index merges across many inserts.

D. Richard Hipp
d...@hwaci.com



_______________________________________________
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users



      
_______________________________________________
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Reply via email to