Just for information:
A full-text indexer based only on SQLite BTree index, not using tables:
http://www.codeproject.com/useritems/Text_Indexer.asp
-
To unsubscribe, send email to [EMAIL PROTECTED]
---
Scott Hess wrote:
>>I am optimistic that the proper implementation will use even less than 50%:
>
>Indeed :-).
Glad to read this ;-)
>>I found that _not_ adding the original text turned out to be a great time
>>saver. This makes sense if we know that the original text is about 4 times
>>the si
On 3/13/07, Ralf Junker <[EMAIL PROTECTED]> wrote:
Scott Hess wrote:
>Keeping track of that information would probably double the
>size of the index.
With your estimate, the SQLite full text index (without document storage) would
still take up only 50% of the documents' size. In my opinion, this
Hello Scott,
I was hoping that you would read my message, many thanks for your reply!
>UPDATE and DELETE need to have the previous document text, because the
>docids are embedded in the index, and there is no docid->term index
>(or, put another way, the previous document text _is_ the docid->term
Ion Silvestru wrote:
>Just a question: did you eliminated stop-words in your tests?
No, I did not eliminate any stop-words. The two test runs were equal except for
the small changes in FTS 2.
My stop words question was not intended for source code but for human language
texts.
Ralf
--
On 3/13/07, [EMAIL PROTECTED] <[EMAIL PROTECTED]> wrote:
Ion Silvestru <[EMAIL PROTECTED]> wrote:
> To Ralf:
> >As a side effect, the offsets() and snippet() functions stopped working,
> >as they seem to rely on the presence of the full document text in the
> >current implementation.
>
> Did you
Ion Silvestru <[EMAIL PROTECTED]> wrote:
> To Ralf:
>
> >As a side effect, the offsets() and snippet() functions stopped working, as
> >they seem to rely on the presence of the full document text in the current
> >implementation.
>
> Did you tested "phrase" searching on the index-only version,
To Ralf:
>As a side effect, the offsets() and snippet() functions stopped working, as
>they seem to rely on the presence of the full document text in the current
>implementation.
Did you tested "phrase" searching on the index-only version, didn't this
kind of search rely on offsets()?
---
>Just a question: did you eliminated stop-words in your tests?
Sorry, you specified that you indexed source code files, so no
stop-words are applicable here.
-
To unsubscribe, send email to [EMAIL PROTECTED]
---
Thank you.
Just a question: did you eliminated stop-words in your tests?
>Concluding: Given the great database size savings possible by separating full
>text index from data storage, I wish that
>developers would consider adding such an option to the SQLite FTS interface.
If such an option wil
>But what about:
>
>I am very interested to know if it would be possible to use an FTS indexing
>module to store the inverted index only, but
>not the document's text. This would safe disk space if the text to index is
>stored on disk rather than inside the database.
This is possible with just
11 matches
Mail list logo