postings without position information ?

robert engels Thu, 07 Feb 2008 10:44:08 -0800

I think there are many uses of Lucene that would benefit from 'enum'fields, aka categories.


When classifying documents, they are often in one or more categories.

Lucene could write these posting very efficiently using VINT and RLE(run length encoding) if the positions information was not stored(since it is not really useful in these typical cases).

StartingDocNum|NumberOfDocuments...StartingDocNum|NumberOfDocumentsusing a bit of the StartingDocNum to know if it was a series.

When a lot of documents are in the same category, and they are addedas the same time, the document numbers would be nearly sequential,allowing very efficient compression.

Has anyone worked on this? Our previous custom IndexReaderWritersupported it, and I was wondering if this has made it into the core.I checked the docs/email and could not find anything.


Thanks.

Robert





---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

postings without position information ?

Reply via email to