Michael McCandless wrote:
One thing I have been wondering is whether it really is necessary to
sort the term vectors before writing to the index....

Terms in vectors are prefix-compressed. So not sorting would make indexes bigger, and slower to read & write.

http://lucene.apache.org/java/docs/fileformats.html#Term%20Vectors

Also, having them sorted makes it much easier to do dot products between document vectors, a potentially common operation.

Doug

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to