"Doug Cutting" <[EMAIL PROTECTED]> wrote: > Michael McCandless wrote: > > One thing I have been wondering is whether it really is necessary to > > sort the term vectors before writing to the index.... > > Terms in vectors are prefix-compressed. So not sorting would make > indexes bigger, and slower to read & write. > > http://lucene.apache.org/java/docs/fileformats.html#Term%20Vectors
Duh, I forgot about that :) So I think we should indeed continue to write them sorted. > Also, having them sorted makes it much easier to do dot products between > document vectors, a potentially common operation. True. Mike --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]