Hi all,

recently I wanted to try out some modifications of Lucene's postings
format (namely, copying blocks that have no deletions without
int-decoding/encoding -- this is similar to what was described here:
https://issues.apache.org/jira/browse/LUCENE-2082). I started with changing
Lucene 4.1 postings format to check what can be done there.

I came across the following problem: in Lucene41PostingsReader the length
(number of bytes) of the last, vInt-encoded, block of posting in not known
before all individual postings are read and decoded. When reading this
block we only know the number of postings that should be read and decoded
-- since vInts have different sizes by definition.

If I wanted to copy the whole block without vInt decoding/encoding, I need
to know how many bytes I have to read from postings index input. So, my
question is: is there a clean way to determine the length of this block
(ie. the number of bytes that this block has)? Is the number of bytes in a
posting list tracked somewhere in Lucene 4.1 postings format?

Thanks,
Aleksandra

Reply via email to