Sorry about that. In my original message, I highlighted the relevant parts which probably didn't make it to the mail archive.
I would expect the note to state the following (unless I misunderstood some of the details): "Once data is loaded into memory, you can lookup the start pointer of any document chunk by performing two binary searches: a first one based on the values of DocBase in order to find the right block, and then inside the block based on DocBaseDeltas (by reconstructing the doc bases for every chunk)." instead of: "Once data is loaded into memory, you can lookup the start pointer of any document by performing two binary searches: a first one based on the values of DocBase in order to find the right block, and then inside the block based on DocBaseDeltas (by reconstructing the doc bases for every chunk)." The difference between the two is the added word 'chunk' after the word 'document'. Thanks, Roman Margolis On Fri, Nov 24, 2017 at 11:24 AM, Adrien Grand <jpou...@gmail.com> wrote: > Hi Roman, > > It's unclear to me what modification you are suggesting, could you please > share what the updated comment would look like? > > Le mer. 22 nov. 2017 à 14:17, Roman Margolis <roman.margo...@gmail.com> a > écrit : > > > Hi, > > > > I was reading some internal info about Lucene, and was confused by a note > > on this page: > > > > https://lucene.apache.org/core/7_1_0/core/org/apache/ > lucene/codecs/compressing/CompressingStoredFieldsIndexWriter.html > > > > The note (the last note at the bottom) says: > > > > - Once data is loaded into memory, you can lookup the start pointer of > > any document by performing two binary searches: a first one based on > the > > values of DocBase in order to find the right block, and then inside > the > > block based on DocBaseDeltas (by reconstructing the doc bases for > every > > chunk). > > > > Shouldn't it say chunk, or document chunk (referring to document chunks > in > > the field data file)? > > > > Thanks in advance, > > Roman Margolis > > >