Re: Lucene's internal doc ID space

Michael McCandless Sat, 12 May 2012 13:13:20 -0700

On Sat, May 12, 2012 at 9:12 AM, Valeriy Felberg
<valeri.felb...@gmail.com> wrote:
>> the Document IDs in Lucene are per segment. ie. they are always
>> segment based.
>
> @Simon I'm just wondering: If the document IDs are per segment how
> does it work if I call Searcher.search(Query, int) and get TopDocs
> referencing ScoreDocs which contain document IDs? What happens if
> there are two matching documents in different segments? How does
> Lucene know which segment is meant if I call Searcher.doc(docId) with
> some docId from the search result?


The per-segment docIDs are "rebased" before Searcher.search returns,
ie turned into global docID against the top reader.

Also: when a merge runs, it removes any deleted docIDs (thus
renumbering all non-deleted docIDs)...

Mike McCandless

http://blog.mikemccandless.com

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org

Re: Lucene's internal doc ID space

Reply via email to