Doron Cohen/Haifa/[EMAIL PROTECTED] wrote on 14/01/2007 23:04:27:

> David <[EMAIL PROTECTED]> wrote on 14/01/2007 20:08:05:
>
> > thanks, How do Lucene give each document an ID  when the document is
> added?
> > Is the document ID unchanged until the document is deleted?
> >
>
> Not exactly.
>
> When the first doc is added, it is assigned id 0.
> Next one assigned id 1, etc.
> When a doc is deleted, it is first only marked as such.
> So if there are 10 docs they have ids 0 to 9.
>
> Now doc 2 and 4 are deleted, - there is no change in ids.
> Next doc added is assigned id 10.
>
> Now if/when the segment containing the deleted docs is merged, all info
on
> those docs is really removed, and docids are modified to remove any holes
> in the numbering - result is: 0 docs with ids 0 to 8. Now, next doc added

That shuld be 9 docs (not 0...)

> gets id 9.
>
> Btw, segments are merged either as result of explicit call to optimize(),
> or implicitly following addDoc or indexWriter.close() (and depending on
> Lucene's merge policy).
>
> Docids are therefore internal, with unstable values.
>
> See also the FAQ - http://wiki.apache.org/jakarta-lucene/LuceneFAQ
> Especially "When is it possible for document IDs to change?"
>

Also take a look at the http://lucene.apache.org/java/docs/fileformats.html
- at the "Document Numbers" section.


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to