I'd add to Michael's mail the *strong* recommendation that you provide your own unique doc IDs and use *those* instead. It'll save you a world of grief. Whenever you need to add a new doc to an existing index, you can get the maximum of *your* unique IDs and increment it yourself.
One thing to remember is that not all Lucene docs need to have the same fields. So it's even possible to have a *very special* document that contains meta-data about your index, say the last used of your generated IDs and keep that meta-data doc up to date. If you put fields in that doc that are NOT in any other doc, you don't have to worry about accidentally getting this meta-data doc in your searches.... Best Erick On Tue, Aug 19, 2008 at 8:01 AM, Ivan Vasilev <[EMAIL PROTECTED]> wrote: > Hi Lucene Guys, > > I have a question that is simple but is important for me. I did not found > the answer in the javadoc so I am asking here. > When adding Document-s by the method IndexWriter.addDocument(doc) does the > documents obtain Lucene IDs in the order that they are added to the > IndexWriter? I mean will first added doc be with Lucene ID 0, second added > with Lucene ID 1, etc? > > Bellow I describe why I am asking this. > We plan to split our index to two separate indexes that will be read by > ParallelReader class. This is so because the one of them will contain > field(s) that will be indexed and stored and it will be frequently changed. > So to have always correct data returned from the ParallelReader when > changing documents in the small index the Lucene IDs of these docs have to > remain the same. > To do this Karl Wettin suggests a solution described in *LUCENE-879 < > https://issues.apache.org/jira/browse/LUCENE-879>*. I do not like this > solution because it is connected to changing Lucene source code, and after > each refactoring potentially I will have problems. The solution is related > to optimizing index so it will not be reasonably faster than the one that I > prefer. And it is: > 1. Read the whole index and reconstruct the documents including index data > by using TermDocs and TermEnum classes; > 2. Change the needed documents; > 3. Index documents in new index that will replace the initial one. > I can even simplify this algorithm (and the speed) if all the fields will > be always stored - I can read just the stored data and based on this to > reconstruct the content of the docs and re index them in new. > > But anyway everything in the my approaches will depend on this - are > LuceneIDs in the index ordered in the same way as docs are added to the > IndexWriter. > > Thanks in Advance, > Ivan > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [EMAIL PROTECTED] > For additional commands, e-mail: [EMAIL PROTECTED] > >