I'm using Lucene 1.9.1, and I'm seeing some odd behavior that I hope someone can help me with.

My application counts on Lucene maintaining the order of the documents exactly the same as how I insert them. Lucene is supposed to maintain document order, even across index merges, correct?

My indexing process works as follows (and some of this is hold-over from the time before lucene had a compound file format - so bear with me)

I open up a File based index - using a merge factor of 90, and in my current test, the compound index format. When I have added 100,000 documents, I close this index, and start on a new index. I continue this until I'm done with all of the documents. Then, as a last step, I open up a new empty index, and I call addIndexes(Directory[]) - and I pass in the directories in the same order that I created them.


This allows me to use higher merge factors without running into file handle issues, and without having to call optimize.

The problem that I am seeing right now, is that when I look into my large combined index with Luke, Document number 899 is the 899th document that I added. However, Document 900 is the 49860th document that I added. This continues until Document 910, where it suddenly jumps to the 99720th document.

Is this a bug, or am I misusing something in the API?

Thanks,

Dan


--
****************************
Daniel Armbrust
Biomedical Informatics
Mayo Clinic Rochester
daniel.armbrust(at)mayo.edu
http://informatics.mayo.edu/

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to