I'm using Lucene 1.9.1, and I'm seeing some odd behavior that I hope
someone can help me with.
My application counts on Lucene maintaining the order of the documents
exactly the same as how I insert them. Lucene is supposed to maintain
document order, even across index merges, correct?
My indexing process works as follows (and some of this is hold-over from
the time before lucene had a compound file format - so bear with me)
I open up a File based index - using a merge factor of 90, and in my
current test, the compound index format. When I have added 100,000
documents, I close this index, and start on a new index. I continue
this until I'm done with all of the documents. Then, as a last step, I
open up a new empty index, and I call addIndexes(Directory[]) - and I
pass in the directories in the same order that I created them.
This allows me to use higher merge factors without running into file
handle issues, and without having to call optimize.
The problem that I am seeing right now, is that when I look into my
large combined index with Luke, Document number 899 is the 899th
document that I added. However, Document 900 is the 49860th document
that I added. This continues until Document 910, where it suddenly
jumps to the 99720th document.
Is this a bug, or am I misusing something in the API?
Thanks,
Dan
--
****************************
Daniel Armbrust
Biomedical Informatics
Mayo Clinic Rochester
daniel.armbrust(at)mayo.edu
http://informatics.mayo.edu/
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]