Hi all,
We've been waiting for LUCENE-1879 and LUCENE-2425 and have written our own
ParallelWriter class in the meantime. Apparently our indexes are falling out
of sync (I suspect my colleague is seeing error messages come from
ParallelReader stating the the number of documents must be the same).
Here's a code snippet from our ParallelWriter which extends Object:
writer1 = new IndexWriter(dir, analyzer,
create,
new IndexWriter.MaxFieldLength(MFL));
writer1.setMergePolicy(new LogDocMergePolicy());
writer1.setMergeScheduler(new SerialMergeScheduler());
writer1.setMaxBufferedDocs(MBD);
writer1.setRAMBufferSizeMB(IndexWriter.DISABLE_AUTO_FLUSH);
My colleague suspects that merging or flushing is being triggered on something
other than the doc count which leads to the writers' different behaviors. I
suspect our next step is to scatter breakpoints around Lucene source (we've got
tr...@926791 to take advantage of latest NRT readers).
Does anyone have ideas on how the indexes would get out of sync? Process
close, committing, optimizing,... they all should work okay?
Thanks,
Justin
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]