One of our indexes is updated completely quite frequently -> "batch update" or 
"re-index". 
If so more than 2million documents are added/updated to/in the very index. This 
creates an immense IO load on our system. Does it make sense to set merge 
scheduler to NoMergeScheduler (and/or MergePolicy to NoMergePolicy). Or is 
merging "not relevant" as the commit is done at the very end only?

Context information:
At the moment the writer's config consists only of setRAMBufferSizeMB:
IndexWriterConfig config = new IndexWriterConfig( 
IndexManager.CURRENT_LUCENE_VERSION, analyzer );
config.setMergePolicy( NoMergePolicy.NO_COMPOUND_FILES );
//config.setMergeScheduler( NoMergeScheduler.INSTANCE );
config.setRAMBufferSizeMB( 20 );

The update logic is as follows:
indexWriter.deleteAll()
...
for all elements do {
...
indexWriter.updateDocument( term, doc ); // in order to omit "duplicate entries"
...
}
indexWriter.commit

What is the proposed way to perform such a batch update?

Reply via email to