Hi Everyone, We have been using Lucene integrated with our application for over a year now. The indexing and searching has been pretty fast until recently. But now we are having some scalability issues. We have a job that indexes around 20000 documents in to index every day. There are 2 processes, one that download articles and another one that processes the articles and adds them to the index. The processing time for the articles has increased because of the growing index.
We have been using default values for mergeFactor, maxMergeDocs and minMergeDocs parameters in the IndexWriter. But since our indexing load has increased, would it be a better idea to modify these values? We are using a dual core, 4 CPU machine with 4GB RAM for the lucene indexing. What would be the optimum values for the indexing parameters we should be using for this kind of Indexing requirement? Any pointers would be appreciated. Thanks, Harini