Hi Everyone,

We have been using Lucene integrated with our application for over a year
now. The indexing and searching has been pretty fast until recently. But now
we are having some scalability issues. We have a job that indexes around
20000 documents in to index every day. There are 2 processes, one that
download articles and another one that processes the articles and adds them
to the index. The processing time for the articles has increased because of
the growing index.

We have been using default values for mergeFactor, maxMergeDocs and
minMergeDocs parameters in the IndexWriter. But since our indexing load has
increased, would it be a better idea to modify these values? We are using a
dual core, 4 CPU machine with 4GB RAM for the lucene indexing. What would be
the optimum values for the indexing parameters we should be using for this
kind of Indexing requirement?

Any pointers would be appreciated.

Thanks,
Harini

Reply via email to