As of Lucene 2.1 you can make optimal use of RAM by monitoring IndexWriter.ramSizeInBytes() and calling IndexWriter.flush() when memory is tight.
This avoids the issue of trying to estimate a value for maxBufferedDocs which you think can fit into RAM. Cheers Mark ----- Original Message ---- From: Laxmilal Menaria <[EMAIL PROTECTED]> To: java-user@lucene.apache.org Sent: Monday, 12 March, 2007 11:27:21 AM Subject: Re: Scalability Issues with Indexing I think you can try MergeFactor =1000 MaxMergeDocs=2147483647 MaxBufferedDocs=1000 --LM On 3/12/07, Harini Raghavan <[EMAIL PROTECTED]> wrote: > > Hi Everyone, > > We have been using Lucene integrated with our application for over a year > now. The indexing and searching has been pretty fast until recently. But > now > we are having some scalability issues. We have a job that indexes around > 20000 documents in to index every day. There are 2 processes, one that > download articles and another one that processes the articles and adds > them > to the index. The processing time for the articles has increased because > of > the growing index. > > We have been using default values for mergeFactor, maxMergeDocs and > minMergeDocs parameters in the IndexWriter. But since our indexing load > has > increased, would it be a better idea to modify these values? We are using > a > dual core, 4 CPU machine with 4GB RAM for the lucene indexing. What would > be > the optimum values for the indexing parameters we should be using for this > kind of Indexing requirement? > > Any pointers would be appreciated. > > Thanks, > Harini > ___________________________________________________________ The all-new Yahoo! Mail goes wherever you go - free your email address from your Internet provider. http://uk.docs.yahoo.com/nowyoucan.html --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]