Nick!

I had also the same problem. Now on my SearchEngine class, when I
write a document to the index, I check if the number of documents mod
100 is 0. if it is, optimize().

Optimize() reduces  the number of documents used by the index, so the
number of open files also is reduced.

Take a look:

        private synchronized void write(Document document) throws IOException {
                logger.debug("writing document");
                IndexWriter writer = openWriter();
                if (writer.docCount() % 100 == 0) {
                        // avoiding too many open files, indexing 100 by 100.
                        logger.info("optimizing indexes...");
                        writer.optimize();
                }
                writer.addDocument(document);
                writer.close();
                reopenSearcher();
                logger.debug("document wrote");
        }

I did not try to find a best value. 100 seems ok, although optimizing
my indexes is already taking 2 seconds (and in a synchronized method
this is not so good).

Tell me what you think.


On 3/16/06, Nick Atkins <[EMAIL PROTECTED]> wrote:
> Hi,
>
> What's the best way to manage the number of open files used by Lucene
> when it's running under Tomcat?  I have a indexing application running
> as a web app and I index a huge number of mail messages (upwards of
> 40000 in some cases).  Lucene's merging routine always craps out
> eventually with the "too many open files" regardless of how large I set
> ulimit to.  lsof tells me they are all "deleted" but they still seem to
> count as open files.  I don't want to set ulimit to some enormous value
> just to solve this (because it will never be large enough).  What's the
> best strategy here?
>
> I have tried setting various parameters on the IndexWriter such as the
> MergeFactor, MaxMergeDocs and MaxBufferedDocs but they seem to only
> affect the merge timing algorithm wrt memory usage.  The number of files
> used seems to be unaffected by anything I can set on the IndexWriter.
>
> Any hints much appreciated.
>
> Cheers,
>
> Nick.
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]
>
>


--
Paulo E. A. Silveira
Caelum Ensino e Soluções em Java
http://www.caelum.com.br/

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to