Thanks Hannes,  on my Fedora machine the maximum I can do is ulimit -n
1048576 which is 1M files.  This should be enough for most sane cases
but it makes me uneasy.  I assume the "deleted" file entries reported by
lsof will be cleared up eventually?

I can't believe this is really the only option and there is no way
within Lucene to control the number of files opened.  Hmm...

Thanks,

Nick.

Hannes Carl Meyer wrote:
> Hi Nick,
>
> use 'ulimit' on your ix system to check if its set to unlimited.
>
> check:
> http://wwwcgi.rdg.ac.uk:8081/cgi-bin/cgiwrap/wsi14/poplog/man/2/ulimit
>
> You don't have to set it to unlimited, maybe increasing the number
> will help.
>
> later
>
> Hannes
>
> Nick Atkins schrieb:
>> Thanks Otis, I tried that but I still get the same problem at the ulimit
>> -n point.  I assume you meant I should call
>> IndexWriter.setUseCompoundFile(true).  According to the docs compound
>> structure is the default anyway.
>>
>> Any further thoughts?  Anything I can tweak in the OS (Linux), Java
>> (1.5.0) or Lucene (1.9.1)?
>>
>> Many thanks,
>>
>> Nick
>>
>> Otis Gospodnetic wrote:
>>  
>>> The easiest first step to try is to go from multi-file index
>>> structure to the compound one.
>>>
>>> Otis
>>>
>>> ----- Original Message ----
>>> From: Nick Atkins <[EMAIL PROTECTED]>
>>> To: java-user@lucene.apache.org
>>> Sent: Thursday, March 16, 2006 3:00:59 PM
>>> Subject: Lucene and Tomcat, too many open files
>>>
>>> Hi,
>>>
>>> What's the best way to manage the number of open files used by Lucene
>>> when it's running under Tomcat?  I have a indexing application running
>>> as a web app and I index a huge number of mail messages (upwards of
>>> 40000 in some cases).  Lucene's merging routine always craps out
>>> eventually with the "too many open files" regardless of how large I set
>>> ulimit to.  lsof tells me they are all "deleted" but they still seem to
>>> count as open files.  I don't want to set ulimit to some enormous value
>>> just to solve this (because it will never be large enough).  What's the
>>> best strategy here?
>>>
>>> I have tried setting various parameters on the IndexWriter such as the
>>> MergeFactor, MaxMergeDocs and MaxBufferedDocs but they seem to only
>>> affect the merge timing algorithm wrt memory usage.  The number of
>>> files
>>> used seems to be unaffected by anything I can set on the IndexWriter.
>>>
>>> Any hints much appreciated.
>>>
>>> Cheers,
>>>
>>> Nick.
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: [EMAIL PROTECTED]
>>> For additional commands, e-mail: [EMAIL PROTECTED]
>>>
>>>
>>>
>>>
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: [EMAIL PROTECTED]
>>> For additional commands, e-mail: [EMAIL PROTECTED]
>>>
>>>       
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: [EMAIL PROTECTED]
>> For additional commands, e-mail: [EMAIL PROTECTED]
>>
>>   
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]
>

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to