Yes, indexing only right now, although I can issue the odd search to
test it's being built properly.

My test (indexing 40000+ message's in a user's mailbox) causes
BatchUpdater thread to write everything to the index approx every 15-17
seconds.  The logs say:

[EMAIL PROTECTED] bin]# tail -f ../logs/scalix-sis-indexer.log | grep close
2006-03-16 15:50:30,591 DEBUG [BatchUpdater.processMods:122] Writer
optimized and closed
2006-03-16 15:50:47,049 DEBUG [BatchUpdater.processMods:122] Writer
optimized and closed

and the index grows which tells me it's working OK.

My indexes are in /tmp/lucene and my parent java (Tomcat) process is pid
639.  I get this:

[EMAIL PROTECTED] bin]# lsof -p 639 | grep \/tmp\/lucene | wc -l
51422

and growing.

When it's done I'll try another smaller mailbox with the default ulimit
and your suggestions and see if that helps.

Cheers,

Nick.

Otis Gospodnetic wrote:
> This happens when you are doing indexing only!?  Wow, I've never seen that.  
> Try posting your code in a form of a unit test.
>
> Otis
>
> ----- Original Message ----
> From: Nick Atkins <[EMAIL PROTECTED]>
> To: java-user@lucene.apache.org
> Sent: Thursday, March 16, 2006 6:28:52 PM
> Subject: Re: Lucene and Tomcat, too many open files
>
> Hi Doug,
>
> I have experimented with a mergeFactor of 5 or 10 (default) but it
> didn't help matters once I reached the ulimit.  I understand how the
> mergeFactor affects Lucene's performance.
>
> I am actually not doing any searches with IndexReader right now, just
> indexing.  Yes, I do store and reuse the IndexReader per index, but my
> test right now is purely indexing.
>
> Setting my ulimit to 1M seems to do the trick for now, but I'm sure I
> could do better by tweaking Lucene's API.  I will continue to investigate.
>
> Thanks,
>
> Nick.
>
> Doug Cutting wrote:
>   
>> Are you changing the default mergeFactor or other settings?  If so,
>> how?  Large mergeFactors are generally a bad idea: they don't make
>> things faster in the long run and they chew up file handles.
>>
>> Are all searches reusing a single IndexReader?  They should.  This is
>> the other most common reason folks run out of file handles: they open
>> too many IndexReaders.  The exception may be thrown when merging, but
>> the root cause might be something else.
>>
>> Doug
>>
>> Nick Atkins wrote:
>>     
>>> Hi,
>>>
>>> What's the best way to manage the number of open files used by Lucene
>>> when it's running under Tomcat?  I have a indexing application running
>>> as a web app and I index a huge number of mail messages (upwards of
>>> 40000 in some cases).  Lucene's merging routine always craps out
>>> eventually with the "too many open files" regardless of how large I set
>>> ulimit to.  lsof tells me they are all "deleted" but they still seem to
>>> count as open files.  I don't want to set ulimit to some enormous value
>>> just to solve this (because it will never be large enough).  What's the
>>> best strategy here?
>>>
>>> I have tried setting various parameters on the IndexWriter such as the
>>> MergeFactor, MaxMergeDocs and MaxBufferedDocs but they seem to only
>>> affect the merge timing algorithm wrt memory usage.  The number of files
>>> used seems to be unaffected by anything I can set on the IndexWriter.
>>>
>>> Any hints much appreciated.
>>>
>>> Cheers,
>>>
>>> Nick.
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: [EMAIL PROTECTED]
>>> For additional commands, e-mail: [EMAIL PROTECTED]
>>>
>>>       
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: [EMAIL PROTECTED]
>> For additional commands, e-mail: [EMAIL PROTECTED]
>>
>>     
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]
>
>
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]
>
>   

Reply via email to