Thanks for clarifying Uwe. I will keep the daily optimization turned off. I may be wrong, but I would guess that if the OOM is happening as part of the forceMerge, then there may be a chance that it could also happen as a natural part of the index growth when big segments are merged. If so, it might be worth looking into anyway. I suspect that it may have to do with the way that NumericDocValues fields are handled in the merge process, but again, this is just a stab in the dark...

Michael.

On 2013/09/26 12:38 PM, Uwe Schindler wrote:
Hi,

TieredMergePolicy, which is the default since around Lucene 3.2,  prefers 
merging segments with many deletions, so forceMerge(1) is not needed.

Uwe

-----
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: u...@thetaphi.de


-----Original Message-----
From: Michael van Rooyen [mailto:mich...@loot.co.za]
Sent: Thursday, September 26, 2013 12:26 PM
To: java-user@lucene.apache.org
Cc: Ian Lea
Subject: Re: Lucene 4.4.0 mergeSegments OutOfMemoryError

Yes, it happens as part of the early morning optimize, and yes, it's a
forceMerge(1) which I've disabled for now.

I haven't looked at the persistence mechanism for Lucene since 2.x, but if I
remember correctly, the deleted documents would stay in an index segment
until that segment was eventually merged.  Without forcing a merge
(optimize in old versions), the footprint on disk could be a multiple of the
actual space required for the live documents, and this would have an impact
on performance (the deleted documents would clutter the buffer cache).

Is this still the case?  I would have thought it good practice to force the dead
space out of an index periodically, but if the underlying storage mechanism
has changed and the current index files are more efficient at housekeeping,
this may no longer be necessary.

If someone could shed a little light on best practice for indexes where
documents are frequently updated (i.e. deleted and re-added), that would
be great.

Michael.


On 2013/09/26 11:43 AM, Ian Lea wrote:
Is this OOM happening as part of your early morning optimize or at
some other point?  By optimize do you mean IndexWriter.forceMerge(1)?
You really shouldn't have to use that. If the index grows forever
without it then something else is going on which you might wish to
report separately.


--
Ian.


On Wed, Sep 25, 2013 at 12:35 PM, Michael van Rooyen
<mich...@loot.co.za> wrote:
We've recently upgraded to Lucene 4.4.0 and mergeSegments now
causes
an OOM error.

As background, our index contains about 14 million documents (growing
slowly) and we process about 1 million updates per day. It's about
8GB on disk.  I'm not sure if the Lucene segments merge the way they
used to in the early versions, but we've always optimized at 3am to
get rid of dead space in the index, or otherwise it grows forever.

The mergeSegments was working under 4.3.1 but the index has grown
somewhat on disk since then, probably due to a couple of added
NumericDocValues fields.  The java process is assigned about 3GB (the
maximum, as it's running on a 32 bit i686 Linux box), and it still goes OOM.

Any advice as to the possible cause and how to circumvent it would be
great.
Here's the stack trace:

org.apache.lucene.index.MergePolicy$MergeException:
java.lang.OutOfMemoryError: Java heap space

org.apache.lucene.index.ConcurrentMergeScheduler.handleMergeExceptio
n
(ConcurrentMergeScheduler.java:545)

org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(Co
nc
urrentMergeScheduler.java:518) Caused by:
java.lang.OutOfMemoryError:
Java heap space

org.apache.lucene.codecs.lucene42.Lucene42DocValuesProducer.loadNume
r
ic(Lucene42DocValuesProducer.java:212)

org.apache.lucene.codecs.lucene42.Lucene42DocValuesProducer.getNumeri
c(Lucene42DocValuesProducer.java:174)

org.apache.lucene.index.SegmentCoreReaders.getNormValues(SegmentCor
eR
eaders.java:301)

org.apache.lucene.index.SegmentReader.getNormValues(SegmentReader.j
av
a:253)

org.apache.lucene.index.SegmentMerger.mergeNorms(SegmentMerger.jav
a:2
15)

org.apache.lucene.index.SegmentMerger.merge(SegmentMerger.java:119)
org.apache.lucene.index.IndexWriter.mergeMiddle(IndexWriter.java:3772
)
org.apache.lucene.index.IndexWriter.merge(IndexWriter.java:3376)

org.apache.lucene.index.ConcurrentMergeScheduler.doMerge(Concurrent
Me
rgeScheduler.java:405)

org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(Co
nc
urrentMergeScheduler.java:482)


Thanks,
Michael.

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org

Reply via email to