Hi,

TieredMergePolicy, which is the default since around Lucene 3.2,  prefers 
merging segments with many deletions, so forceMerge(1) is not needed.

Uwe

-----
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: u...@thetaphi.de


> -----Original Message-----
> From: Michael van Rooyen [mailto:mich...@loot.co.za]
> Sent: Thursday, September 26, 2013 12:26 PM
> To: java-user@lucene.apache.org
> Cc: Ian Lea
> Subject: Re: Lucene 4.4.0 mergeSegments OutOfMemoryError
> 
> Yes, it happens as part of the early morning optimize, and yes, it's a
> forceMerge(1) which I've disabled for now.
> 
> I haven't looked at the persistence mechanism for Lucene since 2.x, but if I
> remember correctly, the deleted documents would stay in an index segment
> until that segment was eventually merged.  Without forcing a merge
> (optimize in old versions), the footprint on disk could be a multiple of the
> actual space required for the live documents, and this would have an impact
> on performance (the deleted documents would clutter the buffer cache).
> 
> Is this still the case?  I would have thought it good practice to force the 
> dead
> space out of an index periodically, but if the underlying storage mechanism
> has changed and the current index files are more efficient at housekeeping,
> this may no longer be necessary.
> 
> If someone could shed a little light on best practice for indexes where
> documents are frequently updated (i.e. deleted and re-added), that would
> be great.
> 
> Michael.
> 
> 
> On 2013/09/26 11:43 AM, Ian Lea wrote:
> > Is this OOM happening as part of your early morning optimize or at
> > some other point?  By optimize do you mean IndexWriter.forceMerge(1)?
> > You really shouldn't have to use that. If the index grows forever
> > without it then something else is going on which you might wish to
> > report separately.
> >
> >
> > --
> > Ian.
> >
> >
> > On Wed, Sep 25, 2013 at 12:35 PM, Michael van Rooyen
> <mich...@loot.co.za> wrote:
> >> We've recently upgraded to Lucene 4.4.0 and mergeSegments now
> causes
> >> an OOM error.
> >>
> >> As background, our index contains about 14 million documents (growing
> >> slowly) and we process about 1 million updates per day. It's about
> >> 8GB on disk.  I'm not sure if the Lucene segments merge the way they
> >> used to in the early versions, but we've always optimized at 3am to
> >> get rid of dead space in the index, or otherwise it grows forever.
> >>
> >> The mergeSegments was working under 4.3.1 but the index has grown
> >> somewhat on disk since then, probably due to a couple of added
> >> NumericDocValues fields.  The java process is assigned about 3GB (the
> >> maximum, as it's running on a 32 bit i686 Linux box), and it still goes 
> >> OOM.
> >>
> >> Any advice as to the possible cause and how to circumvent it would be
> great.
> >> Here's the stack trace:
> >>
> >> org.apache.lucene.index.MergePolicy$MergeException:
> >> java.lang.OutOfMemoryError: Java heap space
> >>
> org.apache.lucene.index.ConcurrentMergeScheduler.handleMergeExceptio
> n
> >> (ConcurrentMergeScheduler.java:545)
> >>
> org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(Co
> nc
> >> urrentMergeScheduler.java:518) Caused by:
> java.lang.OutOfMemoryError:
> >> Java heap space
> >>
> org.apache.lucene.codecs.lucene42.Lucene42DocValuesProducer.loadNume
> r
> >> ic(Lucene42DocValuesProducer.java:212)
> >>
> org.apache.lucene.codecs.lucene42.Lucene42DocValuesProducer.getNumeri
> >> c(Lucene42DocValuesProducer.java:174)
> >>
> org.apache.lucene.index.SegmentCoreReaders.getNormValues(SegmentCor
> eR
> >> eaders.java:301)
> >>
> org.apache.lucene.index.SegmentReader.getNormValues(SegmentReader.j
> av
> >> a:253)
> >>
> org.apache.lucene.index.SegmentMerger.mergeNorms(SegmentMerger.jav
> a:2
> >> 15)
> >>
> org.apache.lucene.index.SegmentMerger.merge(SegmentMerger.java:119)
> >>
> org.apache.lucene.index.IndexWriter.mergeMiddle(IndexWriter.java:3772
> >> )
> >> org.apache.lucene.index.IndexWriter.merge(IndexWriter.java:3376)
> >>
> org.apache.lucene.index.ConcurrentMergeScheduler.doMerge(Concurrent
> Me
> >> rgeScheduler.java:405)
> >>
> org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(Co
> nc
> >> urrentMergeScheduler.java:482)
> >>
> >>
> >> Thanks,
> >> Michael.
> >>
> >> ---------------------------------------------------------------------
> >> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
> >> For additional commands, e-mail: java-user-h...@lucene.apache.org
> >>
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
> > For additional commands, e-mail: java-user-h...@lucene.apache.org
> >
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
> For additional commands, e-mail: java-user-h...@lucene.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org

Reply via email to