Hi Jack,

Yes, I plan to merge all the 10GB segments to 20GB, and not just two of the
segments. Sorry for the confusion.

I have recently increased the system memory from 64GB to 192GB, but as our
index size grows (which means more segments are created), I found that the
query speed actually slow downs. So we have decided to increase the segment
size to see how it goes, as there will be fewer segments to search for.

Regards,
Edwin


On 1 February 2016 at 01:37, Jack Krupansky <jack.krupan...@gmail.com>
wrote:

> Make sure you fully digest Mike McCandless' blog post on segment merge
> before trying to outguess his code:
>
> http://blog.mikemccandless.com/2011/02/visualizing-lucenes-segment-merges.html
>
> Generally, I don't think you would want to merge just two segments.
> Generally, you should do a bunch at a time, typically 10. IOW, take all the
> segments on a tier and merge them into one segment at the next tier.
>
> There is no documented practical upper limit for how big to make a single
> segment, but very large segments are not likely to be optimized well in
> Lucene, hence the default max merge size of 5GB. If you want to get a lot
> above that, you're in uncharted territory. Besides, if you start pushing
> your index well above the amount of available system memory your query
> performance will suffer. I'd watch for the latter before pushing on the
> former.
>
>
> -- Jack Krupansky
>
> On Sun, Jan 31, 2016 at 10:43 AM, Zheng Lin Edwin Yeo <
> edwinye...@gmail.com>
> wrote:
>
> > Thanks for your reply Shawn and Jack.
> >
> > I wanted to increase the segment size to 15GB, so that there will be
> lesser
> > segments to search for during the query, which should potentially improve
> > the query speed.
> >
> > What if I set the segment size to 20GB? Will all the existing 10GB
> segments
> > be merge to 20GB, as now merging two 10GB segments will results in a 20GB
> > segment?
> >
> > Regards,
> > Edwin
> >
> >
> > On 31 January 2016 at 12:16, Jack Krupansky <jack.krupan...@gmail.com>
> > wrote:
> >
> > > From the Lucene MergePolicy Javadoc:
> > >
> > > "Whenever the segments in an index have been altered by IndexWriter
> > > <
> > >
> >
> https://lucene.apache.org/core/5_4_0/core/org/apache/lucene/index/IndexWriter.html
> > > >,
> > > either the addition of a newly flushed segment, addition of many
> segments
> > > from addIndexes* calls, or a previous merge that may now need to
> cascade,
> > > IndexWriter
> > > <
> > >
> >
> https://lucene.apache.org/core/5_4_0/core/org/apache/lucene/index/IndexWriter.html
> > > >
> > >  invokes findMerges(org.apache.lucene.index.MergeTrigger,
> > > org.apache.lucene.index.SegmentInfos,
> > org.apache.lucene.index.IndexWriter)
> > > <
> > >
> >
> https://lucene.apache.org/core/5_4_0/core/org/apache/lucene/index/MergePolicy.html#findMerges(org.apache.lucene.index.MergeTrigger
> > > ,
> > > org.apache.lucene.index.SegmentInfos,
> > > org.apache.lucene.index.IndexWriter)> to
> > > give the MergePolicy a chance to pick merges that are now required.
> This
> > > method returns a MergePolicy.MergeSpecification
> > > <
> > >
> >
> https://lucene.apache.org/core/5_4_0/core/org/apache/lucene/index/MergePolicy.MergeSpecification.html
> > > >
> > > instance
> > > describing the set of merges that should be done, or null if no merges
> > are
> > > necessary. When IndexWriter.forceMerge is called, it calls
> > > findForcedMerges(SegmentInfos,int,Map,
> > > IndexWriter)
> > > <
> > >
> >
> https://lucene.apache.org/core/5_4_0/core/org/apache/lucene/index/MergePolicy.html#findForcedMerges(org.apache.lucene.index.SegmentInfos
> > > ,
> > > int, java.util.Map, org.apache.lucene.index.IndexWriter)> and the
> > > MergePolicy should then return the necessary merges."
> > >
> > > See:
> > >
> > >
> >
> https://lucene.apache.org/core/5_4_0/core/org/apache/lucene/index/MergePolicy.html
> > >
> > > IOW, when the next commit occurs that closes and flushes the currently
> > open
> > > segment.
> > >
> > > Nothing will happen to any existing 10GB segments, now or ever in the
> > > future since merging two 10GB segments would not be possible with a
> limit
> > > of only 15GB.
> > >
> > > Maybe you could clue us in as to what effect you are trying to
> achieve. I
> > > mean, why should any app care whether segments are 10GB or 15GB?
> > >
> > >
> > > -- Jack Krupansky
> > >
> > > On Sat, Jan 30, 2016 at 6:28 PM, Shawn Heisey <apa...@elyograg.org>
> > wrote:
> > >
> > > > On 1/30/2016 7:31 AM, Zheng Lin Edwin Yeo wrote:
> > > > > I would like to find out, when I increase the maxMergedSegmentMB
> from
> > > > 10240
> > > > > (10GB) to 15360 (15GB), will all the 10GB segments that were
> created
> > > > > previously be automatically merge to 15GB?
> > > >
> > > > Not necessarily.  It will make those 10GB+ segments eligible for
> > further
> > > > merging, whereas they would have been ineligible before the change.
> > > >
> > > > This might mean that one or more of those large segments will be
> merged
> > > > soon after the change and restart/reload, but I do not know when it
> > > > might happen.  It would probably wait until at least one new segment
> > was
> > > > created, at which time the merge policy would be consulted.
> > > >
> > > > Thanks,
> > > > Shawn
> > > >
> > > >
> > >
> >
>

Reply via email to