Re: solr 4.7.2 mergeFactor/ Merge policy issue

Summer Shire Fri, 13 Mar 2015 22:37:23 -0700

Hi All,

Did anyone get a chance to look at my config and the InfoStream File ?


I am very curious to see what you think

thanks,
Summer

> On Mar 6, 2015, at 5:20 PM, Summer Shire <shiresum...@gmail.com> wrote:
> 
> Hi All,
> 
> Here’s more update on where I am at with this.
> I enabled infoStream logging and quickly figured that I need to get rid of 
> maxBufferedDocs. So Erick you 
> were absolutely right on that.
> I increased my ramBufferSize to 100MB
> and reduced maxMergeAtOnce to 3 and segmentsPerTier to 3 as well.
> My config looks like this 
> 
> <indexConfig>
>    <useCompoundFile>false</useCompoundFile>
>    <ramBufferSizeMB>100</ramBufferSizeMB>
> 
>    
> <!--<maxMergeSizeForForcedMerge>9223372036854775807</maxMergeSizeForForcedMerge>-->
>    <mergePolicy class="org.apache.lucene.index.TieredMergePolicy">
>      <int name="maxMergeAtOnce">3</int>
>      <int name="segmentsPerTier">3</int>
>    </mergePolicy>
>    <mergeScheduler class="org.apache.lucene.index.ConcurrentMergeScheduler"/>
>    <infoStream file=“/tmp/INFOSTREAM.txt”>true</infoStream>
>  </indexConfig>
> 
> I am attaching a sample infostream log file.
> In the infoStream logs though you an see how the segments keep on adding
> and it shows (just an example )
> allowedSegmentCount=10 vs count=9 (eligible count=9) tooBigCount=0
> 
> I looked at TieredMergePolicy.java to see how allowedSegmentCount is getting 
> calculated
> // Compute max allowed segs in the index
>    long levelSize = minSegmentBytes;
>    long bytesLeft = totIndexBytes;
>    double allowedSegCount = 0;
>    while(true) {
>      final double segCountLevel = bytesLeft / (double) levelSize;
>      if (segCountLevel < segsPerTier) {
>        allowedSegCount += Math.ceil(segCountLevel);
>        break;
>      }
>      allowedSegCount += segsPerTier;
>      bytesLeft -= segsPerTier * levelSize;
>      levelSize *= maxMergeAtOnce;
>    }
>    int allowedSegCountInt = (int) allowedSegCount;
> and the minSegmentBytes is calculated as follows
> // Compute total index bytes & print details about the index
>    long totIndexBytes = 0;
>    long minSegmentBytes = Long.MAX_VALUE;
>    for(SegmentInfoPerCommit info : infosSorted) {
>      final long segBytes = size(info);
>      if (verbose()) {
>        String extra = merging.contains(info) ? " [merging]" : "";
>        if (segBytes >= maxMergedSegmentBytes/2.0) {
>          extra += " [skip: too large]";
>        } else if (segBytes < floorSegmentBytes) {
>          extra += " [floored]";
>        }
>        message("  seg=" + writer.get().segString(info) + " size=" + 
> String.format(Locale.ROOT, "%.3f", segBytes/1024/1024.) + " MB" + extra);
>      }
> 
>      minSegmentBytes = Math.min(segBytes, minSegmentBytes);
>      // Accum total byte size
>      totIndexBytes += segBytes;
>    }
> 
> 
> any input is welcome. 
> 
> <myinfoLog.rtf>
> 
> 
> thanks,
> Summer
> 
> 
>> On Mar 5, 2015, at 8:11 AM, Erick Erickson <erickerick...@gmail.com> wrote:
>> 
>> I would, BTW, either just get rid of the <maxBufferedDocs> all together or
>> make it much higher, i.e. 100000. I don't think this is really your
>> problem, but you're creating a lot of segments here.
>> 
>> But I'm kind of at a loss as to what would be different about your setup.
>> Is there _any_ chance that you have some secondary process looking at
>> your index that's maintaining open searchers? Any custom code that's
>> perhaps failing to close searchers? Is this a Unix or Windows system?
>> 
>> And just to be really clear, you _only_ seeing more segments being
>> added, right? If you're only counting files in the index directory, it's
>> _possible_ that merging is happening, you're just seeing new files take
>> the place of old ones.
>> 
>> Best,
>> Erick
>> 
>> On Wed, Mar 4, 2015 at 7:12 PM, Shawn Heisey <apa...@elyograg.org> wrote:
>>> On 3/4/2015 4:12 PM, Erick Erickson wrote:
>>>> I _think_, but don't know for sure, that the merging stuff doesn't get
>>>> triggered until you commit, it doesn't "just happen".
>>>> 
>>>> Shot in the dark...
>>> 
>>> I believe that new segments are created when the indexing buffer
>>> (ramBufferSizeMB) fills up, even without commits.  I'm pretty sure that
>>> anytime a new segment is created, the merge policy is checked to see
>>> whether a merge is needed.
>>> 
>>> Thanks,
>>> Shawn
>>> 
>

Re: solr 4.7.2 mergeFactor/ Merge policy issue

Reply via email to