Hi All, Did anyone get a chance to look at my config and the InfoStream File ?
I am very curious to see what you think thanks, Summer > On Mar 6, 2015, at 5:20 PM, Summer Shire <shiresum...@gmail.com> wrote: > > Hi All, > > Here’s more update on where I am at with this. > I enabled infoStream logging and quickly figured that I need to get rid of > maxBufferedDocs. So Erick you > were absolutely right on that. > I increased my ramBufferSize to 100MB > and reduced maxMergeAtOnce to 3 and segmentsPerTier to 3 as well. > My config looks like this > > <indexConfig> > <useCompoundFile>false</useCompoundFile> > <ramBufferSizeMB>100</ramBufferSizeMB> > > > <!--<maxMergeSizeForForcedMerge>9223372036854775807</maxMergeSizeForForcedMerge>--> > <mergePolicy class="org.apache.lucene.index.TieredMergePolicy"> > <int name="maxMergeAtOnce">3</int> > <int name="segmentsPerTier">3</int> > </mergePolicy> > <mergeScheduler class="org.apache.lucene.index.ConcurrentMergeScheduler"/> > <infoStream file=“/tmp/INFOSTREAM.txt”>true</infoStream> > </indexConfig> > > I am attaching a sample infostream log file. > In the infoStream logs though you an see how the segments keep on adding > and it shows (just an example ) > allowedSegmentCount=10 vs count=9 (eligible count=9) tooBigCount=0 > > I looked at TieredMergePolicy.java to see how allowedSegmentCount is getting > calculated > // Compute max allowed segs in the index > long levelSize = minSegmentBytes; > long bytesLeft = totIndexBytes; > double allowedSegCount = 0; > while(true) { > final double segCountLevel = bytesLeft / (double) levelSize; > if (segCountLevel < segsPerTier) { > allowedSegCount += Math.ceil(segCountLevel); > break; > } > allowedSegCount += segsPerTier; > bytesLeft -= segsPerTier * levelSize; > levelSize *= maxMergeAtOnce; > } > int allowedSegCountInt = (int) allowedSegCount; > and the minSegmentBytes is calculated as follows > // Compute total index bytes & print details about the index > long totIndexBytes = 0; > long minSegmentBytes = Long.MAX_VALUE; > for(SegmentInfoPerCommit info : infosSorted) { > final long segBytes = size(info); > if (verbose()) { > String extra = merging.contains(info) ? " [merging]" : ""; > if (segBytes >= maxMergedSegmentBytes/2.0) { > extra += " [skip: too large]"; > } else if (segBytes < floorSegmentBytes) { > extra += " [floored]"; > } > message(" seg=" + writer.get().segString(info) + " size=" + > String.format(Locale.ROOT, "%.3f", segBytes/1024/1024.) + " MB" + extra); > } > > minSegmentBytes = Math.min(segBytes, minSegmentBytes); > // Accum total byte size > totIndexBytes += segBytes; > } > > > any input is welcome. > > <myinfoLog.rtf> > > > thanks, > Summer > > >> On Mar 5, 2015, at 8:11 AM, Erick Erickson <erickerick...@gmail.com> wrote: >> >> I would, BTW, either just get rid of the <maxBufferedDocs> all together or >> make it much higher, i.e. 100000. I don't think this is really your >> problem, but you're creating a lot of segments here. >> >> But I'm kind of at a loss as to what would be different about your setup. >> Is there _any_ chance that you have some secondary process looking at >> your index that's maintaining open searchers? Any custom code that's >> perhaps failing to close searchers? Is this a Unix or Windows system? >> >> And just to be really clear, you _only_ seeing more segments being >> added, right? If you're only counting files in the index directory, it's >> _possible_ that merging is happening, you're just seeing new files take >> the place of old ones. >> >> Best, >> Erick >> >> On Wed, Mar 4, 2015 at 7:12 PM, Shawn Heisey <apa...@elyograg.org> wrote: >>> On 3/4/2015 4:12 PM, Erick Erickson wrote: >>>> I _think_, but don't know for sure, that the merging stuff doesn't get >>>> triggered until you commit, it doesn't "just happen". >>>> >>>> Shot in the dark... >>> >>> I believe that new segments are created when the indexing buffer >>> (ramBufferSizeMB) fills up, even without commits. I'm pretty sure that >>> anytime a new segment is created, the merge policy is checked to see >>> whether a merge is needed. >>> >>> Thanks, >>> Shawn >>> >