[ 
https://issues.apache.org/jira/browse/LUCENE-7976?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16443506#comment-16443506
 ] 

Erick Erickson commented on LUCENE-7976:
----------------------------------------

I'm much more optimisitc about this approach. What this approach does is 
extract the scoring loop from findMerges and call it from findMerges, 
findForcedDeletesMerges and findForcedMerges. 

Each of those methods creates a list of its peculiar version of eligible 
segments to merge and passes that (and some other info) to the extracted 
doFindMerges method.

So far it seems to work well.

Generally when it comes to large segments, here defined as anything over 
maxMergedSegmentBytes/2 live documents, they're ignoed unless the new parameter 
indexPctDeletedTarget is exceeded, which defaults to 20%. This means that if 
(and only if) the total number of deleted documents in the entire index is > 
20%, then segments with > maxMergedSegmentBytes/2 live docs are _eligible_ for 
merging. Whether they're merged or not depends on whether they are scored 
highest.

On a relatively quick test, setting indexPctDeletedTarget to 20% causes about 
10% more bytes to be written. Setting it to 10% causes 50% more bytes to be 
written. Setting it to 50% (which is kind of the default now) causes the number 
of bytes written to _drop_ by about 10%, but I consider that mostly noise.

forceMerge to 1 segment is possible, and continuing to index will gradually 
shrink that back as indexPctDeletedTarget gets exceeded as these large segments 
become eligible for merging.

So despite the size of the patch, the actual code differences are not nearly as 
great as it might seem. It's mostly moving some code around.

Comments welcome, I'm going to put this down for a few days. There are still a 
few nocommits and the like, but not many.

I do have one question: When should writer.numDeletesToMerge(info) be preferred 
over info.getDelCount()? The former seems more expensive.

Oh, and I haven't run precommit or test on it yet, just gathered stats on 
indexing to the new and old code.

> Make TieredMergePolicy respect maxSegmentSizeMB and allow singleton merges of 
> very large segments
> -------------------------------------------------------------------------------------------------
>
>                 Key: LUCENE-7976
>                 URL: https://issues.apache.org/jira/browse/LUCENE-7976
>             Project: Lucene - Core
>          Issue Type: Improvement
>            Reporter: Erick Erickson
>            Assignee: Erick Erickson
>            Priority: Major
>         Attachments: LUCENE-7976.patch, LUCENE-7976.patch, LUCENE-7976.patch, 
> LUCENE-7976.patch
>
>
> We're seeing situations "in the wild" where there are very large indexes (on 
> disk) handled quite easily in a single Lucene index. This is particularly 
> true as features like docValues move data into MMapDirectory space. The 
> current TMP algorithm allows on the order of 50% deleted documents as per a 
> dev list conversation with Mike McCandless (and his blog here:  
> https://www.elastic.co/blog/lucenes-handling-of-deleted-documents).
> Especially in the current era of very large indexes in aggregate, (think many 
> TB) solutions like "you need to distribute your collection over more shards" 
> become very costly. Additionally, the tempting "optimize" button exacerbates 
> the issue since once you form, say, a 100G segment (by 
> optimizing/forceMerging) it is not eligible for merging until 97.5G of the 
> docs in it are deleted (current default 5G max segment size).
> The proposal here would be to add a new parameter to TMP, something like 
> <maxAllowedPctDeletedInBigSegments> (no, that's not serious name, suggestions 
> welcome) which would default to 100 (or the same behavior we have now).
> So if I set this parameter to, say, 20%, and the max segment size stays at 
> 5G, the following would happen when segments were selected for merging:
> > any segment with > 20% deleted documents would be merged or rewritten NO 
> > MATTER HOW LARGE. There are two cases,
> >> the segment has < 5G "live" docs. In that case it would be merged with 
> >> smaller segments to bring the resulting segment up to 5G. If no smaller 
> >> segments exist, it would just be rewritten
> >> The segment has > 5G "live" docs (the result of a forceMerge or optimize). 
> >> It would be rewritten into a single segment removing all deleted docs no 
> >> matter how big it is to start. The 100G example above would be rewritten 
> >> to an 80G segment for instance.
> Of course this would lead to potentially much more I/O which is why the 
> default would be the same behavior we see now. As it stands now, though, 
> there's no way to recover from an optimize/forceMerge except to re-index from 
> scratch. We routinely see 200G-300G Lucene indexes at this point "in the 
> wild" with 10s of  shards replicated 3 or more times. And that doesn't even 
> include having these over HDFS.
> Alternatives welcome! Something like the above seems minimally invasive. A 
> new merge policy is certainly an alternative.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to