[ 
https://issues.apache.org/jira/browse/LUCENE-7976?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16457885#comment-16457885
 ] 

Erick Erickson commented on LUCENE-7976:
----------------------------------------

bq. I think it still makes sense to try to break these changes into a couple 
issues

Works for me, I'll just leave the new parameter stuff commented out and we can 
discuss it in a separate JIRA.

bq. is going to be big enough!

And scary enough. Anyway, I left the new bits about indexPctDeletedTarget 
commented out.

Thanks for the test failures, "it worked for me", but I'll beast this and see 
and try the seeds.

bq. Can we make these ints, and cast to double when we need to divide them?:

done. Longs unnecessary as this is within a single core, right?

bq. cutoffSize = (long) ((double) maxMergeSegmentBytesThisMerge * (1.0 - 
(50/100))); divide-by-zero

Oddly it doesn't,  think the compiler casts them to doubles due to the 1.0 
leading, but since it's confusing I'll use 50.0/100.0.


bq. // First condition is that

Nice english wasn't it...... Removed, it was the remnant of some intermediate 
versions.

bq. But do not decrement it when we remove segments from eligible in the loop 
after?

In that case it's a silly variable to have since eligible.size() is the same 
thing, removed it.

bq. Hmm can you describe why this would happen?

The problem I ran into was that say we're merging down to 5 segments. At some 
point we might have 9 eligible segments, and it's even possible that _none_ of 
them have deletes.  The doFindMerges code may return a spec == null 'cause 
there's not enough segments to make a decent merge according to that code, 
and/or it would create an out-sized segment etc. So we need to collect segments 
4-9 and merge them to get to 5 total segments. We try to choose the smallest 
ones.

This is actually similar in spirit to the old code, at the end of 
findForcedMerges there's a clause that catches this condition.

All that said, this seems ad-hoc, I'll take another look at it. 

I've also imagined at least one edge case that results in very dissimilar size 
segments at the end of forceMerge, but with the singleton merge they'll correct 
themselves so I don't think it's really worth trying to protect against.

bq. We also seem to invoke {{getSegmentSizes}} more than once in 
{{findForcedMerges}}?

I was going to take another look at all the usages of size(info, writer) as 
well as the getting the deleted doc count, I'll see. I think what I want to do, 
rather than manipulate the {{List<SegmentCommitInfo> eligible}}, is get the 
sizes up front (and maybe deleted docs?) and manipulate that (also pass that to 
doFindMerges).

Probably another patch to look soon.

> Make TieredMergePolicy respect maxSegmentSizeMB and allow singleton merges of 
> very large segments
> -------------------------------------------------------------------------------------------------
>
>                 Key: LUCENE-7976
>                 URL: https://issues.apache.org/jira/browse/LUCENE-7976
>             Project: Lucene - Core
>          Issue Type: Improvement
>            Reporter: Erick Erickson
>            Assignee: Erick Erickson
>            Priority: Major
>         Attachments: LUCENE-7976.patch, LUCENE-7976.patch, LUCENE-7976.patch, 
> LUCENE-7976.patch, LUCENE-7976.patch, LUCENE-7976.patch
>
>
> We're seeing situations "in the wild" where there are very large indexes (on 
> disk) handled quite easily in a single Lucene index. This is particularly 
> true as features like docValues move data into MMapDirectory space. The 
> current TMP algorithm allows on the order of 50% deleted documents as per a 
> dev list conversation with Mike McCandless (and his blog here:  
> https://www.elastic.co/blog/lucenes-handling-of-deleted-documents).
> Especially in the current era of very large indexes in aggregate, (think many 
> TB) solutions like "you need to distribute your collection over more shards" 
> become very costly. Additionally, the tempting "optimize" button exacerbates 
> the issue since once you form, say, a 100G segment (by 
> optimizing/forceMerging) it is not eligible for merging until 97.5G of the 
> docs in it are deleted (current default 5G max segment size).
> The proposal here would be to add a new parameter to TMP, something like 
> <maxAllowedPctDeletedInBigSegments> (no, that's not serious name, suggestions 
> welcome) which would default to 100 (or the same behavior we have now).
> So if I set this parameter to, say, 20%, and the max segment size stays at 
> 5G, the following would happen when segments were selected for merging:
> > any segment with > 20% deleted documents would be merged or rewritten NO 
> > MATTER HOW LARGE. There are two cases,
> >> the segment has < 5G "live" docs. In that case it would be merged with 
> >> smaller segments to bring the resulting segment up to 5G. If no smaller 
> >> segments exist, it would just be rewritten
> >> The segment has > 5G "live" docs (the result of a forceMerge or optimize). 
> >> It would be rewritten into a single segment removing all deleted docs no 
> >> matter how big it is to start. The 100G example above would be rewritten 
> >> to an 80G segment for instance.
> Of course this would lead to potentially much more I/O which is why the 
> default would be the same behavior we see now. As it stands now, though, 
> there's no way to recover from an optimize/forceMerge except to re-index from 
> scratch. We routinely see 200G-300G Lucene indexes at this point "in the 
> wild" with 10s of  shards replicated 3 or more times. And that doesn't even 
> include having these over HDFS.
> Alternatives welcome! Something like the above seems minimally invasive. A 
> new merge policy is certainly an alternative.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to