Thanks Mike! It's great to have other eyes on it (and I'm taking a bit of a break to come back at it with fresh eyes).
It'll be a bit before I can respond in detail. So far the latest patch has successfully run through one full test iteration which is totally inadequate before checking in of course. I intend to send it through a bunch more before thinking about committing of course, but any failed cases most welcome as I can beast them. Again, thanks for taking the time. On Tue, Apr 24, 2018 at 8:48 AM, Michael McCandless (JIRA) <j...@apache.org> wrote: > > [ > https://issues.apache.org/jira/browse/LUCENE-7976?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16450110#comment-16450110 > ] > > Michael McCandless commented on LUCENE-7976: > -------------------------------------------- > > {quote} // We did our best to find the right merges, but through the > vagaries of the scoring algorithm etc. we didn't > // merge down to > the required max segment count. So merge the N smallest segments to make it > so. > {quote} > Hmm can you describe why this would happen? Seems like if you ask the > scoring algorithm to find merges down to N segments, it shouldn't ever fail? > > We also seem to invoke {{getSegmentSizes}} more than once in > {{findForcedMerges}}? > >> Make TieredMergePolicy respect maxSegmentSizeMB and allow singleton merges >> of very large segments >> ------------------------------------------------------------------------------------------------- >> >> Key: LUCENE-7976 >> URL: https://issues.apache.org/jira/browse/LUCENE-7976 >> Project: Lucene - Core >> Issue Type: Improvement >> Reporter: Erick Erickson >> Assignee: Erick Erickson >> Priority: Major >> Attachments: LUCENE-7976.patch, LUCENE-7976.patch, >> LUCENE-7976.patch, LUCENE-7976.patch, LUCENE-7976.patch, LUCENE-7976.patch >> >> >> We're seeing situations "in the wild" where there are very large indexes (on >> disk) handled quite easily in a single Lucene index. This is particularly >> true as features like docValues move data into MMapDirectory space. The >> current TMP algorithm allows on the order of 50% deleted documents as per a >> dev list conversation with Mike McCandless (and his blog here: >> https://www.elastic.co/blog/lucenes-handling-of-deleted-documents). >> Especially in the current era of very large indexes in aggregate, (think >> many TB) solutions like "you need to distribute your collection over more >> shards" become very costly. Additionally, the tempting "optimize" button >> exacerbates the issue since once you form, say, a 100G segment (by >> optimizing/forceMerging) it is not eligible for merging until 97.5G of the >> docs in it are deleted (current default 5G max segment size). >> The proposal here would be to add a new parameter to TMP, something like >> <maxAllowedPctDeletedInBigSegments> (no, that's not serious name, >> suggestions welcome) which would default to 100 (or the same behavior we >> have now). >> So if I set this parameter to, say, 20%, and the max segment size stays at >> 5G, the following would happen when segments were selected for merging: >> > any segment with > 20% deleted documents would be merged or rewritten NO >> > MATTER HOW LARGE. There are two cases, >> >> the segment has < 5G "live" docs. In that case it would be merged with >> >> smaller segments to bring the resulting segment up to 5G. If no smaller >> >> segments exist, it would just be rewritten >> >> The segment has > 5G "live" docs (the result of a forceMerge or >> >> optimize). It would be rewritten into a single segment removing all >> >> deleted docs no matter how big it is to start. The 100G example above >> >> would be rewritten to an 80G segment for instance. >> Of course this would lead to potentially much more I/O which is why the >> default would be the same behavior we see now. As it stands now, though, >> there's no way to recover from an optimize/forceMerge except to re-index >> from scratch. We routinely see 200G-300G Lucene indexes at this point "in >> the wild" with 10s of shards replicated 3 or more times. And that doesn't >> even include having these over HDFS. >> Alternatives welcome! Something like the above seems minimally invasive. A >> new merge policy is certainly an alternative. > > > > -- > This message was sent by Atlassian JIRA > (v7.6.3#76005) > > --------------------------------------------------------------------- > To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org > For additional commands, e-mail: dev-h...@lucene.apache.org > --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org