[
https://issues.apache.org/jira/browse/LUCENE-7523?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15615379#comment-15615379
]
Michael McCandless commented on LUCENE-7523:
--------------------------------------------
UIMP was really designed for one-off usage via the {{IndexUpgrader}} tool, but
I agree it's interesting to maybe have it become instead a merge policy that
passes through ordinary merging as well?
It's a somewhat complex problem, though: if the merge policy is presented with
an index that has N old segments and M new ones, and it's in need of merging,
how does it pick? Is it only {{forceMerge}} that would explicitly target only
old segments first? Would there be just an added bias to favor old ones, like
how {{TieredMergePolicy}} biases to segments that have more deletions.
Maybe we just fold this behavior into TMP and remove UIMP?
bq. That extra new segment could be quite a large 'monster' segment.
Maybe we could have a {{maxMergedSegmentMB}}, like {{TieredMergePolicy}}? Then
UIMP could only send segments whose total size is less than that to the wrapped
merge policy, maybe?
bq. UIMP.findMerges does not pass the mergeTrigger to the inner/delegate merge
policy.
That seems like a bug to me.
> UpgradeIndexMergePolicy: beyond one-off use, monster segment avoidance
> ----------------------------------------------------------------------
>
> Key: LUCENE-7523
> URL: https://issues.apache.org/jira/browse/LUCENE-7523
> Project: Lucene - Core
> Issue Type: Task
> Reporter: Christine Poerschke
> Priority: Minor
> Attachments: LUCENE-7523-outline.patch
>
>
> (Was looking at UpgradeIndexMergePolicy as part of SOLR-9648 and came up with
> these possibilities here, what do people think?)
> Currently one probably would not configure use of the
> {{UpgradeIndexMergePolicy}} (UIMP) permanently since
> [findForcedMerges|https://github.com/apache/lucene-solr/blob/master/lucene/core/src/java/org/apache/lucene/index/UpgradeIndexMergePolicy.java#L74]
> becomes a no-op after all segments have been upgraded.
> * How about adding an optional {{fallbackToInnerAfterUpgrade}} flag? That way
> UIMP.findForcedMerges could fallback to its inner/delegate merge policy's
> findForcedMerges call after all segments have been upgraded.
> Currently UIMP.findForcedMerges identifies all the segments to be upgraded
> and then asks its inner/delegate merge policy to come up with a
> MergeSpecification for those segments. If the inner/delegate merge policy
> does not supply a merge for all the segments to be upgraded then UIMP merges
> the remaining segments into _one_ new segment. That extra new segment could
> be quite a large 'monster' segment.
> * How about adding an optional {{upgradeUnmergedSegmentsIndividually}} flag?
> That way UIMP.findForcedMerges could upgrade (but not merge) the remaining
> segments.
> * Or indeed should 'upgradeUnmergedSegmentsIndividually' be the default
> behaviour?
> Noticed whilst looking at the code:
> *
> [UIMP.findMerges|https://github.com/apache/lucene-solr/blob/master/lucene/core/src/java/org/apache/lucene/index/UpgradeIndexMergePolicy.java#L69]
> does not pass the mergeTrigger to the inner/delegate merge policy.
> ** If we can figure out why that is, let's add a comment to say why that is.
> ** Understanding why that is would also be needed before proceeding with
> beyond one-off use of UIMP.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]