I'd +1 a change that makes setDiagnostics public. Longer term I wonder if
we should have a more locked down API that _only_ allows setting
diagnostics. There are lots of things in SegmentCommitInfo that merges
should never override like the segment ID, and I can't think of anything
else than diagnostics that a merge policy should ever need to set.

On Fri, Sep 24, 2021 at 10:57 AM Chris Hegarty
<[email protected]> wrote:

> In an effort to prepare Elasticsearch for modularization, we are
> investigating and eliminating split packages. The situation has improved
> through recent refactoring in Lucene 9.0 [1], but a number of split
> packages still remain. This message identifies one such so that it can
> be discussed in isolation, with a view to a potential solution either in
> Lucene or possibly within Elasticsearch itself.
>
> Elasticsearch has a custom merge policy, ShuffleForcedMergePolicy, that
> interleaves eldest and newest segments. SFMP has but a single
> package-private dependency on lucene, namely `SegmentInfo::setDiagnostics`.
> The `setDiagnostics` method is used by SFMP to add a single additional
> ES-specific diagnostic key.
>
> It is possible to recreate the whole SegmentCommitInfo (and the SI
> within, along with the additional ES-specific diagnostic key), by using
> just the accessible parts of the lucene API. This has been prototyped[2],
> but shows that something is not quite right [3]. Resolving this issue is
> likely better done in lucene.
>
> The comment in `MergePolicy::setMergeInfo` - "Sets the SegmentCommitInfo
> of the merged segment. Allows sub-classes to e.g. set diagnostics
> properties"[4] - is telling. To better support the use-cases referred to
> in this comment and to allow for other-package subclasses (which seems
> completely reasonable given other parts of the API), it would be better
> for lucene's SI::setDiagnostics method to be public (rather than
> package-private). Such a change is a general improvement in the lucene
> API, which can be leveraged by any custom merge policy.
>
> -Chris.
>
> [1] https://issues.apache.org/jira/browse/LUCENE-9319
> [2] https://github.com/elastic/elasticsearch/pull/78253
> [3]
> https://github.com/elastic/elasticsearch/pull/78253#issuecomment-925861893
> [4]
> https://github.com/apache/lucene/blob/d5d6dc079395c47cd6d12dcce3bcfdd2c7d9dc63/lucene/core/src/java/org/apache/lucene/index/MergePolicy.java#L271
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [email protected]
> For additional commands, e-mail: [email protected]
>
>

-- 
Adrien

Reply via email to