In an effort to prepare Elasticsearch for modularization, we are investigating and eliminating split packages. The situation has improved through recent refactoring in Lucene 9.0 [1], but a number of split packages still remain. This message identifies one such so that it can be discussed in isolation, with a view to a potential solution either in Lucene or possibly within Elasticsearch itself.
Elasticsearch has a custom merge policy, ShuffleForcedMergePolicy, that interleaves eldest and newest segments. SFMP has but a single package-private dependency on lucene, namely `SegmentInfo::setDiagnostics`. The `setDiagnostics` method is used by SFMP to add a single additional ES-specific diagnostic key. It is possible to recreate the whole SegmentCommitInfo (and the SI within, along with the additional ES-specific diagnostic key), by using just the accessible parts of the lucene API. This has been prototyped[2], but shows that something is not quite right [3]. Resolving this issue is likely better done in lucene. The comment in `MergePolicy::setMergeInfo` - "Sets the SegmentCommitInfo of the merged segment. Allows sub-classes to e.g. set diagnostics properties"[4] - is telling. To better support the use-cases referred to in this comment and to allow for other-package subclasses (which seems completely reasonable given other parts of the API), it would be better for lucene's SI::setDiagnostics method to be public (rather than package-private). Such a change is a general improvement in the lucene API, which can be leveraged by any custom merge policy. -Chris. [1] https://issues.apache.org/jira/browse/LUCENE-9319 [2] https://github.com/elastic/elasticsearch/pull/78253 [3] https://github.com/elastic/elasticsearch/pull/78253#issuecomment-925861893 [4] https://github.com/apache/lucene/blob/d5d6dc079395c47cd6d12dcce3bcfdd2c7d9dc63/lucene/core/src/java/org/apache/lucene/index/MergePolicy.java#L271 --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
