[
https://issues.apache.org/jira/browse/SOLR-17310?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17854761#comment-17854761
]
Christine Poerschke commented on SOLR-17310:
--------------------------------------------
Following on from my
[https://github.com/apache/solr/pull/2477/files#r1638276023] comment:
{quote}... Leaf sorting is "between segment sorting" and we also have index
sorting i.e. "within segment sorting" – I wonder if there might be enough
commonality to generalise. ...
{quote}
Perhaps something like
{code:java}
public abstract class IndexSorters {
public abstract Comparator<LeafReader> getLeafSorter(); // via SOLR-17310 here
public abstract Sort getIndexSort(); // via SOLR-13681 i.e. not here
}
{code}
looking something like this in configuration (with both elements optional)
{code:java}
<indexSorters class="org.apache.solr.index.DefaultIndexSorters">
<str name="betweenSegmentSort">timestamp_i_dvo desc</str>
<str name="withinSegmentSort">timestamp_i_dvo desc</str>
</indexSorters>
{code}
e.g. similar to
[https://solr.apache.org/guide/solr/latest/configuration-guide/index-segments-merging.html#mergepolicyfactory]
and something like
{code:java}
public class DefaultIndexSorters extends IndexSorters {
public abstract Comparator<LeafReader> getLeafSorter() {
if (betweenSegmentSort != null) {
final Sort sort = SortSpecParsing.parseSortSpec(betweenSegmentSort,
schema).getSort();
// check that sort contains only one field and that it's of suitable type
// construct and return comparator similar to
https://github.com/apache/lucene/blob/releases/lucene/9.10.0/lucene/core/src/test/org/apache/lucene/index/TestIndexWriterReader.java#L1217-L1237
}
return null;
}
public abstract Sort getIndexSort() {
if (withinSegmentSort != null) {
return SortSpecParsing.parseSortSpec(withinSegmentSort, schema).getSort();
} else {
return null;
}
}
}
{code}
as a default implementation.
Or perhaps something other than {{<str
name="betweenSegmentSort">timestamp_i_dvo desc</str>}} would be a more
generally meaningful default implementation?
Whatever the default implementation, the {{<indexSorters
class="org.apache.solr.index.DefaultIndexSorters">}} class attribute would
allow for custom sorters too.
> Configurable LeafSorter to customize segment search order
> ---------------------------------------------------------
>
> Key: SOLR-17310
> URL: https://issues.apache.org/jira/browse/SOLR-17310
> Project: Solr
> Issue Type: New Feature
> Components: search
> Reporter: wei wang
> Priority: Minor
> Time Spent: 1.5h
> Remaining Estimate: 0h
>
> Lucene's IndexWriterConfig provides the option to sort leaf readers when a
> custom LeafSorter is provided.
> [https://github.com/apache/lucene/blob/main/lucene/core/src/java/org/apache/lucene/index/IndexWriterConfig.java#L494]
>
> The functionality is currently not directly exposed in Solr. One use case is
> in early termination, we would like to search the more recent updated
> segments first. The SegmentTimeLeafSorter sorts the LeafReaders by time
> stamp, so that recent NRT segments can be traversed first. The feature is
> enabled by adding the *segmentSort* config in solrconfig.xml. Without the
> config, no sorting is applied by default.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]