benwtrent commented on code in PR #15741:
URL: https://github.com/apache/lucene/pull/15741#discussion_r2834064711


##########
lucene/core/src/test/org/apache/lucene/search/TestSortOptimization.java:
##########
@@ -1071,6 +1071,72 @@ private void testStringSortOptimizationWithMissingValues(
     dir.close();
   }
 
+  public void testStringSortOptimizationFieldMissingInSegmentBasedPostings() 
throws IOException {
+    testStringSortOptimizationFieldMissingInSegment(
+        (field, value) -> new KeywordField(field, value, Field.Store.NO));
+  }
+
+  public void testStringSortOptimizationFieldMissingInSegmentBasedDVSkipper() 
throws IOException {
+    
testStringSortOptimizationFieldMissingInSegment(SortedDocValuesField::indexedField);
+  }
+
+  /**
+   * Test that when a segment doesn't contain the sort field at all (fieldInfo 
== null), the
+   * optimization still works. All docs in such a segment have missing values, 
and when missing
+   * values are non-competitive the entire segment should be skippable.
+   */

Review Comment:
   I think we should also do a test when its `missing_last` but reversed sorted 
and `missing_first` but naturally sorted. These would obviously NOT exit early 
(e.g. they aren't competitive).



##########
lucene/core/src/java/org/apache/lucene/search/comparators/TermOrdValComparator.java:
##########
@@ -511,6 +516,18 @@ private abstract static class CompetitiveState {
     abstract void update(int minOrd, int maxOrd) throws IOException;
   }
 
+  private static class EmptyCompetitiveState extends CompetitiveState {
+
+    EmptyCompetitiveState(LeafReaderContext context) {
+      super(context);
+    }
+
+    @Override
+    void update(int minOrd, int maxOrd) throws IOException {
+      iterator.update(DocIdSetIterator.empty());
+    }

Review Comment:
   So, reading through all the logic, I honestly think the "empty" logic is 
handled directly in the `TermOrdValLeafComparator` and the iterators will see 
`0, 0` in the update. So, this state is only here to indicate that competitive 
iterating is possible and all the "missing" handling logic (e.g. can stop on 
asc, but cannot stop on desc) seems to be done elsewhere. 
   
   This is a nice little clarification and minor optimization even over what 
was done before :)



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to