salvatore-campagna opened a new issue, #15740:
URL: https://github.com/apache/lucene/issues/15740

   ### Description
   
   ### Description
   
   Lucene currently has two ways to retrieve the global min/max value of a 
numeric field across segments:
   
   - `PointValues.getMinPackedValue()` / `PointValues.getMaxPackedValue()`: 
returns `null` when no points exist for the field.
   - `DocValuesSkipper.globalMinValue()` / `DocValuesSkipper.globalMaxValue()`: 
returns sentinel values (`Long.MIN_VALUE` or `Long.MAX_VALUE`) when no data 
exists or when the skipper is not available for a leaf reader.
   
   These two APIs have different "no data" semantics. `PointValues` returns 
`null`, which callers can check for and handle cleanly. `DocValuesSkipper` 
returns sentinel values that callers must know about and filter out. 
Specifically:
   
   - `globalMinValue()` returns `Long.MAX_VALUE` when no segments have the 
field, and `Long.MIN_VALUE` when a leaf reader has the field info but no 
skipper.
   - `globalMaxValue()` returns `Long.MIN_VALUE` when no segments have the 
field, and `Long.MAX_VALUE` when a leaf reader has the field info but no 
skipper.
   
   This makes it error-prone for callers that need to retrieve min/max values 
from a field: they must first determine which data structure is available, then 
call the right API, and then handle the different "no data" conventions. If a 
caller picks the wrong API or forgets to filter sentinels, invalid values 
propagate silently.
   
   ### Proposal
   
   Introduce a unified API for retrieving the global min/max value of a numeric 
field, abstracting over the underlying data structure. The API should:
   
   1. Return `null` when no data exists, regardless of whether the field uses 
BKD trees or doc values skippers.
   2. Automatically delegate to whichever data structure is available for the 
field.
   3. Define clear behavior when both structures are available or when neither 
is available (return `null`).
   
   A possible solution:
   
   ```java
   public record MinMax(long min, long max) {}
   
   // Returns null if values cannot be loaded
   public static MinMax getGlobalMinMax(IndexReader reader, String field) 
throws IOException { ... }
   ```
   
   This would eliminate the need for callers to know which underlying data 
structure a field uses and would prevent sentinel values from leaking into 
application logic.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to