samuelcolvin opened a new issue, #10400:
URL: https://github.com/apache/datafusion/issues/10400

   ### Describe the bug
   
   Maybe related to #5535, but I couldn't find anything identical, so created a 
fresh issue.
   
   If this is a known bug and you think the fix might be moderate in scope, I'm 
happy to have a go at fixing it?
   
   ### To Reproduce
   
   I have a custom `TableProvider` and `ExecutionPlan`, where calling `execute` 
is somewhat expensive and I want to avoid calling it if no data will match.
   
   The execution plan can return helpful statistics from `.statistics()`, 
including for example, for one column:
   
   ```
   ...
   ColumnStatistics {
       null_count: Precision::Exact(0),
       max_value: Precision::Exact(ScalarValue::Int64(Some(4))),
       min_value: Precision::Exact(ScalarValue::Int64(Some(4))),
       distinct_count: Precision::Exact(1),
   },
   ```
   
   E.g. "in this column all values are equal to 4". This is successfully used 
by Datafusion if I query `value is null`, the `execute()` function is never 
alled.
   
   But if I query `value > 5` or `value < 0`, the statistic is ignored and 
`execute()` is still called.
   
   ### Expected behavior
   
   `min_value` and `max_value` of `ColumnStatistics` should be used for pruning 
and the query plan should not require the "slow" execute method to be called.
   
   ### Additional context
   
   I can give a fairly minimal example if required, but I thought best to 
report the issue and check if it was well known before going to that effort?
   
   I've tried this on both `main` (as of today) and `37.1.0`


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Reply via email to