adriangb commented on issue #20324:
URL: https://github.com/apache/datafusion/issues/20324#issuecomment-3893794362
Okay yeah that's a valid hypothesis. I think things should be optimized
enough that the overhead would not be that impactful, but maybe it is. I think
it would be reasonable to some simplifier / check to discard filters that are
always true.
That said I just tried running Q6:
```sql
set datafusion.execution.parquet.binary_as_string = true;
create external table hits stored as parquet location
'benchmarks/data/hits_partitioned';
explain analyze SELECT MIN("EventDate"), MAX("EventDate") FROM hits;
```
I'm getting:
```
ProjectionExec: expr=[15888 as min(hits.EventDate), 15917 as
max(hits.EventDate)]
```
I.e. we don't even scan the data, we resolve it from statistics 🤔. Am I
doing something wrong / different than you @notashes ?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]