pgaref commented on a change in pull request #508:
URL: https://github.com/apache/orc/pull/508#discussion_r481942384
##########
File path: java/core/src/java/org/apache/orc/impl/RecordReaderImpl.java
##########
@@ -495,6 +495,14 @@ static TruthValue
evaluatePredicateProto(OrcProto.ColumnStatistics statsProto,
" include ORC-517. Writer version: {}",
predicate.getColumnName(), writerVersion);
return TruthValue.YES_NO_NULL;
+ } else if (category == TypeDescription.Category.DOUBLE) {
+ DoubleColumnStatistics dstas = (DoubleColumnStatistics) cs;
+ if (!Double.isFinite(dstas.getMinimum()) ||
!Double.isFinite(dstas.getMaximum())
Review comment:
Hey @wgtmac -- the logic is replicated across Cpp and Java versions.
Regarding the broad check, the way we compare Double Stats (without
Double.compare) can also lead to wrong min/max values (see test case) -- this
is also something we could improve.
https://github.com/apache/orc/blob/ca33ce64bf1fa8b3696e2e44b32237d842c70df3/java/core/src/java/org/apache/orc/impl/ColumnStatisticsImpl.java#L563
However, we could probably replace the infinity check with a Nan check.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]