vvysotskyi commented on a change in pull request #1552: DRILL-6865: Query
returns wrong result when filter pruning happens
URL: https://github.com/apache/drill/pull/1552#discussion_r235995661
##########
File path:
exec/java-exec/src/main/java/org/apache/drill/exec/store/parquet/AbstractParquetGroupScan.java
##########
@@ -310,13 +311,60 @@ public GroupScan applyFilter(LogicalExpression
filterExpr, UdfUtilities udfUtili
AbstractParquetGroupScan cloneGroupScan =
cloneWithFileSelection(qualifiedFilePath);
cloneGroupScan.rowGroupInfos = qualifiedRGs;
cloneGroupScan.parquetGroupScanStatistics.collect(cloneGroupScan.rowGroupInfos,
cloneGroupScan.parquetTableMetadata);
+ cloneGroupScan.matchAllRowGroups = matchAllRowGroupsLocal;
return cloneGroupScan;
} catch (IOException e) {
logger.warn("Could not apply filter prune due to Exception : {}", e);
return null;
}
}
+
+ /**
+ * Returns parquet filter predicate built from specified {@code filterExpr}.
+ *
+ * @param filterExpr filter expression to build
+ * @param udfUtilities udf utilities
+ * @param functionImplementationRegistry context to find drill function
holder
+ * @param optionManager option manager
+ * @param omitUnsupportedExprs whether expressions which cannot be
converted
+ * may be omitted from the resulting
expression
+ * @return parquet filter predicate
+ */
+ public ParquetFilterPredicate getParquetFilterPredicate(LogicalExpression
filterExpr,
Review comment:
`applyFilter()` method from the previous code returns `null` if the filter
wasn't created from first row group.
I agree with you that schema change may break filter pushdown, but
currently, we cannot predict that the filter built from one row group will be
suitable for other ones.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
With regards,
Apache Git Services