[ https://issues.apache.org/jira/browse/SPARK-27698?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Wenchen Fan resolved SPARK-27698. --------------------------------- Resolution: Fixed Fix Version/s: 3.0.0 Issue resolved by pull request 24597 [https://github.com/apache/spark/pull/24597] > Add new method for getting pushed down filters in Parquet file reader > --------------------------------------------------------------------- > > Key: SPARK-27698 > URL: https://issues.apache.org/jira/browse/SPARK-27698 > Project: Spark > Issue Type: Task > Components: SQL > Affects Versions: 3.0.0 > Reporter: Gengliang Wang > Assignee: Gengliang Wang > Priority: Major > Fix For: 3.0.0 > > > To return accurate pushed filters in Parquet file > scan(https://github.com/apache/spark/pull/24327#pullrequestreview-234775673), > we can process the original data source filters in the following way: > 1. For "And" operators, split the conjunctive predicates and try converting > each of them. After that > 1.1 if partially predicate pushed down is allowed, return convertible > results; > 1.2 otherwise, return the whole predicate if convertible, or empty result if > not convertible. > 2. For other operators, they are not able to be partially pushed down. > 2.1 if the entire predicate is convertible, return itself > 2.2 otherwise, return an empty result. > This PR also contains code refactoring. Currently `ParquetFilters. > createFilter ` accepts parameter `schema: MessageType` and create field > mapping for every input filter. We can make it a class member and avoid > creating the `nameToParquetField` mapping for every input filter. -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org