[ https://issues.apache.org/jira/browse/PARQUET-425?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Liwei Lin updated PARQUET-425: ------------------------------ Description: As is reported in Parquet-427, data will be filtered out improperly under the case where: 1. predicates - requested schema ≠ ∅ 2. predicates - file schema = ∅ To give an example: data: |a|b| |1|1| |2|2| |3|3| file schema: a,b, requested schema: a, and predicate: b = 1 we should get: |a| |1| but we'll end up get nothing, which is wrong. This issue proposes to fix this. was: As is reported in Parquet-427, data will be filtered out improperly under the case where: 1. predicates - requested schema ≠ ∅ 2. predicates - file schema = ∅ To give an example: data: |a|b| |1|1| |2|2| |3|3| file schema: a,b, requested schema: a, and predicate: b = 1 we should get: |a| |1| but we'll end up get nothing, which is wrong. > Fix the bug when predicate contains columns not specified in prejection, to > prevent filtering out data improperly > ----------------------------------------------------------------------------------------------------------------- > > Key: PARQUET-425 > URL: https://issues.apache.org/jira/browse/PARQUET-425 > Project: Parquet > Issue Type: Bug > Components: parquet-mr > Affects Versions: 1.8.0, 1.8.1 > Reporter: Liwei Lin > Assignee: Liwei Lin > Fix For: 1.9.0 > > > As is reported in Parquet-427, data will be filtered out improperly under the > case where: > 1. predicates - requested schema ≠ ∅ > 2. predicates - file schema = ∅ > To give an example: > data: > |a|b| > |1|1| > |2|2| > |3|3| > file schema: a,b, requested schema: a, and predicate: b = 1 > we should get: > |a| > |1| > but we'll end up get nothing, which is wrong. > This issue proposes to fix this. -- This message was sent by Atlassian JIRA (v6.3.4#6332)