[ https://issues.apache.org/jira/browse/DRILL-3533?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14637180#comment-14637180 ]
Jinfeng Ni commented on DRILL-3533: ----------------------------------- [~parthc], could you please review the pull request for DRILL-3533? Thanks! > null values in a sub-structure in Parquet returns unexpected/misleading > results > ------------------------------------------------------------------------------- > > Key: DRILL-3533 > URL: https://issues.apache.org/jira/browse/DRILL-3533 > Project: Apache Drill > Issue Type: Bug > Components: Query Planning & Optimization > Affects Versions: 1.1.0 > Reporter: Stefán Baxter > Assignee: Parth Chandra > Priority: Critical > > With this minimal dataset as /tmp/test.json: > {"dimensions":{"adults":"A"}} > select lower(p.dimensions.budgetLevel) as `field1`, > lower(p.dimensions.adults) as `field2` from dfs.tmp.`/test.json` as p; > Returns this: > +---------+---------+ > | field1 | field2 | > +---------+---------+ > | null | a | > +---------+---------+ > With the same data as a Parquet file > CREATE TABLE dfs.tmp.`/test` AS SELECT * FROM dfs.tmp.`/test.json`; > The same query: > select lower(p.dimensions.budgetLevel) as `field1`, > lower(p.dimensions.adults) as `field2` from dfs.tmp.`/test/0_0_0.parquet` as > p; > Return this: > +---------+---------+ > | field1 | field2 | > +---------+---------+ > | a | null | > +---------+---------+ > After some more testing it appears that this has nothing to do with trim. > (any non existing nested-value will be pushed aside) > select p.dimensions.budgetLevel as `field1`, lower(p.dimensions.adults) as > `field2` from dfs.tmp.`/test/0_0_0.parquet` as p; > also returns: > +---------+---------+ > | field1 | field2 | > +---------+---------+ > | a | null | > +---------+---------+ -- This message was sent by Atlassian JIRA (v6.3.4#6332)