[ https://issues.apache.org/jira/browse/SPARK-1913?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14007026#comment-14007026 ]
Cheng Lian commented on SPARK-1913: ----------------------------------- Attributes referenced only in those filters that are pushed down are not considered when building the {{ParquetTableScan}} operator in {{ParquetOperations}}. Will submit a PR for this. > column pruning problem of Parquet File > --------------------------------------- > > Key: SPARK-1913 > URL: https://issues.apache.org/jira/browse/SPARK-1913 > Project: Spark > Issue Type: Bug > Components: SQL > Affects Versions: 1.0.0 > Environment: mac os 10.9.2 > Reporter: Chen Chao > > case class Person(name: String, age: Int) > if we use Parquet file, the following statement will throw a exception says > java.lang.IllegalArgumentException: Column age does not exist. > sql("SELECT name FROM parquetFile WHERE age >= 13 AND age <= 19") > we have to add age column after SELECT in order to make it right: > sql("SELECT name , age FROM parquetFile WHERE age >= 13 AND age <= 19") > The same error also occurs when we use DSL: > parquetFile.where('key === 1).select('value as 'a).collect().foreach(println) > will tell us can not find column 'key',we have to fix like this : > parquetFile.where('key === 1).select('key ,'value as > 'a).collect().foreach(println) > Obviously, that's not the way we want! -- This message was sent by Atlassian JIRA (v6.2#6252)