[ https://issues.apache.org/jira/browse/SPARK-25363?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
DB Tsai resolved SPARK-25363. ----------------------------- Resolution: Fixed Fix Version/s: 2.4.0 3.0.0 Issue resolved by pull request 22357 [https://github.com/apache/spark/pull/22357] > Schema pruning doesn't work if nested column is used in where clause > -------------------------------------------------------------------- > > Key: SPARK-25363 > URL: https://issues.apache.org/jira/browse/SPARK-25363 > Project: Spark > Issue Type: Bug > Components: SQL > Affects Versions: 2.4.0 > Reporter: Liang-Chi Hsieh > Assignee: Liang-Chi Hsieh > Priority: Major > Fix For: 3.0.0, 2.4.0 > > > Schema pruning doesn't work if nested column is used in where clause. > For example, > {code} > sql("select name.first from contacts where name.first = 'David'") > == Physical Plan == > *(1) Project [name#19.first AS first#40] > +- *(1) Filter (isnotnull(name#19) && (name#19.first = David)) > +- *(1) FileScan parquet [name#19] Batched: false, Format: Parquet, > PartitionFilters: [], > PushedFilters: [IsNotNull(name)], ReadSchema: > struct<name:struct<first:string,middle:string,last:string>> > {code} > In above query plan, the scan node reads the entire schema of `name` column. > This issue is reported by: > https://github.com/apache/spark/pull/21320#issuecomment-419290197 -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org