[ https://issues.apache.org/jira/browse/SPARK-5451?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Yin Huai updated SPARK-5451: ---------------------------- Target Version/s: 1.5.0 (was: 1.4.0) > And predicates are not properly pushed down > ------------------------------------------- > > Key: SPARK-5451 > URL: https://issues.apache.org/jira/browse/SPARK-5451 > Project: Spark > Issue Type: Sub-task > Components: SQL > Affects Versions: 1.2.0, 1.2.1 > Reporter: Cheng Lian > Priority: Critical > > This issue is actually caused by PARQUET-173. > The following {{spark-shell}} session can be used to reproduce this bug: > {code} > import org.apache.spark.sql.SQLContext > val sqlContext = new SQLContext(sc) > import sc._ > import sqlContext._ > case class KeyValue(key: Int, value: String) > parallelize(1 to 1024 * 1024 * 20). > flatMap(i => Seq.fill(10)(KeyValue(i, i.toString))). > saveAsParquetFile("large.parquet") > parquetFile("large.parquet").registerTempTable("large") > hadoopConfiguration.set("parquet.task.side.metadata", "false") > sql("SET spark.sql.parquet.filterPushdown=true") > sql("SELECT value FROM large WHERE 1024 < value AND value < 2048").collect() > {code} > From the log we can find: > {code} > There were no row groups that could be dropped due to filter predicates > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org