spark version 2.2.0
Hive version 1.1.0

There are lot of small files

Spark code :

"spark.sql.orc.enabled": "true",
"spark.sql.orc.filterPushdown": "true 

val logs
=spark.read.schema(schema).orc("hdfs://test/date=201810").filter("date >
20181003")

Hive:

"spark.sql.orc.enabled": "true",
"spark.sql.orc.filterPushdown": "true 

test  table in Hive is pointing to hdfs://test/  and partitioned on date

val sqlStr = s"select * from test where date > 20181001"
val logs = spark.sql(sqlStr)

With Hive query I don't see filter pushdown is  happening. I tried setting
these configs in both hive-site.xml and also spark.sqlContext.setConf

"hive.optimize.ppd":"true",
"hive.optimize.ppd.storage":"true" 



--
Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/

---------------------------------------------------------------------
To unsubscribe e-mail: user-unsubscr...@spark.apache.org

Reply via email to