Yes, the predicate pushdown is enabled, but still take longer time than
the first method
BR,
Patcharee
On 08. okt. 2015 18:43, Zhan Zhang wrote:
Hi Patcharee,
Did you enable the predicate pushdown in the second method?
Thanks.
Zhan Zhang
On Oct 8, 2015, at 1:43 AM, patcharee <patcharee.thong...@uni.no> wrote:
Hi,
I am using spark sql 1.5 to query a hive table stored as partitioned orc file.
We have the total files is about 6000 files and each file size is about 245MB.
What is the difference between these two query methods below:
1. Using query on hive table directly
hiveContext.sql("select col1, col2 from table1")
2. Reading from orc file, register temp table and query from the temp table
val c = hiveContext.read.format("orc").load("/apps/hive/warehouse/table1")
c.registerTempTable("regTable")
hiveContext.sql("select col1, col2 from regTable")
When the number of files is large (query all from the total 6000 files) , the
second case is much slower then the first one. Any ideas why?
BR,
---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org