Hi Zhan Zhang,
Is my problem (which is ORC predicate is not generated from WHERE clause
even though spark.sql.orc.filterPushdown=true) can be related to some
factors below ?
- orc file version (File Version: 0.12 with HIVE_8732)
- hive version (using Hive 1.2.1.2.3.0.0-2557)
- orc table is
Hi Patcharee,
I am not sure which side is wrong, driver or executor. If it is executor side,
the reason you mentioned may be possible. But if the driver side didn’t set the
predicate at all, then somewhere else is broken.
Can you please file a JIRA with a simple reproduce step, and let me know
Hi Zhan Zhang,
Here is the issue https://issues.apache.org/jira/browse/SPARK-11087
BR,
Patcharee
On 10/13/2015 06:47 PM, Zhan Zhang wrote:
Hi Patcharee,
I am not sure which side is wrong, driver or executor. If it is
executor side, the reason you mentioned may be possible. But if the
Yes, the predicate pushdown is enabled, but still take longer time than
the first method
BR,
Patcharee
On 08. okt. 2015 18:43, Zhan Zhang wrote:
Hi Patcharee,
Did you enable the predicate pushdown in the second method?
Thanks.
Zhan Zhang
On Oct 8, 2015, at 1:43 AM, patcharee
Hi Patcharee,
>From the query, it looks like only the column pruning will be applied.
>Partition pruning and predicate pushdown does not have effect. Do you see big
>IO difference between two methods?
The potential reason of the speed difference I can think of may be the
different versions of
In your case, you manually set an AND pushdown, and the predicate is right
based on your setting, : leaf-0 = (EQUALS x 320)
The right way is to enable the predicate pushdown as follows.
sqlContext.setConf("spark.sql.orc.filterPushdown", "true”)
Thanks.
Zhan Zhang
On Oct 9, 2015, at 9:58
I set hiveContext.setConf("spark.sql.orc.filterPushdown", "true"). But
from the log No ORC pushdown predicate for my query with WHERE clause.
15/10/09 19:16:01 DEBUG OrcInputFormat: No ORC pushdown predicate
I did not understand what wrong with this.
BR,
Patcharee
On 09. okt. 2015 19:10,
Hi Zhan Zhang
Actually my query has WHERE clause "select date, month, year, hh,
(u*0.9122461 - v*-0.40964267), (v*0.9122461 + u*-0.40964267), z from 4D
where x = 320 and y = 117 and zone == 2 and year=2009 and z >= 2 and z
<= 8", column "x", "y" is not partition column, the others are
That is weird. Unfortunately, there is no debug info available on this part.
Can you please open a JIRA to add some debug information on the driver side?
Thanks.
Zhan Zhang
On Oct 9, 2015, at 10:22 AM, patcharee
> wrote:
I set
Hi Patcharee,
Did you enable the predicate pushdown in the second method?
Thanks.
Zhan Zhang
On Oct 8, 2015, at 1:43 AM, patcharee wrote:
> Hi,
>
> I am using spark sql 1.5 to query a hive table stored as partitioned orc
> file. We have the total files is about
10 matches
Mail list logo