[ 
https://issues.apache.org/jira/browse/SPARK-24859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16550359#comment-16550359
 ] 

Johannes Mayer commented on SPARK-24859:
----------------------------------------

I will provide an example. Could you test it on the master branch?

> Predicates pushdown on outer joins
> ----------------------------------
>
>                 Key: SPARK-24859
>                 URL: https://issues.apache.org/jira/browse/SPARK-24859
>             Project: Spark
>          Issue Type: Bug
>          Components: Spark Core, SQL
>    Affects Versions: 2.2.0
>         Environment: Cloudera CDH 5.13.1
>            Reporter: Johannes Mayer
>            Priority: Major
>
> I have two AVRO tables in Hive called FAct and DIm. Both are partitioned by a 
> common column called part_col. Now I want to join both tables on their id but 
> only for some of partitions.
> If I use an inner join, everything works well:
>  
> {code:java}
> select *
> from FA f
> join DI d
> on(f.id = d.id and f.part_col = d.part_col)
> where f.part_col = 'xyz'
> {code}
>  
> In the sql explain plan I can see, that the predicate part_col = 'xyz' is 
> also used in the DIm HiveTableScan.
>  
> When I execute the same query using a left join the full dim table is 
> scanned. There are some workarounds for this issue, but I wanted to report 
> this as a bug, since it works on an inner join, and i think the behaviour 
> should be the same for an outer join
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to