[ 
https://issues.apache.org/jira/browse/SPARK-12520?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15071695#comment-15071695
 ] 

Xiao Li edited comment on SPARK-12520 at 12/25/15 6:48 PM:
-----------------------------------------------------------

The problem still exists in 1.4 and 1.5, after checking the code. The code just 
ignores the third parameter (join type) users pass. However, the join type we 
called is inner, even if the user-specified type is others. Should we submit a 
PR for 1.4 and 1.5? [~davies]

My PR to 1.6 is just for description and use case updates. 


was (Author: smilegator):
The problem still exists in 1.4 and 1.5, I think. I did not make a try. Should 
we submit a PR for 1.4 and 1.5? [~davies]

My PR to 1.6 is just for description and use case updates. 

> Python API dataframe join returns wrong results on outer join
> -------------------------------------------------------------
>
>                 Key: SPARK-12520
>                 URL: https://issues.apache.org/jira/browse/SPARK-12520
>             Project: Spark
>          Issue Type: Bug
>          Components: PySpark, SQL
>    Affects Versions: 1.4.1
>            Reporter: Aravind  B
>
> Consider the following dataframes:
> """
> left_table:
> +------------+------------+---------+--------------+
> |head_id_left|tail_id_left|weight|joining_column|
> +------------+------------+---------+--------------+
> |           1|           2|        1|           1~2|
> +------------+------------+---------+--------------+
> right_table:
> +-------------+-------------+--------------+
> |head_id_right|tail_id_right|joining_column|
> +-------------+-------------+--------------+
> +-------------+-------------+--------------+
> """
> The following code returns an empty dataframe:
> """
> joined_table = left_table.join(right_table, "joining_column", "outer")
> """
> joined_table has zero rows. 
> However:
> """
> joined_table = left_table.join(right_table, left_table.joining_column == 
> right_table.joining_column, "outer")
> """
> returns the correct answer with one row.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to