Repository: spark Updated Branches: refs/heads/branch-1.5 86161a4f7 -> 42286feb6
[SPARK-12520] [PYSPARK] [1.5] Ensure the join type is `inner` for equi-Join. This PR is to add `assert` to ensure the join type is `inner` for equi-Join. JIRA: https://issues.apache.org/jira/browse/SPARK-12520 In the JIRA, users specify the join type `outer` when using the equi-join. However, the result we returned is the `inner` join, which is the only type Spark 1.5 supports. (Note, starting from Spark 1.6, we can support the other types for equi-join). For example, ```scala joined_table = left_table.join(right_table, "joining_column", "outer") ``` Should we also back port it to 1.4? davies JoshRosen Thanks! Author: gatorsmile <gatorsm...@gmail.com> Closes #10484 from gatorsmile/pythonEquiOuterJoin. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/42286feb Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/42286feb Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/42286feb Branch: refs/heads/branch-1.5 Commit: 42286feb676f52b366c7be3f9ace4bfde55d72a9 Parents: 86161a4 Author: gatorsmile <gatorsm...@gmail.com> Authored: Sun Dec 27 23:23:57 2015 -0800 Committer: Davies Liu <davies....@gmail.com> Committed: Sun Dec 27 23:23:57 2015 -0800 ---------------------------------------------------------------------- python/pyspark/sql/dataframe.py | 1 + 1 file changed, 1 insertion(+) ---------------------------------------------------------------------- http://git-wip-us.apache.org/repos/asf/spark/blob/42286feb/python/pyspark/sql/dataframe.py ---------------------------------------------------------------------- diff --git a/python/pyspark/sql/dataframe.py b/python/pyspark/sql/dataframe.py index 2b23815..eb2c6e5 100644 --- a/python/pyspark/sql/dataframe.py +++ b/python/pyspark/sql/dataframe.py @@ -570,6 +570,7 @@ class DataFrame(object): if on is None or len(on) == 0: jdf = self._jdf.join(other._jdf) elif isinstance(on[0], basestring): + assert how is None or how == 'inner', "Equi-join does not support: %s" % how jdf = self._jdf.join(other._jdf, self._jseq(on)) else: assert isinstance(on[0], Column), "on should be Column or list of Column" --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org