Spencer created SPARK-27935:
-------------------------------

             Summary: Introduce leftOuterJoinWith and fullOuterJoinWith
                 Key: SPARK-27935
                 URL: https://issues.apache.org/jira/browse/SPARK-27935
             Project: Spark
          Issue Type: New Feature
          Components: SQL
    Affects Versions: 3.0.0
            Reporter: Spencer


 

Currently, calling `Dataset[A].joinWith(Dataset[B], col, "left_outer")` or 
`Dataset[A].joinWith(Dataset[B], col, "full_outer")` require users to do null 
checks on the resulting `Dataset[(A, B)]`

 

To make the expected result types of outer joins more explicit, I propose a 
couple of new joinWith functions:

 
{noformat}
def leftOuterJoinWith[U](other: Dataset[U], condition: Column): Dataset[(T, 
Option[U])]

def fullOuterJoinWith[U](other: Dataset[U], condition: Column): 
Dataset[(Option[T], Option[U])]{noformat}
 

The return type of `fullOuterJoinWith` is imperfect, since `(None, None)` is an 
invalid case, but still an improvement on the present interface.

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to