[ https://issues.apache.org/jira/browse/SPARK-27935?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Spencer updated SPARK-27935: ---------------------------- Description: Currently, calling *Dataset[A].joinWith(Dataset[B], col, "left_outer")* or *Dataset[A].joinWith(Dataset[B], col, "full_outer")* require users to do null checks on the resulting *Dataset[(A, B)]* To make the expected result types of outer joins more explicit, I propose a couple of new joinWith functions: {noformat} def leftOuterJoinWith[U](other: Dataset[U], condition: Column): Dataset[(T, Option[U])] def fullOuterJoinWith[U](other: Dataset[U], condition: Column): Dataset[(Option[T], Option[U])]{noformat} The return type of *fullOuterJoinWith* is imperfect, since *(None, None)* is an invalid case, but still an improvement on the present interface. was: Currently, calling `Dataset[A].joinWith(Dataset[B], col, "left_outer")` or `Dataset[A].joinWith(Dataset[B], col, "full_outer")` require users to do null checks on the resulting `Dataset[(A, B)]` To make the expected result types of outer joins more explicit, I propose a couple of new joinWith functions: {noformat} def leftOuterJoinWith[U](other: Dataset[U], condition: Column): Dataset[(T, Option[U])] def fullOuterJoinWith[U](other: Dataset[U], condition: Column): Dataset[(Option[T], Option[U])]{noformat} The return type of `fullOuterJoinWith` is imperfect, since `(None, None)` is an invalid case, but still an improvement on the present interface. > Introduce leftOuterJoinWith and fullOuterJoinWith > ------------------------------------------------- > > Key: SPARK-27935 > URL: https://issues.apache.org/jira/browse/SPARK-27935 > Project: Spark > Issue Type: New Feature > Components: SQL > Affects Versions: 3.0.0 > Reporter: Spencer > Priority: Minor > > Currently, calling *Dataset[A].joinWith(Dataset[B], col, "left_outer")* or > *Dataset[A].joinWith(Dataset[B], col, "full_outer")* require users to do null > checks on the resulting *Dataset[(A, B)]* > > To make the expected result types of outer joins more explicit, I propose a > couple of new joinWith functions: > {noformat} > def leftOuterJoinWith[U](other: Dataset[U], condition: Column): Dataset[(T, > Option[U])] > def fullOuterJoinWith[U](other: Dataset[U], condition: Column): > Dataset[(Option[T], Option[U])]{noformat} > > The return type of *fullOuterJoinWith* is imperfect, since *(None, None)* is > an invalid case, but still an improvement on the present interface. > -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org