Which release did you use ? I tried your example in master branch:
scala> val test2 = Seq(Test(2), Test(3), Test(4)).toDS test2: org.apache.spark.sql.Dataset[Test] = [id: int] scala> test1.as("t1").joinWith(test2.as("t2"), $"t1.id" === $"t2.id", "left_outer").show +---+------+ | _1| _2| +---+------+ |[1]|[null]| |[2]| [2]| |[3]| [3]| +---+------+ On Fri, May 27, 2016 at 1:01 PM, Tim Gautier <tim.gaut...@gmail.com> wrote: > Is it truly impossible to left join a Dataset[T] on the right if T has any > non-option fields? It seems Spark tries to create Ts with null values in > all fields when left joining, which results in null pointer exceptions. In > fact, I haven't found any other way to get around this issue without making > all fields in T options. Is there any other way? > > Example: > > case class Test(id: Int) > val test1 = Seq(Test(1), Test(2), Test(3)).toDS > val test2 = Seq(Test(2), Test(3), Test(4)).toDS > test1.as("t1").joinWith(test2.as("t2"), $"t1.id" === $"t2.id", > "left_outer").show > >