Re: Inconsistent joinWith behavior?

2016-06-20 Thread Richard Marscher
Hi, thanks for the response. I have created a JIRA ticket: https://issues.apache.org/jira/browse/SPARK-16076 On Mon, Jun 20, 2016 at 2:52 PM, Yin Huai wrote: > Hello Richard, > > Looks like the Dataset is Dataset[(Int, Int)]. I guess for the case of > "ds.joinWith(other, expr, Outer).map({ case

Re: Inconsistent joinWith behavior?

2016-06-20 Thread Yin Huai
Hello Richard, Looks like the Dataset is Dataset[(Int, Int)]. I guess for the case of "ds.joinWith(other, expr, Outer).map({ case (t, u) => (Option(t), Option(u)) })". We are trying to use null to create a "(Int, Int)" and somehow it ended up with a tuple2 having default values. Can you create a

Inconsistent joinWith behavior?

2016-06-20 Thread Richard Marscher
I know recently outer join was changed to preserve actual nulls through the join in https://github.com/apache/spark/pull/13425. I am seeing what seems like inconsistent behavior though based on how the join is interacted with. In one case the default datatype values are still used instead of nulls