[ 
https://issues.apache.org/jira/browse/SPARK-11894?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15019199#comment-15019199
 ] 

Xiao Li commented on SPARK-11894:
---------------------------------

The plan of Dataset:

== Parsed Logical Plan ==
Project [struct(_1#2,_2#3) AS _1#10,struct(_1#7,_2#8) AS _2#11]
 Join Inner, Some(true)
  LocalRelation [_1#2,_2#3], [[1,0,1800000001,31],[0,16,1800000001,32]]
  LocalRelation [_1#7,_2#8], [[1,0,1800000001,31],[0,16,1800000001,32]]

== Analyzed Logical Plan ==
_1: struct<_1:int,_2:string>, _2: struct<_1:int,_2:string>
Project [struct(_1#2,_2#3) AS _1#10,struct(_1#7,_2#8) AS _2#11]
 Join Inner, Some(true)
  LocalRelation [_1#2,_2#3], [[1,0,1800000001,31],[0,16,1800000001,32]]
  LocalRelation [_1#7,_2#8], [[1,0,1800000001,31],[0,16,1800000001,32]]

== Optimized Logical Plan ==
Project [struct(_1#2,_2#3) AS _1#10,struct(_1#7,_2#8) AS _2#11]
 Join Inner, None
  LocalRelation [_1#2,_2#3], [[1,0,1800000001,31],[0,16,1800000001,32]]
  LocalRelation [_1#7,_2#8], [[1,0,1800000001,31],[0,16,1800000001,32]]

== Physical Plan ==
Project [struct(_1#2,_2#3) AS _1#10,struct(_1#7,_2#8) AS _2#11]
 BroadcastNestedLoopJoin BuildLeft, Inner, None
  LocalTableScan [_1#2,_2#3], [[1,0,1800000001,31],[0,16,1800000001,32]]
  LocalTableScan [_1#7,_2#8], [[1,0,1800000001,31],[0,16,1800000001,32]]


> Incorrect results are returned when using null
> ----------------------------------------------
>
>                 Key: SPARK-11894
>                 URL: https://issues.apache.org/jira/browse/SPARK-11894
>             Project: Spark
>          Issue Type: Sub-task
>          Components: SQL
>    Affects Versions: 1.6.0
>            Reporter: Xiao Li
>
> In DataSet APIs, the following two datasets are the same. 
>   Seq((new java.lang.Integer(0), "1"), (new java.lang.Integer(22), 
> "2")).toDS()
>   Seq((null.asInstanceOf[java.lang.Integer],, "1"), (new 
> java.lang.Integer(22), "2")).toDS()
> Note: java.lang.Integer is Nullable. 
> It could generate an incorrect result. For example, 
>     val ds1 = Seq((null.asInstanceOf[java.lang.Integer], "1"), (new 
> java.lang.Integer(22), "2")).toDS()
>     val ds2 = Seq((null.asInstanceOf[java.lang.Integer], "1"), (new 
> java.lang.Integer(22), "2")).toDS()//toDF("key", "value").as('df2)
>     val res1 = ds1.joinWith(ds2, lit(true)).collect()
> The expected result should be 
> ((null,1),(null,1))
> ((22,2),(null,1))
> ((null,1),(22,2))
> ((22,2),(22,2))
> The actual result is 
> ((0,1),(0,1))
> ((22,2),(0,1))
> ((0,1),(22,2))
> ((22,2),(22,2))



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to