[ 
https://issues.apache.org/jira/browse/SPARK-11894?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15019263#comment-15019263
 ] 

Xiao Li commented on SPARK-11894:
---------------------------------

So far, I did not find obvious bugs in Projection. The logical plan is correct. 
All the nullability attributes are set to true. 

However, my observation is the following two datasets have the same local 
relation:
-- Seq((new java.lang.Integer(0), "1"), (new java.lang.Integer(22), "2")).toDS()
-- Seq((null.asInstanceOf[java.lang.Integer], "1"), (new java.lang.Integer(22), 
"2")).toDS()

LocalRelation _1#2,_2#3, [[1,0,1800000001,31],[0,16,1800000001,32]]

DataFrame still can return a correct result. I will try to spend more time to 
find if the bug is from Catalyst. 

Thank you!

> Incorrect results are returned when using null
> ----------------------------------------------
>
>                 Key: SPARK-11894
>                 URL: https://issues.apache.org/jira/browse/SPARK-11894
>             Project: Spark
>          Issue Type: Sub-task
>          Components: SQL
>    Affects Versions: 1.6.0
>            Reporter: Xiao Li
>
> In DataSet APIs, the following two datasets are the same. 
>   Seq((new java.lang.Integer(0), "1"), (new java.lang.Integer(22), 
> "2")).toDS()
>   Seq((null.asInstanceOf[java.lang.Integer],, "1"), (new 
> java.lang.Integer(22), "2")).toDS()
> Note: java.lang.Integer is Nullable. 
> It could generate an incorrect result. For example, 
>     val ds1 = Seq((null.asInstanceOf[java.lang.Integer], "1"), (new 
> java.lang.Integer(22), "2")).toDS()
>     val ds2 = Seq((null.asInstanceOf[java.lang.Integer], "1"), (new 
> java.lang.Integer(22), "2")).toDS()//toDF("key", "value").as('df2)
>     val res1 = ds1.joinWith(ds2, lit(true)).collect()
> The expected result should be 
> ((null,1),(null,1))
> ((22,2),(null,1))
> ((null,1),(22,2))
> ((22,2),(22,2))
> The actual result is 
> ((0,1),(0,1))
> ((22,2),(0,1))
> ((0,1),(22,2))
> ((22,2),(22,2))



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to