[ https://issues.apache.org/jira/browse/SPARK-21160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16058537#comment-16058537 ]
Hyukjin Kwon edited comment on SPARK-21160 at 6/22/17 12:40 AM: ---------------------------------------------------------------- There is null-safe equality comparison {code} scala> Seq(Some(1), Some(2), None).toDF("a").where("a != 1").show() +---+ | a| +---+ | 2| +---+ scala> Seq(Some(1), Some(2), None).toDF("a").where("not(a <=> 1)").show() +----+ | a| +----+ | 2| |null| +----+ {code} I am resolving this. Issuing warnings will mess the logs and I guess what you tested in RDB and R does not produce such warnings as well as references. was (Author: hyukjin.kwon): There is null-safe equality comparison ``` scala> Seq(Some(1), Some(2), None).toDF("a").where("a != 1").show() +---+ | a| +---+ | 2| +---+ scala> Seq(Some(1), Some(2), None).toDF("a").where("not(a <=> 1)").show() +----+ | a| +----+ | 2| |null| +----+ ``` I am resolving this. Issuing warnings will mess the logs and I guess what you tested in RDB and R does not produce such warnings as well as references. > Filtering rows with "not equal" operator yields unexpected result with null > rows > -------------------------------------------------------------------------------- > > Key: SPARK-21160 > URL: https://issues.apache.org/jira/browse/SPARK-21160 > Project: Spark > Issue Type: Bug > Components: PySpark, Spark Core, SQL > Affects Versions: 2.0.2 > Reporter: Edoardo Vivo > Priority: Minor > > ``` > schema = StructType([StructField("Test", DoubleType())]) > test2 = spark.createDataFrame([[1.0],[1.0],[2.0],[2.0],[None]], schema=schema) > test2.where("Test != 1").show() > ``` > This returns only the rows with the value 2, it does not return the null row. > This should not be the expected behavior, IMO. > Thank you. -- This message was sent by Atlassian JIRA (v6.4.14#64029) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org