[jira] [Commented] (SPARK-29626) notEqual() should return true when the one is null, the other is not null
[ https://issues.apache.org/jira/browse/SPARK-29626?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17360635#comment-17360635 ] dc-heros commented on SPARK-29626: -- should this func added to pyspark too? > notEqual() should return true when the one is null, the other is not null > - > > Key: SPARK-29626 > URL: https://issues.apache.org/jira/browse/SPARK-29626 > Project: Spark > Issue Type: Improvement > Components: Documentation >Affects Versions: 3.1.0 >Reporter: zhouhuazheng >Priority: Minor > > the one is null,the other is not null, then use the function notEqual(), we > hope it return true . > eg: > scala> df.show() > +--+---+ > | age| name| > +--+---+ > | null|Michael| > | 30| Andy| > | 19| Justin| > | 35| null| > | 19| Justin| > | null| null| > |Justin| Justin| > | 19| 19| > +--+---+ > scala> df.filter(col("age").notEqual(col("name"))).show > +---+--+ > |age| name| > +---+--+ > | 30| Andy| > | 19|Justin| > | 19|Justin| > +---+--+ > scala> df.filter(col("age").equalTo(col("name"))).show > +--+--+ > | age| name| > +--+--+ > | null| null| > |Justin|Justin| > | 19| 19| > +--+--+ -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-29626) notEqual() should return true when the one is null, the other is not null
[ https://issues.apache.org/jira/browse/SPARK-29626?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17360632#comment-17360632 ] dc-heros commented on SPARK-29626: -- ok thanks, I will create a pull request to add notEqualNullSafe > notEqual() should return true when the one is null, the other is not null > - > > Key: SPARK-29626 > URL: https://issues.apache.org/jira/browse/SPARK-29626 > Project: Spark > Issue Type: Improvement > Components: Documentation >Affects Versions: 3.1.0 >Reporter: zhouhuazheng >Priority: Minor > > the one is null,the other is not null, then use the function notEqual(), we > hope it return true . > eg: > scala> df.show() > +--+---+ > | age| name| > +--+---+ > | null|Michael| > | 30| Andy| > | 19| Justin| > | 35| null| > | 19| Justin| > | null| null| > |Justin| Justin| > | 19| 19| > +--+---+ > scala> df.filter(col("age").notEqual(col("name"))).show > +---+--+ > |age| name| > +---+--+ > | 30| Andy| > | 19|Justin| > | 19|Justin| > +---+--+ > scala> df.filter(col("age").equalTo(col("name"))).show > +--+--+ > | age| name| > +--+--+ > | null| null| > |Justin|Justin| > | 19| 19| > +--+--+ -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-29626) notEqual() should return true when the one is null, the other is not null
[ https://issues.apache.org/jira/browse/SPARK-29626?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17360071#comment-17360071 ] Wenchen Fan commented on SPARK-29626: - Note: once there is a null value, both `eq` and `notEqual` return null. > notEqual() should return true when the one is null, the other is not null > - > > Key: SPARK-29626 > URL: https://issues.apache.org/jira/browse/SPARK-29626 > Project: Spark > Issue Type: Improvement > Components: Documentation >Affects Versions: 3.1.0 >Reporter: zhouhuazheng >Priority: Minor > > the one is null,the other is not null, then use the function notEqual(), we > hope it return true . > eg: > scala> df.show() > +--+---+ > | age| name| > +--+---+ > | null|Michael| > | 30| Andy| > | 19| Justin| > | 35| null| > | 19| Justin| > | null| null| > |Justin| Justin| > | 19| 19| > +--+---+ > scala> df.filter(col("age").notEqual(col("name"))).show > +---+--+ > |age| name| > +---+--+ > | 30| Andy| > | 19|Justin| > | 19|Justin| > +---+--+ > scala> df.filter(col("age").equalTo(col("name"))).show > +--+--+ > | age| name| > +--+--+ > | null| null| > |Justin|Justin| > | 19| 19| > +--+--+ -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-29626) notEqual() should return true when the one is null, the other is not null
[ https://issues.apache.org/jira/browse/SPARK-29626?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17360070#comment-17360070 ] Wenchen Fan commented on SPARK-29626: - Yea this is the expected behavior. Perhaps we should add a `notEqualNullSafe` for this case, which is basically `not($"c1".eqNullSafe($"c2"))`. > notEqual() should return true when the one is null, the other is not null > - > > Key: SPARK-29626 > URL: https://issues.apache.org/jira/browse/SPARK-29626 > Project: Spark > Issue Type: Improvement > Components: Documentation >Affects Versions: 3.1.0 >Reporter: zhouhuazheng >Priority: Minor > > the one is null,the other is not null, then use the function notEqual(), we > hope it return true . > eg: > scala> df.show() > +--+---+ > | age| name| > +--+---+ > | null|Michael| > | 30| Andy| > | 19| Justin| > | 35| null| > | 19| Justin| > | null| null| > |Justin| Justin| > | 19| 19| > +--+---+ > scala> df.filter(col("age").notEqual(col("name"))).show > +---+--+ > |age| name| > +---+--+ > | 30| Andy| > | 19|Justin| > | 19|Justin| > +---+--+ > scala> df.filter(col("age").equalTo(col("name"))).show > +--+--+ > | age| name| > +--+--+ > | null| null| > |Justin|Justin| > | 19| 19| > +--+--+ -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-29626) notEqual() should return true when the one is null, the other is not null
[ https://issues.apache.org/jira/browse/SPARK-29626?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17359900#comment-17359900 ] dc-heros commented on SPARK-29626: -- [~cloud_fan] could you give your opinion? > notEqual() should return true when the one is null, the other is not null > - > > Key: SPARK-29626 > URL: https://issues.apache.org/jira/browse/SPARK-29626 > Project: Spark > Issue Type: Improvement > Components: Documentation >Affects Versions: 3.1.0 >Reporter: zhouhuazheng >Priority: Minor > > the one is null,the other is not null, then use the function notEqual(), we > hope it return true . > eg: > scala> df.show() > +--+---+ > | age| name| > +--+---+ > | null|Michael| > | 30| Andy| > | 19| Justin| > | 35| null| > | 19| Justin| > | null| null| > |Justin| Justin| > | 19| 19| > +--+---+ > scala> df.filter(col("age").notEqual(col("name"))).show > +---+--+ > |age| name| > +---+--+ > | 30| Andy| > | 19|Justin| > | 19|Justin| > +---+--+ > scala> df.filter(col("age").equalTo(col("name"))).show > +--+--+ > | age| name| > +--+--+ > | null| null| > |Justin|Justin| > | 19| 19| > +--+--+ -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-29626) notEqual() should return true when the one is null, the other is not null
[ https://issues.apache.org/jira/browse/SPARK-29626?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17359686#comment-17359686 ] dc-heros commented on SPARK-29626: -- I think this behavior is acceptable as it's similar to the behavior of comparing 2 null value in other database such as mySQL, so I think we shouldn't make change here as it will affect the existing data distribution > notEqual() should return true when the one is null, the other is not null > - > > Key: SPARK-29626 > URL: https://issues.apache.org/jira/browse/SPARK-29626 > Project: Spark > Issue Type: Improvement > Components: Documentation >Affects Versions: 3.1.0 >Reporter: zhouhuazheng >Priority: Minor > > the one is null,the other is not null, then use the function notEqual(), we > hope it return true . > eg: > scala> df.show() > +--+---+ > | age| name| > +--+---+ > | null|Michael| > | 30| Andy| > | 19| Justin| > | 35| null| > | 19| Justin| > | null| null| > |Justin| Justin| > | 19| 19| > +--+---+ > scala> df.filter(col("age").notEqual(col("name"))).show > +---+--+ > |age| name| > +---+--+ > | 30| Andy| > | 19|Justin| > | 19|Justin| > +---+--+ > scala> df.filter(col("age").equalTo(col("name"))).show > +--+--+ > | age| name| > +--+--+ > | null| null| > |Justin|Justin| > | 19| 19| > +--+--+ -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-29626) notEqual() should return true when the one is null, the other is not null
[ https://issues.apache.org/jira/browse/SPARK-29626?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17358969#comment-17358969 ] dc-heros commented on SPARK-29626: -- I would like to work on this > notEqual() should return true when the one is null, the other is not null > - > > Key: SPARK-29626 > URL: https://issues.apache.org/jira/browse/SPARK-29626 > Project: Spark > Issue Type: Improvement > Components: Documentation >Affects Versions: 3.1.0 >Reporter: zhouhuazheng >Priority: Minor > > the one is null,the other is not null, then use the function notEqual(), we > hope it return true . > eg: > scala> df.show() > +--+---+ > | age| name| > +--+---+ > | null|Michael| > | 30| Andy| > | 19| Justin| > | 35| null| > | 19| Justin| > | null| null| > |Justin| Justin| > | 19| 19| > +--+---+ > scala> df.filter(col("age").notEqual(col("name"))).show > +---+--+ > |age| name| > +---+--+ > | 30| Andy| > | 19|Justin| > | 19|Justin| > +---+--+ > scala> df.filter(col("age").equalTo(col("name"))).show > +--+--+ > | age| name| > +--+--+ > | null| null| > |Justin|Justin| > | 19| 19| > +--+--+ -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-29626) notEqual() should return true when the one is null, the other is not null
[ https://issues.apache.org/jira/browse/SPARK-29626?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16961770#comment-16961770 ] Aman Omer commented on SPARK-29626: --- Looking into this one. > notEqual() should return true when the one is null, the other is not null > - > > Key: SPARK-29626 > URL: https://issues.apache.org/jira/browse/SPARK-29626 > Project: Spark > Issue Type: Improvement > Components: Documentation >Affects Versions: 2.4.4 >Reporter: zhouhuazheng >Priority: Minor > > the one is null,the other is not null, then use the function notEqual(), we > hope it return true . > eg: > scala> df.show() > +--+---+ > | age| name| > +--+---+ > | null|Michael| > | 30| Andy| > | 19| Justin| > | 35| null| > | 19| Justin| > | null| null| > |Justin| Justin| > | 19| 19| > +--+---+ > scala> df.filter(col("age").notEqual(col("name"))).show > +---+--+ > |age| name| > +---+--+ > | 30| Andy| > | 19|Justin| > | 19|Justin| > +---+--+ > scala> df.filter(col("age").equalTo(col("name"))).show > +--+--+ > | age| name| > +--+--+ > | null| null| > |Justin|Justin| > | 19| 19| > +--+--+ -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org