[jira] [Commented] (SPARK-29626) notEqual() should return true when the one is null, the other is not null

2021-06-10 Thread dc-heros (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-29626?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17360635#comment-17360635
 ] 

dc-heros commented on SPARK-29626:
--

should this func added to pyspark too?

 

> notEqual() should return true when the one is null, the other is not null
> -
>
> Key: SPARK-29626
> URL: https://issues.apache.org/jira/browse/SPARK-29626
> Project: Spark
>  Issue Type: Improvement
>  Components: Documentation
>Affects Versions: 3.1.0
>Reporter: zhouhuazheng
>Priority: Minor
>
> the one is null,the other is not null, then use the function notEqual(), we 
> hope it return true . 
> eg: 
> scala> df.show()
> +--+---+
> | age| name|
> +--+---+
> | null|Michael|
> | 30| Andy|
> | 19| Justin|
> | 35| null|
> | 19| Justin|
> | null| null|
> |Justin| Justin|
> | 19| 19|
> +--+---+
> scala> df.filter(col("age").notEqual(col("name"))).show
> +---+--+
> |age| name|
> +---+--+
> | 30| Andy|
> | 19|Justin|
> | 19|Justin|
> +---+--+
> scala> df.filter(col("age").equalTo(col("name"))).show
> +--+--+
> | age| name|
> +--+--+
> | null| null|
> |Justin|Justin|
> | 19| 19|
> +--+--+



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-29626) notEqual() should return true when the one is null, the other is not null

2021-06-10 Thread dc-heros (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-29626?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17360632#comment-17360632
 ] 

dc-heros commented on SPARK-29626:
--

ok thanks, I will create a pull request to add notEqualNullSafe

 

> notEqual() should return true when the one is null, the other is not null
> -
>
> Key: SPARK-29626
> URL: https://issues.apache.org/jira/browse/SPARK-29626
> Project: Spark
>  Issue Type: Improvement
>  Components: Documentation
>Affects Versions: 3.1.0
>Reporter: zhouhuazheng
>Priority: Minor
>
> the one is null,the other is not null, then use the function notEqual(), we 
> hope it return true . 
> eg: 
> scala> df.show()
> +--+---+
> | age| name|
> +--+---+
> | null|Michael|
> | 30| Andy|
> | 19| Justin|
> | 35| null|
> | 19| Justin|
> | null| null|
> |Justin| Justin|
> | 19| 19|
> +--+---+
> scala> df.filter(col("age").notEqual(col("name"))).show
> +---+--+
> |age| name|
> +---+--+
> | 30| Andy|
> | 19|Justin|
> | 19|Justin|
> +---+--+
> scala> df.filter(col("age").equalTo(col("name"))).show
> +--+--+
> | age| name|
> +--+--+
> | null| null|
> |Justin|Justin|
> | 19| 19|
> +--+--+



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-29626) notEqual() should return true when the one is null, the other is not null

2021-06-09 Thread Wenchen Fan (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-29626?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17360071#comment-17360071
 ] 

Wenchen Fan commented on SPARK-29626:
-

Note: once there is a null value, both `eq` and `notEqual` return null.

> notEqual() should return true when the one is null, the other is not null
> -
>
> Key: SPARK-29626
> URL: https://issues.apache.org/jira/browse/SPARK-29626
> Project: Spark
>  Issue Type: Improvement
>  Components: Documentation
>Affects Versions: 3.1.0
>Reporter: zhouhuazheng
>Priority: Minor
>
> the one is null,the other is not null, then use the function notEqual(), we 
> hope it return true . 
> eg: 
> scala> df.show()
> +--+---+
> | age| name|
> +--+---+
> | null|Michael|
> | 30| Andy|
> | 19| Justin|
> | 35| null|
> | 19| Justin|
> | null| null|
> |Justin| Justin|
> | 19| 19|
> +--+---+
> scala> df.filter(col("age").notEqual(col("name"))).show
> +---+--+
> |age| name|
> +---+--+
> | 30| Andy|
> | 19|Justin|
> | 19|Justin|
> +---+--+
> scala> df.filter(col("age").equalTo(col("name"))).show
> +--+--+
> | age| name|
> +--+--+
> | null| null|
> |Justin|Justin|
> | 19| 19|
> +--+--+



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-29626) notEqual() should return true when the one is null, the other is not null

2021-06-09 Thread Wenchen Fan (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-29626?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17360070#comment-17360070
 ] 

Wenchen Fan commented on SPARK-29626:
-

Yea this is the expected behavior. Perhaps we should add a `notEqualNullSafe` 
for this case, which is basically `not($"c1".eqNullSafe($"c2"))`.

> notEqual() should return true when the one is null, the other is not null
> -
>
> Key: SPARK-29626
> URL: https://issues.apache.org/jira/browse/SPARK-29626
> Project: Spark
>  Issue Type: Improvement
>  Components: Documentation
>Affects Versions: 3.1.0
>Reporter: zhouhuazheng
>Priority: Minor
>
> the one is null,the other is not null, then use the function notEqual(), we 
> hope it return true . 
> eg: 
> scala> df.show()
> +--+---+
> | age| name|
> +--+---+
> | null|Michael|
> | 30| Andy|
> | 19| Justin|
> | 35| null|
> | 19| Justin|
> | null| null|
> |Justin| Justin|
> | 19| 19|
> +--+---+
> scala> df.filter(col("age").notEqual(col("name"))).show
> +---+--+
> |age| name|
> +---+--+
> | 30| Andy|
> | 19|Justin|
> | 19|Justin|
> +---+--+
> scala> df.filter(col("age").equalTo(col("name"))).show
> +--+--+
> | age| name|
> +--+--+
> | null| null|
> |Justin|Justin|
> | 19| 19|
> +--+--+



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-29626) notEqual() should return true when the one is null, the other is not null

2021-06-09 Thread dc-heros (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-29626?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17359900#comment-17359900
 ] 

dc-heros commented on SPARK-29626:
--

[~cloud_fan] could you give your opinion?

 

> notEqual() should return true when the one is null, the other is not null
> -
>
> Key: SPARK-29626
> URL: https://issues.apache.org/jira/browse/SPARK-29626
> Project: Spark
>  Issue Type: Improvement
>  Components: Documentation
>Affects Versions: 3.1.0
>Reporter: zhouhuazheng
>Priority: Minor
>
> the one is null,the other is not null, then use the function notEqual(), we 
> hope it return true . 
> eg: 
> scala> df.show()
> +--+---+
> | age| name|
> +--+---+
> | null|Michael|
> | 30| Andy|
> | 19| Justin|
> | 35| null|
> | 19| Justin|
> | null| null|
> |Justin| Justin|
> | 19| 19|
> +--+---+
> scala> df.filter(col("age").notEqual(col("name"))).show
> +---+--+
> |age| name|
> +---+--+
> | 30| Andy|
> | 19|Justin|
> | 19|Justin|
> +---+--+
> scala> df.filter(col("age").equalTo(col("name"))).show
> +--+--+
> | age| name|
> +--+--+
> | null| null|
> |Justin|Justin|
> | 19| 19|
> +--+--+



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-29626) notEqual() should return true when the one is null, the other is not null

2021-06-08 Thread dc-heros (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-29626?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17359686#comment-17359686
 ] 

dc-heros commented on SPARK-29626:
--

I think this behavior is acceptable as it's similar to the behavior of 
comparing 2 null value in other database such as mySQL, so I think we shouldn't 
make change here as it will affect the existing data distribution

> notEqual() should return true when the one is null, the other is not null
> -
>
> Key: SPARK-29626
> URL: https://issues.apache.org/jira/browse/SPARK-29626
> Project: Spark
>  Issue Type: Improvement
>  Components: Documentation
>Affects Versions: 3.1.0
>Reporter: zhouhuazheng
>Priority: Minor
>
> the one is null,the other is not null, then use the function notEqual(), we 
> hope it return true . 
> eg: 
> scala> df.show()
> +--+---+
> | age| name|
> +--+---+
> | null|Michael|
> | 30| Andy|
> | 19| Justin|
> | 35| null|
> | 19| Justin|
> | null| null|
> |Justin| Justin|
> | 19| 19|
> +--+---+
> scala> df.filter(col("age").notEqual(col("name"))).show
> +---+--+
> |age| name|
> +---+--+
> | 30| Andy|
> | 19|Justin|
> | 19|Justin|
> +---+--+
> scala> df.filter(col("age").equalTo(col("name"))).show
> +--+--+
> | age| name|
> +--+--+
> | null| null|
> |Justin|Justin|
> | 19| 19|
> +--+--+



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-29626) notEqual() should return true when the one is null, the other is not null

2021-06-07 Thread dc-heros (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-29626?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17358969#comment-17358969
 ] 

dc-heros commented on SPARK-29626:
--

I would like to work on this

> notEqual() should return true when the one is null, the other is not null
> -
>
> Key: SPARK-29626
> URL: https://issues.apache.org/jira/browse/SPARK-29626
> Project: Spark
>  Issue Type: Improvement
>  Components: Documentation
>Affects Versions: 3.1.0
>Reporter: zhouhuazheng
>Priority: Minor
>
> the one is null,the other is not null, then use the function notEqual(), we 
> hope it return true . 
> eg: 
> scala> df.show()
> +--+---+
> | age| name|
> +--+---+
> | null|Michael|
> | 30| Andy|
> | 19| Justin|
> | 35| null|
> | 19| Justin|
> | null| null|
> |Justin| Justin|
> | 19| 19|
> +--+---+
> scala> df.filter(col("age").notEqual(col("name"))).show
> +---+--+
> |age| name|
> +---+--+
> | 30| Andy|
> | 19|Justin|
> | 19|Justin|
> +---+--+
> scala> df.filter(col("age").equalTo(col("name"))).show
> +--+--+
> | age| name|
> +--+--+
> | null| null|
> |Justin|Justin|
> | 19| 19|
> +--+--+



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-29626) notEqual() should return true when the one is null, the other is not null

2019-10-29 Thread Aman Omer (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-29626?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16961770#comment-16961770
 ] 

Aman Omer commented on SPARK-29626:
---

Looking into this one.

> notEqual() should return true when the one is null, the other is not null
> -
>
> Key: SPARK-29626
> URL: https://issues.apache.org/jira/browse/SPARK-29626
> Project: Spark
>  Issue Type: Improvement
>  Components: Documentation
>Affects Versions: 2.4.4
>Reporter: zhouhuazheng
>Priority: Minor
>
> the one is null,the other is not null, then use the function notEqual(), we 
> hope it return true . 
> eg: 
> scala> df.show()
> +--+---+
> | age| name|
> +--+---+
> | null|Michael|
> | 30| Andy|
> | 19| Justin|
> | 35| null|
> | 19| Justin|
> | null| null|
> |Justin| Justin|
> | 19| 19|
> +--+---+
> scala> df.filter(col("age").notEqual(col("name"))).show
> +---+--+
> |age| name|
> +---+--+
> | 30| Andy|
> | 19|Justin|
> | 19|Justin|
> +---+--+
> scala> df.filter(col("age").equalTo(col("name"))).show
> +--+--+
> | age| name|
> +--+--+
> | null| null|
> |Justin|Justin|
> | 19| 19|
> +--+--+



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org