[jira] [Commented] (SPARK-28186) array_contains returns null instead of false when one of the items in the array is null

2019-07-01 Thread Takeshi Yamamuro (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-28186?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16876650#comment-16876650
 ] 

Takeshi Yamamuro commented on SPARK-28186:
--

I also think this is a right behaviour as Marco said. If no more comment, I'll 
close this. Thanks.

> array_contains returns null instead of false when one of the items in the 
> array is null
> ---
>
> Key: SPARK-28186
> URL: https://issues.apache.org/jira/browse/SPARK-28186
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 2.3.0
>Reporter: Alex Kushnir
>Priority: Major
>
> If array of items contains a null item then array_contains returns true if 
> item is found but if item is not found it returns null instead of false
> Seq(
> (1, Seq("a", "b", "c")),
> (2, Seq("a", "b", null, "c"))
> ).toDF("id", "vals").createOrReplaceTempView("tbl")
> spark.sql("select id, vals, array_contains(vals, 'a') as has_a, 
> array_contains(vals, 'd') as has_d from tbl").show
>  ++-++--+
> |id|vals|has_a|has_d|
> ++-++--+
> |1|[a, b, c]|true|false|
> |2|[a, b,, c]|true|null|
> ++-++--+



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-28186) array_contains returns null instead of false when one of the items in the array is null

2019-07-01 Thread Marco Gaido (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-28186?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16876509#comment-16876509
 ] 

Marco Gaido commented on SPARK-28186:
-

You're right with that. The equivalent in Postgres is {{=ANY}} which behaves 
like current Spark. So I don't see a string motivation to change the current 
Spark behavior.

> array_contains returns null instead of false when one of the items in the 
> array is null
> ---
>
> Key: SPARK-28186
> URL: https://issues.apache.org/jira/browse/SPARK-28186
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 2.3.0
>Reporter: Alex Kushnir
>Priority: Major
>
> If array of items contains a null item then array_contains returns true if 
> item is found but if item is not found it returns null instead of false
> Seq(
> (1, Seq("a", "b", "c")),
> (2, Seq("a", "b", null, "c"))
> ).toDF("id", "vals").createOrReplaceTempView("tbl")
> spark.sql("select id, vals, array_contains(vals, 'a') as has_a, 
> array_contains(vals, 'd') as has_d from tbl").show
>  ++-++--+
> |id|vals|has_a|has_d|
> ++-++--+
> |1|[a, b, c]|true|false|
> |2|[a, b,, c]|true|null|
> ++-++--+



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-28186) array_contains returns null instead of false when one of the items in the array is null

2019-07-01 Thread Alex Kushnir (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-28186?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16876386#comment-16876386
 ] 

Alex Kushnir commented on SPARK-28186:
--

I'm porting HIVE workload to SPARK. It works in HIVE as expected

select array_contains(array('a','b',null,'c'),'a'), 
array_contains(array('a','b',null,'c'), 'd')

returns true, false

> array_contains returns null instead of false when one of the items in the 
> array is null
> ---
>
> Key: SPARK-28186
> URL: https://issues.apache.org/jira/browse/SPARK-28186
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 2.3.0
>Reporter: Alex Kushnir
>Priority: Major
>
> If array of items contains a null item then array_contains returns true if 
> item is found but if item is not found it returns null instead of false
> Seq(
> (1, Seq("a", "b", "c")),
> (2, Seq("a", "b", null, "c"))
> ).toDF("id", "vals").createOrReplaceTempView("tbl")
> spark.sql("select id, vals, array_contains(vals, 'a') as has_a, 
> array_contains(vals, 'd') as has_d from tbl").show
>  ++-++--+
> |id|vals|has_a|has_d|
> ++-++--+
> |1|[a, b, c]|true|false|
> |2|[a, b,, c]|true|null|
> ++-++--+



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-28186) array_contains returns null instead of false when one of the items in the array is null

2019-07-01 Thread Marco Gaido (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-28186?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16876357#comment-16876357
 ] 

Marco Gaido commented on SPARK-28186:
-

Do you know of any SQL BD with the behavior you are suggesting?

> array_contains returns null instead of false when one of the items in the 
> array is null
> ---
>
> Key: SPARK-28186
> URL: https://issues.apache.org/jira/browse/SPARK-28186
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 2.3.0
>Reporter: Alex Kushnir
>Priority: Major
>
> If array of items contains a null item then array_contains returns true if 
> item is found but if item is not found it returns null instead of false
> Seq(
> (1, Seq("a", "b", "c")),
> (2, Seq("a", "b", null, "c"))
> ).toDF("id", "vals").createOrReplaceTempView("tbl")
> spark.sql("select id, vals, array_contains(vals, 'a') as has_a, 
> array_contains(vals, 'd') as has_d from tbl").show
>  ++-++--+
> |id|vals|has_a|has_d|
> ++-++--+
> |1|[a, b, c]|true|false|
> |2|[a, b,, c]|true|null|
> ++-++--+



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-28186) array_contains returns null instead of false when one of the items in the array is null

2019-07-01 Thread Alex Kushnir (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-28186?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16876304#comment-16876304
 ] 

Alex Kushnir commented on SPARK-28186:
--

because array ["a","b",null,"c"] clearly does not contain "d" and I would 
expect it to return false and not null. Why are you saying that this is correct 
behavior?

> array_contains returns null instead of false when one of the items in the 
> array is null
> ---
>
> Key: SPARK-28186
> URL: https://issues.apache.org/jira/browse/SPARK-28186
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 2.3.0
>Reporter: Alex Kushnir
>Priority: Major
>
> If array of items contains a null item then array_contains returns true if 
> item is found but if item is not found it returns null instead of false
> Seq(
> (1, Seq("a", "b", "c")),
> (2, Seq("a", "b", null, "c"))
> ).toDF("id", "vals").createOrReplaceTempView("tbl")
> spark.sql("select id, vals, array_contains(vals, 'a') as has_a, 
> array_contains(vals, 'd') as has_d from tbl").show
>  ++-++--+
> |id|vals|has_a|has_d|
> ++-++--+
> |1|[a, b, c]|true|false|
> |2|[a, b,, c]|true|null|
> ++-++--+



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-28186) array_contains returns null instead of false when one of the items in the array is null

2019-06-29 Thread Marco Gaido (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-28186?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16875447#comment-16875447
 ] 

Marco Gaido commented on SPARK-28186:
-

This is the right behavior AFAIK. Why are you saying it is wrong?

> array_contains returns null instead of false when one of the items in the 
> array is null
> ---
>
> Key: SPARK-28186
> URL: https://issues.apache.org/jira/browse/SPARK-28186
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 2.3.0
>Reporter: Alex Kushnir
>Priority: Major
>
> If array of items contains a null item then array_contains returns true if 
> item is found but if item is not found it returns null instead of false
> Seq(
> (1, Seq("a", "b", "c")),
> (2, Seq("a", "b", null, "c"))
> ).toDF("id", "vals").createOrReplaceTempView("tbl")
> spark.sql("select id, vals, array_contains(vals, 'a') as has_a, 
> array_contains(vals, 'd') as has_d from tbl").show
>  ++-++--+
> |id|vals|has_a|has_d|
> ++-++--+
> |1|[a, b, c]|true|false|
> |2|[a, b,, c]|true|null|
> ++-++--+



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org