Re: [PR] [SPARK-47296][SQL][COLLATION] Fail unsupported functions for non-binary collations [spark]

2024-03-20 Thread via GitHub
cloud-fan closed pull request #45422: [SPARK-47296][SQL][COLLATION] Fail unsupported functions for non-binary collations URL: https://github.com/apache/spark/pull/45422 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the

Re: [PR] [SPARK-47296][SQL][COLLATION] Fail unsupported functions for non-binary collations [spark]

2024-03-20 Thread via GitHub
cloud-fan commented on PR #45422: URL: https://github.com/apache/spark/pull/45422#issuecomment-2009773392 thanks, merging to master! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

Re: [PR] [SPARK-47296][SQL][COLLATION] Fail unsupported functions for non-binary collations [spark]

2024-03-19 Thread via GitHub
mihailom-db commented on PR #45422: URL: https://github.com/apache/spark/pull/45422#issuecomment-2008104827 > LGTM > > As a follow up we should revisit error messages. IMO it is weird to expose message with "string_any_collation" type to customer. But I think that we can do that as

Re: [PR] [SPARK-47296][SQL][COLLATION] Fail unsupported functions for non-binary collations [spark]

2024-03-19 Thread via GitHub
mihailom-db commented on code in PR #45422: URL: https://github.com/apache/spark/pull/45422#discussion_r1531040306 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/TypeCoercion.scala: ## @@ -994,8 +994,12 @@ object TypeCoercion extends TypeCoercionBase {

Re: [PR] [SPARK-47296][SQL][COLLATION] Fail unsupported functions for non-binary collations [spark]

2024-03-19 Thread via GitHub
mihailom-db commented on code in PR #45422: URL: https://github.com/apache/spark/pull/45422#discussion_r1530992551 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/AnsiTypeCoercion.scala: ## @@ -215,6 +220,10 @@ object AnsiTypeCoercion extends

Re: [PR] [SPARK-47296][SQL][COLLATION] Fail unsupported functions for non-binary collations [spark]

2024-03-19 Thread via GitHub
cloud-fan commented on code in PR #45422: URL: https://github.com/apache/spark/pull/45422#discussion_r1530688780 ## sql/core/src/test/scala/org/apache/spark/sql/CollationStringExpressionsSuite.scala: ## @@ -0,0 +1,73 @@ +/* + * Licensed to the Apache Software Foundation (ASF)

Re: [PR] [SPARK-47296][SQL][COLLATION] Fail unsupported functions for non-binary collations [spark]

2024-03-19 Thread via GitHub
cloud-fan commented on code in PR #45422: URL: https://github.com/apache/spark/pull/45422#discussion_r1530687285 ## sql/core/src/test/scala/org/apache/spark/sql/CollationRegexpExpressionsSuite.scala: ## @@ -0,0 +1,438 @@ +/* + * Licensed to the Apache Software Foundation (ASF)

Re: [PR] [SPARK-47296][SQL][COLLATION] Fail unsupported functions for non-binary collations [spark]

2024-03-19 Thread via GitHub
cloud-fan commented on code in PR #45422: URL: https://github.com/apache/spark/pull/45422#discussion_r1530684010 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/TypeCoercion.scala: ## @@ -994,8 +994,12 @@ object TypeCoercion extends TypeCoercionBase {

Re: [PR] [SPARK-47296][SQL][COLLATION] Fail unsupported functions for non-binary collations [spark]

2024-03-19 Thread via GitHub
cloud-fan commented on code in PR #45422: URL: https://github.com/apache/spark/pull/45422#discussion_r1530682774 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/AnsiTypeCoercion.scala: ## @@ -205,6 +205,11 @@ object AnsiTypeCoercion extends

Re: [PR] [SPARK-47296][SQL][COLLATION] Fail unsupported functions for non-binary collations [spark]

2024-03-19 Thread via GitHub
cloud-fan commented on code in PR #45422: URL: https://github.com/apache/spark/pull/45422#discussion_r1530677086 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/TypeCoercion.scala: ## @@ -994,8 +994,12 @@ object TypeCoercion extends TypeCoercionBase {

Re: [PR] [SPARK-47296][SQL][COLLATION] Fail unsupported functions for non-binary collations [spark]

2024-03-19 Thread via GitHub
cloud-fan commented on code in PR #45422: URL: https://github.com/apache/spark/pull/45422#discussion_r1530675927 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/AnsiTypeCoercion.scala: ## @@ -215,6 +220,10 @@ object AnsiTypeCoercion extends

Re: [PR] [SPARK-47296][SQL][COLLATION] Fail unsupported functions for non-binary collations [spark]

2024-03-19 Thread via GitHub
cloud-fan commented on code in PR #45422: URL: https://github.com/apache/spark/pull/45422#discussion_r1530674212 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/AnsiTypeCoercion.scala: ## @@ -205,6 +205,11 @@ object AnsiTypeCoercion extends

Re: [PR] [SPARK-47296][SQL][COLLATION] Fail unsupported functions for non-binary collations [spark]

2024-03-19 Thread via GitHub
cloud-fan commented on code in PR #45422: URL: https://github.com/apache/spark/pull/45422#discussion_r1530673443 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/AnsiTypeCoercion.scala: ## @@ -205,6 +205,11 @@ object AnsiTypeCoercion extends

Re: [PR] [SPARK-47296][SQL][COLLATION] Fail unsupported functions for non-binary collations [spark]

2024-03-19 Thread via GitHub
mihailom-db commented on code in PR #45422: URL: https://github.com/apache/spark/pull/45422#discussion_r1530084248 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/TypeCoercion.scala: ## @@ -994,8 +994,11 @@ object TypeCoercion extends TypeCoercionBase {

Re: [PR] [SPARK-47296][SQL][COLLATION] Fail unsupported functions for non-binary collations [spark]

2024-03-19 Thread via GitHub
mihailom-db commented on code in PR #45422: URL: https://github.com/apache/spark/pull/45422#discussion_r1529966896 ## sql/api/src/main/scala/org/apache/spark/sql/types/StringType.scala: ## @@ -40,6 +40,7 @@ class StringType private(val collationId: Int) extends AtomicType with

Re: [PR] [SPARK-47296][SQL][COLLATION] Fail unsupported functions for non-binary collations [spark]

2024-03-15 Thread via GitHub
cloud-fan commented on code in PR #45422: URL: https://github.com/apache/spark/pull/45422#discussion_r1526346919 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/TypeCoercion.scala: ## @@ -994,8 +994,11 @@ object TypeCoercion extends TypeCoercionBase {

Re: [PR] [SPARK-47296][SQL][COLLATION] Fail unsupported functions for non-binary collations [spark]

2024-03-15 Thread via GitHub
dbatomic commented on PR #45422: URL: https://github.com/apache/spark/pull/45422#issuecomment-1999648400 LGTM As a follow up we should revisit error messages. IMO it is weird to expose message with "string_any_collation" type to customer. But I think that we can do that as a follow

Re: [PR] [SPARK-47296][SQL][COLLATION] Fail unsupported functions for non-binary collations [spark]

2024-03-15 Thread via GitHub
dbatomic commented on code in PR #45422: URL: https://github.com/apache/spark/pull/45422#discussion_r1526193615 ## sql/api/src/main/scala/org/apache/spark/sql/types/StringType.scala: ## @@ -40,6 +40,7 @@ class StringType private(val collationId: Int) extends AtomicType with

Re: [PR] [SPARK-47296][SQL][COLLATION] Fail unsupported functions for non-binary collations [spark]

2024-03-15 Thread via GitHub
uros-db commented on code in PR #45422: URL: https://github.com/apache/spark/pull/45422#discussion_r1526085933 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/TypeCoercion.scala: ## @@ -994,8 +994,10 @@ object TypeCoercion extends TypeCoercionBase {

Re: [PR] [SPARK-47296][SQL][COLLATION] Fail unsupported functions for non-binary collations [spark]

2024-03-15 Thread via GitHub
mihailom-db commented on code in PR #45422: URL: https://github.com/apache/spark/pull/45422#discussion_r1526146584 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/TypeCoercion.scala: ## @@ -994,8 +994,10 @@ object TypeCoercion extends TypeCoercionBase {

Re: [PR] [SPARK-47296][SQL][COLLATION] Fail unsupported functions for non-binary collations [spark]

2024-03-15 Thread via GitHub
uros-db commented on code in PR #45422: URL: https://github.com/apache/spark/pull/45422#discussion_r1517236308 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/CollationUtils.scala: ## @@ -0,0 +1,86 @@ +/* + * Licensed to the Apache Software Foundation

Re: [PR] [SPARK-47296][SQL][COLLATION] Fail unsupported functions for non-binary collations [spark]

2024-03-15 Thread via GitHub
uros-db commented on code in PR #45422: URL: https://github.com/apache/spark/pull/45422#discussion_r1526092915 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/AnsiTypeCoercion.scala: ## @@ -205,6 +205,10 @@ object AnsiTypeCoercion extends TypeCoercionBase

Re: [PR] [SPARK-47296][SQL][COLLATION] Fail unsupported functions for non-binary collations [spark]

2024-03-15 Thread via GitHub
uros-db commented on code in PR #45422: URL: https://github.com/apache/spark/pull/45422#discussion_r1526085933 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/TypeCoercion.scala: ## @@ -994,8 +994,10 @@ object TypeCoercion extends TypeCoercionBase {

Re: [PR] [SPARK-47296][SQL][COLLATION] Fail unsupported functions for non-binary collations [spark]

2024-03-15 Thread via GitHub
cloud-fan commented on code in PR #45422: URL: https://github.com/apache/spark/pull/45422#discussion_r1526084649 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/AnsiTypeCoercion.scala: ## @@ -205,6 +205,10 @@ object AnsiTypeCoercion extends

Re: [PR] [SPARK-47296][SQL][COLLATION] Fail unsupported functions for non-binary collations [spark]

2024-03-15 Thread via GitHub
cloud-fan commented on code in PR #45422: URL: https://github.com/apache/spark/pull/45422#discussion_r1526067004 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/TypeCoercion.scala: ## @@ -994,8 +994,10 @@ object TypeCoercion extends TypeCoercionBase {

Re: [PR] [SPARK-47296][SQL][COLLATION] Fail unsupported functions for non-binary collations [spark]

2024-03-15 Thread via GitHub
uros-db commented on code in PR #45422: URL: https://github.com/apache/spark/pull/45422#discussion_r1526059294 ## sql/api/src/main/scala/org/apache/spark/sql/types/StringType.scala: ## @@ -65,9 +64,41 @@ class StringType private(val collationId: Int) extends AtomicType with

Re: [PR] [SPARK-47296][SQL][COLLATION] Fail unsupported functions for non-binary collations [spark]

2024-03-15 Thread via GitHub
uros-db commented on code in PR #45422: URL: https://github.com/apache/spark/pull/45422#discussion_r1526059294 ## sql/api/src/main/scala/org/apache/spark/sql/types/StringType.scala: ## @@ -65,9 +64,41 @@ class StringType private(val collationId: Int) extends AtomicType with

Re: [PR] [SPARK-47296][SQL][COLLATION] Fail unsupported functions for non-binary collations [spark]

2024-03-15 Thread via GitHub
dbatomic commented on code in PR #45422: URL: https://github.com/apache/spark/pull/45422#discussion_r1526026385 ## sql/api/src/main/scala/org/apache/spark/sql/types/StringType.scala: ## @@ -65,9 +64,41 @@ class StringType private(val collationId: Int) extends AtomicType with

Re: [PR] [SPARK-47296][SQL][COLLATION] Fail unsupported functions for non-binary collations [spark]

2024-03-15 Thread via GitHub
cloud-fan commented on code in PR #45422: URL: https://github.com/apache/spark/pull/45422#discussion_r1526006045 ## sql/api/src/main/scala/org/apache/spark/sql/types/StringType.scala: ## @@ -65,9 +64,41 @@ class StringType private(val collationId: Int) extends AtomicType with

Re: [PR] [SPARK-47296][SQL][COLLATION] Fail unsupported functions for non-binary collations [spark]

2024-03-15 Thread via GitHub
dbatomic commented on code in PR #45422: URL: https://github.com/apache/spark/pull/45422#discussion_r1525992996 ## sql/api/src/main/scala/org/apache/spark/sql/types/StringType.scala: ## @@ -65,9 +64,41 @@ class StringType private(val collationId: Int) extends AtomicType with

Re: [PR] [SPARK-47296][SQL][COLLATION] Fail unsupported functions for non-binary collations [spark]

2024-03-15 Thread via GitHub
dbatomic commented on code in PR #45422: URL: https://github.com/apache/spark/pull/45422#discussion_r1525990277 ## common/unsafe/src/main/java/org/apache/spark/sql/catalyst/util/CollationFactory.java: ## @@ -69,6 +69,7 @@ public static class Collation { * byte for byte

Re: [PR] [SPARK-47296][SQL][COLLATION] Fail unsupported functions for non-binary collations [spark]

2024-03-15 Thread via GitHub
cloud-fan commented on code in PR #45422: URL: https://github.com/apache/spark/pull/45422#discussion_r1525952969 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/TypeCoercion.scala: ## @@ -956,9 +956,19 @@ object TypeCoercion extends TypeCoercionBase {

Re: [PR] [SPARK-47296][SQL][COLLATION] Fail unsupported functions for non-binary collations [spark]

2024-03-15 Thread via GitHub
cloud-fan commented on code in PR #45422: URL: https://github.com/apache/spark/pull/45422#discussion_r1525857828 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/TypeCoercion.scala: ## @@ -702,9 +702,13 @@ abstract class TypeCoercionBase {

Re: [PR] [SPARK-47296][SQL][COLLATION] Fail unsupported functions for non-binary collations [spark]

2024-03-15 Thread via GitHub
cloud-fan commented on code in PR #45422: URL: https://github.com/apache/spark/pull/45422#discussion_r1525855043 ## sql/api/src/main/scala/org/apache/spark/sql/types/StringType.scala: ## @@ -65,9 +64,43 @@ class StringType private(val collationId: Int) extends AtomicType with

Re: [PR] [SPARK-47296][SQL][COLLATION] Fail unsupported functions for non-binary collations [spark]

2024-03-13 Thread via GitHub
uros-db commented on PR #45422: URL: https://github.com/apache/spark/pull/45422#issuecomment-1994394143 @cloud-fan that makes a lot of sense, to combat this - now new case classes should handle this. essentially: - `StringType` no longer accepts all collationIds, but only the default

Re: [PR] [SPARK-47296][SQL][COLLATION] Fail unsupported functions for non-binary collations [spark]

2024-03-12 Thread via GitHub
cloud-fan commented on PR #45422: URL: https://github.com/apache/spark/pull/45422#issuecomment-1991702090 I don't think it's safe to only handle expressions in `regexpExpressions.scala`. For example, `Substring` is not there. I don't know how to collect all functions that take

Re: [PR] [SPARK-47296][SQL][COLLATION] Fail unsupported functions for non-binary collations [spark]

2024-03-12 Thread via GitHub
uros-db commented on PR #45422: URL: https://github.com/apache/spark/pull/45422#issuecomment-1991590743 @cloud-fan yes, that is a problem... should we settle only on `string functions` for now? I think these functions that are meant to work with Strings are more sensitive to this error

Re: [PR] [SPARK-47296][SQL][COLLATION] Fail unsupported functions for non-binary collations [spark]

2024-03-12 Thread via GitHub
cloud-fan commented on PR #45422: URL: https://github.com/apache/spark/pull/45422#issuecomment-1991557191 Without updating `StringType.acceptsType`, I'm not confident to find out all functions that expect StringType but do not support collation. -- This is an automated message from the

Re: [PR] [SPARK-47296][SQL][COLLATION] Fail unsupported functions for non-binary collations [spark]

2024-03-08 Thread via GitHub
MaxGekk commented on code in PR #45422: URL: https://github.com/apache/spark/pull/45422#discussion_r1518490602 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/CollationTypeConstraints.scala: ## @@ -0,0 +1,89 @@ +/* + * Licensed to the Apache Software

Re: [PR] [SPARK-47296][SQL][COLLATION] Fail unsupported functions for non-binary collations [spark]

2024-03-07 Thread via GitHub
uros-db commented on code in PR #45422: URL: https://github.com/apache/spark/pull/45422#discussion_r1517236308 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/CollationUtils.scala: ## @@ -0,0 +1,86 @@ +/* + * Licensed to the Apache Software Foundation

Re: [PR] [SPARK-47296][SQL][COLLATION] Fail unsupported functions for non-binary collations [spark]

2024-03-07 Thread via GitHub
HyukjinKwon commented on code in PR #45422: URL: https://github.com/apache/spark/pull/45422#discussion_r1517022572 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/CollationUtils.scala: ## @@ -0,0 +1,86 @@ +/* + * Licensed to the Apache Software

Re: [PR] [SPARK-47296][SQL][COLLATION] Fail unsupported functions for non-binary collations [spark]

2024-03-07 Thread via GitHub
dbatomic commented on code in PR #45422: URL: https://github.com/apache/spark/pull/45422#discussion_r1516510742 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/CollationUtils.scala: ## @@ -0,0 +1,86 @@ +/* + * Licensed to the Apache Software Foundation

[PR] [SPARK-47296][SQL][COLLATION] Fail unsupported functions for non-binary collations [spark]

2024-03-07 Thread via GitHub
uros-db opened a new pull request, #45422: URL: https://github.com/apache/spark/pull/45422 ### What changes were proposed in this pull request? ### Why are the changes needed? Currently, all `StringType` arguments passed to built-in string functions in Spark SQL get