cloud-fan closed pull request #45422: [SPARK-47296][SQL][COLLATION] Fail
unsupported functions for non-binary collations
URL: https://github.com/apache/spark/pull/45422
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
cloud-fan commented on PR #45422:
URL: https://github.com/apache/spark/pull/45422#issuecomment-2009773392
thanks, merging to master!
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific c
mihailom-db commented on PR #45422:
URL: https://github.com/apache/spark/pull/45422#issuecomment-2008104827
> LGTM
>
> As a follow up we should revisit error messages. IMO it is weird to expose
message with "string_any_collation" type to customer. But I think that we can
do that as a
mihailom-db commented on code in PR #45422:
URL: https://github.com/apache/spark/pull/45422#discussion_r1531040306
##
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/TypeCoercion.scala:
##
@@ -994,8 +994,12 @@ object TypeCoercion extends TypeCoercionBase {
mihailom-db commented on code in PR #45422:
URL: https://github.com/apache/spark/pull/45422#discussion_r1530992551
##
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/AnsiTypeCoercion.scala:
##
@@ -215,6 +220,10 @@ object AnsiTypeCoercion extends TypeCoercionBa
cloud-fan commented on code in PR #45422:
URL: https://github.com/apache/spark/pull/45422#discussion_r1530688780
##
sql/core/src/test/scala/org/apache/spark/sql/CollationStringExpressionsSuite.scala:
##
@@ -0,0 +1,73 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) un
cloud-fan commented on code in PR #45422:
URL: https://github.com/apache/spark/pull/45422#discussion_r1530687285
##
sql/core/src/test/scala/org/apache/spark/sql/CollationRegexpExpressionsSuite.scala:
##
@@ -0,0 +1,438 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) u
cloud-fan commented on code in PR #45422:
URL: https://github.com/apache/spark/pull/45422#discussion_r1530684010
##
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/TypeCoercion.scala:
##
@@ -994,8 +994,12 @@ object TypeCoercion extends TypeCoercionBase {
cloud-fan commented on code in PR #45422:
URL: https://github.com/apache/spark/pull/45422#discussion_r1530682774
##
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/AnsiTypeCoercion.scala:
##
@@ -205,6 +205,11 @@ object AnsiTypeCoercion extends TypeCoercionBase
cloud-fan commented on code in PR #45422:
URL: https://github.com/apache/spark/pull/45422#discussion_r1530677086
##
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/TypeCoercion.scala:
##
@@ -994,8 +994,12 @@ object TypeCoercion extends TypeCoercionBase {
cloud-fan commented on code in PR #45422:
URL: https://github.com/apache/spark/pull/45422#discussion_r1530675927
##
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/AnsiTypeCoercion.scala:
##
@@ -215,6 +220,10 @@ object AnsiTypeCoercion extends TypeCoercionBase
cloud-fan commented on code in PR #45422:
URL: https://github.com/apache/spark/pull/45422#discussion_r1530674212
##
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/AnsiTypeCoercion.scala:
##
@@ -205,6 +205,11 @@ object AnsiTypeCoercion extends TypeCoercionBase
cloud-fan commented on code in PR #45422:
URL: https://github.com/apache/spark/pull/45422#discussion_r1530673443
##
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/AnsiTypeCoercion.scala:
##
@@ -205,6 +205,11 @@ object AnsiTypeCoercion extends TypeCoercionBase
mihailom-db commented on code in PR #45422:
URL: https://github.com/apache/spark/pull/45422#discussion_r1530084248
##
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/TypeCoercion.scala:
##
@@ -994,8 +994,11 @@ object TypeCoercion extends TypeCoercionBase {
mihailom-db commented on code in PR #45422:
URL: https://github.com/apache/spark/pull/45422#discussion_r1529966896
##
sql/api/src/main/scala/org/apache/spark/sql/types/StringType.scala:
##
@@ -40,6 +40,7 @@ class StringType private(val collationId: Int) extends
AtomicType with
cloud-fan commented on code in PR #45422:
URL: https://github.com/apache/spark/pull/45422#discussion_r1526346919
##
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/TypeCoercion.scala:
##
@@ -994,8 +994,11 @@ object TypeCoercion extends TypeCoercionBase {
dbatomic commented on PR #45422:
URL: https://github.com/apache/spark/pull/45422#issuecomment-1999648400
LGTM
As a follow up we should revisit error messages. IMO it is weird to expose
message with "string_any_collation" type to customer. But I think that we can
do that as a follow u
dbatomic commented on code in PR #45422:
URL: https://github.com/apache/spark/pull/45422#discussion_r1526193615
##
sql/api/src/main/scala/org/apache/spark/sql/types/StringType.scala:
##
@@ -40,6 +40,7 @@ class StringType private(val collationId: Int) extends
AtomicType with Ser
uros-db commented on code in PR #45422:
URL: https://github.com/apache/spark/pull/45422#discussion_r1526085933
##
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/TypeCoercion.scala:
##
@@ -994,8 +994,10 @@ object TypeCoercion extends TypeCoercionBase {
mihailom-db commented on code in PR #45422:
URL: https://github.com/apache/spark/pull/45422#discussion_r1526146584
##
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/TypeCoercion.scala:
##
@@ -994,8 +994,10 @@ object TypeCoercion extends TypeCoercionBase {
uros-db commented on code in PR #45422:
URL: https://github.com/apache/spark/pull/45422#discussion_r1517236308
##
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/CollationUtils.scala:
##
@@ -0,0 +1,86 @@
+/*
+ * Licensed to the Apache Software Foundation (A
uros-db commented on code in PR #45422:
URL: https://github.com/apache/spark/pull/45422#discussion_r1526092915
##
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/AnsiTypeCoercion.scala:
##
@@ -205,6 +205,10 @@ object AnsiTypeCoercion extends TypeCoercionBase {
uros-db commented on code in PR #45422:
URL: https://github.com/apache/spark/pull/45422#discussion_r1526085933
##
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/TypeCoercion.scala:
##
@@ -994,8 +994,10 @@ object TypeCoercion extends TypeCoercionBase {
cloud-fan commented on code in PR #45422:
URL: https://github.com/apache/spark/pull/45422#discussion_r1526084649
##
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/AnsiTypeCoercion.scala:
##
@@ -205,6 +205,10 @@ object AnsiTypeCoercion extends TypeCoercionBase
cloud-fan commented on code in PR #45422:
URL: https://github.com/apache/spark/pull/45422#discussion_r1526067004
##
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/TypeCoercion.scala:
##
@@ -994,8 +994,10 @@ object TypeCoercion extends TypeCoercionBase {
uros-db commented on code in PR #45422:
URL: https://github.com/apache/spark/pull/45422#discussion_r1526059294
##
sql/api/src/main/scala/org/apache/spark/sql/types/StringType.scala:
##
@@ -65,9 +64,41 @@ class StringType private(val collationId: Int) extends
AtomicType with Ser
uros-db commented on code in PR #45422:
URL: https://github.com/apache/spark/pull/45422#discussion_r1526059294
##
sql/api/src/main/scala/org/apache/spark/sql/types/StringType.scala:
##
@@ -65,9 +64,41 @@ class StringType private(val collationId: Int) extends
AtomicType with Ser
dbatomic commented on code in PR #45422:
URL: https://github.com/apache/spark/pull/45422#discussion_r1526026385
##
sql/api/src/main/scala/org/apache/spark/sql/types/StringType.scala:
##
@@ -65,9 +64,41 @@ class StringType private(val collationId: Int) extends
AtomicType with Se
cloud-fan commented on code in PR #45422:
URL: https://github.com/apache/spark/pull/45422#discussion_r1526006045
##
sql/api/src/main/scala/org/apache/spark/sql/types/StringType.scala:
##
@@ -65,9 +64,41 @@ class StringType private(val collationId: Int) extends
AtomicType with S
dbatomic commented on code in PR #45422:
URL: https://github.com/apache/spark/pull/45422#discussion_r1525992996
##
sql/api/src/main/scala/org/apache/spark/sql/types/StringType.scala:
##
@@ -65,9 +64,41 @@ class StringType private(val collationId: Int) extends
AtomicType with Se
dbatomic commented on code in PR #45422:
URL: https://github.com/apache/spark/pull/45422#discussion_r1525990277
##
common/unsafe/src/main/java/org/apache/spark/sql/catalyst/util/CollationFactory.java:
##
@@ -69,6 +69,7 @@ public static class Collation {
* byte for byte equ
cloud-fan commented on code in PR #45422:
URL: https://github.com/apache/spark/pull/45422#discussion_r1525952969
##
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/TypeCoercion.scala:
##
@@ -956,9 +956,19 @@ object TypeCoercion extends TypeCoercionBase {
cloud-fan commented on code in PR #45422:
URL: https://github.com/apache/spark/pull/45422#discussion_r1525857828
##
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/TypeCoercion.scala:
##
@@ -702,9 +702,13 @@ abstract class TypeCoercionBase {
}.getOrEl
cloud-fan commented on code in PR #45422:
URL: https://github.com/apache/spark/pull/45422#discussion_r1525855043
##
sql/api/src/main/scala/org/apache/spark/sql/types/StringType.scala:
##
@@ -65,9 +64,43 @@ class StringType private(val collationId: Int) extends
AtomicType with S
uros-db commented on PR #45422:
URL: https://github.com/apache/spark/pull/45422#issuecomment-1994394143
@cloud-fan that makes a lot of sense, to combat this - now new case classes
should handle this. essentially:
- `StringType` no longer accepts all collationIds, but only the default
col
cloud-fan commented on PR #45422:
URL: https://github.com/apache/spark/pull/45422#issuecomment-1991702090
I don't think it's safe to only handle expressions in
`regexpExpressions.scala`. For example, `Substring` is not there. I don't know
how to collect all functions that take `StringType`,
uros-db commented on PR #45422:
URL: https://github.com/apache/spark/pull/45422#issuecomment-1991590743
@cloud-fan yes, that is a problem... should we settle only on `string
functions` for now? I think these functions that are meant to work with Strings
are more sensitive to this error
cloud-fan commented on PR #45422:
URL: https://github.com/apache/spark/pull/45422#issuecomment-1991557191
Without updating `StringType.acceptsType`, I'm not confident to find out all
functions that expect StringType but do not support collation.
--
This is an automated message from the Ap
MaxGekk commented on code in PR #45422:
URL: https://github.com/apache/spark/pull/45422#discussion_r1518490602
##
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/CollationTypeConstraints.scala:
##
@@ -0,0 +1,89 @@
+/*
+ * Licensed to the Apache Software Fou
uros-db commented on code in PR #45422:
URL: https://github.com/apache/spark/pull/45422#discussion_r1517236308
##
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/CollationUtils.scala:
##
@@ -0,0 +1,86 @@
+/*
+ * Licensed to the Apache Software Foundation (A
HyukjinKwon commented on code in PR #45422:
URL: https://github.com/apache/spark/pull/45422#discussion_r1517022572
##
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/CollationUtils.scala:
##
@@ -0,0 +1,86 @@
+/*
+ * Licensed to the Apache Software Foundatio
dbatomic commented on code in PR #45422:
URL: https://github.com/apache/spark/pull/45422#discussion_r1516510742
##
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/CollationUtils.scala:
##
@@ -0,0 +1,86 @@
+/*
+ * Licensed to the Apache Software Foundation (
uros-db opened a new pull request, #45422:
URL: https://github.com/apache/spark/pull/45422
### What changes were proposed in this pull request?
### Why are the changes needed?
Currently, all `StringType` arguments passed to built-in string functions in
Spark SQL get treated
43 matches
Mail list logo