Re: [PR] [SPARK-47307][SQL] Add a config to optionally chunk base64 strings [spark]

2024-03-11 Thread via GitHub
ted-jenks commented on PR #45408: URL: https://github.com/apache/spark/pull/45408#issuecomment-1988639946 I am having trouble getting the failing test to pass: ```13:27:04.051 ERROR org.apache.spark.sql.hive.thriftserver.ThriftServerQueryTestSuite``` Giving: ```

Re: [PR] [SPARK-47323][K8S] Support custom executor log urls [spark]

2024-03-11 Thread via GitHub
pan3793 commented on code in PR #45464: URL: https://github.com/apache/spark/pull/45464#discussion_r1519858538 ## resource-managers/kubernetes/core/src/main/scala/org/apache/spark/scheduler/cluster/k8s/KubernetesExecutorBackend.scala: ## @@ -28,6 +28,46 @@ import

Re: [PR] [WIP][SPARK 46840] Add sql.execution.benchmark.CollationBenchmark.scala Scaffolding [spark]

2024-03-11 Thread via GitHub
MaxGekk commented on PR #45453: URL: https://github.com/apache/spark/pull/45453#issuecomment-1988594133 @dbatomic @stefankandic Could you review this PR, please. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

Re: [PR] [SPARK-47323][K8S] Support custom executor log urls [spark]

2024-03-11 Thread via GitHub
mridulm commented on PR #45464: URL: https://github.com/apache/spark/pull/45464#issuecomment-1988591514 +CC @thejdeep -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To

Re: [PR] [SPARK-47255][SQL] Assign names to the error classes _LEGACY_ERROR_TEMP_323[6-7] and _LEGACY_ERROR_TEMP_324[7-9] [spark]

2024-03-11 Thread via GitHub
MaxGekk closed pull request #45423: [SPARK-47255][SQL] Assign names to the error classes _LEGACY_ERROR_TEMP_323[6-7] and _LEGACY_ERROR_TEMP_324[7-9] URL: https://github.com/apache/spark/pull/45423 -- This is an automated message from the Apache Git Service. To respond to the message, please

Re: [PR] [SPARK-47255][SQL] Assign names to the error classes _LEGACY_ERROR_TEMP_323[6-7] and _LEGACY_ERROR_TEMP_324[7-9] [spark]

2024-03-11 Thread via GitHub
MaxGekk commented on PR #45423: URL: https://github.com/apache/spark/pull/45423#issuecomment-1988584943 +1, LGTM. Merging to master. Thank you, @miland-db and @HyukjinKwon for review. -- This is an automated message from the Apache Git Service. To respond to the message, please log on

Re: [PR] [SPARK-47255][SQL] Assign names to the error classes _LEGACY_ERROR_TEMP_323[6-7] and _LEGACY_ERROR_TEMP_324[7-9] [spark]

2024-03-11 Thread via GitHub
MaxGekk commented on PR #45423: URL: https://github.com/apache/spark/pull/45423#issuecomment-1988588006 @miland-db Congratulations with your first contribution to Apache Spark! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub

Re: [PR] [SPARK-47313][SQL] Added scala.MatchError handling inside QueryExecution.toInternalError [spark]

2024-03-11 Thread via GitHub
MaxGekk commented on PR #45438: URL: https://github.com/apache/spark/pull/45438#issuecomment-1988571851 @stevomitric Congratulations with your first contribution to Apache Spark! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to

[PR] [SPARK-47323][K8S] Support custom executor log urls [spark]

2024-03-11 Thread via GitHub
EnricoMi opened a new pull request, #45464: URL: https://github.com/apache/spark/pull/45464 ### What changes were proposed in this pull request? Make Kubernetes resource manager support existing config `spark.ui.custom.executor.log.url`. Allow for

Re: [PR] [SPARK-47313][SQL] Added scala.MatchError handling inside QueryExecution.toInternalError [spark]

2024-03-11 Thread via GitHub
MaxGekk closed pull request #45438: [SPARK-47313][SQL] Added scala.MatchError handling inside QueryExecution.toInternalError URL: https://github.com/apache/spark/pull/45438 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use

Re: [PR] [SPARK-47313][SQL] Added scala.MatchError handling inside QueryExecution.toInternalError [spark]

2024-03-11 Thread via GitHub
MaxGekk commented on PR #45438: URL: https://github.com/apache/spark/pull/45438#issuecomment-1988551358 +1, LGTM. Merging to master. Thank you, @stevomitric and @cloud-fan for review. -- This is an automated message from the Apache Git Service. To respond to the message, please log on

Re: [PR] [SPARK-47343][SQL] Fix NPE when `sqlString` variable value is null string in execute immediate [spark]

2024-03-11 Thread via GitHub
milastdbx commented on code in PR #45462: URL: https://github.com/apache/spark/pull/45462#discussion_r1519779866 ## common/utils/src/main/resources/error/error-classes.json: ## @@ -3004,6 +3004,12 @@ ], "sqlState" : "2200E" }, +

[PR] [SPARK-45827] Fix for collation [spark]

2024-03-11 Thread via GitHub
cashmand opened a new pull request, #45463: URL: https://github.com/apache/spark/pull/45463 ### What changes were proposed in this pull request? https://github.com/apache/spark/pull/45409 created a default allow-list of types for data sources. The intent was to only prevent

Re: [PR] [SPARK-47210][SQL][COLLATION] Implicit casting on collated expressions [spark]

2024-03-11 Thread via GitHub
mihailom-db commented on code in PR #45383: URL: https://github.com/apache/spark/pull/45383#discussion_r1519766425 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/AnsiTypeCoercion.scala: ## @@ -173,6 +185,8 @@ object AnsiTypeCoercion extends

Re: [PR] [SPARK-47210][SQL][COLLATION] Implicit casting on collated expressions [spark]

2024-03-11 Thread via GitHub
mihailom-db commented on code in PR #45383: URL: https://github.com/apache/spark/pull/45383#discussion_r1519762760 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/AnsiTypeCoercion.scala: ## @@ -138,21 +140,31 @@ object AnsiTypeCoercion extends

[PR] [MINOR] Minor English fixes [spark]

2024-03-11 Thread via GitHub
nchammas opened a new pull request, #45461: URL: https://github.com/apache/spark/pull/45461 ### What changes were proposed in this pull request? Minor English grammar and wording fixes. ### Why are the changes needed? They're not strictly needed, but give the project a

Re: [PR] [SPARK-47210][SQL][COLLATION] Implicit casting on collated expressions [spark]

2024-03-11 Thread via GitHub
mihailom-db commented on code in PR #45383: URL: https://github.com/apache/spark/pull/45383#discussion_r1519754938 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/TypeCoercion.scala: ## @@ -764,6 +773,94 @@ abstract class TypeCoercionBase { } } +

Re: [PR] [SPARK-47210][SQL][COLLATION] Implicit casting on collated expressions [spark]

2024-03-11 Thread via GitHub
mihailom-db commented on code in PR #45383: URL: https://github.com/apache/spark/pull/45383#discussion_r1519752158 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/TypeCoercion.scala: ## @@ -764,6 +773,94 @@ abstract class TypeCoercionBase { } } +

[PR] [WIP][SPARK-47341][Connect] Replace commands with relations in a few tests in SparkConnectClientSuite [spark]

2024-03-11 Thread via GitHub
xi-db opened a new pull request, #45460: URL: https://github.com/apache/spark/pull/45460 ### What changes were proposed in this pull request? A few

Re: [PR] [SPARK-47210][SQL][COLLATION] Implicit casting on collated expressions [spark]

2024-03-11 Thread via GitHub
dbatomic commented on code in PR #45383: URL: https://github.com/apache/spark/pull/45383#discussion_r1519730371 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/AnsiTypeCoercion.scala: ## @@ -138,21 +140,31 @@ object AnsiTypeCoercion extends

Re: [PR] [SPARK-47210][SQL][COLLATION] Implicit casting on collated expressions [spark]

2024-03-11 Thread via GitHub
mihailom-db commented on code in PR #45383: URL: https://github.com/apache/spark/pull/45383#discussion_r1519714112 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/AnsiTypeCoercion.scala: ## @@ -138,21 +140,31 @@ object AnsiTypeCoercion extends

Re: [PR] [SPARK-47307][SQL] Add a config to optionally chunk base64 strings [spark]

2024-03-11 Thread via GitHub
yaooqinn commented on PR #45408: URL: https://github.com/apache/spark/pull/45408#issuecomment-1988397848 thank you for the explanation @ted-jenks -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

Re: [PR] [SPARK-47307][SQL] Add a config to optionally chunk base64 strings [spark]

2024-03-11 Thread via GitHub
ted-jenks commented on PR #45408: URL: https://github.com/apache/spark/pull/45408#issuecomment-1988371956 > Do we need to revise unbase64 accordingly? Unbase64 uses the Mime decoder, which can tolerate chunked and unchunked data. -- This is an automated message from the Apache

Re: [PR] [SPARK-47328][SQL] Rename UCS_BASIC collation to UTF8_BINARY [spark]

2024-03-11 Thread via GitHub
MaxGekk commented on PR #45442: URL: https://github.com/apache/spark/pull/45442#issuecomment-1988350585 +1, LGTM. Merging to master. Thank you, @stefankandic and @srielau for review. -- This is an automated message from the Apache Git Service. To respond to the message, please log on

Re: [PR] [SPARK-47328][SQL] Rename UCS_BASIC collation to UTF8_BINARY [spark]

2024-03-11 Thread via GitHub
MaxGekk closed pull request #45442: [SPARK-47328][SQL] Rename UCS_BASIC collation to UTF8_BINARY URL: https://github.com/apache/spark/pull/45442 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

Re: [PR] [SPARK-47337][SQL][DOCKER] Upgrade DB2 docker image version to 11.5.8.0 [spark]

2024-03-11 Thread via GitHub
yaooqinn commented on PR #45456: URL: https://github.com/apache/spark/pull/45456#issuecomment-1988294658 Thank you @LuciferYang, merged to master -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

Re: [PR] [SPARK-47337][SQL][DOCKER] Upgrade DB2 docker image version to 11.5.8.0 [spark]

2024-03-11 Thread via GitHub
yaooqinn closed pull request #45456: [SPARK-47337][SQL][DOCKER] Upgrade DB2 docker image version to 11.5.8.0 URL: https://github.com/apache/spark/pull/45456 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above

Re: [PR] [SPARK-47307][SQL] Add a config to optionally chunk base64 strings [spark]

2024-03-11 Thread via GitHub
yaooqinn commented on PR #45408: URL: https://github.com/apache/spark/pull/45408#issuecomment-1988293451 Do we need to revise unbase64 accordingly? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

Re: [PR] [SPARK-47210][SQL][COLLATION] Implicit casting on collated expressions [spark]

2024-03-11 Thread via GitHub
dbatomic commented on code in PR #45383: URL: https://github.com/apache/spark/pull/45383#discussion_r1519598303 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/TypeCoercion.scala: ## @@ -764,6 +773,94 @@ abstract class TypeCoercionBase { } } +

Re: [PR] [SPARK-47210][SQL][COLLATION] Implicit casting on collated expressions [spark]

2024-03-11 Thread via GitHub
dbatomic commented on code in PR #45383: URL: https://github.com/apache/spark/pull/45383#discussion_r1519594237 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/TypeCoercion.scala: ## @@ -764,6 +773,94 @@ abstract class TypeCoercionBase { } } +

Re: [PR] [SPARK-47210][SQL][COLLATION] Implicit casting on collated expressions [spark]

2024-03-11 Thread via GitHub
dbatomic commented on code in PR #45383: URL: https://github.com/apache/spark/pull/45383#discussion_r1519589124 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/TypeCoercion.scala: ## @@ -764,6 +773,94 @@ abstract class TypeCoercionBase { } } +

Re: [PR] [SPARK-47210][SQL][COLLATION] Implicit casting on collated expressions [spark]

2024-03-11 Thread via GitHub
dbatomic commented on code in PR #45383: URL: https://github.com/apache/spark/pull/45383#discussion_r1519587302 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/TypeCoercion.scala: ## @@ -764,6 +773,94 @@ abstract class TypeCoercionBase { } } +

Re: [PR] [SPARK-47210][SQL][COLLATION] Implicit casting on collated expressions [spark]

2024-03-11 Thread via GitHub
dbatomic commented on code in PR #45383: URL: https://github.com/apache/spark/pull/45383#discussion_r1519581505 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/TypeCoercion.scala: ## @@ -764,6 +773,94 @@ abstract class TypeCoercionBase { } } +

Re: [PR] [SPARK-47148][SQL] Avoid to materialize AQE ExchangeQueryStageExec on the cancellation [spark]

2024-03-11 Thread via GitHub
cloud-fan commented on code in PR #45234: URL: https://github.com/apache/spark/pull/45234#discussion_r1519582719 ## sql/core/src/test/scala/org/apache/spark/sql/execution/adaptive/AdaptiveQueryExecSuite.scala: ## @@ -897,6 +897,138 @@ class AdaptiveQueryExecSuite } }

Re: [PR] [SPARK-47210][SQL][COLLATION] Implicit casting on collated expressions [spark]

2024-03-11 Thread via GitHub
dbatomic commented on code in PR #45383: URL: https://github.com/apache/spark/pull/45383#discussion_r1519579806 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/TypeCoercion.scala: ## @@ -764,6 +773,94 @@ abstract class TypeCoercionBase { } } +

Re: [PR] [SPARK-47148][SQL] Avoid to materialize AQE ExchangeQueryStageExec on the cancellation [spark]

2024-03-11 Thread via GitHub
cloud-fan commented on code in PR #45234: URL: https://github.com/apache/spark/pull/45234#discussion_r1519578742 ## sql/core/src/test/scala/org/apache/spark/sql/execution/adaptive/AdaptiveQueryExecSuite.scala: ## @@ -897,6 +897,138 @@ class AdaptiveQueryExecSuite } }

Re: [PR] [SPARK-47210][SQL][COLLATION] Implicit casting on collated expressions [spark]

2024-03-11 Thread via GitHub
dbatomic commented on code in PR #45383: URL: https://github.com/apache/spark/pull/45383#discussion_r1519576239 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/AnsiTypeCoercion.scala: ## @@ -173,6 +185,8 @@ object AnsiTypeCoercion extends TypeCoercionBase

Re: [PR] [SPARK-47210][SQL][COLLATION] Implicit casting on collated expressions [spark]

2024-03-11 Thread via GitHub
dbatomic commented on code in PR #45383: URL: https://github.com/apache/spark/pull/45383#discussion_r1519574637 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/AnsiTypeCoercion.scala: ## @@ -138,21 +140,31 @@ object AnsiTypeCoercion extends

Re: [PR] [SPARK-47270][SQL] Dataset.isEmpty projects CommandResults locally [spark]

2024-03-11 Thread via GitHub
cloud-fan closed pull request #45373: [SPARK-47270][SQL] Dataset.isEmpty projects CommandResults locally URL: https://github.com/apache/spark/pull/45373 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

Re: [PR] [SPARK-47270][SQL] Dataset.isEmpty projects CommandResults locally [spark]

2024-03-11 Thread via GitHub
cloud-fan commented on PR #45373: URL: https://github.com/apache/spark/pull/45373#issuecomment-1988232237 thanks, merging to master! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

Re: [PR] [SPARK-47210][SQL][COLLATION] Implicit casting on collated expressions [spark]

2024-03-11 Thread via GitHub
dbatomic commented on code in PR #45383: URL: https://github.com/apache/spark/pull/45383#discussion_r1519567374 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/AnsiTypeCoercion.scala: ## @@ -138,21 +140,31 @@ object AnsiTypeCoercion extends

Re: [PR] [SPARK-47210][SQL][COLLATION] Implicit casting on collated expressions [spark]

2024-03-11 Thread via GitHub
dbatomic commented on code in PR #45383: URL: https://github.com/apache/spark/pull/45383#discussion_r1519561653 ## sql/api/src/main/scala/org/apache/spark/sql/types/StringType.scala: ## @@ -41,14 +41,22 @@ class StringType private(val collationId: Int) extends AtomicType with

Re: [PR] [SPARK-47295][SQL][COLLATION] Added ICU StringSearch for 'startsWith' and 'endsWith' functions [spark]

2024-03-11 Thread via GitHub
cloud-fan commented on code in PR #45421: URL: https://github.com/apache/spark/pull/45421#discussion_r1519546855 ## common/unsafe/src/main/java/org/apache/spark/unsafe/types/UTF8String.java: ## @@ -396,7 +389,18 @@ public boolean startsWith(final UTF8String prefix, int

Re: [PR] [MINOR][DOCS][SQL] Fix doc comment for coalescePartitions.parallelismFirst on sql-performance-tuning page [spark]

2024-03-11 Thread via GitHub
yaooqinn commented on PR #45458: URL: https://github.com/apache/spark/pull/45458#issuecomment-1988054085 Thank you, merged to master -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

Re: [PR] [MINOR][DOCS][SQL] Fix doc comment for coalescePartitions.parallelismFirst on sql-performance-tuning page [spark]

2024-03-11 Thread via GitHub
yaooqinn closed pull request #45458: [MINOR][DOCS][SQL] Fix doc comment for coalescePartitions.parallelismFirst on sql-performance-tuning page URL: https://github.com/apache/spark/pull/45458 -- This is an automated message from the Apache Git Service. To respond to the message, please log

Re: [PR] [SPARK-47295][SQL][COLLATION] Added ICU StringSearch for 'startsWith' and 'endsWith' functions [spark]

2024-03-11 Thread via GitHub
dbatomic commented on code in PR #45421: URL: https://github.com/apache/spark/pull/45421#discussion_r1519451854 ## common/unsafe/src/main/java/org/apache/spark/unsafe/types/UTF8String.java: ## @@ -396,7 +389,18 @@ public boolean startsWith(final UTF8String prefix, int

Re: [PR] [SPARK-47327][SQL] Fix thread safety issue in ICU Collator [spark]

2024-03-11 Thread via GitHub
dbatomic commented on code in PR #45436: URL: https://github.com/apache/spark/pull/45436#discussion_r1519437339 ## common/unsafe/src/main/java/org/apache/spark/sql/catalyst/util/CollationFactory.java: ## @@ -138,11 +138,13 @@ public Collation( collationTable[2] = new

Re: [PR] [SPARK-45245][CONNECT][TESTS][FOLLOW-UP] Remove unneeded Matchers trait in the test [spark]

2024-03-11 Thread via GitHub
cloud-fan commented on code in PR #45459: URL: https://github.com/apache/spark/pull/45459#discussion_r1519406352 ## core/src/test/scala/org/apache/spark/api/python/PythonWorkerFactorySuite.scala: ## @@ -33,7 +33,7 @@ import org.apache.spark.SparkFunSuite import

Re: [PR] [SPARK-45245][PYTHON][CONNECT] PythonWorkerFactory: Timeout if worker does not connect back. [spark]

2024-03-11 Thread via GitHub
HyukjinKwon commented on code in PR #43023: URL: https://github.com/apache/spark/pull/43023#discussion_r1519396300 ## core/src/test/scala/org/apache/spark/api/python/PythonWorkerFactorySuite.scala: ## @@ -0,0 +1,61 @@ +/* + * Licensed to the Apache Software Foundation (ASF)

[PR] [SPARK-45245][CONNECT][TESTS][FOLLOW-UP] Remove unneeded Matchers trait in the test [spark]

2024-03-11 Thread via GitHub
HyukjinKwon opened a new pull request, #45459: URL: https://github.com/apache/spark/pull/45459 ### What changes were proposed in this pull request? This PR is a followup of https://github.com/apache/spark/pull/43023 that addresses a post-review comment. ### Why are the changes

Re: [PR] [SPARK-47327][SQL] Fix thread safety issue in ICU Collator [spark]

2024-03-11 Thread via GitHub
cloud-fan commented on code in PR #45436: URL: https://github.com/apache/spark/pull/45436#discussion_r1519395144 ## common/unsafe/src/main/java/org/apache/spark/sql/catalyst/util/CollationFactory.java: ## @@ -138,11 +138,13 @@ public Collation( collationTable[2] = new

Re: [PR] [SPARK-47295][SQL][COLLATION] Added ICU StringSearch for 'startsWith' and 'endsWith' functions [spark]

2024-03-11 Thread via GitHub
uros-db commented on code in PR #45421: URL: https://github.com/apache/spark/pull/45421#discussion_r1519387286 ## common/unsafe/src/main/java/org/apache/spark/unsafe/types/UTF8String.java: ## @@ -378,13 +378,6 @@ public boolean matchAt(final UTF8String s, int pos) { return

Re: [PR] [SPARK-47295][SQL][COLLATION] Added ICU StringSearch for 'startsWith' and 'endsWith' functions [spark]

2024-03-11 Thread via GitHub
uros-db commented on code in PR #45421: URL: https://github.com/apache/spark/pull/45421#discussion_r1519387286 ## common/unsafe/src/main/java/org/apache/spark/unsafe/types/UTF8String.java: ## @@ -378,13 +378,6 @@ public boolean matchAt(final UTF8String s, int pos) { return

Re: [PR] [SPARK-46043][SQL][FOLLOWUP] do not resolve v2 table provider with custom session catalog [spark]

2024-03-11 Thread via GitHub
cloud-fan commented on code in PR #45440: URL: https://github.com/apache/spark/pull/45440#discussion_r1517821667 ## sql/catalyst/src/main/scala/org/apache/spark/sql/connector/catalog/CatalogV2Util.scala: ## @@ -379,10 +379,6 @@ private[sql] object CatalogV2Util { } }

Re: [PR] [SPARK-47295][SQL][COLLATION] Added ICU StringSearch for 'startsWith' and 'endsWith' functions [spark]

2024-03-11 Thread via GitHub
uros-db commented on code in PR #45421: URL: https://github.com/apache/spark/pull/45421#discussion_r1519379776 ## common/unsafe/src/main/java/org/apache/spark/unsafe/types/UTF8String.java: ## @@ -396,7 +389,18 @@ public boolean startsWith(final UTF8String prefix, int

Re: [PR] [SPARK-47327][SQL] Fix thread safety issue in ICU Collator [spark]

2024-03-11 Thread via GitHub
stefankandic commented on code in PR #45436: URL: https://github.com/apache/spark/pull/45436#discussion_r1519361380 ## common/unsafe/src/main/java/org/apache/spark/sql/catalyst/util/CollationFactory.java: ## @@ -138,11 +138,13 @@ public Collation( collationTable[2] = new

Re: [PR] [SPARK-47295][SQL][COLLATION] Added ICU StringSearch for 'startsWith' and 'endsWith' functions [spark]

2024-03-11 Thread via GitHub
MaxGekk commented on code in PR #45421: URL: https://github.com/apache/spark/pull/45421#discussion_r1519361693 ## common/unsafe/src/main/java/org/apache/spark/unsafe/types/UTF8String.java: ## @@ -378,13 +378,6 @@ public boolean matchAt(final UTF8String s, int pos) { return

Re: [PR] [SPARK-47327][SQL] Fix thread safety issue in ICU Collator [spark]

2024-03-11 Thread via GitHub
stefankandic commented on code in PR #45436: URL: https://github.com/apache/spark/pull/45436#discussion_r1519362819 ## sql/core/src/test/scala/org/apache/spark/sql/CollationSuite.scala: ## @@ -438,6 +439,19 @@ class CollationSuite extends DatasourceV2SQLBase with

Re: [PR] [SPARK-47327][SQL] Fix thread safety issue in ICU Collator [spark]

2024-03-11 Thread via GitHub
stefankandic commented on code in PR #45436: URL: https://github.com/apache/spark/pull/45436#discussion_r1519361380 ## common/unsafe/src/main/java/org/apache/spark/sql/catalyst/util/CollationFactory.java: ## @@ -138,11 +138,13 @@ public Collation( collationTable[2] = new

Re: [PR] [WIP][SPARK-47169][SQL] Disable bucketing on collated columns [spark]

2024-03-11 Thread via GitHub
mihailom-db commented on code in PR #45260: URL: https://github.com/apache/spark/pull/45260#discussion_r1519359745 ## common/utils/src/main/resources/error/error-classes.json: ## @@ -1752,6 +1752,12 @@ }, "sqlState" : "22003" }, + "INVALID_BUCKET_COLUMN_DATA_TYPE"

Re: [PR] [SPARK-46962][SS][PYTHON] Add interface for python streaming data source API and implement python worker to run python streaming data source [spark]

2024-03-11 Thread via GitHub
HeartSaVioR commented on code in PR #45023: URL: https://github.com/apache/spark/pull/45023#discussion_r1519254087 ## python/pyspark/sql/datasource.py: ## @@ -298,6 +320,133 @@ def read(self, partition: InputPartition) -> Iterator[Union[Tuple, Row]]: ... +class

Re: [PR] [SPARK-47327][SQL] Fix thread safety issue in ICU Collator [spark]

2024-03-11 Thread via GitHub
dbatomic commented on code in PR #45436: URL: https://github.com/apache/spark/pull/45436#discussion_r1519347155 ## sql/core/src/test/scala/org/apache/spark/sql/CollationSuite.scala: ## @@ -438,6 +439,19 @@ class CollationSuite extends DatasourceV2SQLBase with

Re: [PR] [SPARK-47210][SQL][COLLATION] Implicit casting on collated expressions [spark]

2024-03-11 Thread via GitHub
stefankandic commented on code in PR #45383: URL: https://github.com/apache/spark/pull/45383#discussion_r1519324890 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/TypeCoercion.scala: ## @@ -764,6 +773,94 @@ abstract class TypeCoercionBase { } }

Re: [PR] [SPARK-47210][SQL][COLLATION] Implicit casting on collated expressions [spark]

2024-03-11 Thread via GitHub
stefankandic commented on code in PR #45383: URL: https://github.com/apache/spark/pull/45383#discussion_r1519323364 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/TypeCoercion.scala: ## @@ -764,6 +773,94 @@ abstract class TypeCoercionBase { } }

Re: [PR] [SPARK-45245][PYTHON][CONNECT] PythonWorkerFactory: Timeout if worker does not connect back. [spark]

2024-03-11 Thread via GitHub
cloud-fan commented on code in PR #43023: URL: https://github.com/apache/spark/pull/43023#discussion_r1519318552 ## core/src/test/scala/org/apache/spark/api/python/PythonWorkerFactorySuite.scala: ## @@ -0,0 +1,61 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under

Re: [PR] [MINOR][DOCS][SQL] Fix doc comment for coalescePartitions.parallelismFirst [spark]

2024-03-11 Thread via GitHub
eejbyfeldt commented on PR #45437: URL: https://github.com/apache/spark/pull/45437#issuecomment-1987872112 Realised that this doc comment is actually in 2 places: https://github.com/apache/spark/pull/45458 -- This is an automated message from the Apache Git Service. To respond to the

[PR] [MINOR][DOCS][SQL] Followup to fix doc comment for coalescePartitions.parallelismFirst [spark]

2024-03-11 Thread via GitHub
eejbyfeldt opened a new pull request, #45458: URL: https://github.com/apache/spark/pull/45458 We missed that this doc is duplicated while fixing it in https://github.com/apache/spark/pull/45437 ### What changes were proposed in this pull request? Documentation fix.

Re: [PR] [SPARK-45827][SQL] Add variant singleton type for Java [spark]

2024-03-11 Thread via GitHub
MaxGekk closed pull request #45455: [SPARK-45827][SQL] Add variant singleton type for Java URL: https://github.com/apache/spark/pull/45455 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

Re: [PR] [SPARK-45827][SQL] Add variant singleton type for Java [spark]

2024-03-11 Thread via GitHub
MaxGekk commented on PR #45455: URL: https://github.com/apache/spark/pull/45455#issuecomment-1987868783 The test failure is not related to the changes: [Run / Run Docker integration tests](https://github.com/richardc-db/spark/actions/runs/8228264424/job/22497420892#logs) +1, LGTM.

Re: [PR] [SPARK-47295][SQL][COLLATION] Added ICU StringSearch for 'startsWith' and 'endsWith' functions [spark]

2024-03-11 Thread via GitHub
uros-db commented on code in PR #45421: URL: https://github.com/apache/spark/pull/45421#discussion_r1519302246 ## common/unsafe/src/main/java/org/apache/spark/unsafe/types/UTF8String.java: ## @@ -396,7 +389,18 @@ public boolean startsWith(final UTF8String prefix, int

Re: [PR] [SPARK-47295][SQL][COLLATION] Added ICU StringSearch for 'startsWith' and 'endsWith' functions [spark]

2024-03-11 Thread via GitHub
uros-db commented on code in PR #45421: URL: https://github.com/apache/spark/pull/45421#discussion_r1519302618 ## common/unsafe/src/main/java/org/apache/spark/unsafe/types/UTF8String.java: ## @@ -410,7 +414,20 @@ public boolean endsWith(final UTF8String suffix, int

Re: [PR] [WIP][SPARK-47194][BUILD] Upgrade log4j to 2.23.1 [spark]

2024-03-11 Thread via GitHub
panbingkun commented on PR #45326: URL: https://github.com/apache/spark/pull/45326#issuecomment-1987848933 @LuciferYang @dongjoon-hyun The version `2.23.1` of log4j2 has been released, which has resolved the issue with the `StatusLogger` in the version `1.13.0`, as follows:

[PR] [WIP][SPARK-47338][SQL] Introduce `_LEGACY_ERROR_UNKNOWN` for default error class [spark]

2024-03-11 Thread via GitHub
itholic opened a new pull request, #45457: URL: https://github.com/apache/spark/pull/45457 ### What changes were proposed in this pull request? This PR proposes to introduce `_LEGACY_ERROR_UNKNOWN` for default error class when error class is not defined. ### Why are

Re: [PR] [SPARK-46654][SQL][Python] Make `to_csv` explicitly indicate that it does not support some types of data [spark]

2024-03-11 Thread via GitHub
panbingkun commented on code in PR #44665: URL: https://github.com/apache/spark/pull/44665#discussion_r1519273344 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/csvExpressions.scala: ## @@ -260,16 +263,33 @@ case class StructsToCsv( child =

Re: [PR] [SPARK-46654][SQL][Python] Make `to_csv` explicitly indicate that it does not support some types of data [spark]

2024-03-11 Thread via GitHub
panbingkun commented on code in PR #44665: URL: https://github.com/apache/spark/pull/44665#discussion_r1519273344 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/csvExpressions.scala: ## @@ -260,16 +263,33 @@ case class StructsToCsv( child =

Re: [PR] [WIP][SPARK-47274][PYTHON][CONNECT] Provide more useful context for PySpark DataFrame API errors [spark]

2024-03-11 Thread via GitHub
itholic commented on code in PR #45377: URL: https://github.com/apache/spark/pull/45377#discussion_r1519270151 ## python/pyspark/errors/utils.py: ## @@ -15,12 +15,22 @@ # limitations under the License. # +import builtins import re -from typing import Dict, Match +import

Re: [PR] [SPARK-46654][SQL][Python] Make `to_csv` explicitly indicate that it does not support some types of data [spark]

2024-03-11 Thread via GitHub
panbingkun commented on code in PR #44665: URL: https://github.com/apache/spark/pull/44665#discussion_r1519260055 ## sql/core/src/test/scala/org/apache/spark/sql/CsvFunctionsSuite.scala: ## @@ -294,10 +294,19 @@ class CsvFunctionsSuite extends QueryTest with SharedSparkSession

Re: [PR] [SPARK-46654][SQL][Python] Make `to_csv` explicitly indicate that it does not support some types of data [spark]

2024-03-11 Thread via GitHub
panbingkun commented on code in PR #44665: URL: https://github.com/apache/spark/pull/44665#discussion_r1519252452 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/csvExpressions.scala: ## @@ -260,16 +263,36 @@ case class StructsToCsv( child =

Re: [PR] [SPARK-46913][SS] Add support for processing/event time based timers with transformWithState operator [spark]

2024-03-11 Thread via GitHub
HeartSaVioR commented on code in PR #45051: URL: https://github.com/apache/spark/pull/45051#discussion_r1519251919 ## sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/TimerStateImpl.scala: ## @@ -0,0 +1,224 @@ +/* + * Licensed to the Apache Software Foundation

Re: [PR] [SPARK-46654][SQL][Python] Make `to_csv` explicitly indicate that it does not support some types of data [spark]

2024-03-11 Thread via GitHub
panbingkun commented on code in PR #44665: URL: https://github.com/apache/spark/pull/44665#discussion_r1519249360 ## python/pyspark/sql/functions/builtin.py: ## @@ -15534,19 +15532,7 @@ def to_csv(col: "ColumnOrName", options: Optional[Dict[str, str]] = None) -> Col |

Re: [PR] [SPARK-46812][CONNECT][PYTHON] Make mapInPandas / mapInArrow support ResourceProfile [spark]

2024-03-11 Thread via GitHub
wbo4958 commented on code in PR #45232: URL: https://github.com/apache/spark/pull/45232#discussion_r1519241310 ## connector/connect/common/src/main/protobuf/spark/connect/relations.proto: ## @@ -892,6 +893,9 @@ message MapPartitions { // (Optional) Whether to use barrier

Re: [PR] [SPARK-46812][CONNECT][PYTHON] Make mapInPandas / mapInArrow support ResourceProfile [spark]

2024-03-11 Thread via GitHub
wbo4958 commented on code in PR #45232: URL: https://github.com/apache/spark/pull/45232#discussion_r1519241310 ## connector/connect/common/src/main/protobuf/spark/connect/relations.proto: ## @@ -892,6 +893,9 @@ message MapPartitions { // (Optional) Whether to use barrier

Re: [PR] [SPARK-46812][CONNECT][PYTHON] Make mapInPandas / mapInArrow support ResourceProfile [spark]

2024-03-11 Thread via GitHub
wbo4958 commented on code in PR #45232: URL: https://github.com/apache/spark/pull/45232#discussion_r1519235937 ## connector/connect/server/src/main/scala/org/apache/spark/sql/connect/service/SparkConnectBuildResourceProfileHandler.scala: ## @@ -0,0 +1,75 @@ +/* + * Licensed to

Re: [PR] [SPARK-46812][CONNECT][PYTHON] Make mapInPandas / mapInArrow support ResourceProfile [spark]

2024-03-11 Thread via GitHub
wbo4958 commented on code in PR #45232: URL: https://github.com/apache/spark/pull/45232#discussion_r1519232917 ## connector/connect/common/src/main/protobuf/spark/connect/base.proto: ## @@ -1011,5 +1039,7 @@ service SparkConnectService { // FetchErrorDetails retrieves the

Re: [PR] [SPARK-46812][CONNECT][PYTHON] Make mapInPandas / mapInArrow support ResourceProfile [spark]

2024-03-11 Thread via GitHub
wbo4958 commented on code in PR #45232: URL: https://github.com/apache/spark/pull/45232#discussion_r1519232536 ## connector/connect/common/src/main/protobuf/spark/connect/relations.proto: ## @@ -892,6 +893,9 @@ message MapPartitions { // (Optional) Whether to use barrier

Re: [PR] [WIP][SPARK-47254][SQL] Assign names to the error classes _LEGACY_ERROR_TEMP_325[1-9] [spark]

2024-03-11 Thread via GitHub
stefanbuk-db commented on code in PR #45407: URL: https://github.com/apache/spark/pull/45407#discussion_r1519220632 ## sql/api/src/main/scala/org/apache/spark/sql/catalyst/util/SparkIntervalUtils.scala: ## @@ -229,13 +237,21 @@ trait SparkIntervalUtils { try {

[PR] [SPARK-47337][SQL][DOCKER] Upgrade DB2 docker image version to 11.5.8.0 [spark]

2024-03-11 Thread via GitHub
yaooqinn opened a new pull request, #45456: URL: https://github.com/apache/spark/pull/45456 ### What changes were proposed in this pull request? This PR upgrades the DB2 Docker image version to 11.5.8.0 used by docker-integration tests

Re: [PR] [WIP][SPARK-47254][SQL] Assign names to the error classes _LEGACY_ERROR_TEMP_325[1-9] [spark]

2024-03-11 Thread via GitHub
stefanbuk-db commented on code in PR #45407: URL: https://github.com/apache/spark/pull/45407#discussion_r1519209532 ## common/utils/src/main/resources/error/error-classes.json: ## @@ -2060,6 +2085,74 @@ }, "sqlState" : "42000" }, + "INVALID_INTERVAL_FORMAT" : { +

Re: [PR] [WIP][SPARK-47254][SQL] Assign names to the error classes _LEGACY_ERROR_TEMP_325[1-9] [spark]

2024-03-11 Thread via GitHub
stefanbuk-db commented on code in PR #45407: URL: https://github.com/apache/spark/pull/45407#discussion_r1519208459 ## sql/api/src/main/scala/org/apache/spark/sql/types/StructType.scala: ## @@ -283,10 +283,10 @@ case class StructType(fields: Array[StructField]) extends

Re: [PR] [WIP][SPARK-47254][SQL] Assign names to the error classes _LEGACY_ERROR_TEMP_325[1-9] [spark]

2024-03-11 Thread via GitHub
stefanbuk-db commented on code in PR #45407: URL: https://github.com/apache/spark/pull/45407#discussion_r1519208319 ## common/utils/src/main/resources/error/error-classes.json: ## @@ -1823,6 +1830,24 @@ }, "sqlState" : "HY109" }, + "INVALID_DATETIME_PATTERN" : { +

Re: [PR] [SPARK-46913][SS] Add support for processing/event time based timers with transformWithState operator [spark]

2024-03-11 Thread via GitHub
HeartSaVioR commented on code in PR #45051: URL: https://github.com/apache/spark/pull/45051#discussion_r1519192303 ## sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/TimerStateImpl.scala: ## @@ -0,0 +1,224 @@ +/* + * Licensed to the Apache Software Foundation

Re: [PR] [SPARK-46913][SS] Add support for processing/event time based timers with transformWithState operator [spark]

2024-03-11 Thread via GitHub
HeartSaVioR commented on code in PR #45051: URL: https://github.com/apache/spark/pull/45051#discussion_r1519192303 ## sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/TimerStateImpl.scala: ## @@ -0,0 +1,224 @@ +/* + * Licensed to the Apache Software Foundation

<    1   2