Re: [PR] [SPARK-46812][CONNECT][PYTHON] Make mapInPandas / mapInArrow support ResourceProfile [spark]

2024-03-07 Thread via GitHub
tgravescs commented on code in PR #45232: URL: https://github.com/apache/spark/pull/45232#discussion_r1516375405 ## python/pyspark/resource/tests/test_connect_resources.py: ## @@ -0,0 +1,46 @@ +# +# Licensed to the Apache Software Foundation (ASF) under one or more +#

Re: [PR] [SQL] Bind JDBC dialect to JDBCRDD at construction [spark]

2024-03-07 Thread via GitHub
johnnywalker commented on code in PR #45410: URL: https://github.com/apache/spark/pull/45410#discussion_r1516375276 ## sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/jdbc/JDBCRDD.scala: ## @@ -153,12 +153,12 @@ object JDBCRDD extends Logging { */ class

Re: [PR] [SPARK-47302][SQL][Collation] Collate keyword as identifier [spark]

2024-03-07 Thread via GitHub
MaxGekk commented on code in PR #45405: URL: https://github.com/apache/spark/pull/45405#discussion_r1516373801 ## sql/api/src/main/antlr4/org/apache/spark/sql/catalyst/parser/SqlBaseParser.g4: ## @@ -1096,7 +1096,7 @@ colPosition ; collateClause -: COLLATE

Re: [PR] [SPARK-47248][SQL][COLLATION] Improved string function support: contains [spark]

2024-03-07 Thread via GitHub
cloud-fan closed pull request #45382: [SPARK-47248][SQL][COLLATION] Improved string function support: contains URL: https://github.com/apache/spark/pull/45382 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above

Re: [PR] [SPARK-47248][SQL][COLLATION] Improved string function support: contains [spark]

2024-03-07 Thread via GitHub
cloud-fan commented on PR #45382: URL: https://github.com/apache/spark/pull/45382#issuecomment-1983758947 The GA jobs all passed: https://github.com/uros-db/spark/actions/runs/8186876833/job/22395549669 merging to master, thanks! -- This is an automated message from the Apache Git

Re: [PR] [SPARK-47302][SQL][Collation] Collate keyword as identifier [spark]

2024-03-07 Thread via GitHub
stefankandic commented on code in PR #45405: URL: https://github.com/apache/spark/pull/45405#discussion_r1516357243 ## sql/api/src/main/antlr4/org/apache/spark/sql/catalyst/parser/SqlBaseParser.g4: ## @@ -1096,7 +1096,7 @@ colPosition ; collateClause -: COLLATE

Re: [PR] [SPARK-45827][SQL] Move data type checks to CreatableRelationProvider [spark]

2024-03-07 Thread via GitHub
cashmand commented on code in PR #45409: URL: https://github.com/apache/spark/pull/45409#discussion_r1516216789 ## sql/core/src/main/scala/org/apache/spark/sql/sources/interfaces.scala: ## @@ -175,6 +175,25 @@ trait CreatableRelationProvider { mode: SaveMode,

Re: [PR] [SPARK-47314][DOC] Correct the `ExternalSorter#writePartitionedMapOutput` method comment [spark]

2024-03-07 Thread via GitHub
LuciferYang commented on code in PR #45415: URL: https://github.com/apache/spark/pull/45415#discussion_r1516106051 ## core/src/main/scala/org/apache/spark/util/collection/ExternalSorter.scala: ## @@ -690,7 +690,7 @@ private[spark] class ExternalSorter[K, V, C]( * Write all

[PR] [SPARK-46761][SQL] Quoted strings in a JSON path should support ? characters [spark]

2024-03-07 Thread via GitHub
planga82 opened a new pull request, #45420: URL: https://github.com/apache/spark/pull/45420 ### What changes were proposed in this pull request? If there is a JSON with a ? character in the key like ``` {"?":"QUESTION"} ``` This PR allow to add this character

Re: [PR] [SPARK-47316][SQL] Fix TimestampNTZ in Postgres Array [spark]

2024-03-07 Thread via GitHub
cloud-fan commented on code in PR #45418: URL: https://github.com/apache/spark/pull/45418#discussion_r1516087243 ## sql/core/src/main/scala/org/apache/spark/sql/jdbc/JdbcDialects.scala: ## @@ -87,17 +87,26 @@ abstract class JdbcDialect extends Serializable with Logging {

Re: [PR] [SPARK-47302][SQL][Collation] Collate keyword as identifier [spark]

2024-03-07 Thread via GitHub
MaxGekk commented on code in PR #45405: URL: https://github.com/apache/spark/pull/45405#discussion_r1516070311 ## sql/api/src/main/scala/org/apache/spark/sql/types/DataType.scala: ## @@ -117,7 +117,7 @@ object DataType { private val FIXED_DECIMAL =

Re: [PR] [SPARK-47302][SQL][Collation] Collate keyword as identifier [spark]

2024-03-07 Thread via GitHub
MaxGekk commented on code in PR #45405: URL: https://github.com/apache/spark/pull/45405#discussion_r1516062859 ## sql/api/src/main/antlr4/org/apache/spark/sql/catalyst/parser/SqlBaseParser.g4: ## @@ -1096,7 +1096,7 @@ colPosition ; collateClause -: COLLATE

Re: [PR] [SPARK-47254][SQL] Assign names to the error classes _LEGACY_ERROR_TEMP_325[1-9][WIP] [spark]

2024-03-07 Thread via GitHub
MaxGekk commented on PR #45407: URL: https://github.com/apache/spark/pull/45407#issuecomment-1983367613 @stefanbuk-db If you are still working on the PR, please, move the tag `[WIP]` at the beginning of PR's title (this is a convention) -- This is an automated message from the Apache Git

Re: [PR] [SPARK-47254][SQL] Assign names to the error classes _LEGACY_ERROR_TEMP_325[1-9][WIP] [spark]

2024-03-07 Thread via GitHub
MaxGekk commented on code in PR #45407: URL: https://github.com/apache/spark/pull/45407#discussion_r1515965942 ## sql/core/src/test/scala/org/apache/spark/sql/execution/command/DDLParserSuite.scala: ## @@ -455,19 +455,6 @@ class DDLParserSuite extends AnalysisTest with

Re: [PR] [MINOR][DOCS][PYTHON] Fix documentation typo in takeSample method [spark]

2024-03-07 Thread via GitHub
yaooqinn closed pull request #45419: [MINOR][DOCS][PYTHON] Fix documentation typo in takeSample method URL: https://github.com/apache/spark/pull/45419 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

Re: [PR] [MINOR][DOCS][PYTHON] Fix documentation typo in takeSample method [spark]

2024-03-07 Thread via GitHub
yaooqinn commented on PR #45419: URL: https://github.com/apache/spark/pull/45419#issuecomment-1983320337 Merged to master Thank you @kimborowicz @zhengruifeng -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use

Re: [PR] [SPARK-47278][BUILD] Upgrade rocksdbjni to 8.11.3 [spark]

2024-03-07 Thread via GitHub
LuciferYang commented on PR #45365: URL: https://github.com/apache/spark/pull/45365#issuecomment-1983298155 Thanks @yaooqinn @dongjoon-hyun -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

Re: [PR] [SPARK-47278][BUILD] Upgrade rocksdbjni to 8.11.3 [spark]

2024-03-07 Thread via GitHub
yaooqinn closed pull request #45365: [SPARK-47278][BUILD] Upgrade rocksdbjni to 8.11.3 URL: https://github.com/apache/spark/pull/45365 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

Re: [PR] [SPARK-47278][BUILD] Upgrade rocksdbjni to 8.11.3 [spark]

2024-03-07 Thread via GitHub
yaooqinn commented on PR #45365: URL: https://github.com/apache/spark/pull/45365#issuecomment-1983294203 Merged to master. Thank you @LuciferYang @dongjoon-hyun -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use

Re: [PR] [SPARK-47298][BUILD] Upgrade `mysql-connector-j` to `8.3.0` and `mariadb-java-client` to `2.7.12` [spark]

2024-03-07 Thread via GitHub
yaooqinn commented on PR #45399: URL: https://github.com/apache/spark/pull/45399#issuecomment-1983290627 Merged to master. Thank you @panbingkun @dongjoon-hyun -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use

Re: [PR] [SPARK-47298][BUILD] Upgrade `mysql-connector-j` to `8.3.0` and `mariadb-java-client` to `2.7.12` [spark]

2024-03-07 Thread via GitHub
yaooqinn closed pull request #45399: [SPARK-47298][BUILD] Upgrade `mysql-connector-j` to `8.3.0` and `mariadb-java-client` to `2.7.12` URL: https://github.com/apache/spark/pull/45399 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to

Re: [PR] [SPARK-47302][SQL][Collation] Collate keyword as identifier [spark]

2024-03-07 Thread via GitHub
dbatomic commented on code in PR #45405: URL: https://github.com/apache/spark/pull/45405#discussion_r1515984970 ## python/pyspark/sql/tests/test_types.py: ## @@ -862,15 +862,13 @@ def test_parse_datatype_string(self): if k != "varchar" and k != "char":

Re: [PR] [SPARK-47302][SQL][Collation] Collate keyword as identifier [spark]

2024-03-07 Thread via GitHub
dbatomic commented on code in PR #45405: URL: https://github.com/apache/spark/pull/45405#discussion_r1515983493 ## sql/api/src/main/scala/org/apache/spark/sql/catalyst/parser/DataTypeAstBuilder.scala: ## @@ -218,6 +218,6 @@ class DataTypeAstBuilder extends

Re: [PR] [SPARK-47300][SQL] `quoteIfNeeded` should quote identifier starts with digits [spark]

2024-03-07 Thread via GitHub
yaooqinn commented on PR #45401: URL: https://github.com/apache/spark/pull/45401#issuecomment-1983276998 Merged to master. Thank you @cloud-fan @dongjoon-hyun @HyukjinKwon @MaxGekk -- This is an automated message from the Apache Git Service. To respond to the message, please log

Re: [PR] [SPARK-47300][SQL] `quoteIfNeeded` should quote identifier starts with digits [spark]

2024-03-07 Thread via GitHub
yaooqinn closed pull request #45401: [SPARK-47300][SQL] `quoteIfNeeded` should quote identifier starts with digits URL: https://github.com/apache/spark/pull/45401 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

Re: [PR] [SPARK-46992]Fix cache consistence [spark]

2024-03-07 Thread via GitHub
dtarima commented on PR #45181: URL: https://github.com/apache/spark/pull/45181#issuecomment-1983260338 All children have to be considered for changes of their persistence state. Currently it only checks the fist found child. For clarity there is a test which fails:

Re: [PR] [SPARK-47238][SQL] Reduce executor memory usage by making generated code in WSCG a broadcast variable [spark]

2024-03-07 Thread via GitHub
jwang0306 commented on PR #45348: URL: https://github.com/apache/spark/pull/45348#issuecomment-1983243153 Thank you! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To

[PR] [DOCS][PYTHON] Fix documentation typo in takeSample method [spark]

2024-03-07 Thread via GitHub
kimborowicz opened a new pull request, #45419: URL: https://github.com/apache/spark/pull/45419 ### What changes were proposed in this pull request? Fixed an error in the docstring documentation for the parameter `withReplacement` of `takeSample` method in `pyspark.RDD`, should be

Re: [PR] [SPARK-43124][SQL] Add ConvertCommandResultToLocalRelation rule [spark]

2024-03-07 Thread via GitHub
wForget commented on PR #45397: URL: https://github.com/apache/spark/pull/45397#issuecomment-1983129621 Close with comment: https://github.com/apache/spark/pull/45397#discussion_r1515557219 -- This is an automated message from the Apache Git Service. To respond to the message, please log

Re: [PR] [SPARK-43124][SQL] Add ConvertCommandResultToLocalRelation rule [spark]

2024-03-07 Thread via GitHub
wForget closed pull request #45397: [SPARK-43124][SQL] Add ConvertCommandResultToLocalRelation rule URL: https://github.com/apache/spark/pull/45397 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

Re: [PR] [SPARK-47316][SQL] Fix TimestampNTZ in Postgres Array [spark]

2024-03-07 Thread via GitHub
yaooqinn commented on PR #45418: URL: https://github.com/apache/spark/pull/45418#issuecomment-1983087260 cc @cloud-fan -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To

Re: [PR] [SPARK-47301][SQL][TESTS] Fix flaky ParquetIOSuite [spark]

2024-03-07 Thread via GitHub
yaooqinn commented on PR #45403: URL: https://github.com/apache/spark/pull/45403#issuecomment-1983075720 Merged to master. Thank you, @panbingkun & @dongjoon-hyun -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use

Re: [PR] [SPARK-47301][SQL][TESTS] Fix flaky ParquetIOSuite [spark]

2024-03-07 Thread via GitHub
yaooqinn closed pull request #45403: [SPARK-47301][SQL][TESTS] Fix flaky ParquetIOSuite URL: https://github.com/apache/spark/pull/45403 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

Re: [PR] [SPARK-36691][PYTHON] PythonRunner failed should pass error message to ApplicationMaster too [spark]

2024-03-07 Thread via GitHub
cloud-fan commented on code in PR #33934: URL: https://github.com/apache/spark/pull/33934#discussion_r1515813583 ## core/src/main/scala/org/apache/spark/util/Utils.scala: ## @@ -3281,6 +3282,80 @@ private[spark] class RedirectThread( } } +private[spark] class

[PR] [SPARK-36691][PYTHON] PythonRunner failed should pass error message to ApplicationMaster too [spark]

2024-03-07 Thread via GitHub
AngersZh opened a new pull request, #33934: URL: https://github.com/apache/spark/pull/33934 ### What changes were proposed in this pull request? In current pyspark, stderr and stdout are print together, if python script exit, PythonRunner will only throw a `SparkUserAppsException`

Re: [PR] [SPARK-47300][SQL] `quoteIfNeeded` should quote identifier starts with digits [spark]

2024-03-07 Thread via GitHub
cloud-fan commented on code in PR #45401: URL: https://github.com/apache/spark/pull/45401#discussion_r1515795172 ## sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/util/StringUtilsSuite.scala: ## @@ -129,16 +129,6 @@ class StringUtilsSuite extends SparkFunSuite with

Re: [PR] [SPARK-47241][SQL] Fix rule order issues for ExtractGenerator [spark]

2024-03-07 Thread via GitHub
cloud-fan closed pull request #45350: [SPARK-47241][SQL] Fix rule order issues for ExtractGenerator URL: https://github.com/apache/spark/pull/45350 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

Re: [PR] [SPARK-47241][SQL] Fix rule order issues for ExtractGenerator [spark]

2024-03-07 Thread via GitHub
cloud-fan commented on PR #45350: URL: https://github.com/apache/spark/pull/45350#issuecomment-1983026959 thanks for the review, merging to master/3.5! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

Re: [PR] [SPARK-43124][SQL] Add ConvertCommandResultToLocalRelation rule [spark]

2024-03-07 Thread via GitHub
cloud-fan commented on code in PR #45397: URL: https://github.com/apache/spark/pull/45397#discussion_r1515785048 ## sql/core/src/main/scala/org/apache/spark/sql/catalyst/optimizer/ConvertCommandResultToLocalRelation.scala: ## @@ -0,0 +1,52 @@ +/* + * Licensed to the Apache

Re: [PR] [SPARK-47315][SQL][TEST] Clean up tempView for `createTempView` UT [spark]

2024-03-07 Thread via GitHub
yaooqinn commented on PR #45417: URL: https://github.com/apache/spark/pull/45417#issuecomment-1983000945 Merged to master. Thank you @wForget and @HyukjinKwon. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the

Re: [PR] [SPARK-47315][SQL][TEST] Clean up tempView for `createTempView` UT [spark]

2024-03-07 Thread via GitHub
yaooqinn closed pull request #45417: [SPARK-47315][SQL][TEST] Clean up tempView for `createTempView` UT URL: https://github.com/apache/spark/pull/45417 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

Re: [PR] [SPARK-47314][DOC] Correct the `ExternalSorter#writePartitionedMapOutput` method comment [spark]

2024-03-07 Thread via GitHub
yaooqinn commented on code in PR #45415: URL: https://github.com/apache/spark/pull/45415#discussion_r1515746618 ## core/src/main/scala/org/apache/spark/util/collection/ExternalSorter.scala: ## @@ -690,7 +690,7 @@ private[spark] class ExternalSorter[K, V, C]( * Write all the

[PR] [SPARK-47316][SQL] Fix TimestampNTZ in Postgres Array [spark]

2024-03-07 Thread via GitHub
yaooqinn opened a new pull request, #45418: URL: https://github.com/apache/spark/pull/45418 ### What changes were proposed in this pull request? For Postgres, TimestampNTZ works well for plain TimestampNTZ types but not for nested ones, typically for now: array. This

Re: [PR] [SPARK-47307] Replace RFC 2045 base64 encoder with RFC 4648 encoder [spark]

2024-03-07 Thread via GitHub
ted-jenks commented on PR #45408: URL: https://github.com/apache/spark/pull/45408#issuecomment-1982836579 @dongjoon-hyun > It sounds like you have other systems to read Spark's data. Correct. The issue was that from 3.2 to 3.3 there was a behavior change in the base64 encodings used

<    1   2