Re: [PR] [SPARK-45022][SQL] Provide context for dataset API errors [spark]

2024-03-31 Thread via GitHub
cloud-fan commented on PR #43334: URL: https://github.com/apache/spark/pull/43334#issuecomment-2029265533 I don't think so... We are still waiting for people who are familiar with Spark Connect to pick it up. -- This is an automated message from the Apache Git Service. To respond to the m

[PR] [SPARK-47665][SQL] Use SMALLINT to Write ShortType to MYSQL [spark]

2024-03-31 Thread via GitHub
yaooqinn opened a new pull request, #45789: URL: https://github.com/apache/spark/pull/45789 ### What changes were proposed in this pull request? This PR uses MySQL SMALLINT to write ShortType to MYSQL, as we read SMALLINT as ShortType. ### Why are the changes needed?

Re: [PR] [SPARK-45022][SQL] Provide context for dataset API errors [spark]

2024-03-31 Thread via GitHub
itholic commented on PR #43334: URL: https://github.com/apache/spark/pull/43334#issuecomment-2029250313 > > If not, we probably should disable this feature for spark connect for now since customers may get confused if they see contexts for dataset API errors (likely in the spark connect pla

Re: [PR] [SPARK-47141][CORE] Support enabling migration of shuffle data directly to external storage using config parameter. [spark]

2024-03-31 Thread via GitHub
attilapiros commented on PR #45228: URL: https://github.com/apache/spark/pull/45228#issuecomment-2029191559 > > I am also concerned about the performance. > > I think the best would be if the migration of shuffle data to external storage would only kick in when the scale down is aggressiv

[PR] [SPARK-47664][PYTHON][CONNECT] Validate the column name with cached schema [spark]

2024-03-31 Thread via GitHub
zhengruifeng opened a new pull request, #45788: URL: https://github.com/apache/spark/pull/45788 ### What changes were proposed in this pull request? improve the column name validation, to try the best to avoid RPC. ### Why are the changes needed? existing validation contains

Re: [PR] [SPARK-47645][BUILD][CORE][SQL][YARN] Make Spark build with `-release` instead of `-target` [spark]

2024-03-31 Thread via GitHub
LuciferYang commented on PR #45716: URL: https://github.com/apache/spark/pull/45716#issuecomment-2029152050 Thanks @dongjoon-hyun ~ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific co

Re: [PR] [SPARK-47645][BUILD][CORE][SQL][YARN] Make Spark build with `-release` instead of `-target` [spark]

2024-03-31 Thread via GitHub
dongjoon-hyun commented on PR #45716: URL: https://github.com/apache/spark/pull/45716#issuecomment-2029138539 Merged to master. Thank you, @LuciferYang and all. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL a

Re: [PR] [SPARK-47645][BUILD][CORE][SQL][YARN] Make Spark build with `-release` instead of `-target` [spark]

2024-03-31 Thread via GitHub
dongjoon-hyun closed pull request #45716: [SPARK-47645][BUILD][CORE][SQL][YARN] Make Spark build with `-release` instead of `-target` URL: https://github.com/apache/spark/pull/45716 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub

Re: [PR] [SPARK-47662][SQL][DOCS] Add User Document for Mapping Spark SQL Data Types to MySQL [spark]

2024-03-31 Thread via GitHub
yaooqinn commented on PR #45787: URL: https://github.com/apache/spark/pull/45787#issuecomment-2029132133 Thank you very much, @dongjoon-hyun -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the s

Re: [PR] [SPARK-47662][SQL][DOCS] Add User Document for Mapping Spark SQL Data Types to MySQL [spark]

2024-03-31 Thread via GitHub
dongjoon-hyun closed pull request #45787: [SPARK-47662][SQL][DOCS] Add User Document for Mapping Spark SQL Data Types to MySQL URL: https://github.com/apache/spark/pull/45787 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and us

Re: [PR] [SPARK-47662][SQL][DOCS] Add User Document for Mapping Spark SQL Data Types to MySQL [spark]

2024-03-31 Thread via GitHub
dongjoon-hyun commented on PR #45787: URL: https://github.com/apache/spark/pull/45787#issuecomment-2029131805 Merged to master. Thank you, @yaooqinn ! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

[PR] [SPARK-47662][SQL] Add User Document for Mapping Spark SQL Data Types to MySQL [spark]

2024-03-31 Thread via GitHub
yaooqinn opened a new pull request, #45787: URL: https://github.com/apache/spark/pull/45787 ### What changes were proposed in this pull request? Following #45736, we add the User Document for Mapping Spark SQL Data Types to MySQL ### Why are the changes needed?

Re: [PR] [SPARK-47141][CORE] Support enabling migration of shuffle data directly to external storage using config parameter. [spark]

2024-03-31 Thread via GitHub
maheshk114 commented on code in PR #45228: URL: https://github.com/apache/spark/pull/45228#discussion_r1545941169 ## core/src/main/scala/org/apache/spark/storage/ShuffleBlockFetcherIterator.scala: ## @@ -398,8 +410,13 @@ final class ShuffleBlockFetcherIterator( var pushMerg

Re: [PR] [SPARK-47141][CORE] Support enabling migration of shuffle data directly to external storage using config parameter. [spark]

2024-03-31 Thread via GitHub
maheshk114 commented on PR #45228: URL: https://github.com/apache/spark/pull/45228#issuecomment-2029095673 > I am also concerned about the performance. > > I think the best would be if the migration of shuffle data to external storage would only kick in when the scale down is aggressi

Re: [PR] [SPARK-47458][CORE] Fix the problem with calculating the maximum concurrent tasks for the barrier stage [spark]

2024-03-31 Thread via GitHub
WeichenXu123 commented on code in PR #45528: URL: https://github.com/apache/spark/pull/45528#discussion_r1545905575 ## core/src/test/scala/org/apache/spark/scheduler/TaskSchedulerImplSuite.scala: ## @@ -2687,4 +2687,58 @@ class TaskSchedulerImplSuite extends SparkFunSuite with

Re: [PR] [SPARK-47458][CORE] Fix the problem with calculating the maximum concurrent tasks for the barrier stage [spark]

2024-03-31 Thread via GitHub
WeichenXu123 commented on code in PR #45528: URL: https://github.com/apache/spark/pull/45528#discussion_r1545905575 ## core/src/test/scala/org/apache/spark/scheduler/TaskSchedulerImplSuite.scala: ## @@ -2687,4 +2687,58 @@ class TaskSchedulerImplSuite extends SparkFunSuite with

Re: [PR] [SPARK-47458][CORE] Fix the problem with calculating the maximum concurrent tasks for the barrier stage [spark]

2024-03-31 Thread via GitHub
WeichenXu123 commented on code in PR #45528: URL: https://github.com/apache/spark/pull/45528#discussion_r1545905575 ## core/src/test/scala/org/apache/spark/scheduler/TaskSchedulerImplSuite.scala: ## @@ -2687,4 +2687,58 @@ class TaskSchedulerImplSuite extends SparkFunSuite with

Re: [PR] [SPARK-47646][SQL][FOLLOWUP][3.4] Replace non-existing try_to_number function with TryToNumber [spark]

2024-03-31 Thread via GitHub
HyukjinKwon commented on PR #45785: URL: https://github.com/apache/spark/pull/45785#issuecomment-2028990639 Thank you. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To un

Re: [PR] [SPARK-47646][SQL] Make try_to_number return NULL for malformed input [spark]

2024-03-31 Thread via GitHub
HyukjinKwon commented on PR #45771: URL: https://github.com/apache/spark/pull/45771#issuecomment-2028990510 Thank you guys! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

Re: [PR] [SPARK-46812][CONNECT][PYTHON] Make mapInPandas / mapInArrow support ResourceProfile [spark]

2024-03-31 Thread via GitHub
wbo4958 commented on code in PR #45232: URL: https://github.com/apache/spark/pull/45232#discussion_r1545872296 ## connector/connect/common/src/main/protobuf/spark/connect/base.proto: ## @@ -375,6 +375,9 @@ message ExecutePlanResponse { // Response type informing if the stre

Re: [PR] [SPARK-47646][SQL][FOLLOWUP][3.4] Replace non-existing try_to_number function with TryToNumber [spark]

2024-03-31 Thread via GitHub
viirya commented on PR #45785: URL: https://github.com/apache/spark/pull/45785#issuecomment-2028939879 Thank you @sunchao @dongjoon-hyun -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the speci

Re: [PR] [SPARK-47646][SQL] Make try_to_number return NULL for malformed input [spark]

2024-03-31 Thread via GitHub
bersprockets commented on PR #45771: URL: https://github.com/apache/spark/pull/45771#issuecomment-2028936223 @HyukjinKwon I should have mentioned that `try_to_number` exists in 3.4.2 as an SQL function but not as a scala function in `functions.scala` (that's why 3.4.2 example had to

Re: [PR] [SPARK-47658][BUILD] Upgrade `tink` to 1.12.0 [spark]

2024-03-31 Thread via GitHub
dongjoon-hyun closed pull request #45783: [SPARK-47658][BUILD] Upgrade `tink` to 1.12.0 URL: https://github.com/apache/spark/pull/45783 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific c

Re: [PR] [SPARK-47656][BUILD] Upgrade commons-io to 2.16.0 [spark]

2024-03-31 Thread via GitHub
dongjoon-hyun closed pull request #45781: [SPARK-47656][BUILD] Upgrade commons-io to 2.16.0 URL: https://github.com/apache/spark/pull/45781 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specif

Re: [PR] [SPARK-47646][SQL][FOLLOWUP][3.4] Replace non-existing try_to_number function with TryToNumber [spark]

2024-03-31 Thread via GitHub
dongjoon-hyun commented on PR #45785: URL: https://github.com/apache/spark/pull/45785#issuecomment-2028935442 cc @HyukjinKwon , @cloud-fan , @bersprockets from #45771 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the

Re: [PR] [SPARK-47646][SQL] Make try_to_number return NULL for malformed input [spark]

2024-03-31 Thread via GitHub
dongjoon-hyun commented on PR #45771: URL: https://github.com/apache/spark/pull/45771#issuecomment-2028935260 For the record, this broke branch-3.4 and the following PR fixed it. - #45785 -- This is an automated message from the Apache Git Service. To respond to the message, please log

Re: [PR] [SPARK-47646][SQL][FOLLOWUP][3.4] Replace non-existing try_to_number function with TryToNumber [spark]

2024-03-31 Thread via GitHub
dongjoon-hyun commented on PR #45785: URL: https://github.com/apache/spark/pull/45785#issuecomment-2028935146 Merged to branch-3.4. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific co

Re: [PR] [SPARK-47646][SQL][FOLLOWUP][3.4] Replace non-existing try_to_number function with TryToNumber [spark]

2024-03-31 Thread via GitHub
dongjoon-hyun closed pull request #45785: [SPARK-47646][SQL][FOLLOWUP][3.4] Replace non-existing try_to_number function with TryToNumber URL: https://github.com/apache/spark/pull/45785 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to Git

Re: [PR] [SPARK-47646][SQL][FOLLOWUP] Replace non-existing try_to_number function with TryToNumber [spark]

2024-03-31 Thread via GitHub
viirya commented on PR #45785: URL: https://github.com/apache/spark/pull/45785#issuecomment-2028858606 cc @sunchao -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsub

Re: [PR] [SPARK-46810][DOCS] Align error class terminology with SQL standard [spark]

2024-03-31 Thread via GitHub
nchammas commented on PR #44902: URL: https://github.com/apache/spark/pull/44902#issuecomment-2028802766 @MaxGekk - Is there something I can do to move this along? Once this is merged in, I am happy to work on SPARK-47429, as I noted in the comments there. (But it's also fine if you'd

Re: [PR] [SPARK-47659][CORE][TESTS] Improve `*LoggingSuite*` [spark]

2024-03-31 Thread via GitHub
panbingkun commented on code in PR #45784: URL: https://github.com/apache/spark/pull/45784#discussion_r1545536535 ## common/utils/src/test/scala/org/apache/spark/util/PatternLoggingSuite.scala: ## @@ -16,24 +16,27 @@ */ package org.apache.spark.util +import org.apache.loggi

Re: [PR] [SPARK-47574][INFRA] Introduce Structured Logging Framework [spark]

2024-03-31 Thread via GitHub
panbingkun commented on code in PR #45729: URL: https://github.com/apache/spark/pull/45729#discussion_r1545654750 ## common/utils/src/test/scala/org/apache/spark/util/StructuredLoggingSuite.scala: ## @@ -0,0 +1,91 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under

Re: [PR] [SPARK-47649][SQL] Make the parameter `inputs` of the function `[csv|parquet|orc|json|text|xml](paths: String*)` non empty [spark]

2024-03-31 Thread via GitHub
koedlt commented on code in PR #45776: URL: https://github.com/apache/spark/pull/45776#discussion_r1545643660 ## sql/core/src/test/java/test/org/apache/spark/sql/JavaDataFrameWriterV2Suite.java: ## @@ -26,29 +32,30 @@ import org.apache.spark.sql.connector.catalog.InMemoryTableC

Re: [PR] [SPARK-47649][SQL] Make the parameter `inputs` of the function `[csv|parquet|orc|json|text|xml](paths: String*)` non empty [spark]

2024-03-31 Thread via GitHub
koedlt commented on code in PR #45776: URL: https://github.com/apache/spark/pull/45776#discussion_r1545642868 ## sql/core/src/test/java/test/org/apache/spark/sql/JavaDataFrameWriterV2Suite.java: ## @@ -26,29 +32,30 @@ import org.apache.spark.sql.connector.catalog.InMemoryTableC

Re: [PR] [SPARK-47659][CORE][TESTS] Improve `*LoggingSuite*` [spark]

2024-03-31 Thread via GitHub
panbingkun commented on code in PR #45784: URL: https://github.com/apache/spark/pull/45784#discussion_r1545639407 ## common/utils/src/test/scala/org/apache/spark/util/StructuredLoggingSuite.scala: ## @@ -19,23 +19,28 @@ package org.apache.spark.util import java.io.File import

Re: [PR] [SPARK-47646][SQL][FOLLOWUP] Replace non-existing try_to_number function with TryToNumber [spark]

2024-03-31 Thread via GitHub
viirya commented on PR #45785: URL: https://github.com/apache/spark/pull/45785#issuecomment-2028584493 cc @HyukjinKwon @dongjoon-hyun -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

[PR] [SPARK-47646][SQL][FOLLOWUP] Replace non-existing try_to_number function with TryToNumber [spark]

2024-03-31 Thread via GitHub
viirya opened a new pull request, #45785: URL: https://github.com/apache/spark/pull/45785 ### What changes were proposed in this pull request? This patch fixes broken CI by replacing non-existing `try_to_number` function in branch-3.4. ### Why are the changes needed