Re: [PR] [SPARK-46402][PYTHON] Add getMessageParameters and getQueryContext support [spark]

2023-12-13 Thread via GitHub
cloud-fan commented on code in PR #44349: URL: https://github.com/apache/spark/pull/44349#discussion_r1426363179 ## python/pyspark/errors/exceptions/base.py: ## @@ -34,6 +34,7 @@ def __init__( message: Optional[str] = None, error_class: Optional[str] = None,

Re: [PR] [SPARK-45796][SQL] Support MODE() WITHIN GROUP (ORDER BY col) [spark]

2023-12-13 Thread via GitHub
beliefer commented on PR #44184: URL: https://github.com/apache/spark/pull/44184#issuecomment-1855352902 The GA failure seems unrelated. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

Re: [PR] [SPARK-45796][SQL] Support MODE() WITHIN GROUP (ORDER BY col) [spark]

2023-12-13 Thread via GitHub
beliefer commented on code in PR #44184: URL: https://github.com/apache/spark/pull/44184#discussion_r1426361627 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala: ## @@ -2332,9 +2332,14 @@ class Analyzer(override val catalogManager:

Re: [PR] [SPARK-46402][PYTHON] Add getMessageParameters and getQueryContext support [spark]

2023-12-13 Thread via GitHub
cloud-fan commented on code in PR #44349: URL: https://github.com/apache/spark/pull/44349#discussion_r1426361362 ## python/pyspark/errors/exceptions/base.py: ## @@ -106,10 +111,26 @@ def getMessage(self) -> str: :meth:`PySparkException.getErrorClass`

Re: [PR] [SPARK-46393][SQL] Classify exceptions in the JDBC table catalog [spark]

2023-12-13 Thread via GitHub
MaxGekk commented on PR #44335: URL: https://github.com/apache/spark/pull/44335#issuecomment-1855346272 > At a minimum we could give: _LEGACY_ERROR_TEMP_3064 a proper name and each of these invocations picks a subclass. There are no tests at all for the cases where JDBC op fails. I

Re: [PR] [SPARK-45593][BUILD] Building a runnable distribution from master code running spark-sql raise error [spark]

2023-12-13 Thread via GitHub
LuciferYang commented on code in PR #43436: URL: https://github.com/apache/spark/pull/43436#discussion_r1426355353 ## connector/connect/client/jvm/pom.xml: ## @@ -124,6 +124,10 @@ io.grpc.** + Review Comment: From

Re: [PR] [SPARK-45593][BUILD] Building a runnable distribution from master code running spark-sql raise error [spark]

2023-12-13 Thread via GitHub
LuciferYang commented on code in PR #43436: URL: https://github.com/apache/spark/pull/43436#discussion_r1426350579 ## connector/connect/client/jvm/pom.xml: ## @@ -124,6 +124,10 @@ io.grpc.** + Review Comment: > The

Re: [PR] [SPARK-46392][SQL] Push down Cast expression in DSv2 [spark]

2023-12-13 Thread via GitHub
nchammas commented on PR #44333: URL: https://github.com/apache/spark/pull/44333#issuecomment-1855319808 My apologies. It looks like I made some mistake trying to run a similar `CREATE TABLE` statement. Your original statement was fine. By the way, if I want to reproduce this log you

Re: [PR] [SPARK-46402][PYTHON] Add getMessageParameters and getQueryContext support [spark]

2023-12-13 Thread via GitHub
HyukjinKwon commented on PR #44349: URL: https://github.com/apache/spark/pull/44349#issuecomment-1855311221 cc @zhengruifeng @itholic @grundprinzip @hvanhovell -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

Re: [PR] [SPARK-46402][PYTHON] Add getMessageParameters and getQueryContext support [spark]

2023-12-13 Thread via GitHub
HyukjinKwon commented on code in PR #44349: URL: https://github.com/apache/spark/pull/44349#discussion_r1426317058 ## python/pyspark/sql/tests/connect/test_utils.py: ## @@ -14,13 +14,16 @@ # See the License for the specific language governing permissions and # limitations

[PR] [SPARK-46402][PYTHON] Add getMessageParameters and getQueryContext support [spark]

2023-12-13 Thread via GitHub
HyukjinKwon opened a new pull request, #44349: URL: https://github.com/apache/spark/pull/44349 ### What changes were proposed in this pull request? This PR adds new API with/without Spark Connect as below. - `getMessageParamater` working fine - `getQueryContext` ###

Re: [PR] [SPARK-46043][SQL][FOLLOWUP] Remove the catalog and identifier check in DataSourceV2Relation [spark]

2023-12-13 Thread via GitHub
allisonwang-db commented on PR #44348: URL: https://github.com/apache/spark/pull/44348#issuecomment-1855303711 cc @cloud-fan -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

[PR] [SPARK-46043][SQL][FOLLOWUP] Remove the catalog and identifier check in DataSourceV2Relation [spark]

2023-12-13 Thread via GitHub
allisonwang-db opened a new pull request, #44348: URL: https://github.com/apache/spark/pull/44348 ### What changes were proposed in this pull request? https://github.com/apache/spark/pull/43949 added a check in the `name` method of `DataSourceV2Relation`, which can be overly

Re: [PR] [SPARK-42307][SQL] Adding in a better name for `_LEGACY_ERROR_TEMP_2232` [spark]

2023-12-13 Thread via GitHub
MaxGekk commented on code in PR #44337: URL: https://github.com/apache/spark/pull/44337#discussion_r1426275065 ## common/utils/src/main/resources/error/error-classes.json: ## @@ -2750,6 +2750,12 @@ ], "sqlState" : "42000" }, + "NULL_INDEX" : { Review Comment:

Re: [PR] [WIP][SPARK-46051][INFRA] Cache python deps for linter and documentation [spark]

2023-12-13 Thread via GitHub
nchammas commented on code in PR #43953: URL: https://github.com/apache/spark/pull/43953#discussion_r1426263880 ## dev/infra/Dockerfile: ## @@ -139,3 +139,60 @@ RUN python3.12 -m pip install 'grpcio==1.59.3' 'grpcio-status==1.59.3' 'protobuf # TODO(SPARK-46078) Use official

Re: [PR] [WIP][SPARK-46051][INFRA] Cache python deps for linter and documentation [spark]

2023-12-13 Thread via GitHub
nchammas commented on code in PR #43953: URL: https://github.com/apache/spark/pull/43953#discussion_r1426263880 ## dev/infra/Dockerfile: ## @@ -139,3 +139,60 @@ RUN python3.12 -m pip install 'grpcio==1.59.3' 'grpcio-status==1.59.3' 'protobuf # TODO(SPARK-46078) Use official

Re: [PR] [SPARK-46392][SQL] Push down Cast expression in DSv2 [spark]

2023-12-13 Thread via GitHub
monkeyboy123 commented on PR #44333: URL: https://github.com/apache/spark/pull/44333#issuecomment-1855267099 > Could you update your example with a working reproduction? This create table statement is invalid. It partitions on two columns `dt` and `hour` that do not exist in the column

Re: [PR] [SPARK-46392][SQL] Push down Cast expression in DSv2 [spark]

2023-12-13 Thread via GitHub
nchammas commented on PR #44333: URL: https://github.com/apache/spark/pull/44333#issuecomment-1855242821 > CREATE TABLE `tableA`( >   s string) > PARTITIONED BY (  >   `dt` string COMMENT '日期, MMdd',  >   `hour` string COMMENT '小时, HH') Could you update your example

Re: [PR] [SPARK-45807][SQL] Add createOrReplaceView(..) / replaceView(..) to ViewCatalog [spark]

2023-12-13 Thread via GitHub
cloud-fan commented on code in PR #43677: URL: https://github.com/apache/spark/pull/43677#discussion_r1426238778 ## sql/catalyst/src/main/java/org/apache/spark/sql/connector/catalog/ViewCatalog.java: ## @@ -140,6 +140,92 @@ View createView( String[] columnComments,

Re: [PR] [SPARK-46380][SQL]Replace current time/date prior to evaluating inline table expressions. [spark]

2023-12-13 Thread via GitHub
cloud-fan commented on code in PR #44316: URL: https://github.com/apache/spark/pull/44316#discussion_r1426236880 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/ResolveInlineTables.scala: ## @@ -33,12 +34,14 @@ import

Re: [PR] [SPARK-46401][CORE] Use `!isEmpty()` on `RoaringBitmap` instead of `getCardinality() > 0` in `RemoteBlockPushResolver` [spark]

2023-12-13 Thread via GitHub
LuciferYang commented on PR #44347: URL: https://github.com/apache/spark/pull/44347#issuecomment-1855234334 > Nice catch @LuciferYang ! Looks good to me. > > Why is this still DRAFT ? :-) Set to ready to review :) -- This is an automated message from the Apache Git Service.

Re: [PR] [SPARK-46379][PS][TESTS][FOLLOWUPS] Deduplicate `test_interpolate_error` [spark]

2023-12-13 Thread via GitHub
zhengruifeng closed pull request #44341: [SPARK-46379][PS][TESTS][FOLLOWUPS] Deduplicate `test_interpolate_error` URL: https://github.com/apache/spark/pull/44341 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

Re: [PR] [SPARK-46379][PS][TESTS][FOLLOWUPS] Deduplicate `test_interpolate_error` [spark]

2023-12-13 Thread via GitHub
zhengruifeng commented on PR #44341: URL: https://github.com/apache/spark/pull/44341#issuecomment-1855209798 merged to master -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

Re: [PR] [SPARK-46394][SQL] Fix spark.catalog.listDatabases() issues on schemas with special characters when `spark.sql.legacy.keepCommandOutputSchema` set to true [spark]

2023-12-13 Thread via GitHub
cloud-fan closed pull request #44328: [SPARK-46394][SQL] Fix spark.catalog.listDatabases() issues on schemas with special characters when `spark.sql.legacy.keepCommandOutputSchema` set to true URL: https://github.com/apache/spark/pull/44328 -- This is an automated message from the Apache

Re: [PR] [SPARK-46394][SQL] Fix spark.catalog.listDatabases() issues on schemas with special characters when `spark.sql.legacy.keepCommandOutputSchema` set to true [spark]

2023-12-13 Thread via GitHub
cloud-fan commented on PR #44328: URL: https://github.com/apache/spark/pull/44328#issuecomment-1855208319 thanks, merging to master/3.5! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

Re: [PR] [SPARK-45796][SQL] Support MODE() WITHIN GROUP (ORDER BY col) [spark]

2023-12-13 Thread via GitHub
cloud-fan commented on code in PR #44184: URL: https://github.com/apache/spark/pull/44184#discussion_r1426218462 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala: ## @@ -2332,9 +2332,14 @@ class Analyzer(override val catalogManager:

Re: [PR] [SPARK-46401][CORE] Use `!isEmpty()` on `RoaringBitmap` instead of `getCardinality() > 0` in `RemoteBlockPushResolver` [spark]

2023-12-13 Thread via GitHub
mridulm commented on PR #44347: URL: https://github.com/apache/spark/pull/44347#issuecomment-1855185369 +CC @otterc as well -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

Re: [PR] [SPARK-46401][CORE] Use `!isEmpty()` on RoaringBitmap instead of `getCardinality() > 0` in `RemoteBlockPushResolver` [spark]

2023-12-13 Thread via GitHub
LuciferYang commented on code in PR #44347: URL: https://github.com/apache/spark/pull/44347#discussion_r1426180105 ## common/network-shuffle/src/main/java/org/apache/spark/network/shuffle/RemoteBlockPushResolver.java: ## @@ -809,7 +809,7 @@ public MergeStatuses

Re: [PR] [SPARK-46294][SQL] Clean up semantics of init vs zero value [spark]

2023-12-13 Thread via GitHub
cloud-fan commented on code in PR #44222: URL: https://github.com/apache/spark/pull/44222#discussion_r1426175626 ## sql/core/src/main/scala/org/apache/spark/sql/execution/metric/SQLMetrics.scala: ## @@ -37,36 +37,47 @@ import

Re: [PR] [SPARK-46294][SQL] Clean up semantics of init vs zero value [spark]

2023-12-13 Thread via GitHub
cloud-fan commented on code in PR #44222: URL: https://github.com/apache/spark/pull/44222#discussion_r1426174905 ## sql/core/src/main/scala/org/apache/spark/sql/execution/metric/SQLMetrics.scala: ## @@ -37,36 +37,47 @@ import

[PR] [SPARK-46401][CORE] Use `!isEmpty()` on RoaringBitmap instead of `getCardinality() > 0` [spark]

2023-12-13 Thread via GitHub
LuciferYang opened a new pull request, #44347: URL: https://github.com/apache/spark/pull/44347 ### What changes were proposed in this pull request? ### Why are the changes needed? ### Does this PR introduce _any_ user-facing change? ### How was this

Re: [PR] [SPARK-46294][SQL] Clean up semantics of init vs zero value [spark]

2023-12-13 Thread via GitHub
cloud-fan commented on code in PR #44222: URL: https://github.com/apache/spark/pull/44222#discussion_r1426171943 ## sql/core/src/main/scala/org/apache/spark/sql/execution/metric/SQLMetrics.scala: ## @@ -37,36 +37,47 @@ import

Re: [PR] [SPARK-46294][SQL] Clean up semantics of init vs zero value [spark]

2023-12-13 Thread via GitHub
cloud-fan commented on code in PR #44222: URL: https://github.com/apache/spark/pull/44222#discussion_r1426171810 ## sql/core/src/main/scala/org/apache/spark/sql/execution/metric/SQLMetrics.scala: ## @@ -37,36 +37,47 @@ import

Re: [PR] [SPARK-46032][CORE] Add jars from default session state into isolated session state for spark connect sessions [spark]

2023-12-13 Thread via GitHub
hvanhovell commented on PR #44240: URL: https://github.com/apache/spark/pull/44240#issuecomment-1855116540 @fhalde can you just add the jars to the ArtifactManager when it is initialized? -- This is an automated message from the Apache Git Service. To respond to the message, please log

Re: [PR] [SPARK-46393][SQL] Classify exceptions in the JDBC table catalog [spark]

2023-12-13 Thread via GitHub
srielau commented on PR #44335: URL: https://github.com/apache/spark/pull/44335#issuecomment-1855113908 > @srielau This seems like a lot of work as we need to understand different errors from different databases. Shall we have a general error class name as the fallback? Each JDBC dialect

Re: [PR] [SPARK-46393][SQL] Classify exceptions in the JDBC table catalog [spark]

2023-12-13 Thread via GitHub
cloud-fan commented on PR #44335: URL: https://github.com/apache/spark/pull/44335#issuecomment-1855106022 @srielau This seems like a lot of work as we need to understand different errors from different databases. Shall we have a general error class name as the fallback? Each JDBC dialect

[PR] [WIP] [SPARK-46384][SS][UI] Fix Operation Duration Stack Chart on Structured Streaming Page [spark]

2023-12-13 Thread via GitHub
yaooqinn opened a new pull request, #44346: URL: https://github.com/apache/spark/pull/44346 ### What changes were proposed in this pull request? ### Why are the changes needed? ### Does this PR introduce _any_ user-facing change? ### How

Re: [PR] [SPARK-46357] Replace incorrect documentation use of setConf with conf.set [spark]

2023-12-13 Thread via GitHub
nchammas commented on PR #44290: URL: https://github.com/apache/spark/pull/44290#issuecomment-1855084056 Thanks for reviewing, @yaooqinn. I didn't realize you were a committer! You may also be interested in reviewing #44300, which relates to #28274 which you previously worked on back in

Re: [PR] [SPARK-45861][PYTHON][DOCS] Add user guide for dataframe creation [spark]

2023-12-13 Thread via GitHub
panbingkun commented on PR #43897: URL: https://github.com/apache/spark/pull/43897#issuecomment-1855078032 > Will the code in this guide be tested by our Python doc tests, by the way? No, but I have manually tested it. -- This is an automated message from the Apache Git Service. To

Re: [PR] [MINOR][CORE] Add `@tailrec` to `HadoopFSUtils#shouldFilterOutPath` [spark]

2023-12-13 Thread via GitHub
LuciferYang commented on code in PR #44345: URL: https://github.com/apache/spark/pull/44345#discussion_r1426138762 ## core/src/main/scala/org/apache/spark/util/HadoopFSUtils.scala: ## @@ -349,6 +349,7 @@ private[spark] object HadoopFSUtils extends Logging { private val

Re: [PR] Revert "[SPARK-46377][INFRA] Upgrade labeler action to v5" [spark]

2023-12-13 Thread via GitHub
LuciferYang closed pull request #44344: Revert "[SPARK-46377][INFRA] Upgrade labeler action to v5" URL: https://github.com/apache/spark/pull/44344 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

[PR] [MINOR][CORE] Add `@tailrec` to `HadoopFSUtils#shouldFilterOutPath` [spark]

2023-12-13 Thread via GitHub
LuciferYang opened a new pull request, #44345: URL: https://github.com/apache/spark/pull/44345 ### What changes were proposed in this pull request? This pr adds the `@scala.annotation.tailrec` annotation to the `shouldFilterOutPath` function in `HadoopFSUtils`. ### Why are the

[PR] Revert "[SPARK-46377][INFRA] Upgrade labeler action to v5" [spark]

2023-12-13 Thread via GitHub
panbingkun opened a new pull request, #44344: URL: https://github.com/apache/spark/pull/44344 This reverts commit 270da6f9b7b331e455d1f0bd1309fb87bc8740ab. ### What changes were proposed in this pull request? ### Why are the changes needed? ###

Re: [PR] [SPARK-46377][INFRA] Upgrade labeler action to v5 [spark]

2023-12-13 Thread via GitHub
panbingkun commented on PR #44309: URL: https://github.com/apache/spark/pull/44309#issuecomment-1855062230 @yaooqinn please let's revert it first. I found some problems. https://github.com/actions/labeler/issues/710

Re: [PR] [SPARK-45796][SQL] Support MODE() WITHIN GROUP (ORDER BY col) [spark]

2023-12-13 Thread via GitHub
beliefer commented on code in PR #44184: URL: https://github.com/apache/spark/pull/44184#discussion_r1426127448 ## sql/core/src/test/resources/sql-functions/sql-expression-schema.md: ## @@ -254,7 +254,7 @@ | org.apache.spark.sql.catalyst.expressions.RLike | regexp_like |

Re: [PR] [SPARK-46377][INFRA] Upgrade labeler action to v5 [spark]

2023-12-13 Thread via GitHub
yaooqinn commented on PR #44309: URL: https://github.com/apache/spark/pull/44309#issuecomment-1855053418 Thanks @panbingkun . Merged to master -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

Re: [PR] [SPARK-46377][INFRA] Upgrade labeler action to v5 [spark]

2023-12-13 Thread via GitHub
yaooqinn closed pull request #44309: [SPARK-46377][INFRA] Upgrade labeler action to v5 URL: https://github.com/apache/spark/pull/44309 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

Re: [PR] [SPARK-46357] Replace incorrect documentation use of setConf with conf.set [spark]

2023-12-13 Thread via GitHub
yaooqinn commented on PR #44290: URL: https://github.com/apache/spark/pull/44290#issuecomment-1855052192 Thanks @nchammas, merged to master -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

Re: [PR] [SPARK-46357] Replace incorrect documentation use of setConf with conf.set [spark]

2023-12-13 Thread via GitHub
yaooqinn closed pull request #44290: [SPARK-46357] Replace incorrect documentation use of setConf with conf.set URL: https://github.com/apache/spark/pull/44290 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

[PR] [SPARK-46400][CORE][SQL] When there are corrupted files in the local maven repo, retry to skip this cache [spark]

2023-12-13 Thread via GitHub
panbingkun opened a new pull request, #44343: URL: https://github.com/apache/spark/pull/44343 ### What changes were proposed in this pull request? The pr aims to fix bug. ### Why are the changes needed? ### Does this PR introduce _any_ user-facing change?

[PR] [MINOR][DOCS] Correct the usage example of dataset in Java. [spark]

2023-12-13 Thread via GitHub
Aiden-Dong opened a new pull request, #44342: URL: https://github.com/apache/spark/pull/44342 ### What changes were proposed in this pull request? ### Why are the changes needed? ### Does this PR introduce _any_ user-facing change? ### How

Re: [PR] [SPARK-46379][PS][TESTS][FOLLOWUPS] Deduplicate `test_interpolate_error` [spark]

2023-12-13 Thread via GitHub
zhengruifeng commented on PR #44341: URL: https://github.com/apache/spark/pull/44341#issuecomment-1854998594 ci: https://github.com/zhengruifeng/spark/actions/runs/7203517827/job/19623517549 -- This is an automated message from the Apache Git Service. To respond to the message, please

Re: [PR] [SPARK-45862][PYTHON][DOCS] Add user guide for basic dataframe operations [spark]

2023-12-13 Thread via GitHub
nchammas commented on code in PR #43972: URL: https://github.com/apache/spark/pull/43972#discussion_r1426079059 ## python/docs/source/user_guide/basic_dataframe_operations.rst: ## @@ -0,0 +1,169 @@ +.. Licensed to the Apache Software Foundation (ASF) under one +or more

Re: [PR] [SPARK-45862][PYTHON][DOCS] Add user guide for basic dataframe operations [spark]

2023-12-13 Thread via GitHub
nchammas commented on code in PR #43972: URL: https://github.com/apache/spark/pull/43972#discussion_r1426079059 ## python/docs/source/user_guide/basic_dataframe_operations.rst: ## @@ -0,0 +1,169 @@ +.. Licensed to the Apache Software Foundation (ASF) under one +or more

[PR] [SPARK-46379][PS][TESTS][FOLLOWUPS] Deduplicate `test_interpolate_error` [spark]

2023-12-13 Thread via GitHub
zhengruifeng opened a new pull request, #44341: URL: https://github.com/apache/spark/pull/44341 ### What changes were proposed in this pull request? this is a followup of https://github.com/apache/spark/pull/44313/, which happened to duplicate `test_interpolate_error` ###

Re: [PR] [SPARK-46228][SQL] Insert window group limit node for limit outside of window [spark]

2023-12-13 Thread via GitHub
zml1206 commented on code in PR #44145: URL: https://github.com/apache/spark/pull/44145#discussion_r1426081697 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/InferWindowGroupLimit.scala: ## @@ -68,10 +72,55 @@ object InferWindowGroupLimit extends

Re: [PR] [SPARK-45861][PYTHON][DOCS] Add user guide for dataframe creation [spark]

2023-12-13 Thread via GitHub
nchammas commented on PR #43897: URL: https://github.com/apache/spark/pull/43897#issuecomment-1854991274 Will the code in this guide be tested by our Python doc tests, by the way? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to

Re: [PR] [SPARK-45861][PYTHON][DOCS] Add user guide for dataframe creation [spark]

2023-12-13 Thread via GitHub
nchammas commented on code in PR #43897: URL: https://github.com/apache/spark/pull/43897#discussion_r1426075998 ## python/docs/source/user_guide/sql/creating_dataframes.rst: ## @@ -0,0 +1,223 @@ +.. Licensed to the Apache Software Foundation (ASF) under one +or more

Re: [PR] [SPARK-46399][Core] Add exit status to the Application End event for the use of Spark Listener [spark]

2023-12-13 Thread via GitHub
rezasafi commented on PR #44340: URL: https://github.com/apache/spark/pull/44340#issuecomment-1854890434 @squito @vanzin @attilapiros old friends appreciate your review here :-) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub

[PR] [SPARK-46399][Core] Add exit status to the Application End event for the use of Spark Listener [spark]

2023-12-13 Thread via GitHub
rezasafi opened a new pull request, #44340: URL: https://github.com/apache/spark/pull/44340 ### What changes were proposed in this pull request? Currently SparkListenerApplicationEnd only has a timestamp value and there is not exit status recorded with it. This change will

[PR] [SPARK-46398][PYSPARK][TESTS] Test rangeBetween window function (pyspark.sql.window) [spark]

2023-12-13 Thread via GitHub
xinrong-meng opened a new pull request, #44339: URL: https://github.com/apache/spark/pull/44339 ### What changes were proposed in this pull request? Test rangeBetween window function (pyspark.sql.window). ### Why are the changes needed? Subtasks of

Re: [PR] [SPARK-46394][SQL] Fix spark.catalog.listDatabases() issues on schemas with special characters when `spark.sql.legacy.keepCommandOutputSchema` set to true [spark]

2023-12-13 Thread via GitHub
anchovYu commented on code in PR #44328: URL: https://github.com/apache/spark/pull/44328#discussion_r1425951074 ## sql/core/src/test/scala/org/apache/spark/sql/internal/CatalogSuite.scala: ## @@ -167,6 +167,44 @@ class CatalogSuite extends SharedSparkSession with AnalysisTest

Re: [PR] [SPARK-46396][SQL] Timestamp inference should not throw exception [spark]

2023-12-13 Thread via GitHub
gengliangwang commented on PR #44338: URL: https://github.com/apache/spark/pull/44338#issuecomment-1854772764 cc @Hisoka-X as well -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

[PR] [SPARK-46396][SQL] Timestamp inference should not throw exception [spark]

2023-12-13 Thread via GitHub
gengliangwang opened a new pull request, #44338: URL: https://github.com/apache/spark/pull/44338 ### What changes were proposed in this pull request? When setting `spark.sql.legacy.timeParserPolicy=LEGACY`, Spark will use the LegacyFastTimestampFormatter to infer potential

Re: [PR] [SPARK-46394][SQL] Fix spark.catalog.listDatabases() issues on schemas with special characters when `spark.sql.legacy.keepCommandOutputSchema` set to true [spark]

2023-12-13 Thread via GitHub
anchovYu commented on code in PR #44328: URL: https://github.com/apache/spark/pull/44328#discussion_r142506 ## sql/core/src/test/scala/org/apache/spark/sql/internal/CatalogSuite.scala: ## @@ -167,6 +167,44 @@ class CatalogSuite extends SharedSparkSession with AnalysisTest

Re: [PR] [SPARK-46394][SQL] Fix spark.catalog.listDatabases() issues on schemas with special characters when `spark.sql.legacy.keepCommandOutputSchema` set to true [spark]

2023-12-13 Thread via GitHub
anchovYu commented on code in PR #44328: URL: https://github.com/apache/spark/pull/44328#discussion_r1425886427 ## sql/core/src/main/scala/org/apache/spark/sql/internal/CatalogImpl.scala: ## @@ -100,7 +109,9 @@ class CatalogImpl(sparkSession: SparkSession) extends Catalog {

Re: [PR] [SPARK-45597][PYTHON][SQL] Support creating table using a Python data source in SQL (DSv2 exec) [spark]

2023-12-13 Thread via GitHub
HyukjinKwon commented on code in PR #44305: URL: https://github.com/apache/spark/pull/44305#discussion_r1425868807 ## sql/core/src/test/scala/org/apache/spark/sql/execution/python/PythonDataSourceSuite.scala: ## @@ -145,34 +148,6 @@ class PythonDataSourceSuite extends QueryTest

Re: [PR] [SPARK-45597][PYTHON][SQL] Support creating table using a Python data source in SQL (DSv2 exec) [spark]

2023-12-13 Thread via GitHub
HyukjinKwon commented on code in PR #44305: URL: https://github.com/apache/spark/pull/44305#discussion_r1425865126 ## sql/core/src/test/scala/org/apache/spark/sql/execution/python/PythonDataSourceSuite.scala: ## @@ -231,8 +212,9 @@ class PythonDataSourceSuite extends QueryTest

[PR] [WIP][SPARK-42307][SQL] Adding in a better name for `_LEGACY_ERROR_TEMP_2232` [spark]

2023-12-13 Thread via GitHub
hannahkamundson opened a new pull request, #44337: URL: https://github.com/apache/spark/pull/44337 ### What changes were proposed in this pull request? This adds a better name for the error type `_LEGACY_ERROR_TEMP_2232`. ### Why are the changes needed? All that was changed

Re: [PR] [SPARK-45597][PYTHON][SQL] Support creating table using a Python data source in SQL (DSv2 exec) [spark]

2023-12-13 Thread via GitHub
cloud-fan commented on code in PR #44305: URL: https://github.com/apache/spark/pull/44305#discussion_r1425859733 ## sql/core/src/test/scala/org/apache/spark/sql/execution/python/PythonDataSourceSuite.scala: ## @@ -231,8 +212,9 @@ class PythonDataSourceSuite extends QueryTest

Re: [PR] [SPARK-45597][PYTHON][SQL] Support creating table using a Python data source in SQL (DSv2 exec) [spark]

2023-12-13 Thread via GitHub
cloud-fan commented on code in PR #44305: URL: https://github.com/apache/spark/pull/44305#discussion_r1425859297 ## sql/core/src/test/scala/org/apache/spark/sql/execution/python/PythonDataSourceSuite.scala: ## @@ -145,34 +148,6 @@ class PythonDataSourceSuite extends QueryTest

Re: [PR] [SPARK-45597][PYTHON][SQL] Support creating table using a Python data source in SQL (DSv2 exec) [spark]

2023-12-13 Thread via GitHub
HyukjinKwon commented on code in PR #44305: URL: https://github.com/apache/spark/pull/44305#discussion_r1425856795 ## sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/DataSource.scala: ## @@ -708,17 +732,18 @@ object DataSource extends Logging { def

Re: [PR] [SPARK-46360][PYTHON][FOLLOWUP][DOCS] Add `getMessage` to API reference [spark]

2023-12-13 Thread via GitHub
HyukjinKwon closed pull request #44308: [SPARK-46360][PYTHON][FOLLOWUP][DOCS] Add `getMessage` to API reference URL: https://github.com/apache/spark/pull/44308 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

Re: [PR] [SPARK-46393][SQL] Classify exceptions in the JDBC table catalog [spark]

2023-12-13 Thread via GitHub
srielau commented on PR #44335: URL: https://github.com/apache/spark/pull/44335#issuecomment-1854656969 @MaxGekk Is this just doing: /** * Gets a dialect exception, classifies it and wraps it by `AnalysisException`. * @param message The error message to be placed to the

Re: [PR] [SPARK-46360][PYTHON][FOLLOWUP][DOCS] Add `getMessage` to API reference [spark]

2023-12-13 Thread via GitHub
HyukjinKwon commented on PR #44308: URL: https://github.com/apache/spark/pull/44308#issuecomment-1854656696 Merged to master. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

Re: [PR] [SPARK-46385][PYTHON][TESTS] Test aggregate functions for groups (pyspark.sql.group) [spark]

2023-12-13 Thread via GitHub
HyukjinKwon closed pull request #44322: [SPARK-46385][PYTHON][TESTS] Test aggregate functions for groups (pyspark.sql.group) URL: https://github.com/apache/spark/pull/44322 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use

Re: [PR] [SPARK-46385][PYTHON][TESTS] Test aggregate functions for groups (pyspark.sql.group) [spark]

2023-12-13 Thread via GitHub
HyukjinKwon commented on PR #44322: URL: https://github.com/apache/spark/pull/44322#issuecomment-1854649472 Merged to master. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

Re: [PR] [SPARK-42332][SQL] Changing the require to a SparkException in ComplexTypeMergingExpression [spark]

2023-12-13 Thread via GitHub
hannahkamundson commented on code in PR #44336: URL: https://github.com/apache/spark/pull/44336#discussion_r1425846359 ## common/utils/src/main/scala/org/apache/spark/SparkException.scala: ## @@ -106,6 +106,20 @@ object SparkException { messageParameters = Map("message"

Re: [PR] [SPARK-46387][PS][DOCS] Add an information of `isocalendar` into migration guide [spark]

2023-12-13 Thread via GitHub
HyukjinKwon closed pull request #44325: [SPARK-46387][PS][DOCS] Add an information of `isocalendar` into migration guide URL: https://github.com/apache/spark/pull/44325 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the

Re: [PR] [SPARK-46387][PS][DOCS] Add an information of `isocalendar` into migration guide [spark]

2023-12-13 Thread via GitHub
HyukjinKwon commented on PR #44325: URL: https://github.com/apache/spark/pull/44325#issuecomment-1854646588 Merged to master. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

Re: [PR] [SPARK-45807][SQL] Improve ViewCatalog API [spark]

2023-12-13 Thread via GitHub
HyukjinKwon commented on code in PR #44330: URL: https://github.com/apache/spark/pull/44330#discussion_r1425842464 ## sql/catalyst/src/main/java/org/apache/spark/sql/connector/catalog/ViewInfo.java: ## @@ -0,0 +1,47 @@ +/* + * Licensed to the Apache Software Foundation (ASF)

Re: [PR] [SPARK-45597][PYTHON][SQL] Support creating table using a Python data source in SQL (DSv2 exec) [spark]

2023-12-13 Thread via GitHub
cloud-fan commented on code in PR #44305: URL: https://github.com/apache/spark/pull/44305#discussion_r1425842196 ## sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/DataSource.scala: ## @@ -708,17 +732,18 @@ object DataSource extends Logging { def

Re: [PR] [SPARK-45798][CONNECT] Followup: add serverSessionId to SessionHolderInfo [spark]

2023-12-13 Thread via GitHub
HyukjinKwon closed pull request #44334: [SPARK-45798][CONNECT] Followup: add serverSessionId to SessionHolderInfo URL: https://github.com/apache/spark/pull/44334 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

Re: [PR] [SPARK-45798][CONNECT] Followup: add serverSessionId to SessionHolderInfo [spark]

2023-12-13 Thread via GitHub
HyukjinKwon commented on PR #44334: URL: https://github.com/apache/spark/pull/44334#issuecomment-1854640582 Merged to master. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

Re: [PR] [SPARK-42332][SQL] Changing the require to a SparkException in ComplexTypeMergingExpression [spark]

2023-12-13 Thread via GitHub
HyukjinKwon commented on code in PR #44336: URL: https://github.com/apache/spark/pull/44336#discussion_r1425839094 ## common/utils/src/main/scala/org/apache/spark/SparkException.scala: ## @@ -106,6 +106,20 @@ object SparkException { messageParameters = Map("message" ->

Re: [PR] [SPARK-46394][SQL] Fix spark.catalog.listDatabases() issues on schemas with special characters when `spark.sql.legacy.keepCommandOutputSchema` set to true [spark]

2023-12-13 Thread via GitHub
cloud-fan commented on code in PR #44328: URL: https://github.com/apache/spark/pull/44328#discussion_r1425836204 ## sql/core/src/test/scala/org/apache/spark/sql/internal/CatalogSuite.scala: ## @@ -167,6 +167,44 @@ class CatalogSuite extends SharedSparkSession with AnalysisTest

Re: [PR] [SPARK-46394][SQL] Fix spark.catalog.listDatabases() issues on schemas with special characters when `spark.sql.legacy.keepCommandOutputSchema` set to true [spark]

2023-12-13 Thread via GitHub
cloud-fan commented on code in PR #44328: URL: https://github.com/apache/spark/pull/44328#discussion_r1425834404 ## sql/core/src/main/scala/org/apache/spark/sql/internal/CatalogImpl.scala: ## @@ -100,7 +109,9 @@ class CatalogImpl(sparkSession: SparkSession) extends Catalog {

Re: [PR] [SPARK-46153][SQL] XML: Add TimestampNTZType support [spark]

2023-12-13 Thread via GitHub
HyukjinKwon closed pull request #44329: [SPARK-46153][SQL] XML: Add TimestampNTZType support URL: https://github.com/apache/spark/pull/44329 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

Re: [PR] [SPARK-46153][SQL] XML: Add TimestampNTZType support [spark]

2023-12-13 Thread via GitHub
HyukjinKwon commented on PR #44329: URL: https://github.com/apache/spark/pull/44329#issuecomment-1854623125 Merged to master. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

Re: [PR] [SPARK-45502][BUILD] Upgrade Kafka to 3.6.1 [spark]

2023-12-13 Thread via GitHub
dongjoon-hyun commented on code in PR #44312: URL: https://github.com/apache/spark/pull/44312#discussion_r1425785574 ## connector/kafka-0-10-sql/src/test/scala/org/apache/spark/sql/kafka010/KafkaMicroBatchSourceSuite.scala: ## @@ -2691,7 +2691,9 @@ class KafkaSourceStressSuite

Re: [PR] [SPARK-42332][SQL] Changing the require to a SparkException in ComplexTypeMergingExpression [spark]

2023-12-13 Thread via GitHub
hannahkamundson commented on code in PR #44336: URL: https://github.com/apache/spark/pull/44336#discussion_r1425757933 ## common/utils/src/main/scala/org/apache/spark/SparkException.scala: ## @@ -106,6 +106,20 @@ object SparkException { messageParameters = Map("message"

Re: [PR] [SPARK-42332][SQL] Changing the require to a SparkException in ComplexTypeMergingExpression [spark]

2023-12-13 Thread via GitHub
MaxGekk commented on code in PR #44336: URL: https://github.com/apache/spark/pull/44336#discussion_r1425719892 ## common/utils/src/main/scala/org/apache/spark/SparkException.scala: ## @@ -106,6 +106,20 @@ object SparkException { messageParameters = Map("message" -> msg),

[PR] [WIP][SPARK-42332][SQL] Changing the require to a SparkException in ComplexTypeMergingExpression [spark]

2023-12-13 Thread via GitHub
hannahkamundson opened a new pull request, #44336: URL: https://github.com/apache/spark/pull/44336 ### What changes were proposed in this pull request? I changed 2 `require`s to `SparkIllegalArgumentException` ### Why are the changes needed? All user facing exceptions should

[PR] [WIP][SQL] Classify exceptions in the JDBC table catalog [spark]

2023-12-13 Thread via GitHub
MaxGekk opened a new pull request, #44335: URL: https://github.com/apache/spark/pull/44335 ### What changes were proposed in this pull request? ### Why are the changes needed? ### Does this PR introduce _any_ user-facing change? ### How was this

Re: [PR] [SPARK-45798][CONNECT] Followup: add serverSessionId to SessionHolderInfo [spark]

2023-12-13 Thread via GitHub
juliuszsompolski commented on PR #44334: URL: https://github.com/apache/spark/pull/44334#issuecomment-1854138594 cc @grundprinzip @hvanhovell -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

Re: [PR] [SPARK-46361][PYTHON][CORE] Spark dataset chunk read api [spark]

2023-12-13 Thread via GitHub
WeichenXu123 commented on code in PR #44294: URL: https://github.com/apache/spark/pull/44294#discussion_r1425410174 ## core/src/main/scala/org/apache/spark/api/python/CachedArrowBatchServer.scala: ## @@ -0,0 +1,104 @@ +/* + * Licensed to the Apache Software Foundation (ASF)

Re: [PR] [SPARK-46361][PYTHON][CORE] Spark dataset chunk read api [spark]

2023-12-13 Thread via GitHub
WeichenXu123 commented on code in PR #44294: URL: https://github.com/apache/spark/pull/44294#discussion_r1425407299 ## core/src/main/scala/org/apache/spark/api/python/CachedArrowBatchServer.scala: ## @@ -0,0 +1,104 @@ +/* + * Licensed to the Apache Software Foundation (ASF)

Re: [PR] [SPARK-46380][SQL]Replace current time/date prior to evaluating inline table expressions. [spark]

2023-12-13 Thread via GitHub
dbatomic commented on PR #44316: URL: https://github.com/apache/spark/pull/44316#issuecomment-1853979737 > I think we need a discussion. > > ``` > SELECT COUNT(DISTINCT ct) FROM VALUES > (CURRENT_TIMESTAMP()), > (CURRENT_TIMESTAMP()), > (CURRENT_TIMESTAMP()) as data(ct)

Re: [PR] [SPARK-46361][PYTHON][CORE] Spark dataset chunk read api [spark]

2023-12-13 Thread via GitHub
Ngone51 commented on code in PR #44294: URL: https://github.com/apache/spark/pull/44294#discussion_r1425340638 ## core/src/main/scala/org/apache/spark/SparkContext.scala: ## @@ -486,6 +493,12 @@ class SparkContext(config: SparkConf) extends Logging { _env =

Re: [PR] [SPARK-46246] EXECUTE IMMEDIATE SQL support [spark]

2023-12-13 Thread via GitHub
milastdbx commented on code in PR #44093: URL: https://github.com/apache/spark/pull/44093#discussion_r1425342125 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/executeImmediate.scala: ## @@ -0,0 +1,186 @@ +/* + * Licensed to the Apache Software Foundation

  1   2   >