Re: [PR] [SPARK-46216][CORE] Improve `FileSystemPersistenceEngine` to support compressions [spark]

2023-12-02 Thread via GitHub
dongjoon-hyun commented on PR #44129: URL: https://github.com/apache/spark/pull/44129#issuecomment-1837400198 Thank you so much. I addressed your comment via converting `spark.deploy.recoveryCompressionCodec` as `optionalString`. And, the default is no compression. -- This is an automate

Re: [PR] [SPARK-46216][CORE] Improve `FileSystemPersistenceEngine` to support compressions [spark]

2023-12-02 Thread via GitHub
dongjoon-hyun commented on code in PR #44129: URL: https://github.com/apache/spark/pull/44129#discussion_r1413018463 ## core/src/main/scala/org/apache/spark/internal/config/Deploy.scala: ## @@ -39,6 +41,14 @@ private[spark] object Deploy { .checkValues(RecoverySerializer.va

Re: [PR] [SPARK-46216][CORE] Improve `FileSystemPersistenceEngine` to support compressions [spark]

2023-12-02 Thread via GitHub
HyukjinKwon commented on code in PR #44129: URL: https://github.com/apache/spark/pull/44129#discussion_r1413017750 ## core/src/main/scala/org/apache/spark/internal/config/Deploy.scala: ## @@ -39,6 +41,14 @@ private[spark] object Deploy { .checkValues(RecoverySerializer.valu

Re: [PR] [SPARK-46069][SQL] Support unwrap timestamp type to date type [spark]

2023-12-02 Thread via GitHub
HyukjinKwon commented on code in PR #43982: URL: https://github.com/apache/spark/pull/43982#discussion_r1413016701 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/UnwrapCastInBinaryComparison.scala: ## @@ -138,6 +138,11 @@ object UnwrapCastInBinaryComparis

Re: [PR] [SPARK-46069][SQL] Support unwrap timestamp type to date type [spark]

2023-12-02 Thread via GitHub
HyukjinKwon commented on code in PR #43982: URL: https://github.com/apache/spark/pull/43982#discussion_r1413015199 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/UnwrapCastInBinaryComparison.scala: ## @@ -138,6 +138,11 @@ object UnwrapCastInBinaryComparis

Re: [PR] [SPARK-46069][SQL] Support unwrap timestamp type to date type [spark]

2023-12-02 Thread via GitHub
HyukjinKwon commented on code in PR #43982: URL: https://github.com/apache/spark/pull/43982#discussion_r1413014878 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/UnwrapCastInBinaryComparison.scala: ## @@ -138,6 +138,11 @@ object UnwrapCastInBinaryComparis

Re: [PR] [SPARK-46069][SQL] Support unwrap timestamp type to date type [spark]

2023-12-02 Thread via GitHub
HyukjinKwon commented on code in PR #43982: URL: https://github.com/apache/spark/pull/43982#discussion_r1413014878 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/UnwrapCastInBinaryComparison.scala: ## @@ -138,6 +138,11 @@ object UnwrapCastInBinaryComparis

Re: [PR] [SPARK-46216][CORE] Improve `FileSystemPersistenceEngine` to support compressions [spark]

2023-12-02 Thread via GitHub
dongjoon-hyun commented on PR #44129: URL: https://github.com/apache/spark/pull/44129#issuecomment-1837388900 Could you review this when you have some time, @HyukjinKwon ? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and us

Re: [PR] [SPARK-46208][PS][DOCS] Adding a link for latest Pandas API specifications. [spark]

2023-12-02 Thread via GitHub
HyukjinKwon closed pull request #44115: [SPARK-46208][PS][DOCS] Adding a link for latest Pandas API specifications. URL: https://github.com/apache/spark/pull/44115 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL a

Re: [PR] [SPARK-46208][PS][DOCS] Adding a link for latest Pandas API specifications. [spark]

2023-12-02 Thread via GitHub
HyukjinKwon commented on PR #44115: URL: https://github.com/apache/spark/pull/44115#issuecomment-1837387999 Merged to master. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

Re: [PR] [SPARK-45786][SQL][3.3] Fix inaccurate Decimal multiplication and division results [spark]

2023-12-02 Thread via GitHub
dongjoon-hyun commented on PR #43705: URL: https://github.com/apache/spark/pull/43705#issuecomment-1837378858 In any way, Apache Spark 3.3 will reach the end of life in two weeks. - https://lists.apache.org/thread/ml25jdzmtcl8r6fhg849zzmqz82qh3jw (`Apache Spark 3.3.4 EOL Release?`)

Re: [PR] [SPARK-46206][PS] Use a narrower scope exception for SQL processor [spark]

2023-12-02 Thread via GitHub
dongjoon-hyun commented on PR #44114: URL: https://github.com/apache/spark/pull/44114#issuecomment-1837378161 Merged to master for Apache Spark 4. Thank you, @itholic . -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use t

Re: [PR] [SPARK-46206][PS] Use a narrower scope exception for SQL processor [spark]

2023-12-02 Thread via GitHub
dongjoon-hyun closed pull request #44114: [SPARK-46206][PS] Use a narrower scope exception for SQL processor URL: https://github.com/apache/spark/pull/44114 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

Re: [PR] [SPARK-46212][CORE][SQL][SS][CONNECT][MLLIB][GRAPHX][DSTREAM][PROTOBUF][EXAMPLES] Use other functions to simplify the code pattern of `s.c.MapOps#view.mapValues` [spark]

2023-12-02 Thread via GitHub
dongjoon-hyun commented on PR #44122: URL: https://github.com/apache/spark/pull/44122#issuecomment-1837377737 Merged to master for Apache Spark 4. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

Re: [PR] [SPARK-46212][CORE][SQL][SS][CONNECT][MLLIB][GRAPHX][DSTREAM][PROTOBUF][EXAMPLES] Use other functions to simplify the code pattern of `s.c.MapOps#view.mapValues` [spark]

2023-12-02 Thread via GitHub
dongjoon-hyun closed pull request #44122: [SPARK-46212][CORE][SQL][SS][CONNECT][MLLIB][GRAPHX][DSTREAM][PROTOBUF][EXAMPLES] Use other functions to simplify the code pattern of `s.c.MapOps#view.mapValues` URL: https://github.com/apache/spark/pull/44122 -- This is an automated message from the

Re: [PR] [SPARK-46069][SQL] Support unwrap timestamp type to date type [spark]

2023-12-02 Thread via GitHub
wangyum commented on PR #43982: URL: https://github.com/apache/spark/pull/43982#issuecomment-1837365342 Merged to master. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To

Re: [PR] [SPARK-46069][SQL] Support unwrap timestamp type to date type [spark]

2023-12-02 Thread via GitHub
wangyum closed pull request #43982: [SPARK-46069][SQL] Support unwrap timestamp type to date type URL: https://github.com/apache/spark/pull/43982 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

Re: [PR] [SPARK-46215][CORE] Improve `FileSystemPersistenceEngine` to allow nonexistent parents [spark]

2023-12-02 Thread via GitHub
dongjoon-hyun commented on PR #44127: URL: https://github.com/apache/spark/pull/44127#issuecomment-1837364849 Merged to master for Apache Spark 4.0.0 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

Re: [PR] [SPARK-46215][CORE] Improve `FileSystemPersistenceEngine` to allow nonexistent parents [spark]

2023-12-02 Thread via GitHub
dongjoon-hyun closed pull request #44127: [SPARK-46215][CORE] Improve `FileSystemPersistenceEngine` to allow nonexistent parents URL: https://github.com/apache/spark/pull/44127 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and

Re: [PR] [SPARK-46215][CORE] Improve `FileSystemPersistenceEngine` to allow nonexistent parents [spark]

2023-12-02 Thread via GitHub
dongjoon-hyun commented on PR #44127: URL: https://github.com/apache/spark/pull/44127#issuecomment-1837364358 Thank you so much, @wangyum ! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the spe

Re: [PR] [SPARK-46215][CORE] Improve `FileSystemPersistenceEngine` to allow nonexistent parents [spark]

2023-12-02 Thread via GitHub
dongjoon-hyun commented on PR #44127: URL: https://github.com/apache/spark/pull/44127#issuecomment-1837345014 Could you review this PR, @wangyum ? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

Re: [PR] [SPARK-43228][SQL] Join keys also match PartitioningCollection in CoalesceBucketsInJoin [spark]

2023-12-02 Thread via GitHub
wangyum commented on PR #44128: URL: https://github.com/apache/spark/pull/44128#issuecomment-1837343886 Thanks. Merged to master. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comm

Re: [PR] [SPARK-43228][SQL] Join keys also match PartitioningCollection in CoalesceBucketsInJoin [spark]

2023-12-02 Thread via GitHub
wangyum closed pull request #44128: [SPARK-43228][SQL] Join keys also match PartitioningCollection in CoalesceBucketsInJoin URL: https://github.com/apache/spark/pull/44128 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use t

[PR] [SPARK-46216][CORE] Improve `FileSystemPersistenceEngine` to support compressions [spark]

2023-12-02 Thread via GitHub
dongjoon-hyun opened a new pull request, #44129: URL: https://github.com/apache/spark/pull/44129 … ### What changes were proposed in this pull request? ### Why are the changes needed? ### Does this PR introduce _any_ user-facing change?

Re: [PR] [SPARK-46208][PS][DOCS] Adding a link for latest Pandas API specifications. [spark]

2023-12-02 Thread via GitHub
dongjoon-hyun commented on code in PR #44115: URL: https://github.com/apache/spark/pull/44115#discussion_r1412910434 ## python/docs/source/reference/pyspark.pandas/index.rst: ## @@ -23,7 +23,7 @@ Pandas API on Spark This page gives an overview of all public pandas API on Spark.

Re: [PR] [SPARK-40609][SQL] Unwrap cast in the join condition to unlock bucketed read [spark]

2023-12-02 Thread via GitHub
wankunde closed pull request #44120: [SPARK-40609][SQL] Unwrap cast in the join condition to unlock bucketed read URL: https://github.com/apache/spark/pull/44120 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL abo

Re: [PR] [SPARK-46206][PS] Use a narrower scope exception for SQL processor [spark]

2023-12-02 Thread via GitHub
itholic commented on PR #44114: URL: https://github.com/apache/spark/pull/44114#issuecomment-1837310856 cc @HyukjinKwon @zhengruifeng -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

Re: [PR] [SPARK-46208][PS][DOCS] Use specific Pandas version for API specifications. [spark]

2023-12-02 Thread via GitHub
itholic commented on code in PR #44115: URL: https://github.com/apache/spark/pull/44115#discussion_r1412909844 ## python/docs/source/reference/pyspark.pandas/index.rst: ## @@ -23,7 +23,7 @@ Pandas API on Spark This page gives an overview of all public pandas API on Spark. ..

[PR] [SPARK-43228][SQL] Join keys also match PartitioningCollection in CoalesceBucketsInJoin [spark]

2023-12-02 Thread via GitHub
wankunde opened a new pull request, #44128: URL: https://github.com/apache/spark/pull/44128 ### What changes were proposed in this pull request? This PR updates `CoalesceBucketsInJoin.satisfiesOutputPartitioning` to support matching `PartitioningCollection`. A common case is that we a

Re: [PR] [SPARK-46208][PS][DOCS] Use specific Pandas version for API specifications. [spark]

2023-12-02 Thread via GitHub
itholic commented on PR #44115: URL: https://github.com/apache/spark/pull/44115#issuecomment-1837309501 Sounds reasonable to me. Let me just close this ticket. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL ab

Re: [PR] [SPARK-46208][PS][DOCS] Use specific Pandas version for API specifications. [spark]

2023-12-02 Thread via GitHub
itholic closed pull request #44115: [SPARK-46208][PS][DOCS] Use specific Pandas version for API specifications. URL: https://github.com/apache/spark/pull/44115 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above

Re: [PR] [SPARK-46213][PYTHON] Introduce `PySparkImportError` for error framework [spark]

2023-12-02 Thread via GitHub
itholic commented on PR #44123: URL: https://github.com/apache/spark/pull/44123#issuecomment-1837309298 > Since I'm not sure, I simply remove the following part from the PR description. Yeah, that was my mistake. Thanks for the correction! -- This is an automated message from the A

Re: [PR] [SPARK-46215][CORE] Improve `FileSystemPersistenceEngine` to allow nonexistent parents [spark]

2023-12-02 Thread via GitHub
dongjoon-hyun commented on PR #44127: URL: https://github.com/apache/spark/pull/44127#issuecomment-1837308671 Could you review this PR, @HyukjinKwon ? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

Re: [PR] [SPARK-46208][PS][DOCS] Use specific Pandas version for API specifications. [spark]

2023-12-02 Thread via GitHub
HyukjinKwon commented on code in PR #44115: URL: https://github.com/apache/spark/pull/44115#discussion_r1412908509 ## python/docs/source/reference/pyspark.pandas/index.rst: ## @@ -23,7 +23,7 @@ Pandas API on Spark This page gives an overview of all public pandas API on Spark.

Re: [PR] [SPARK-43536][CORE] Fixing statsd sink reporter [spark]

2023-12-02 Thread via GitHub
github-actions[bot] closed pull request #41199: [SPARK-43536][CORE] Fixing statsd sink reporter URL: https://github.com/apache/spark/pull/41199 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the sp

Re: [PR] [WIP][SPARK-44943][SQL] Fix overflow detection logic in conv [spark]

2023-12-02 Thread via GitHub
github-actions[bot] commented on PR #42652: URL: https://github.com/apache/spark/pull/42652#issuecomment-1837289982 We're closing this PR because it hasn't been updated in a while. This isn't a judgement on the merit of the PR in any way. It's just a way of keeping the PR queue manageable.

Re: [PR] [SPARK-43521][SQL] Add `CREATE TABLE LIKE FILE` statement [spark]

2023-12-02 Thread via GitHub
github-actions[bot] commented on PR #41251: URL: https://github.com/apache/spark/pull/41251#issuecomment-1837289989 We're closing this PR because it hasn't been updated in a while. This isn't a judgement on the merit of the PR in any way. It's just a way of keeping the PR queue manageable.

Re: [PR] [SPARK-46215][CORE] Improve `FileSystemPersistenceEngine` to allow non-exist parents [spark]

2023-12-02 Thread via GitHub
dongjoon-hyun commented on PR #44127: URL: https://github.com/apache/spark/pull/44127#issuecomment-1837282489 Could you review this when you have some time, @LuciferYang ? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and us

[PR] [SPARK-46215][CORE] Improve `FileSystemPersistenceEngine` to allow non-exist parents [spark]

2023-12-02 Thread via GitHub
dongjoon-hyun opened a new pull request, #44127: URL: https://github.com/apache/spark/pull/44127 ### What changes were proposed in this pull request? This PR aims to improve `FileSystemPersistenceEngine` to allow non-exist parents ### Why are the changes needed? To preve

Re: [PR] [SPARK-46213][PYTHON] Introduce `PySparkImportError` for error framework [spark]

2023-12-02 Thread via GitHub
dongjoon-hyun closed pull request #44123: [SPARK-46213][PYTHON] Introduce `PySparkImportError` for error framework URL: https://github.com/apache/spark/pull/44123 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL ab

Re: [PR] [SPARK-46213][PYTHON] Introduce `PySparkImportError` for error framework [spark]

2023-12-02 Thread via GitHub
dongjoon-hyun commented on PR #44123: URL: https://github.com/apache/spark/pull/44123#issuecomment-1837255140 Since I'm not sure, I simply remove this part, `to replace `IndexError` in pyspark.sql.*.` from the PR description. Merged to master~ Thank you, @itholic . -- This is an autom

Re: [PR] [MINOR][DOCS] Add `since` tag for `Scan.reportDriverMetrics` [spark]

2023-12-02 Thread via GitHub
dongjoon-hyun closed pull request #44116: [MINOR][DOCS] Add `since` tag for `Scan.reportDriverMetrics` URL: https://github.com/apache/spark/pull/44116 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

Re: [PR] [SPARK-41532][CONNECT][PYTHON][FOLLOWUP] Expose `SessionNotSameException` as PySpark exceptions [spark]

2023-12-02 Thread via GitHub
dongjoon-hyun commented on PR #44121: URL: https://github.com/apache/spark/pull/44121#issuecomment-1837237412 Merged to master. Thank you, @itholic . -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

Re: [PR] [SPARK-41532][CONNECT][PYTHON][FOLLOWUP] Expose `SessionNotSameException` as PySpark exceptions [spark]

2023-12-02 Thread via GitHub
dongjoon-hyun closed pull request #44121: [SPARK-41532][CONNECT][PYTHON][FOLLOWUP] Expose `SessionNotSameException` as PySpark exceptions URL: https://github.com/apache/spark/pull/44121 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to G

Re: [PR] [SPARK-45975][SQL][TESTS][3.5] Reset storeAssignmentPolicy to original [spark]

2023-12-02 Thread via GitHub
LuciferYang commented on PR #44126: URL: https://github.com/apache/spark/pull/44126#issuecomment-1837235615 Thanks @dongjoon-hyun ~ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific co

Re: [PR] [SPARK-46210][K8S][DOCS] Update `YuniKorn` docs with v1.4 [spark]

2023-12-02 Thread via GitHub
LuciferYang commented on PR #44117: URL: https://github.com/apache/spark/pull/44117#issuecomment-1837235510 Seems like I'm one step ahead, hahaha ~ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

Re: [PR] [SPARK-46210][K8S][DOCS] Update `YuniKorn` docs with v1.4 [spark]

2023-12-02 Thread via GitHub
dongjoon-hyun commented on PR #44117: URL: https://github.com/apache/spark/pull/44117#issuecomment-1837235373 Oh, you are faster than me. Thank you a lots! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above

Re: [PR] [SPARK-46210][K8S][DOCS] Update `YuniKorn` docs with v1.4 [spark]

2023-12-02 Thread via GitHub
LuciferYang closed pull request #44117: [SPARK-46210][K8S][DOCS] Update `YuniKorn` docs with v1.4 URL: https://github.com/apache/spark/pull/44117 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

Re: [PR] [SPARK-46210][K8S][DOCS] Update `YuniKorn` docs with v1.4 [spark]

2023-12-02 Thread via GitHub
dongjoon-hyun commented on PR #44117: URL: https://github.com/apache/spark/pull/44117#issuecomment-1837235185 Thank you, @LuciferYang . Merged to master. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

Re: [PR] [SPARK-46205][CORE][TESTS][FOLLOWUP] Simplify PersistenceEngineBenchmark [spark]

2023-12-02 Thread via GitHub
dongjoon-hyun commented on PR #44118: URL: https://github.com/apache/spark/pull/44118#issuecomment-1837234980 Thank you, @LuciferYang . Merged to master. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

Re: [PR] [SPARK-45975][SQL][TESTS][3.5] Reset storeAssignmentPolicy to original [spark]

2023-12-02 Thread via GitHub
dongjoon-hyun commented on PR #44126: URL: https://github.com/apache/spark/pull/44126#issuecomment-1837234889 Merged to branch-3.5. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific co

Re: [PR] [SPARK-45975][SQL][TESTS][3.5] Reset storeAssignmentPolicy to original [spark]

2023-12-02 Thread via GitHub
dongjoon-hyun closed pull request #44126: [SPARK-45975][SQL][TESTS][3.5] Reset storeAssignmentPolicy to original URL: https://github.com/apache/spark/pull/44126 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL abov

Re: [PR] [SPARK-46205][CORE][TESTS][FOLLOWUP] Simplify PersistenceEngineBenchmark [spark]

2023-12-02 Thread via GitHub
dongjoon-hyun closed pull request #44118: [SPARK-46205][CORE][TESTS][FOLLOWUP] Simplify PersistenceEngineBenchmark URL: https://github.com/apache/spark/pull/44118 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL ab

Re: [PR] [SPARK-45943][SQL] Move DetermineTableStats to resolution rules [spark]

2023-12-02 Thread via GitHub
dongjoon-hyun commented on PR #43867: URL: https://github.com/apache/spark/pull/43867#issuecomment-1837234483 Thank you, @LuciferYang . -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specifi

Re: [PR] [SPARK-46210][K8S][DOCS] Update `YuniKorn` docs with v1.4 [spark]

2023-12-02 Thread via GitHub
dongjoon-hyun commented on PR #44117: URL: https://github.com/apache/spark/pull/44117#issuecomment-1837232986 Could you review this doc PR, @LuciferYang ? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above t

Re: [PR] [SPARK-46205][CORE][TESTS][FOLLOWUP] Simplify PersistenceEngineBenchmark [spark]

2023-12-02 Thread via GitHub
dongjoon-hyun commented on PR #44118: URL: https://github.com/apache/spark/pull/44118#issuecomment-1837232922 Could you review this PR, @LuciferYang ? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

Re: [PR] [SPARK-46092][SQL] Don't push down Parquet row group filters that overflow [spark]

2023-12-02 Thread via GitHub
dongjoon-hyun commented on PR #44006: URL: https://github.com/apache/spark/pull/44006#issuecomment-1837232773 Merged to master for Apache Spark 4.0.0. Could you make backporting PRs to make it sure that all tests pass in the release branches, @johanl-db ? Thank you, @johanl-db

Re: [PR] [SPARK-45975][SQL][TESTS][3.5] Reset storeAssignmentPolicy to original [spark]

2023-12-02 Thread via GitHub
LuciferYang commented on PR #44126: URL: https://github.com/apache/spark/pull/44126#issuecomment-1837232251 @dongjoon-hyun For the master branch, this pr was merged before https://github.com/apache/spark/pull/43867, so the test fail will not be reproduced. But for branch-3.5, due t

Re: [PR] [SPARK-46092][SQL] Don't push down Parquet row group filters that overflow [spark]

2023-12-02 Thread via GitHub
dongjoon-hyun closed pull request #44006: [SPARK-46092][SQL] Don't push down Parquet row group filters that overflow URL: https://github.com/apache/spark/pull/44006 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

Re: [PR] [SPARK-46214][BUILD] Upgrade commons-io to 2.15.1 [spark]

2023-12-02 Thread via GitHub
dongjoon-hyun commented on PR #44125: URL: https://github.com/apache/spark/pull/44125#issuecomment-1837231001 Thank you, @panbingkun and @LuciferYang . -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to g

Re: [PR] [SPARK-46214][BUILD] Upgrade commons-io to 2.15.1 [spark]

2023-12-02 Thread via GitHub
dongjoon-hyun closed pull request #44125: [SPARK-46214][BUILD] Upgrade commons-io to 2.15.1 URL: https://github.com/apache/spark/pull/44125 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specif

Re: [PR] [SPARK-45975][SQL][TESTS][3.5] Reset storeAssignmentPolicy to original [spark]

2023-12-02 Thread via GitHub
dongjoon-hyun commented on PR #44126: URL: https://github.com/apache/spark/pull/44126#issuecomment-1837230461 Ack, @LuciferYang . - Did you see the same failure in `master` branch? - If this doesn't help `branch-3.5`, we can simply close this PR. -- This is an automated message from

Re: [PR] [SPARK-46173] Skipping trimAll call during date parsing [spark]

2023-12-02 Thread via GitHub
MaxGekk commented on PR #44110: URL: https://github.com/apache/spark/pull/44110#issuecomment-1837229888 > The change also includes a small unit benchmark for this particular case. I wonder of other benchmarks. Do you observe perf regressions? I am asking just in case. -- This is an

Re: [PR] [SPARK-46210][K8S][DOCS] Update `YuniKorn` docs with v1.4 [spark]

2023-12-02 Thread via GitHub
dongjoon-hyun commented on PR #44117: URL: https://github.com/apache/spark/pull/44117#issuecomment-1837213257 Thank you, @itholic ! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific co

Re: [PR] [SPARK-45975][TESTS][3.5] Reset storeAssignmentPolicy to original [spark]

2023-12-02 Thread via GitHub
LuciferYang commented on PR #44126: URL: https://github.com/apache/spark/pull/44126#issuecomment-1837143832 - https://github.com/apache/spark/actions/runs/7041075997/job/19163039661 - https://github.com/apache/spark/actions/runs/7054970010/job/19204693143 ``` [info] HiveSourceRow

[PR] [SPARK-45975][TESTS][3.5] Reset storeAssignmentPolicy to original [spark]

2023-12-02 Thread via GitHub
LuciferYang opened a new pull request, #44126: URL: https://github.com/apache/spark/pull/44126 ### What changes were proposed in this pull request? Reset storeAssignmentPolicy to original in HiveCompatibilitySuite. ### Why are the changes needed? STORE_ASSIGNMENT_POLICY was not r

Re: [PR] [SPARK-45943][SQL] Move DetermineTableStats to resolution rules [spark]

2023-12-02 Thread via GitHub
LuciferYang commented on PR #43867: URL: https://github.com/apache/spark/pull/43867#issuecomment-1837142397 seems we should backport SPARK-45975 to https://github.com/apache/spark/pull/43870 @wForget ? -- This is an automated message from the Apache Git Service. To respond to the message,

[PR] [SPARK-46214][BUILD] Upgrade commons-io to 2.15.1 [spark]

2023-12-02 Thread via GitHub
panbingkun opened a new pull request, #44125: URL: https://github.com/apache/spark/pull/44125 ### What changes were proposed in this pull request? ### Why are the changes needed? ### Does this PR introduce _any_ user-facing change? ### How

Re: [PR] [SPARK-46214][BUILD] Upgrade commons-io to 2.15.1 [spark]

2023-12-02 Thread via GitHub
panbingkun closed pull request #44124: [SPARK-46214][BUILD] Upgrade commons-io to 2.15.1 URL: https://github.com/apache/spark/pull/44124 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

[PR] [SPARK-46214][BUILD] Upgrade commons-io to 2.15.1 [spark]

2023-12-02 Thread via GitHub
panbingkun opened a new pull request, #44124: URL: https://github.com/apache/spark/pull/44124 ### What changes were proposed in this pull request? ### Why are the changes needed? ### Does this PR introduce _any_ user-facing change? ### How

Re: [PR] [SPARK-46209] Add java 11 only yml for version before 3.5 [spark-docker]

2023-12-02 Thread via GitHub
Yikun commented on PR #58: URL: https://github.com/apache/spark-docker/pull/58#issuecomment-1837138971 @HyukjinKwon Thanks merged. - ghcr 3.4.2 publish (test): https://github.com/apache/spark-docker/actions/runs/7070227446 -- This is an automated message from the Apache Git Service

Re: [PR] [SPARK-46209] Add java 11 only yml for version before 3.5 [spark-docker]

2023-12-02 Thread via GitHub
Yikun closed pull request #58: [SPARK-46209] Add java 11 only yml for version before 3.5 URL: https://github.com/apache/spark-docker/pull/58 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the speci

[PR] [SPARK-46213][PYTHON] Introduce `PySparkImportError` for error framework [spark]

2023-12-02 Thread via GitHub
itholic opened a new pull request, #44123: URL: https://github.com/apache/spark/pull/44123 ### What changes were proposed in this pull request? This PR proposes to introduce `PySparkImportError` for error framework to replace `IndexError` in pyspark.sql.*. ### Why are t

Re: [PR] [SPARK-45943][SQL] Move DetermineTableStats to resolution rules [spark]

2023-12-02 Thread via GitHub
LuciferYang commented on PR #43867: URL: https://github.com/apache/spark/pull/43867#issuecomment-1837128189 After merging this pr, there are test failures in branch-3.5. Can you take a look at this issue? @wForget https://github.com/apache/spark/actions/runs/7041075997/job/191630

Re: [PR] [SPARK-46212][CORE][SQL][SS][CONNECT][MLLIB][GRAPHX][DSTREAM][PROTOBUF][EXAMPLES] Replace `s.c.MapOps#view.mapValues` with `s.c.MapOps#map/s.c.immutable.MapOps#transform` [spark]

2023-12-02 Thread via GitHub
LuciferYang commented on PR #44122: URL: https://github.com/apache/spark/pull/44122#issuecomment-1837121298 cc @cloud-fan -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. T

[PR] [SPARK-46212][CORE][SQL][SS][CONNECT][MLLIB][GRAPHX][DSTREAM][PROTOBUF][EXAMPLES] Replace `s.c.MapOps#view.mapValues` with `s.c.mutable.MapOps.map/s.c.MapOps.map/s.c.immutable.MapOps#transform` [

2023-12-02 Thread via GitHub
LuciferYang opened a new pull request, #44122: URL: https://github.com/apache/spark/pull/44122 ### What changes were proposed in this pull request? ### Why are the changes needed? ### Does this PR introduce _any_ user-facing change? ### How

Re: [PR] [SPARK-40609][SQL] Unwrap cast in the join condition to unlock bucketed read [spark]

2023-12-02 Thread via GitHub
wankunde commented on PR #44120: URL: https://github.com/apache/spark/pull/44120#issuecomment-1837105605 Retest this please -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

Re: [PR] [SPARK-46208][PS][DOCS] Use specific Pandas version for API specifications. [spark]

2023-12-02 Thread via GitHub
itholic commented on code in PR #44115: URL: https://github.com/apache/spark/pull/44115#discussion_r1412770204 ## python/docs/source/reference/pyspark.pandas/index.rst: ## @@ -23,7 +23,7 @@ Pandas API on Spark This page gives an overview of all public pandas API on Spark. ..

Re: [PR] [SPARK-46208][PS][DOCS] Use specific Pandas version for API specifications. [spark]

2023-12-02 Thread via GitHub
itholic commented on PR #44115: URL: https://github.com/apache/spark/pull/44115#issuecomment-1837100041 cc @HyukjinKwon @zhengruifeng -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

[PR] [SPARK-41532][CONNECT][FOLLOWUP] Expose `SessionNotSameException` as PySpark exceptions [spark]

2023-12-02 Thread via GitHub
itholic opened a new pull request, #44121: URL: https://github.com/apache/spark/pull/44121 ### What changes were proposed in this pull request? This PR proposes to expose `SessionNotSameException` as PySpark exceptions. ### Why are the changes needed? `SessionNotSameE

Re: [PR] EXECUTE IMMEDIATE SQL support [spark]

2023-12-02 Thread via GitHub
MaxGekk commented on code in PR #44093: URL: https://github.com/apache/spark/pull/44093#discussion_r1412764471 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/executeImmediate.scala: ## @@ -0,0 +1,130 @@ +/* + * Licensed to the Apache Software Foundation (A