Re: [PR] [SPARK-45934][DOCS] Fix `Spark Standalone` documentation table layout [spark]

2023-11-15 Thread via GitHub
dongjoon-hyun commented on code in PR #43814: URL: https://github.com/apache/spark/pull/43814#discussion_r1394824967 ## docs/running-on-kubernetes.md: ## @@ -1203,17 +1203,17 @@ See the [configuration page](configuration.html) for information on Spark config 3.0.0 -

Re: [PR] [SPARK-45934][DOCS] Fix `Spark Standalone` documentation table layout [spark]

2023-11-15 Thread via GitHub
dongjoon-hyun commented on code in PR #43814: URL: https://github.com/apache/spark/pull/43814#discussion_r1394824967 ## docs/running-on-kubernetes.md: ## @@ -1203,17 +1203,17 @@ See the [configuration page](configuration.html) for information on Spark config 3.0.0 -

Re: [PR] [SPARK-45934][DOCS] Fix `Spark Standalone` documentation table layout [spark]

2023-11-15 Thread via GitHub
bjornjorgensen commented on code in PR #43814: URL: https://github.com/apache/spark/pull/43814#discussion_r1394823713 ## docs/running-on-kubernetes.md: ## @@ -1203,17 +1203,17 @@ See the [configuration page](configuration.html) for information on Spark config 3.0.0 -

Re: [PR] [SPARK-45941][PS] Upgrade `pandas` to version 2.1.3 [spark]

2023-11-15 Thread via GitHub
dongjoon-hyun closed pull request #43822: [SPARK-45941][PS] Upgrade `pandas` to version 2.1.3 URL: https://github.com/apache/spark/pull/43822 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

Re: [PR] [SPARK-45941][PS] Upgrade `pandas` to version 2.1.3 [spark]

2023-11-15 Thread via GitHub
dongjoon-hyun commented on PR #43822: URL: https://github.com/apache/spark/pull/43822#issuecomment-1813276250 Ya, it looks like that. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

Re: [PR] [SPARK-45941][PS] Upgrade `pandas` to version 2.1.3 [spark]

2023-11-15 Thread via GitHub
bjornjorgensen commented on PR #43822: URL: https://github.com/apache/spark/pull/43822#issuecomment-1813274288 https://github.com/bjornjorgensen/spark/actions/runs/6881177899 I pushed this 3 hours ago and waited to all the tests passed before I open this PR. -- This is an

Re: [PR] [SPARK-45941][PS] Upgrade `pandas` to version 2.1.3 [spark]

2023-11-15 Thread via GitHub
dongjoon-hyun commented on PR #43822: URL: https://github.com/apache/spark/pull/43822#issuecomment-1813270978 BTW, your image looks like an old one (3 hours ago). This PR is created 18 minutes ago, isn't it?

Re: [PR] [SPARK-45941][PS] Upgrade `pandas` to version 2.1.3 [spark]

2023-11-15 Thread via GitHub
dongjoon-hyun commented on PR #43822: URL: https://github.com/apache/spark/pull/43822#issuecomment-1813269062 You can provide the link here, @bjornjorgensen ~ That would be enough if the CI is running in your side successfully. -- This is an automated message from the Apache Git Service.

Re: [PR] [SPARK-45941][PS] Upgrade `pandas` to version 2.1.3 [spark]

2023-11-15 Thread via GitHub
bjornjorgensen commented on PR #43822: URL: https://github.com/apache/spark/pull/43822#issuecomment-1813263774 it seams to be happy ![image](https://github.com/apache/spark/assets/47577197/2b9a1588-2337-402b-84c0-4337078d8b20) are there anything I can do with this then?

Re: [PR] [SPARK-45934][DOCS] Fix `Spark Standalone` documentation table layout [spark]

2023-11-15 Thread via GitHub
dongjoon-hyun commented on PR #43814: URL: https://github.com/apache/spark/pull/43814#issuecomment-1813263099 Could you review this PR, @bjornjorgensen ? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above

Re: [PR] [SPARK-45527][CORE] Use fraction to do the resource calculation [spark]

2023-11-15 Thread via GitHub
tgravescs commented on code in PR #43494: URL: https://github.com/apache/spark/pull/43494#discussion_r1384058605 ## core/src/main/scala/org/apache/spark/scheduler/ExecutorResourcesAmounts.scala: ## @@ -0,0 +1,202 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under

Re: [PR] [SPARK-43393][SQL] Address sequence expression overflow bug. [spark]

2023-11-15 Thread via GitHub
dongjoon-hyun commented on PR #41072: URL: https://github.com/apache/spark/pull/41072#issuecomment-1813257544 Could you fix the compilation of your PRs, @thepinetree ? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use

Re: [PR] [SPARK-45941][PS] Upgrade `pandas` to version 2.1.3 [spark]

2023-11-15 Thread via GitHub
dongjoon-hyun commented on PR #43822: URL: https://github.com/apache/spark/pull/43822#issuecomment-1813254752 Thanks. Could you make CI happy, @bjornjorgensen ? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

[PR] [SPARK-45941][PS] Update `pandas` to version 2.1.3 [spark]

2023-11-15 Thread via GitHub
bjornjorgensen opened a new pull request, #43822: URL: https://github.com/apache/spark/pull/43822 ### What changes were proposed in this pull request? Update pandas from 2.1.2 to 2.1.3 ### Why are the changes needed? Fixed infinite recursion from operations that return a new

Re: [PR] [SPARK-45762][CORE] Support shuffle managers defined in user jars by changing startup order [spark]

2023-11-15 Thread via GitHub
abellina commented on PR #43627: URL: https://github.com/apache/spark/pull/43627#issuecomment-1813246792 Thanks @mridulm, yes the commits make sense, it brings back the late initialization in the driver. I tested the change, the main difference from your patch @mridulm is I had to still

Re: [PR] [SPARK-45527][CORE] Use fraction to do the resource calculation [spark]

2023-11-15 Thread via GitHub
tgravescs commented on PR #43494: URL: https://github.com/apache/spark/pull/43494#issuecomment-1813235365 > 23/11/13 09:40:22 INFO TaskSetManager: Starting task 1.0 in stage 0.0 (TID 1) (10.51.70.102, executor 0, partition 1, PROCESS_LOCAL, 7823 bytes) taskResourceAssignments Map(gpu

Re: [PR] [SPARK-45927][PYTHON] Update path handling in Python data source [spark]

2023-11-15 Thread via GitHub
allisonwang-db commented on code in PR #43809: URL: https://github.com/apache/spark/pull/43809#discussion_r1394689877 ## sql/core/src/main/scala/org/apache/spark/sql/DataFrameReader.scala: ## @@ -246,7 +246,15 @@ class DataFrameReader private[sql](sparkSession: SparkSession)

Re: [PR] [SPARK-45927][PYTHON] Update path handling in Python data source [spark]

2023-11-15 Thread via GitHub
allisonwang-db commented on code in PR #43809: URL: https://github.com/apache/spark/pull/43809#discussion_r1394685914 ## python/pyspark/sql/datasource.py: ## @@ -45,30 +45,19 @@ class DataSource(ABC): """ @final -def __init__( -self, -paths:

Re: [PR] [SPARK-45810][Python] Create Python UDTF API to stop consuming rows from the input table [spark]

2023-11-15 Thread via GitHub
dtenedor commented on code in PR #43682: URL: https://github.com/apache/spark/pull/43682#discussion_r1394630877 ## python/pyspark/sql/tests/test_udtf.py: ## @@ -2482,6 +2533,7 @@ def tearDownClass(cls): super(UDTFTests, cls).tearDownClass() +''' Review

Re: [PR] [SPARK-43393][SQL][3.5] Address sequence expression overflow bug. [spark]

2023-11-15 Thread via GitHub
dongjoon-hyun commented on PR #43820: URL: https://github.com/apache/spark/pull/43820#issuecomment-1813034968 Could you fix the compilation? ``` [error]

Re: [PR] [SPARK-43393][SQL][3.3] Address sequence expression overflow bug. [spark]

2023-11-15 Thread via GitHub
dongjoon-hyun commented on PR #43821: URL: https://github.com/apache/spark/pull/43821#issuecomment-1813032065 Is GitHub Action triggered on this PR, @thepinetree ? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the

Re: [PR] [SPARK-43393][SQL] Address sequence expression overflow bug. [spark]

2023-11-15 Thread via GitHub
dongjoon-hyun commented on PR #41072: URL: https://github.com/apache/spark/pull/41072#issuecomment-1813024170 Thank you so much! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

Re: [PR] [SPARK-43393][SQL] Address sequence expression overflow bug. [spark]

2023-11-15 Thread via GitHub
thepinetree commented on PR #41072: URL: https://github.com/apache/spark/pull/41072#issuecomment-1813023418 @dongjoon-hyun @cloud-fan Backport PRs: * 3.3: https://github.com/apache/spark/pull/43821 * 3.4: https://github.com/apache/spark/pull/43819 * 3.5:

[PR] [SPARK-43393][SQL][3.3] Address sequence expression overflow bug. [spark]

2023-11-15 Thread via GitHub
thepinetree opened a new pull request, #43821: URL: https://github.com/apache/spark/pull/43821 ### What changes were proposed in this pull request? Spark has a (long-standing) overflow bug in the `sequence` expression. Consider the following operations: ``` spark.sql("CREATE

[PR] [SPARK-43393][SQL][3.5] Address sequence expression overflow bug. [spark]

2023-11-15 Thread via GitHub
thepinetree opened a new pull request, #43820: URL: https://github.com/apache/spark/pull/43820 ### What changes were proposed in this pull request? Spark has a (long-standing) overflow bug in the `sequence` expression. Consider the following operations: ``` spark.sql("CREATE

[PR] [SPARK-43393][SQL][3.4] Address sequence expression overflow bug. [spark]

2023-11-15 Thread via GitHub
thepinetree opened a new pull request, #43819: URL: https://github.com/apache/spark/pull/43819 ### What changes were proposed in this pull request? Spark has a (long-standing) overflow bug in the `sequence` expression. Consider the following operations: ``` spark.sql("CREATE

Re: [PR] [SPARK-45934][DOCS] Fix `Spark Standalone` documentation table layout [spark]

2023-11-15 Thread via GitHub
dongjoon-hyun commented on code in PR #43814: URL: https://github.com/apache/spark/pull/43814#discussion_r1394491810 ## docs/running-on-kubernetes.md: ## @@ -590,41 +590,41 @@ Some of these include: See the [configuration page](configuration.html) for information on Spark

Re: [PR] [SPARK-43393][SQL] Address sequence expression overflow bug. [spark]

2023-11-15 Thread via GitHub
dongjoon-hyun commented on PR #41072: URL: https://github.com/apache/spark/pull/41072#issuecomment-1812937385 Oh this seems to break branch-3.5. - https://github.com/apache/spark/actions/runs/6873765275 Let me revert this from branch-3.5. Given the situation, we can start backport

Re: [PR] [SPARK-45562][DOCS] Regenerate `docs/sql-error-conditions.md` and add `42KDF` to `SQLSTATE table` in `error/README.md` [spark]

2023-11-15 Thread via GitHub
dongjoon-hyun closed pull request #43817: [SPARK-45562][DOCS] Regenerate `docs/sql-error-conditions.md` and add `42KDF` to `SQLSTATE table` in `error/README.md` URL: https://github.com/apache/spark/pull/43817 -- This is an automated message from the Apache Git Service. To respond to the

Re: [PR] [SPARK-45934][DOCS] Fix `spark-standalone.md` and `running-on-kubernetes.md` table layout [spark]

2023-11-15 Thread via GitHub
dongjoon-hyun commented on code in PR #43814: URL: https://github.com/apache/spark/pull/43814#discussion_r1394491810 ## docs/running-on-kubernetes.md: ## @@ -590,41 +590,41 @@ Some of these include: See the [configuration page](configuration.html) for information on Spark

Re: [PR] [SPARK-45873][CORE][YARN][K8S] Make ExecutorFailureTracker more tolerant when app remains sufficient resources [spark]

2023-11-15 Thread via GitHub
tgravescs commented on PR #43746: URL: https://github.com/apache/spark/pull/43746#issuecomment-1812838085 > > Preemption on yarn shouldn't be going against the number of failed executors. If it is then something has changed and we should fix that. > Yes, you are right What do

Re: [PR] [SPARK-45873][CORE][YARN][K8S] Make ExecutorFailureTracker more tolerant when app remains sufficient resources [spark]

2023-11-15 Thread via GitHub
tgravescs commented on code in PR #43746: URL: https://github.com/apache/spark/pull/43746#discussion_r1394411937 ## core/src/main/scala/org/apache/spark/internal/config/package.scala: ## @@ -2087,6 +2087,17 @@ package object config { .doubleConf .createOptional

Re: [PR] [SPARK-45533][CORE] Use j.l.r.Cleaner instead of finalize for RocksDBIterator/LevelDBIterator [spark]

2023-11-15 Thread via GitHub
mridulm commented on code in PR #43502: URL: https://github.com/apache/spark/pull/43502#discussion_r1394409363 ## common/kvstore/src/main/java/org/apache/spark/util/kvstore/LevelDBIterator.java: ## @@ -182,23 +193,34 @@ public boolean skip(long n) { @Override public

Re: [PR] [SPARK-45533][CORE] Use j.l.r.Cleaner instead of finalize for RocksDBIterator/LevelDBIterator [spark]

2023-11-15 Thread via GitHub
zhaomin1423 commented on code in PR #43502: URL: https://github.com/apache/spark/pull/43502#discussion_r1394398171 ## common/kvstore/src/main/java/org/apache/spark/util/kvstore/LevelDB.java: ## @@ -322,26 +323,15 @@ public void close() throws IOException { } } - /**

Re: [PR] [SPARK-45533][CORE] Use j.l.r.Cleaner instead of finalize for RocksDBIterator/LevelDBIterator [spark]

2023-11-15 Thread via GitHub
mridulm commented on code in PR #43502: URL: https://github.com/apache/spark/pull/43502#discussion_r1394375372 ## common/kvstore/src/main/java/org/apache/spark/util/kvstore/LevelDB.java: ## @@ -322,26 +323,15 @@ public void close() throws IOException { } } - /** -

Re: [PR] [SPARK-45533][CORE] Use j.l.r.Cleaner instead of finalize for RocksDBIterator/LevelDBIterator [spark]

2023-11-15 Thread via GitHub
mridulm commented on code in PR #43502: URL: https://github.com/apache/spark/pull/43502#discussion_r1394375372 ## common/kvstore/src/main/java/org/apache/spark/util/kvstore/LevelDB.java: ## @@ -322,26 +323,15 @@ public void close() throws IOException { } } - /** -

Re: [PR] [SPARK-45938][INFRA] Add `utils` to the dependencies of the `core` module in `module.py` [spark]

2023-11-15 Thread via GitHub
LuciferYang commented on code in PR #43818: URL: https://github.com/apache/spark/pull/43818#discussion_r1394365136 ## dev/sparktestsupport/modules.py: ## @@ -178,7 +178,7 @@ def __hash__(self): core = Module( name="core", -dependencies=[kvstore, network_common,

[PR] [SPARK-45938][INFRA] Add `utils` to the dependency list of the `core` module in `module.py` [spark]

2023-11-15 Thread via GitHub
LuciferYang opened a new pull request, #43818: URL: https://github.com/apache/spark/pull/43818 ### What changes were proposed in this pull request? ### Why are the changes needed? ### Does this PR introduce _any_ user-facing change? ###

Re: [PR] [SPARK-45747][SS] Use prefix key information in state metadata to handle reading state for session window aggregation [spark]

2023-11-15 Thread via GitHub
HeartSaVioR commented on PR #43788: URL: https://github.com/apache/spark/pull/43788#issuecomment-1812633610 @chaoqin-li1123 Could you please rebase your change with latest master branch? merge script is confusing that I'm the main author due to my commits listed here. -- This is an

Re: [PR] [SPARK-45533][CORE] Use j.l.r.Cleaner instead of finalize for RocksDBIterator/LevelDBIterator [spark]

2023-11-15 Thread via GitHub
zhaomin1423 commented on code in PR #43502: URL: https://github.com/apache/spark/pull/43502#discussion_r1394277068 ## common/kvstore/src/main/java/org/apache/spark/util/kvstore/LevelDB.java: ## @@ -322,26 +323,15 @@ public void close() throws IOException { } } - /**

Re: [PR] [SPARK-45533][CORE] Use j.l.r.Cleaner instead of finalize for RocksDBIterator/LevelDBIterator [spark]

2023-11-15 Thread via GitHub
zhaomin1423 commented on code in PR #43502: URL: https://github.com/apache/spark/pull/43502#discussion_r1394277068 ## common/kvstore/src/main/java/org/apache/spark/util/kvstore/LevelDB.java: ## @@ -322,26 +323,15 @@ public void close() throws IOException { } } - /**

Re: [PR] [SPARK-45747][SS] Use prefix key information in state metadata to handle reading state for session window aggregation [spark]

2023-11-15 Thread via GitHub
HeartSaVioR commented on PR #43788: URL: https://github.com/apache/spark/pull/43788#issuecomment-1812628737 Thanks! Merging to master. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

Re: [PR] [SPARK-45562][DOCS] Regenerate `docs/sql-error-conditions.md` and add `42KDF` to `SQLSTATE table` in `error/README.md` [spark]

2023-11-15 Thread via GitHub
sandip-db commented on PR #43817: URL: https://github.com/apache/spark/pull/43817#issuecomment-1812621901 LGTM. @LuciferYang Thanks for the quick fix. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

Re: [PR] [SPARK-45919][CORE][SQL] Use Java 16 `record` to simplify Java class definition [spark]

2023-11-15 Thread via GitHub
LuciferYang commented on PR #43796: URL: https://github.com/apache/spark/pull/43796#issuecomment-1812528892 wait https://github.com/apache/spark/pull/43817 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above

[PR] [SPARK-45562][CORE][DOCS] Regenerate `docs/sql-error-conditions.md` and update `SQLSTATE table` in `error/README.md` [spark]

2023-11-15 Thread via GitHub
LuciferYang opened a new pull request, #43817: URL: https://github.com/apache/spark/pull/43817 ### What changes were proposed in this pull request? ### Why are the changes needed? ### Does this PR introduce _any_ user-facing change? ###

Re: [PR] [SPARK-45562][CORE][DOCS] Regenerate `docs/sql-error-conditions.md` and add `42KDF` to `SQLSTATE table` in `error/README.md` [spark]

2023-11-15 Thread via GitHub
LuciferYang commented on PR #43817: URL: https://github.com/apache/spark/pull/43817#issuecomment-1812524632 cc @HyukjinKwon @beliefer @sandip-db -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

Re: [PR] [SPARK-45562][SQL][FOLLOW-UP] XML: Fix SQLSTATE for missing rowTag error [spark]

2023-11-15 Thread via GitHub
LuciferYang commented on code in PR #43804: URL: https://github.com/apache/spark/pull/43804#discussion_r1394181483 ## docs/sql-error-conditions.md: ## @@ -2375,9 +2375,3 @@ The operation `` requires a ``. But `` is a The `` requires `` parameters but the actual number is ``.

Re: [PR] [SPARK-45562][SQL][FOLLOW-UP] XML: Fix SQLSTATE for missing rowTag error [spark]

2023-11-15 Thread via GitHub
LuciferYang commented on code in PR #43804: URL: https://github.com/apache/spark/pull/43804#discussion_r1394156014 ## docs/sql-error-conditions.md: ## @@ -2375,9 +2375,3 @@ The operation `` requires a ``. But `` is a The `` requires `` parameters but the actual number is ``.

Re: [PR] [SPARK-45905][SQL] Least common type between decimal types should retain integral digits first [spark]

2023-11-15 Thread via GitHub
cloud-fan closed pull request #43781: [SPARK-45905][SQL] Least common type between decimal types should retain integral digits first URL: https://github.com/apache/spark/pull/43781 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub

Re: [PR] [SPARK-45905][SQL] Least common type between decimal types should retain integral digits first [spark]

2023-11-15 Thread via GitHub
cloud-fan commented on PR #43781: URL: https://github.com/apache/spark/pull/43781#issuecomment-1812465509 The failure is unrelated ``` Extension error: Could not import extension sphinx_copybutton (exception: No module named 'sphinx_copybutton') make: *** [Makefile:35: html]

Re: [PR] [SPARK-45924][SQL] Fixing the canonicalization of SubqueryAdaptiveBroadcastExec and making it equivalent with SubqueryBroadcastExec [spark]

2023-11-15 Thread via GitHub
beliefer commented on PR #43806: URL: https://github.com/apache/spark/pull/43806#issuecomment-1812439348 > Ideally that should have happened, but what I see is one stage containing subquery adaptive exec and incoming exchange contains subqueryexec. Also this is just 1 of the issues. The

Re: [PR] [SPARK-45934][DOCS] Fix `spark-standalone.md` and `running-on-kubernetes.md` table layout [spark]

2023-11-15 Thread via GitHub
yaooqinn commented on code in PR #43814: URL: https://github.com/apache/spark/pull/43814#discussion_r1394086066 ## docs/running-on-kubernetes.md: ## @@ -590,41 +590,41 @@ Some of these include: See the [configuration page](configuration.html) for information on Spark

Re: [PR] [SPARK-45919][CORE][SQL] Use Java 16 `record` to simplify Java class definition [spark]

2023-11-15 Thread via GitHub
LuciferYang commented on code in PR #43796: URL: https://github.com/apache/spark/pull/43796#discussion_r1394074332 ## common/network-shuffle/src/test/java/org/apache/spark/network/shuffle/RemoteBlockPushResolverSuite.java: ## @@ -1612,19 +1612,8 @@ private void verifyMetrics(

Re: [PR] [SPARK-45919][CORE][SQL] Use Java 16 `record` to simplify Java class definition [spark]

2023-11-15 Thread via GitHub
LuciferYang commented on code in PR #43796: URL: https://github.com/apache/spark/pull/43796#discussion_r1394064308 ## common/network-yarn/src/main/java/org/apache/spark/network/yarn/YarnShuffleServiceMetrics.java: ## @@ -153,24 +153,6 @@ private static ShuffleServiceMetricsInfo

Re: [PR] [SPARK-45919][CORE][SQL] Use Java 16 `record` to simplify Java class definition [spark]

2023-11-15 Thread via GitHub
LuciferYang commented on code in PR #43796: URL: https://github.com/apache/spark/pull/43796#discussion_r1394067701 ## common/network-shuffle/src/test/java/org/apache/spark/network/shuffle/RemoteBlockPushResolverSuite.java: ## @@ -1612,19 +1612,8 @@ private void verifyMetrics(

Re: [PR] [SPARK-45919][CORE][SQL] Use Java 16 `record` to simplify Java class definition [spark]

2023-11-15 Thread via GitHub
LuciferYang commented on code in PR #43796: URL: https://github.com/apache/spark/pull/43796#discussion_r1394064308 ## common/network-yarn/src/main/java/org/apache/spark/network/yarn/YarnShuffleServiceMetrics.java: ## @@ -153,24 +153,6 @@ private static ShuffleServiceMetricsInfo

Re: [PR] [SPARK-45851][CONNECT][SCALA] Support multiple policies in scala client [spark]

2023-11-15 Thread via GitHub
cdkrot commented on PR #43757: URL: https://github.com/apache/spark/pull/43757#issuecomment-1812291366 cc @hvanhovell let's merge when tests pass :). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

Re: [PR] [SPARK-45919][CORE][SQL] Use Java 16 `record` to simplify Java class definition [spark]

2023-11-15 Thread via GitHub
dongjoon-hyun commented on code in PR #43796: URL: https://github.com/apache/spark/pull/43796#discussion_r1394022682 ## common/network-yarn/src/main/java/org/apache/spark/network/yarn/YarnShuffleServiceMetrics.java: ## @@ -153,24 +153,6 @@ private static

Re: [PR] [SPARK-45919][CORE][SQL] Use Java 16 `record` to simplify Java class definition [spark]

2023-11-15 Thread via GitHub
dongjoon-hyun commented on code in PR #43796: URL: https://github.com/apache/spark/pull/43796#discussion_r1394018312 ## common/network-shuffle/src/test/java/org/apache/spark/network/shuffle/RemoteBlockPushResolverSuite.java: ## @@ -1612,19 +1612,8 @@ private void verifyMetrics(

Re: [PR] [SPARK-45934][DOCS] Fix `spark-standalone.md` and `running-on-kubernetes.md` table layout [spark]

2023-11-15 Thread via GitHub
dongjoon-hyun commented on PR #43814: URL: https://github.com/apache/spark/pull/43814#issuecomment-1812232936 Could you review this when you have some time, @yaooqinn ? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use

Re: [PR] [SPARK-45915][SQL] Treat decimal(x, 0) the same as IntegralType in `PromoteStrings` [spark]

2023-11-15 Thread via GitHub
dongjoon-hyun commented on PR #43812: URL: https://github.com/apache/spark/pull/43812#issuecomment-1812231907 Merged to master for Apache Spark 4.0.0. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

Re: [PR] [SPARK-45915][SQL] Treat decimal(x, 0) the same as IntegralType in `PromoteStrings` [spark]

2023-11-15 Thread via GitHub
dongjoon-hyun closed pull request #43812: [SPARK-45915][SQL] Treat decimal(x, 0) the same as IntegralType in `PromoteStrings` URL: https://github.com/apache/spark/pull/43812 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and

[PR] [SPARK-45935][PYTHON][DOCS] Fix RST files link substitutions error [spark]

2023-11-15 Thread via GitHub
panbingkun opened a new pull request, #43815: URL: https://github.com/apache/spark/pull/43815 ### What changes were proposed in this pull request? ### Why are the changes needed? ### Does this PR introduce _any_ user-facing change? ### How

Re: [PR] [SPARK-45851][CONNECT][SCALA] Support multiple policies in scala client [spark]

2023-11-15 Thread via GitHub
cdkrot commented on code in PR #43757: URL: https://github.com/apache/spark/pull/43757#discussion_r1394006876 ## connector/connect/common/src/main/scala/org/apache/spark/sql/connect/client/RetryPolicy.scala: ## @@ -0,0 +1,141 @@ +/* + * Licensed to the Apache Software

Re: [PR] [SPARK-45851][CONNECT][SCALA] Support multiple policies in scala client [spark]

2023-11-15 Thread via GitHub
cdkrot commented on code in PR #43757: URL: https://github.com/apache/spark/pull/43757#discussion_r1394006876 ## connector/connect/common/src/main/scala/org/apache/spark/sql/connect/client/RetryPolicy.scala: ## @@ -0,0 +1,141 @@ +/* + * Licensed to the Apache Software

[PR] [SPARK-45934][DOCS] Fix `spark-standalone.md` and `running-on-kubernetes.md` table layout [spark]

2023-11-15 Thread via GitHub
dongjoon-hyun opened a new pull request, #43814: URL: https://github.com/apache/spark/pull/43814 ### What changes were proposed in this pull request? ### Why are the changes needed? ### Does this PR introduce _any_ user-facing change? ### How was

Re: [PR] [SPARK-45851][CONNECT][SCALA] Support multiple policies in scala client [spark]

2023-11-15 Thread via GitHub
cdkrot commented on code in PR #43757: URL: https://github.com/apache/spark/pull/43757#discussion_r1394001717 ## connector/connect/common/src/main/scala/org/apache/spark/sql/connect/client/RetryPolicy.scala: ## @@ -0,0 +1,140 @@ +/* + * Licensed to the Apache Software

Re: [PR] [SPARK-45919][CORE][SQL] Use Java 16 `record` to simplify Java class definition [spark]

2023-11-15 Thread via GitHub
LuciferYang commented on PR #43796: URL: https://github.com/apache/spark/pull/43796#issuecomment-1812139559 > > @dongjoon-hyun I want to clarify the issue. We don't want to use `record` here because `field` in the original class doesn't provide an Accessor, but since `record` automatically

Re: [PR] [SPARK-44496][SQL][FOLLOW-UP] CalendarIntervalType is also orderable [spark]

2023-11-15 Thread via GitHub
cloud-fan commented on PR #43805: URL: https://github.com/apache/spark/pull/43805#issuecomment-1812104254 So we just need to update the doc -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

Re: [PR] [MINOR] Fix some typo [spark]

2023-11-15 Thread via GitHub
panbingkun commented on code in PR #43724: URL: https://github.com/apache/spark/pull/43724#discussion_r1393903025 ## docs/cloud-integration.md: ## @@ -121,12 +121,12 @@ for talking to cloud infrastructures, in which case this module may not be neede Spark jobs must

Re: [PR] [SPARK-45904][SQL][CONNECT] Mode function should supports sort with order direction [spark]

2023-11-15 Thread via GitHub
cloud-fan commented on code in PR #43786: URL: https://github.com/apache/spark/pull/43786#discussion_r1393889797 ## connector/connect/client/jvm/src/main/scala/org/apache/spark/sql/functions.scala: ## @@ -842,19 +842,20 @@ object functions { * @group agg_funcs * @since

Re: [PR] [SPARK-45915][SQL] Treat decimal(x, 0) the same as IntegralType in `PromoteStrings` [spark]

2023-11-15 Thread via GitHub
wangyum commented on code in PR #43812: URL: https://github.com/apache/spark/pull/43812#discussion_r1393881574 ## sql/core/src/test/resources/sql-tests/analyzer-results/typeCoercion/native/binaryComparison.sql.out: ## @@ -1330,7 +1330,7 @@ Project [NOT (cast(cast(null as

Re: [PR] [SPARK-45924][SQL] Fixing the canonicalization of SubqueryAdaptiveBroadcastExec and making it equivalent with SubqueryBroadcastExec [spark]

2023-11-15 Thread via GitHub
ahshahid commented on PR #43806: URL: https://github.com/apache/spark/pull/43806#issuecomment-1812050963 > AFAIK, the `SubqueryAdaptiveBroadcastExec` only used for dynamic partition pruning. `SubqueryAdaptiveBroadcastExec` will be replaced with `SubqueryBroadcastExec` and the later must

Re: [PR] [MINOR][DOCS] Clarify collect_list -> ArrayType [spark]

2023-11-15 Thread via GitHub
landlord-matt commented on PR #43787: URL: https://github.com/apache/spark/pull/43787#issuecomment-1812049206 @HyukjinKwon: I now also did a similar thing for collect_set. Any comments on the proposal? I thought you would appreciate this suggestion :'( -- This is an automated message

Re: [PR] [SPARK-45924][SQL] Fixing the canonicalization of SubqueryAdaptiveBroadcastExec and making it equivalent with SubqueryBroadcastExec [spark]

2023-11-15 Thread via GitHub
beliefer commented on code in PR #43806: URL: https://github.com/apache/spark/pull/43806#discussion_r1393856527 ## sql/core/src/main/scala/org/apache/spark/sql/execution/SubqueryAdaptiveBroadcastExec.scala: ## @@ -46,7 +46,8 @@ case class SubqueryAdaptiveBroadcastExec(

Re: [PR] [SPARK-45924][SQL] Fixing the canonicalization of SubqueryAdaptiveBroadcastExec and making it equivalent with SubqueryBroadcastExec [spark]

2023-11-15 Thread via GitHub
beliefer commented on PR #43806: URL: https://github.com/apache/spark/pull/43806#issuecomment-1812033293 AFAIK, the `SubqueryAdaptiveBroadcastExec` only used for dynamic partition pruning. `SubqueryAdaptiveBroadcastExec` will be replaced with `SubqueryBroadcastExec` and the later must

Re: [PR] [SPARK-45764][PYTHON][DOCS] Make code block copyable [spark]

2023-11-15 Thread via GitHub
HyukjinKwon commented on PR #43799: URL: https://github.com/apache/spark/pull/43799#issuecomment-1812018431 @panbingkun would you mind creating a backporting PR? Actually yeah I think it's an important improvement in docs. -- This is an automated message from the Apache Git Service. To

Re: [PR] [SPARK-45919][CORE][SQL] Use Java 16 `record` to simplify Java class definition [spark]

2023-11-15 Thread via GitHub
dongjoon-hyun commented on PR #43796: URL: https://github.com/apache/spark/pull/43796#issuecomment-1812013863 > @dongjoon-hyun I want to clarify the issue. We don't want to use `record` here because `field` in the original class doesn't provide an Accessor, but since `record` automatically

Re: [PR] [SPARK-45562][SQL][FOLLOW-UP] XML: Fix SQLSTATE for missing rowTag error [spark]

2023-11-15 Thread via GitHub
beliefer commented on code in PR #43804: URL: https://github.com/apache/spark/pull/43804#discussion_r1393832007 ## docs/sql-error-conditions.md: ## @@ -2375,9 +2375,3 @@ The operation `` requires a ``. But `` is a The `` requires `` parameters but the actual number is ``.

Re: [PR] [SPARK-45926][SQL] Implementing equals and hashCode which takes into account pushed runtime filters , in InMemoryTable related scans [spark]

2023-11-15 Thread via GitHub
beliefer commented on PR #43808: URL: https://github.com/apache/spark/pull/43808#issuecomment-1812006973 It seems this PR is unrelated to runtime filter. I guess you mean is DS V2 filter pushdown -- This is an automated message from the Apache Git Service. To respond to the message,

Re: [PR] [SPARK-45915][SQL] Treat decimal(x, 0) the same as IntegralType in `PromoteStrings` [spark]

2023-11-15 Thread via GitHub
dongjoon-hyun commented on code in PR #43812: URL: https://github.com/apache/spark/pull/43812#discussion_r1393823245 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/TypeCoercion.scala: ## @@ -934,8 +934,8 @@ object TypeCoercion extends TypeCoercionBase {

[PR] [SPARK-45898][SQL] Support groupingSets operation in dataframe api [spark]

2023-11-15 Thread via GitHub
JacobZheng0927 opened a new pull request, #43813: URL: https://github.com/apache/spark/pull/43813 ### What changes were proposed in this pull request? Add groupingSets method in dataset api. `select col1, col2, col3, sum(col4) FROM t GROUP col1, col2, col3 BY GROUPING SETS ((col1,

Re: [PR] [SPARK-45915][SQL] Treat decimal(x, 0) the same as IntegralType in `PromoteStrings` [spark]

2023-11-15 Thread via GitHub
dongjoon-hyun commented on code in PR #43812: URL: https://github.com/apache/spark/pull/43812#discussion_r1393820692 ## sql/core/src/test/resources/sql-tests/analyzer-results/typeCoercion/native/binaryComparison.sql.out: ## @@ -1330,7 +1330,7 @@ Project [NOT (cast(cast(null as

Re: [PR] [SPARK-44496][SQL][FOLLOW-UP] CalendarIntervalType is also orderable [spark]

2023-11-15 Thread via GitHub
cloud-fan commented on PR #43805: URL: https://github.com/apache/spark/pull/43805#issuecomment-1811991530 it's not really comparable as we can't compare `30 days` and `1 month`. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub

Re: [PR] [SPARK-43393][SQL] Address sequence expression overflow bug. [spark]

2023-11-15 Thread via GitHub
cloud-fan commented on PR #41072: URL: https://github.com/apache/spark/pull/41072#issuecomment-1811989510 SGTM. @thepinetree can you help to create backport PRs? thanks! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use

Re: [PR] [SPARK-45764][PYTHON][DOCS] Make code block copyable [spark]

2023-11-15 Thread via GitHub
zhengruifeng commented on PR #43799: URL: https://github.com/apache/spark/pull/43799#issuecomment-1811986470 shall we backport this to other branches? so that new maintenance releases will benefit from this fix -- This is an automated message from the Apache Git Service. To respond to

Re: [PR] [SPARK-45925][SQL] Making SubqueryBroadcastExec equivalent to SubqueryAdaptiveBroadcastExec [spark]

2023-11-15 Thread via GitHub
ahshahid commented on PR #43807: URL: https://github.com/apache/spark/pull/43807#issuecomment-1811975176 I think if I get clean run on PR [SPARK-45924](https://github.com/apache/spark/pull/43806), then this PR can be closed without merging -- This is an automated message from the Apache

Re: [PR] [SPARK-45924][SQL] Fixing the canonicalization of SubqueryAdaptiveBroadcastExec and making it equivalent with SubqueryBroadcastExec [spark]

2023-11-15 Thread via GitHub
ahshahid commented on PR #43806: URL: https://github.com/apache/spark/pull/43806#issuecomment-1811971501 I have reworked the PR to just canonicalize the SubqueryAdaptiveBroadcastExec as SubqueryBroadcastExec. This also fixes the reuse of exchange issue and seems to be lesser impacting

<    1   2