Re: [PR] [SPARK-45762][CORE] Support shuffle managers defined in user jars by changing startup order [spark]

2024-02-10 Thread via GitHub
mridulm commented on PR #43627: URL: https://github.com/apache/spark/pull/43627#issuecomment-1936933162 Thanks for understanding @dongjoon-hyun ! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

Re: [PR] [SPARK-45762][CORE] Support shuffle managers defined in user jars by changing startup order [spark]

2024-02-07 Thread via GitHub
dongjoon-hyun commented on PR #43627: URL: https://github.com/apache/spark/pull/43627#issuecomment-1931550906 @mridulm . Of course, it's legit if it's not easy or there is no other way. Also, we have a similar breaking proposal, #45052 , too. While reviewing that PR, I double-checked this

Re: [PR] [SPARK-45762][CORE] Support shuffle managers defined in user jars by changing startup order [spark]

2024-02-06 Thread via GitHub
mridulm commented on PR #43627: URL: https://github.com/apache/spark/pull/43627#issuecomment-1931474172 @dongjoon-hyun, tt is `@DeveloperApi` from point of view of usage - `SparkEnv` is not expected to be created by users, as some of the constructor parameters are not externally visible

Re: [PR] [SPARK-45762][CORE] Support shuffle managers defined in user jars by changing startup order [spark]

2023-11-17 Thread via GitHub
abellina commented on PR #43627: URL: https://github.com/apache/spark/pull/43627#issuecomment-1816533283 Thanks @mridulm @tgravescs and @beliefer for the reviews and rework! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and

Re: [PR] [SPARK-45762][CORE] Support shuffle managers defined in user jars by changing startup order [spark]

2023-11-17 Thread via GitHub
tgravescs commented on PR #43627: URL: https://github.com/apache/spark/pull/43627#issuecomment-1816518927 Thanks @mridulm @abellina -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

Re: [PR] [SPARK-45762][CORE] Support shuffle managers defined in user jars by changing startup order [spark]

2023-11-16 Thread via GitHub
mridulm commented on PR #43627: URL: https://github.com/apache/spark/pull/43627#issuecomment-1815678388 Merged to master. Thanks for working on this @abellina ! Thanks for the reviews @tgravescs, @beliefer :-) -- This is an automated message from the Apache Git Service. To respond

Re: [PR] [SPARK-45762][CORE] Support shuffle managers defined in user jars by changing startup order [spark]

2023-11-16 Thread via GitHub
mridulm closed pull request #43627: [SPARK-45762][CORE] Support shuffle managers defined in user jars by changing startup order URL: https://github.com/apache/spark/pull/43627 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and

Re: [PR] [SPARK-45762][CORE] Support shuffle managers defined in user jars by changing startup order [spark]

2023-11-16 Thread via GitHub
abellina commented on PR #43627: URL: https://github.com/apache/spark/pull/43627#issuecomment-1814730910 There were some CI failures around missing dependencies in the documentation build (all tests are passing otherwise). So I have upmerged. I also tweaked a couple of comments here:

Re: [PR] [SPARK-45762][CORE] Support shuffle managers defined in user jars by changing startup order [spark]

2023-11-15 Thread via GitHub
abellina commented on code in PR #43627: URL: https://github.com/apache/spark/pull/43627#discussion_r1394870890 ## core/src/main/scala/org/apache/spark/SparkEnv.scala: ## @@ -415,6 +418,11 @@ object SparkEnv extends Logging { advertiseAddress, blockManagerPort,

Re: [PR] [SPARK-45762][CORE] Support shuffle managers defined in user jars by changing startup order [spark]

2023-11-15 Thread via GitHub
tgravescs commented on code in PR #43627: URL: https://github.com/apache/spark/pull/43627#discussion_r1394837773 ## core/src/main/scala/org/apache/spark/SparkEnv.scala: ## @@ -415,6 +418,11 @@ object SparkEnv extends Logging { advertiseAddress, blockManagerPort,

Re: [PR] [SPARK-45762][CORE] Support shuffle managers defined in user jars by changing startup order [spark]

2023-11-15 Thread via GitHub
abellina commented on PR #43627: URL: https://github.com/apache/spark/pull/43627#issuecomment-1813246792 Thanks @mridulm, yes the commits make sense, it brings back the late initialization in the driver. I tested the change, the main difference from your patch @mridulm is I had to still

Re: [PR] [SPARK-45762][CORE] Support shuffle managers defined in user jars by changing startup order [spark]

2023-11-10 Thread via GitHub
abellina commented on PR #43627: URL: https://github.com/apache/spark/pull/43627#issuecomment-1806707560 @mridulm updated. Thanks for your comments and patch. I took most of it verbatim and cleaned up a couple of things. -- This is an automated message from the Apache Git Service. To

Re: [PR] [SPARK-45762][CORE] Support shuffle managers defined in user jars by changing startup order [spark]

2023-11-08 Thread via GitHub
mridulm commented on PR #43627: URL: https://github.com/apache/spark/pull/43627#issuecomment-1803110054 > On the SPIP, I did send it to dev as a DISCUSS thread on November 4. I would like to get input on that as well. You are right, I do [see it

Re: [PR] [SPARK-45762][CORE] Support shuffle managers defined in user jars by changing startup order [spark]

2023-11-08 Thread via GitHub
abellina commented on PR #43627: URL: https://github.com/apache/spark/pull/43627#issuecomment-1803024221 @mridulm thanks for the comments. I think they mostly work and I’m testing it at the moment. I’ll push something either tonight or tomorrow am. On the SPIP, I did send it to dev

Re: [PR] [SPARK-45762][CORE] Support shuffle managers defined in user jars by changing startup order [spark]

2023-11-08 Thread via GitHub
mridulm commented on code in PR #43627: URL: https://github.com/apache/spark/pull/43627#discussion_r1386166263 ## core/src/main/scala/org/apache/spark/SparkEnv.scala: ## @@ -277,6 +287,11 @@ object SparkEnv extends Logging { hostname, numCores, ioEncryptionKey, isLocal)

Re: [PR] [SPARK-45762][CORE] Support shuffle managers defined in user jars by changing startup order [spark]

2023-11-08 Thread via GitHub
mridulm commented on code in PR #43627: URL: https://github.com/apache/spark/pull/43627#discussion_r1386016225 ## core/src/main/scala/org/apache/spark/SparkEnv.scala: ## @@ -185,6 +191,10 @@ class SparkEnv ( releasePythonWorker( pythonExec, workerModule,

Re: [PR] [SPARK-45762][CORE] Support shuffle managers defined in user jars by changing startup order [spark]

2023-11-08 Thread via GitHub
mridulm commented on PR #43627: URL: https://github.com/apache/spark/pull/43627#issuecomment-1801296392 @abellina, given [SPARK-45792](https://issues.apache.org/jira/browse/SPARK-45792) is an SPIP, can you please surface in spark-dev@ and initiate a discussion on it ? I dont remember

Re: [PR] [SPARK-45762][CORE] Support shuffle managers defined in user jars by changing startup order [spark]

2023-11-08 Thread via GitHub
mridulm commented on code in PR #43627: URL: https://github.com/apache/spark/pull/43627#discussion_r1386033849 ## core/src/main/scala/org/apache/spark/executor/Executor.scala: ## @@ -329,14 +329,22 @@ private[spark] class Executor( } updateDependencies(initialUserFiles,

Re: [PR] [SPARK-45762][CORE] Support shuffle managers defined in user jars by changing startup order [spark]

2023-11-08 Thread via GitHub
mridulm commented on code in PR #43627: URL: https://github.com/apache/spark/pull/43627#discussion_r1386032325 ## core/src/main/scala/org/apache/spark/SparkEnv.scala: ## @@ -355,16 +370,6 @@ object SparkEnv extends Logging { new MapOutputTrackerMasterEndpoint(

Re: [PR] [SPARK-45762][CORE] Support shuffle managers defined in user jars by changing startup order [spark]

2023-11-08 Thread via GitHub
mridulm commented on code in PR #43627: URL: https://github.com/apache/spark/pull/43627#discussion_r1386166263 ## core/src/main/scala/org/apache/spark/SparkEnv.scala: ## @@ -277,6 +287,11 @@ object SparkEnv extends Logging { hostname, numCores, ioEncryptionKey, isLocal)

Re: [PR] [SPARK-45762][CORE] Support shuffle managers defined in user jars by changing startup order [spark]

2023-11-08 Thread via GitHub
mridulm commented on code in PR #43627: URL: https://github.com/apache/spark/pull/43627#discussion_r1386011282 ## core/src/main/scala/org/apache/spark/SparkContext.scala: ## @@ -627,6 +631,7 @@ class SparkContext(config: SparkConf) extends Logging { }

Re: [PR] [SPARK-45762][CORE] Support shuffle managers defined in user jars by changing startup order [spark]

2023-11-08 Thread via GitHub
mridulm commented on code in PR #43627: URL: https://github.com/apache/spark/pull/43627#discussion_r1386016225 ## core/src/main/scala/org/apache/spark/SparkEnv.scala: ## @@ -185,6 +191,10 @@ class SparkEnv ( releasePythonWorker( pythonExec, workerModule,

Re: [PR] [SPARK-45762][CORE] Support shuffle managers defined in user jars by changing startup order [spark]

2023-11-07 Thread via GitHub
mridulm commented on code in PR #43627: URL: https://github.com/apache/spark/pull/43627#discussion_r1386011282 ## core/src/main/scala/org/apache/spark/SparkContext.scala: ## @@ -627,6 +631,7 @@ class SparkContext(config: SparkConf) extends Logging { }

Re: [PR] [SPARK-45762][CORE] Support shuffle managers defined in user jars by changing startup order [spark]

2023-11-06 Thread via GitHub
mridulm commented on PR #43627: URL: https://github.com/apache/spark/pull/43627#issuecomment-1797888003 @tgravescs The SparkEnv related change is what gave me pause ... I am less concerned about the Executor side of things -- This is an automated message from the Apache Git Service. To

Re: [PR] [SPARK-45762][CORE] Support shuffle managers defined in user jars by changing startup order [spark]

2023-11-06 Thread via GitHub
abellina commented on PR #43627: URL: https://github.com/apache/spark/pull/43627#issuecomment-1796931246 @tgravescs @mridulm @beliefer I made a small tweak where the `executorEnvs` map in the `SparkContext` is populated with the configuration prefix `spark.executorEnv.*` after the driver

Re: [PR] [SPARK-45762][CORE] Support shuffle managers defined in user jars by changing startup order [spark]

2023-11-06 Thread via GitHub
tgravescs commented on PR #43627: URL: https://github.com/apache/spark/pull/43627#issuecomment-1796408368 I agree that ideally we would finish SPARK-25299, I don't see that happening anytime soon. I also don't think it covers the case of people replacing the entire ShuffleManager vs just

Re: [PR] [SPARK-45762][CORE] Support shuffle managers defined in user jars by changing startup order [spark]

2023-11-06 Thread via GitHub
abellina commented on PR #43627: URL: https://github.com/apache/spark/pull/43627#issuecomment-1794940813 @beliefer thanks for the comments, I handled most of your comments in the last commit (except for the one about the function passing, but we can discuss that one more there). -- This

Re: [PR] [SPARK-45762][CORE] Support shuffle managers defined in user jars by changing startup order [spark]

2023-11-06 Thread via GitHub
abellina commented on code in PR #43627: URL: https://github.com/apache/spark/pull/43627#discussion_r1383413842 ## core/src/main/scala/org/apache/spark/SparkEnv.scala: ## @@ -402,7 +405,7 @@ object SparkEnv extends Logging { None }, blockManagerInfo,

Re: [PR] [SPARK-45762][CORE] Support shuffle managers defined in user jars by changing startup order [spark]

2023-11-05 Thread via GitHub
mridulm commented on PR #43627: URL: https://github.com/apache/spark/pull/43627#issuecomment-1793950325 While I am not opposed to a way to create a short name for shuffle manager, if it results in nontrivial changes to Spark, I am not very inclined towards it. IMO this should be

Re: [PR] [SPARK-45762][CORE] Support shuffle managers defined in user jars by changing startup order [spark]

2023-11-05 Thread via GitHub
beliefer commented on code in PR #43627: URL: https://github.com/apache/spark/pull/43627#discussion_r1382560338 ## core/src/main/scala/org/apache/spark/shuffle/ShuffleManager.scala: ## @@ -17,7 +17,30 @@ package org.apache.spark.shuffle -import

Re: [PR] [SPARK-45762][CORE] Support shuffle managers defined in user jars by changing startup order [spark]

2023-11-04 Thread via GitHub
abellina commented on PR #43627: URL: https://github.com/apache/spark/pull/43627#issuecomment-1793562824 @mridulm thanks for the comments. I have published a SPIP here https://issues.apache.org/jira/browse/SPARK-45792 that aims to show the bigger picture. Without the change of

Re: [PR] [SPARK-45762][CORE] Support shuffle managers defined in user jars by changing startup order [spark]

2023-11-04 Thread via GitHub
abellina commented on code in PR #43627: URL: https://github.com/apache/spark/pull/43627#discussion_r1382469931 ## core/src/main/scala/org/apache/spark/SparkEnv.scala: ## @@ -71,6 +69,10 @@ class SparkEnv ( val outputCommitCoordinator: OutputCommitCoordinator, val

Re: [PR] [SPARK-45762][CORE] Support shuffle managers defined in user jars by changing startup order [spark]

2023-11-02 Thread via GitHub
mridulm commented on code in PR #43627: URL: https://github.com/apache/spark/pull/43627#discussion_r1381167046 ## core/src/main/scala/org/apache/spark/SparkEnv.scala: ## @@ -71,6 +69,10 @@ class SparkEnv ( val outputCommitCoordinator: OutputCommitCoordinator, val

Re: [PR] [SPARK-45762][CORE] Support shuffle managers defined in user jars by changing startup order [spark]

2023-11-02 Thread via GitHub
abellina commented on PR #43627: URL: https://github.com/apache/spark/pull/43627#issuecomment-1791035949 @tgravescs thanks for the review. I have handled your comments in this commit: https://github.com/apache/spark/pull/43627/commits/0bd7e990812d23166509ad6585c8d352f78e569f -- This is

Re: [PR] [SPARK-45762][CORE] Support shuffle managers defined in user jars by changing startup order [spark]

2023-11-02 Thread via GitHub
tgravescs commented on code in PR #43627: URL: https://github.com/apache/spark/pull/43627#discussion_r1380262372 ## core/src/main/scala/org/apache/spark/SparkEnv.scala: ## @@ -423,7 +425,6 @@ object SparkEnv extends Logging { conf, memoryManager, Review Comment:

Re: [PR] [SPARK-45762][CORE]: Support shuffle managers defined in user jars by changing startup order [spark]

2023-11-01 Thread via GitHub
abellina commented on PR #43627: URL: https://github.com/apache/spark/pull/43627#issuecomment-1789727698 @tgravescs fyi -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To

Re: [PR] [SPARK-45762][CORE]: Support shuffle managers defined in user jars by changing startup order [spark]

2023-11-01 Thread via GitHub
abellina commented on PR #43627: URL: https://github.com/apache/spark/pull/43627#issuecomment-1789508860 Ok I believe given the mima code that I need to add a temporary restriction here: https://github.com/apache/spark/blob/master/project/MimaExcludes.scala#L37. -- This is an automated

Re: [PR] [SPARK-45762][CORE]: Support shuffle managers defined in user jars by changing startup order [spark]

2023-11-01 Thread via GitHub
abellina commented on PR #43627: URL: https://github.com/apache/spark/pull/43627#issuecomment-1789408786 The MIMA tests are failing due to: ``` [error] spark-core: Failed binary compatibility check against org.apache.spark:spark-core_2.13:3.5.0! Found 1 potential problems (filtered

Re: [PR] [SPARK-45762][CORE]: Support shuffle managers defined in user jars by changing startup order [spark]

2023-11-01 Thread via GitHub
abellina commented on PR #43627: URL: https://github.com/apache/spark/pull/43627#issuecomment-1789318235 I'll have to figure out the CI. It seems my fork is running things, but I am getting some failures in this page (`AppVeyor` and the `Notify test workflow`) -- This is an automated