Victsm commented on a change in pull request #30164:
URL: https://github.com/apache/spark/pull/30164#discussion_r523713075



##########
File path: core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala
##########
@@ -1252,6 +1254,28 @@ private[spark] class DAGScheduler(
     execCores.map(cores => 
properties.setProperty(EXECUTOR_CORES_LOCAL_PROPERTY, cores))
   }
 
+  /**
+   * If push based shuffle is enabled, set the shuffle services to be used for 
the given
+   * shuffle map stage. The list of shuffle services is determined based on 
the list of
+   * active executors tracked by block manager master at the start of the 
stage.
+   */
+  private def prepareShuffleServicesForShuffleMapStage(stage: ShuffleMapStage) 
{
+    // TODO: Handle stage reuse/retry cases separately as without finalize 
changes we cannot
+    // TODO: disable shuffle merge for the retry/reuse cases
+    val mergerLocs = sc.schedulerBackend.getMergerLocations(
+      stage.shuffleDep.partitioner.numPartitions, stage.resourceProfileId)
+    logDebug(s"${stage.shuffleDep.getMergerLocs.map(_.host).mkString(", ")}")
+
+    if (mergerLocs.nonEmpty) {
+      stage.shuffleDep.setMergerLocs(mergerLocs)
+      logInfo("Shuffle merge enabled for %s (%s) with %d merger locations"
+        .format(stage, stage.name, stage.shuffleDep.getMergerLocs.size))
+    } else {
+      stage.shuffleDep.setShuffleMergeEnabled(false)
+      logInfo("Shuffle merge disabled for %s (%s)".format(stage, stage.name))
+    }

Review comment:
       Should we expose the list of merger locations or just a count of number 
of merges for a given ShuffleMapStage?
   Having a long list of merge locations might increase the event size.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to