venkata91 commented on a change in pull request #30164: URL: https://github.com/apache/spark/pull/30164#discussion_r516894315
########## File path: core/src/main/scala/org/apache/spark/storage/BlockManagerMasterEndpoint.scala ########## @@ -657,6 +679,14 @@ class BlockManagerMasterEndpoint( } } + private def getMergerLocations( + numMergersNeeded: Int, + hostsToFilter: Set[String]): Seq[BlockManagerId] = { + // Copying the merger locations to a list so that the original mergerLocations won't be shuffled + val mergers = mergerLocations.values.filterNot(x => hostsToFilter.contains(x.host)).toSeq + Utils.randomize(mergers).take(numMergersNeeded) Review comment: Hm. We haven't really done much of experiments at this point. But this is an interesting area to explore further. Another thing we can possibly do is pass the merger locations information to `ExecutorAllocationManager` as part of `ShuffleMapStage` creation to give some sort of preference to these executors when we remove executors between the stages. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org