gaoyajun02 created SPARK-36964: ---------------------------------- Summary: Reuse CachedDNSToSwitchMapping for yarn container requests Key: SPARK-36964 URL: https://issues.apache.org/jira/browse/SPARK-36964 Project: Spark Issue Type: Improvement Components: Spark Core, YARN Affects Versions: 3.1.2, 3.0.3 Reporter: gaoyajun02
Similar to SPARK-13704, In some cases, YarnAllocator add or remove container requests can be expensive, it may call the topology script for rack awareness. When submit a very large job in a very large Yarn cluster, the topology script may take signifiant time to run. And this blocks receiving YarnSchedulerBackend's RequestExecutors rpc calls, This request comes from spark dynamic executor allocation thread, which may blocks the ExecutorAllocationListener, and then result in executorManagement queue backlog. -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org