Hi All, When a Spark Job is running, and one of the Spark Executor on Node A has some partitions cached. Later for some other stage, Scheduler tries to assign a task to Node A to process a cached partition (PROCESS_LOCAL). But meanwhile the Node A is occupied with some other tasks and got busy. Scheduler waits for spark.locality.wait interval and times out and tries to find some other node B which is NODE_LOCAL. The executor on Node B will try to get the cached partition from Node A which adds network IO to node and also some extra CPU for I/O. Eventually, every node will have a task that is waiting to fetch some cached partition from node A and so the spark job / cluster is basically blocked on a single node.
Spark JIRA is created https://issues.apache.org/jira/browse/SPARK-13718 Beginning from Spark 1.2, Spark introduced External Shuffle Service to enable executors fetch shuffle files from an external service instead of from each other which will offload the load on Spark Executors. We want to check whether a similar thing of an External Service is implemented for transferring the cached partition to other executors. Thanks, Prabhu Joseph