[ https://issues.apache.org/jira/browse/SPARK-23252?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16343288#comment-16343288 ]
Sean Owen commented on SPARK-23252: ----------------------------------- Blocked how? Waiting for the NodeManager? YARN would know the NM is down shortly.. > When NodeManager and CoarseGrainedExecutorBackend processes are killed, the > job will be blocked > ----------------------------------------------------------------------------------------------- > > Key: SPARK-23252 > URL: https://issues.apache.org/jira/browse/SPARK-23252 > Project: Spark > Issue Type: Bug > Components: Spark Core > Affects Versions: 2.2.0 > Reporter: Bang Xiao > Priority: Major > > This happens when 'spark.dynamicAllocation.enabled' is set to be 'true'. We > use Yarn as our resource manager. > 1,spark-submit "JavaWordCount" application in yarn-client mode > 2, Kill NodeManager and CoarseGrainedExecutorBackend processes in one node > when the job is in stage 0 > if we just kill all CoarseGrainedExecutorBackend in that node, TaskSetManager > will pending the failure task to resubmit. but if the NodeManager and > CoarseGrainedExecutorBackend processes killed simultaneously,the whole job > will be blocked. -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org