devinduan created SPARK-25563:
---------------------------------

             Summary: Spark application hangs
                 Key: SPARK-25563
                 URL: https://issues.apache.org/jira/browse/SPARK-25563
             Project: Spark
          Issue Type: Bug
          Components: Spark Core
    Affects Versions: 2.3.1
            Reporter: devinduan


    I met a issue that if  I start a spark application use yarn client mode, 
application sometimes hang.
    I check the application logs,  container allocate on a lost NodeManager, 
but AM don't retry to start another executor.
    My spark version is 2.3.1
    Here is my ApplicationMaster log.
 
2018-09-26 05:21:15 INFO YarnRMClient:54 - Registering the ApplicationMaster
2018-09-26 05:21:15 INFO ConfiguredRMFailoverProxyProvider:100 - Failing over 
to rm2 
2018-09-26 05:21:15 WARN Utils:66 - spark.executor.instances less than 
spark.dynamicAllocation.minExecutors is invalid, ignoring its setting, please 
update your configs.
2018-09-26 05:21:15 INFO Utils:54 - Using initial executors = 1, max of 
spark.dynamicAllocation.initialExecutors, spark.dynamicAllocation.minExecutors 
and spark.executor.instances
2018-09-26 05:21:15 INFO YarnAllocator:54 - Will request 1 executor 
container(s), each with 24 core(s) and 20275 MB memory (including 1843 MB of 
overhead)
2018-09-26 05:21:15 INFO YarnAllocator:54 - Submitted 1 unlocalized container 
requests.
2018-09-26 05:21:15 INFO ApplicationMaster:54 - Started progress reporter 
thread with (heartbeat : 3000, initial allocation : 200) intervals
2018-09-26 05:21:27 WARN YarnAllocator:66 - Cannot find executorId for 
container: container_1532951609168_4721728_01_000002
2018-09-26 05:21:27 INFO YarnAllocator:54 - Completed container 
container_1532951609168_4721728_01_000002 (state: COMPLETE, exit status: -100)
2018-09-26 05:21:27 WARN YarnAllocator:66 - Container marked as failed: 
container_1532951609168_4721728_01_000002. Exit status: -100. Diagnostics: 
Container released on a *lost* node



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to