[ https://issues.apache.org/jira/browse/YARN-7563?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16266987#comment-16266987 ]
lujie edited comment on YARN-7563 at 11/27/17 4:03 PM: ------------------------------------------------------- I have find the reason by analysis code and logs above figure has shown the reason:client submit a application and then send kill command. NM will start Container by ContainerManagerImpl .startContainerInternal, this method will (1)put appID in context and then (4)send INIT_APPLICATION. Meanwhile NodeManager apperceives the app that need to be cleaned by ResourceTrackerService.nodeHeartbeat, and send FINISH_APPS event to ContainerManagerImpl. ContainerManagerImpl will first (2)check the appID if exists in context, if it dose, (3) send FINISH_APPLICATION. This bug manifests needing two condition: (1) happens before(2) and (3) happens before(4). one of them is violated, this bug will be hidden. I need to future check the ApplicationImpl code, make sure whether AppFinishTriggeredTransition needed to fix this bug. was (Author: xiaoheipangzi): I have find the reason by analysis code and logs !YARN-7536.png! above figure has shown the reason:client submit a application and then send kill command. NM will start Container by ContainerManagerImpl .startContainerInternal, this method will (1)put appID in context and then (4)send INIT_APPLICATION. Meanwhile NodeManager apperceives the app that need to be cleaned by ResourceTrackerService.nodeHeartbeat, and send FINISH_APPS event to ContainerManagerImpl. ContainerManagerImpl will first (2)check the appID if exists in context, if it dose, (3) send FINISH_APPLICATION. This bug manifests needing two condition: (1) happens before(2) and (3) happens before(4). one of them is violated, this bug will be hidden. I need to future check the ApplicationImpl code, make sure whether AppFinishTriggeredTransition needed to fix this bug. > Invalid event: FINISH_APPLICATION at NEW > ---------------------------------------- > > Key: YARN-7563 > URL: https://issues.apache.org/jira/browse/YARN-7563 > Project: Hadoop YARN > Issue Type: Bug > Components: yarn > Affects Versions: 3.0.0-beta1 > Reporter: lujie > > I send kill command to application, nodemanager log shows: > {code:java} > 2017-11-25 19:18:48,126 WARN > org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl: > couldn't find container container_1511608703018_0001_01_000001 while > processing FINISH_CONTAINERS event > 2017-11-25 19:18:48,146 WARN > org.apache.hadoop.yarn.server.nodemanager.containermanager.application.ApplicationImpl: > Can't handle this event at current state > org.apache.hadoop.yarn.state.InvalidStateTransitionException: Invalid event: > FINISH_APPLICATION at NEW > at > org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:305) > at > org.apache.hadoop.yarn.state.StateMachineFactory.access$500(StateMachineFactory.java:46) > at > org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:487) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.application.ApplicationImpl.handle(ApplicationImpl.java:627) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.application.ApplicationImpl.handle(ApplicationImpl.java:75) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl$ApplicationEventDispatcher.handle(ContainerManagerImpl.java:1508) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl$ApplicationEventDispatcher.handle(ContainerManagerImpl.java:1501) > at > org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:197) > at > org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:126) > at java.lang.Thread.run(Thread.java:745) > 2017-11-25 19:18:48,151 INFO > org.apache.hadoop.yarn.server.nodemanager.containermanager.application.ApplicationImpl: > Application application_1511608703018_0001 transitioned from NEW to INITING > {code} > -- This message was sent by Atlassian JIRA (v6.4.14#64029) --------------------------------------------------------------------- To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org