[ https://issues.apache.org/jira/browse/YARN-6555?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Haibo Chen updated YARN-6555: ----------------------------- Fix Version/s: YARN-5355-branch-2 YARN-5355 > Store application flow context in NM state store for work-preserving restart > ---------------------------------------------------------------------------- > > Key: YARN-6555 > URL: https://issues.apache.org/jira/browse/YARN-6555 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver > Affects Versions: YARN-5355, YARN-5355-branch-2, 3.0.0-alpha3 > Reporter: Vrushali C > Assignee: Rohith Sharma K S > Labels: yarn-5355-merge-blocker > Fix For: YARN-5355, YARN-5355-branch-2, 3.0.0-alpha3 > > Attachments: YARN-6555.001.patch, YARN-6555.002.patch, > YARN-6555.003.patch > > > If timeline service v2 is enabled and NM is restarted with recovery enabled, > then NM fails to start and throws an error as "flow context can't be null". > This is happening because the flow context did not exist before but now that > timeline service v2 is enabled, ApplicationImpl expects it to exist. > This would also happen even if flow context existed before but since we are > not persisting it / reading it during > ContainerManagerImpl#recoverApplication, it does not get passed in to > ApplicationImpl. > full stack trace > {code} > 2017-05-03 21:51:52,178 FATAL > org.apache.hadoop.yarn.server.nodemanager.NodeManager: Error starting > NodeManager > java.lang.IllegalArgumentException: flow context cannot be null > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.application.ApplicationImpl.<init>(ApplicationImpl.java:104) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.application.ApplicationImpl.<init>(ApplicationImpl.java:90) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl.recoverApplication(ContainerManagerImpl.java:318) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl.recover(ContainerManagerImpl.java:280) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl.serviceInit(ContainerManagerImpl.java:267) > at > org.apache.hadoop.service.AbstractService.init(AbstractService.java:163) > at > org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:107) > at > org.apache.hadoop.yarn.server.nodemanager.NodeManager.serviceInit(NodeManager.java:276) > at > org.apache.hadoop.service.AbstractService.init(AbstractService.java:163) > at > org.apache.hadoop.yarn.server.nodemanager.NodeManager.initAndStartNodeManager(NodeManager.java:588) > at > org.apache.hadoop.yarn.server.nodemanager.NodeManager.main(NodeManager.java:649) > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346) --------------------------------------------------------------------- To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org