[ https://issues.apache.org/jira/browse/YARN-4373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15068335#comment-15068335 ]
Daniel Templeton commented on YARN-4373: ---------------------------------------- I'm also incredulous. I'm still working to reproduce the issue. It was reported by our testing team. If/when I reproduce it, I'll post the details. > Jobs can be temporarily forgotten during recovery > ------------------------------------------------- > > Key: YARN-4373 > URL: https://issues.apache.org/jira/browse/YARN-4373 > Project: Hadoop YARN > Issue Type: Bug > Affects Versions: 2.7.1 > Reporter: Daniel Templeton > Assignee: Daniel Templeton > Priority: Critical > > The RM becomes available to service requests before state store recovery is > started. Before recovery and during the recovery period, it's possible for a > client to request an application report for a running application to which > the RM will respond that the application in unknown. > I'm seeing this issue with Oozie during an RM failover. Until the active > finishes recovery, it reports erroneous information to Oozie, which doesn't > have context to know that it should just try again later. -- This message was sent by Atlassian JIRA (v6.3.4#6332)