[ https://issues.apache.org/jira/browse/YARN-4700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15170376#comment-15170376 ]
Varun Saxena commented on YARN-4700: ------------------------------------ bq. I think we're using the day timestamp for a reason as this table is supposed to be a flow (daily) activity table. And some considerations are given to long running apps that will cross the day boundaries. Let us assume RM does not restart. In that case, we will only get the start event and finish event once each. In that case, event timestamp will be close to current timestamp. And if those are the only events we get, issue with long running apps(extending over more than 2 days) will anyways be there. For instance if we get start event on day 1 and finish event on day 3 and if there is no other app for this flow, this will lead to no activity on day 2 even if we use current timestamp.. There was YARN-4069 which was filed for this issue and its with me. I was thinking of scheduling a global timer in RM which can emit ATS events for all the running apps at a certain point of time.This should resolve long running app issue. This is not marked for 1st milestone though so no progress made. > ATS storage has one extra record each time the RM got restarted > --------------------------------------------------------------- > > Key: YARN-4700 > URL: https://issues.apache.org/jira/browse/YARN-4700 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver > Affects Versions: YARN-2928 > Reporter: Li Lu > Assignee: Naganarasimha G R > Labels: yarn-2928-1st-milestone > > When testing the new web UI for ATS v2, I noticed that we're creating one > extra record for each finished application (but still hold in the RM state > store) each time the RM got restarted. It's quite possible that we add the > cluster start timestamp into the default cluster id, thus each time we're > creating a new record for one application (cluster id is a part of the row > key). We need to fix this behavior, probably by having a better default > cluster id. -- This message was sent by Atlassian JIRA (v6.3.4#6332)