[ 
https://issues.apache.org/jira/browse/YARN-4700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15170704#comment-15170704
 ] 

Vrushali C commented on YARN-4700:
----------------------------------

Hi [~sjlee0] 
Let me add some more explanation.

bq. Wait, I think we're using the day timestamp for a reason as this table is 
supposed to be a flow (daily) activity table.

Yes, the flow activity table indicates which apps were running at what time. If 
an event arrives late (or in this case, a replay causes it arrive at a later 
time), it still belongs to the day the app ran on. So the entry for that flow 
should go into THAT older day's row, hence we should use the event timestamp.

bq.  And some considerations are given to long running apps that will cross the 
day boundaries. 
For long running apps, we would most likely be making a snapshot entry that 
belongs to the day on which the app was running.

bq. I'd like us to stick with that unless there is a compelling reason not to?
So we are not changing the semantics here by using the event timestamp. We are 
actually making an explicit entry for the actual day on which the app ran, 
rather than relying on when the event reached the backend.

We can chat further on monday. 

> ATS storage has one extra record each time the RM got restarted
> ---------------------------------------------------------------
>
>                 Key: YARN-4700
>                 URL: https://issues.apache.org/jira/browse/YARN-4700
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: timelineserver
>    Affects Versions: YARN-2928
>            Reporter: Li Lu
>            Assignee: Naganarasimha G R
>              Labels: yarn-2928-1st-milestone
>
> When testing the new web UI for ATS v2, I noticed that we're creating one 
> extra record for each finished application (but still hold in the RM state 
> store) each time the RM got restarted. It's quite possible that we add the 
> cluster start timestamp into the default cluster id, thus each time we're 
> creating a new record for one application (cluster id is a part of the row 
> key). We need to fix this behavior, probably by having a better default 
> cluster id. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to