[ 
https://issues.apache.org/jira/browse/YARN-3995?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15070643#comment-15070643
 ] 

Sangjin Lee commented on YARN-3995:
-----------------------------------

That would be [~vrushalic], not me. :)

It might be bit better if we can build this "lingering" functionality at the 
per-app collector level. Note that we will have an option of running per-app 
collectors in their own processes. It would be nice if this functionality 
translates to that mode without much work.

Also, note that this linger doesn't need to be too long as we discussed 
offline. I think 1-2 seconds was more than enough?

> Some of the NM events are not getting published due race condition when AM 
> container finishes in NM 
> ----------------------------------------------------------------------------------------------------
>
>                 Key: YARN-3995
>                 URL: https://issues.apache.org/jira/browse/YARN-3995
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: nodemanager, timelineserver
>    Affects Versions: YARN-2928
>            Reporter: Naganarasimha G R
>            Assignee: Naganarasimha G R
>              Labels: yarn-2928-1st-milestone
>
> As discussed in YARN-3045:  While testing in TestDistributedShell found out 
> that few of the container metrics events were failing as there will be race 
> condition. When the AM container finishes and removes the collector for the 
> app, still there is possibility that all the events published for the app by 
> the current NM and other NM are still in pipeline, 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to