[ 
https://issues.apache.org/jira/browse/YARN-5987?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15795781#comment-15795781
 ] 

Miklos Szegedi commented on YARN-5987:
--------------------------------------

Thank you [~templedf] for the info! YARN-2261 is about adding a cleanup 
container for an entire application, this jira is about adding a cleanup script 
for every container.
I read the design there and it suggests two points to discuss to me. One is 
whether we want to run the cleanup callback in a container and the other is 
whether we want to do retries.
1. If we used a separate container for the callback, it might fail due to 
resource constraints, which would prevent collecting a useful dump file. It has 
to run while the original container is alive. One container accessing another 
would also raise container isolation concerns I think.
2. If we do not run the callback in a container, it cannot be preempted, so I 
think we do not need any retry logic either. We can reconsider this, if there 
is a usage pattern other than collecting a dump in the future. However, the 
callback itself can implement a retry logic in the current implementation, if 
necessary.

> NM configured command to collect heap dump of preempted container
> -----------------------------------------------------------------
>
>                 Key: YARN-5987
>                 URL: https://issues.apache.org/jira/browse/YARN-5987
>             Project: Hadoop YARN
>          Issue Type: Improvement
>            Reporter: Miklos Szegedi
>            Assignee: Miklos Szegedi
>         Attachments: YARN-5987.000.patch, YARN-5987.001.patch
>
>
> The node manager can kill a container, if it exceeds the assigned memory 
> limits. It would be nice to have a configuration entry to set up a command 
> that can collect additional debug information, if needed. The collected 
> information can be used for root cause analysis.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

Reply via email to