[jira] [Commented] (YARN-446) Container killed before hprof dumps profile.out

2013-03-04 Thread Jason Lowe (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-446?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13592246#comment-13592246
 ] 

Jason Lowe commented on YARN-446:
-

IMO the AM should always allow the task attempt time to exit successfully on 
its own rather than sending it a kill signal that races with the normal 
shutdown of the task attempt.  This is very similar to the race between the AM 
shutting down after unregistering with the RM and the subsequent kill being 
sent by the RM which was mitigated by MAPREDUCE-4157.  This would also help 
eliminate the many confusing "Container killed by ApplicationMaster" messages 
that are appearing in task attempt diagnostics for tasks that are otherwise 
operating normally.

> Container killed before hprof dumps profile.out
> ---
>
> Key: YARN-446
> URL: https://issues.apache.org/jira/browse/YARN-446
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: client
>Affects Versions: 2.0.3-alpha
>Reporter: Radim Kolar
>
> If there is profiling enabled for mapper or reducer then hprof dumps 
> profile.out at process exit. It is dumped after task signaled to AM that work 
> is finished.
> AM kills container with finished work without waiting for hprof to finish 
> dumps. If hprof is dumping larger outputs (such as with depth=4 while depth=3 
> works) , it could not finish dump in time before being killed making entire 
> dump unusable because cpu and heap stats are missing.
> There needs to be better delay before container is killed if profiling is 
> enabled.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-446) Container killed before hprof dumps profile.out

2013-03-03 Thread Hitesh Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-446?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13591958#comment-13591958
 ] 

Hitesh Shah commented on YARN-446:
--

YARN does not know whether a container is being profiled. Are you suggesting an 
additional api change in YARN to support the notion of a container being 
profiled?

In the scenario mentioned in the description, doesn't it make more sense for 
the MR application to be changed such that the MR task process goes through a 
normal shutdown ( after work is finished ) to enable itself to dump out the 
profiled information? Instead of the MR AM killing the MR task/container via 
the ContainerManager protocol via a SIGTERM/SIGKILL?

  

> Container killed before hprof dumps profile.out
> ---
>
> Key: YARN-446
> URL: https://issues.apache.org/jira/browse/YARN-446
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: client
>Affects Versions: 2.0.3-alpha
>Reporter: Radim Kolar
>
> If there is profiling enabled for mapper or reducer then hprof dumps 
> profile.out at process exit. It is dumped after task signaled to AM that work 
> is finished.
> AM kills container with finished work without waiting for hprof to finish 
> dumps. If hprof is dumping larger outputs (such as with depth=4 while depth=3 
> works) , it could not finish dump in time before being killed making entire 
> dump unusable because cpu and heap stats are missing.
> There needs to be better delay before container is killed if profiling is 
> enabled.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-446) Container killed before hprof dumps profile.out

2013-03-03 Thread Radim Kolar (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-446?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13591937#comment-13591937
 ] 

Radim Kolar commented on YARN-446:
--

My original idea is to have 2nd timeout value for use cases where profiling is 
enabled on particular container.

> Container killed before hprof dumps profile.out
> ---
>
> Key: YARN-446
> URL: https://issues.apache.org/jira/browse/YARN-446
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: client
>Affects Versions: 2.0.3-alpha
>Reporter: Radim Kolar
>
> If there is profiling enabled for mapper or reducer then hprof dumps 
> profile.out at process exit. It is dumped after task signaled to AM that work 
> is finished.
> AM kills container with finished work without waiting for hprof to finish 
> dumps. If hprof is dumping larger outputs (such as with depth=4 while depth=3 
> works) , it could not finish dump in time before being killed making entire 
> dump unusable because cpu and heap stats are missing.
> There needs to be better delay before container is killed if profiling is 
> enabled.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-446) Container killed before hprof dumps profile.out

2013-03-03 Thread Hitesh Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-446?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13591885#comment-13591885
 ] 

Hitesh Shah commented on YARN-446:
--

Is the just a question of setting 
YarnConfiguration.NM_SLEEP_DELAY_BEFORE_SIGKILL_MS to a higher value on the 
cluster that you are testing against? 

> Container killed before hprof dumps profile.out
> ---
>
> Key: YARN-446
> URL: https://issues.apache.org/jira/browse/YARN-446
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: client
>Affects Versions: 2.0.3-alpha
>Reporter: Radim Kolar
>
> If there is profiling enabled for mapper or reducer then hprof dumps 
> profile.out at process exit. It is dumped after task signaled to AM that work 
> is finished.
> AM kills container with finished work without waiting for hprof to finish 
> dumps. If hprof is dumping larger outputs (such as with depth=4 while depth=3 
> works) , it could not finish dump in time before being killed making entire 
> dump unusable because cpu and heap stats are missing.
> There needs to be better delay before container is killed if profiling is 
> enabled.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira