[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4284?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13283335#comment-13283335
 ] 

Arun C Murthy commented on MAPREDUCE-4284:
------------------------------------------

@Tucu:

bq.Still, I would say this is a property to be use in development clusters.

If this is only needed for development clusters, then we just use the global 
setting and make it very high (e.g. 3 days).

bq.  Or, in order to make it more production friendly there should be a 
MAX_TIME_TO_KEEP_FILES property in the NM and jobs can set any value up to that 
time.

Then you pretty much have to have a limit on file-sizes, number of files etc. 
which leads exactly to MAPREDUCE-1100, something which we've been trying to 
avoid by durably storing logs in HDFS and not on the NM local disk.

----

To recap, if this is just for debugging, we can set the global limit very high 
and not bother with per-job limits.

IAC, we have all task logs on HDFS - so I really don't see the need to reinvent 
MAPREDUCE-1100.

----

@Ahmed - Your proposal doesn't work because the NodeManager doesn't load 
jobConf of the container... this would require changes to ContainerManager 
protocol.

                
> Allow setting yarn.nodemanager.delete.debug-delay-sec on a per-job basis
> ------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-4284
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4284
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: mrv2
>            Reporter: Ahmed Radwan
>            Assignee: Ahmed Radwan
>
> The yarn.nodemanager.delete.debug-delay-sec property is helpful in debugging 
> jobs (inspecting container logs/local dirs after the job finishes). Currently 
> it is a nodemanager property and changing it requires restarting the 
> nodemanager. In a production cluster this can be a real problem. It is better 
> to have this property set on a per-job basis and not requiring the restart of 
> nodemanagers. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to