[ 
https://issues.apache.org/jira/browse/MAPREDUCE-3862?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13208985#comment-13208985
 ] 

Jason Lowe commented on MAPREDUCE-3862:
---------------------------------------

Note that with this patch the DeletionService can leave some scheduled files 
undeleted to avoid long hangs at shutdown.  A couple of alternatives:

* Implement a customized ScheduledThreadPoolExecutor that executes all 
scheduled tasks immediately upon shutdown rather than waiting the specified 
delays for each scheduled task.  This could still lead to long shutdown times 
if there are directories scheduled to be deleted with tons of files.
* Declare the existing behavior "as-intended" and note that NMs can take up to 
{{yarn.nodemanager.delete.debug-delay-sec}} seconds to finish shutting down.  
Would be helpful to log a useful message when waiting.
                
> Nodemanager can appear to hang on shutdown due to lingering DeletionService 
> threads
> -----------------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-3862
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3862
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: mrv2, nodemanager
>    Affects Versions: 0.23.1
>            Reporter: Jason Lowe
>            Assignee: Jason Lowe
>         Attachments: MAPREDUCE-3862.patch, MAPREDUCE-3862.patch
>
>
> When a nodemanager attempts to shutdown cleanly, it's possible for it to 
> appear to hang due to lingering DeletionService threads.  This can occur when 
> yarn.nodemanager.delete.debug-delay-sec is set to a relatively large value 
> and one or more containers executes on the node shortly before the shutdown.
> The DeletionService is never calling 
> {{setExecuteExistingDelayedTasksAfterShutdownPolicy()}} on the 
> ScheduledThreadPoolExecutor, and it defaults to waiting for all scheduled 
> tasks to complete before exiting.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to