[ https://issues.apache.org/jira/browse/YARN-3793?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14597817#comment-14597817 ]
Brahma Reddy Battula commented on YARN-3793: -------------------------------------------- [~kasha] one possible scenario is : When disk became bad and NM stopped.. I had seen this NPE( where good dir's will be null).. {noformat} 2015-06-19 03:09:10,528 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.AppLogAggregatorImpl: Uploading logs for container container_1434452428753_0522_01_000162. Current good log dirs are 2015-06-19 03:09:10,528 ERROR org.apache.hadoop.yarn.server.nodemanager.DeletionService: Exception during execution of task in DeletionService java.lang.NullPointerException at org.apache.hadoop.fs.FileContext.fixRelativePart(FileContext.java:274) at org.apache.hadoop.fs.FileContext.delete(FileContext.java:761) at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.deleteAsUser(DefaultContainerExecutor.java:458) at org.apache.hadoop.yarn.server.nodemanager.DeletionService$FileDeletionTask.run(DeletionService.java:293) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) {noformat} > Several NPEs when deleting local files on NM recovery > ----------------------------------------------------- > > Key: YARN-3793 > URL: https://issues.apache.org/jira/browse/YARN-3793 > Project: Hadoop YARN > Issue Type: Bug > Components: nodemanager > Affects Versions: 2.6.0 > Reporter: Karthik Kambatla > Assignee: Karthik Kambatla > > When NM work-preserving restart is enabled, we see several NPEs on recovery. > These seem to correspond to sub-directories that need to be deleted. I wonder > if null pointers here mean incorrect tracking of these resources and a > potential leak. This JIRA is to investigate and fix anything required. > Logs show: > {noformat} > 2015-05-18 07:06:10,225 INFO > org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor: Deleting > absolute path : null > 2015-05-18 07:06:10,224 ERROR > org.apache.hadoop.yarn.server.nodemanager.DeletionService: Exception during > execution of task in DeletionService > java.lang.NullPointerException > at > org.apache.hadoop.fs.FileContext.fixRelativePart(FileContext.java:274) > at org.apache.hadoop.fs.FileContext.delete(FileContext.java:755) > at > org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.deleteAsUser(DefaultContainerExecutor.java:458) > at > org.apache.hadoop.yarn.server.nodemanager.DeletionService$FileDeletionTask.run(DeletionService.java:293) > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)