[jira] [Updated] (YARN-4530) LocalizedResource trigger a NPE Cause the NodeManager exit

2015-12-30 Thread tangshangwen (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4530?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

tangshangwen updated YARN-4530:
---
Attachment: YARN-4530.1.patch

I found 2.7.1 have the same problem,I submitted a patch.

> LocalizedResource trigger a NPE Cause the NodeManager exit
> --
>
> Key: YARN-4530
> URL: https://issues.apache.org/jira/browse/YARN-4530
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.2.0, 2.7.1
>Reporter: tangshangwen
> Attachments: YARN-4530.1.patch
>
>
> In our cluster, I found that LocalizedResource download failed trigger a NPE 
> Cause the NodeManager shutdown.
> {noformat}
> 2015-12-29 17:18:33,706 INFO 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalizedResource:
>  Resource 
> hdfs://ns3:8020/user/username/projects/user_insight/lookalike/oozie/workflow/conf/hive-site.xml
>  transitioned from DOWNLOADING to FAILED
> 2015-12-29 17:18:33,708 INFO 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService:
>  Downloading public rsrc:{ 
> hdfs://ns3/user/username/projects/user_insight/lookalike/oozie/workflow/lib/user_insight_pig_udf-0.0.1-SNAPSHOT-jar-with-dependencies.jar,
>  1451380519635, FILE, null }
> 2015-12-29 17:18:33,710 INFO 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService:
>  Failed to download rsrc { { 
> hdfs://ns3/user/username/projects/user_insight/lookalike/oozie/workflow/lib/unilever_support_udf-0.0.1-SNAPSHOT.jar,
>  1451380519452, FILE, null 
> },pending,[(container_1451039893865_261670_01_000578)],42332661980495938,DOWNLOADING}
> java.io.IOException: Resource 
> hdfs://ns3/user/username/projects/user_insight/lookalike/oozie/workflow/lib/unilever_support_udf-0.0.1-SNAPSHOT.jar
>  changed on src filesystem (expected 1451380519452, was 1451380611793
>   at org.apache.hadoop.yarn.util.FSDownload.copy(FSDownload.java:176)
>   at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:276)
>   at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:50)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:262)
>   at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:262)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>   at java.lang.Thread.run(Thread.java:745)
> 2015-12-29 17:18:33,710 INFO 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalizedResource:
>  Resource 
> hdfs://ns3/user/username/projects/user_insight/lookalike/oozie/workflow/lib/unilever_support_udf-0.0.1-SNAPSHOT.jar
>  transitioned from DOWNLOADING to FAILED
> 2015-12-29 17:18:33,710 FATAL 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService:
>  Error: Shutting down
> java.lang.NullPointerException at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$PublicLocalizer.run(ResourceLocalizationService.java:712)
> 2015-12-29 17:18:33,710 INFO 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService:
>  Public cache exiting
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-4530) LocalizedResource trigger a NPE Cause the NodeManager exit

2015-12-30 Thread tangshangwen (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4530?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

tangshangwen updated YARN-4530:
---
Affects Version/s: 2.7.1

> LocalizedResource trigger a NPE Cause the NodeManager exit
> --
>
> Key: YARN-4530
> URL: https://issues.apache.org/jira/browse/YARN-4530
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.2.0, 2.7.1
>Reporter: tangshangwen
>
> In our cluster, I found that LocalizedResource download failed trigger a NPE 
> Cause the NodeManager shutdown.
> {noformat}
> 2015-12-29 17:18:33,706 INFO 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalizedResource:
>  Resource 
> hdfs://ns3:8020/user/username/projects/user_insight/lookalike/oozie/workflow/conf/hive-site.xml
>  transitioned from DOWNLOADING to FAILED
> 2015-12-29 17:18:33,708 INFO 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService:
>  Downloading public rsrc:{ 
> hdfs://ns3/user/username/projects/user_insight/lookalike/oozie/workflow/lib/user_insight_pig_udf-0.0.1-SNAPSHOT-jar-with-dependencies.jar,
>  1451380519635, FILE, null }
> 2015-12-29 17:18:33,710 INFO 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService:
>  Failed to download rsrc { { 
> hdfs://ns3/user/username/projects/user_insight/lookalike/oozie/workflow/lib/unilever_support_udf-0.0.1-SNAPSHOT.jar,
>  1451380519452, FILE, null 
> },pending,[(container_1451039893865_261670_01_000578)],42332661980495938,DOWNLOADING}
> java.io.IOException: Resource 
> hdfs://ns3/user/username/projects/user_insight/lookalike/oozie/workflow/lib/unilever_support_udf-0.0.1-SNAPSHOT.jar
>  changed on src filesystem (expected 1451380519452, was 1451380611793
>   at org.apache.hadoop.yarn.util.FSDownload.copy(FSDownload.java:176)
>   at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:276)
>   at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:50)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:262)
>   at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:262)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>   at java.lang.Thread.run(Thread.java:745)
> 2015-12-29 17:18:33,710 INFO 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalizedResource:
>  Resource 
> hdfs://ns3/user/username/projects/user_insight/lookalike/oozie/workflow/lib/unilever_support_udf-0.0.1-SNAPSHOT.jar
>  transitioned from DOWNLOADING to FAILED
> 2015-12-29 17:18:33,710 FATAL 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService:
>  Error: Shutting down
> java.lang.NullPointerException at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$PublicLocalizer.run(ResourceLocalizationService.java:712)
> 2015-12-29 17:18:33,710 INFO 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService:
>  Public cache exiting
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)