[ 
https://issues.apache.org/jira/browse/YARN-980?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13721231#comment-13721231
 ] 

Raghu C Doppalapudi commented on YARN-980:
------------------------------------------

And also this incident is not happening every time, it is infrequent. Following 
is the entire stack trace Omkar


2013-07-25 23:06:09,851 INFO  containermanager.ContainerManagerImpl - Start 
request for container_1374637086174_0533_01_000002 by user testuser
2013-07-25 23:06:09,851 INFO  containermanager.ContainerManagerImpl - Creating 
a new application reference for app application_1374637086174_0533
2013-07-25 23:06:09,851 INFO  nodemanager.NMAuditLogger - USER=testuser      
IP=10.224.111.21        OPERATION=Start Container Request       
TARGET=ContainerManageImpl      RESULT=SUCCESS  
APPID=application_1374637086174_0533    
CONTAINERID=container_1374637086174_0533_01_000002
2013-07-25 23:06:09,852 INFO  application.Application - Application 
application_1374637086174_0533 transitioned from NEW to INITING
2013-07-25 23:06:09,853 INFO  application.Application - Adding 
container_1374637086174_0533_01_000002 to application 
application_1374637086174_0533
2013-07-25 23:06:09,944 INFO  application.Application - Application 
application_1374637086174_0533 transitioned from INITING to RUNNING
2013-07-25 23:06:09,948 INFO  container.Container - Container 
container_1374637086174_0533_01_000002 transitioned from NEW to LOCALIZING
2013-07-25 23:06:09,948 INFO  containermanager.AuxServices - Got event 
APPLICATION_INIT for appId application_1374637086174_0533
2013-07-25 23:06:09,948 INFO  containermanager.AuxServices - Got 
APPLICATION_INIT for service mapreduce.shuffle
2013-07-25 23:06:09,948 INFO  mapred.ShuffleHandler - Added token for 
job_1374637086174_0533
2013-07-25 23:06:09,948 INFO  localizer.LocalizedResource - Resource 
hdfs://internal-EMPTY-gfist1/testuser-wsl/test_case/PreDriverTest/testJobWithFakeQueries/10383925454001385/sfdc_lib/10381068948677006/locale-sh-0.0.3.jar
 transitioned from INIT to DOWNLOADING
2013-07-25 23:06:09,948 INFO  localizer.LocalizedResource - Resource 
hdfs://internal-EMPTY-gfist1/testuser-wsl/test_case/PreDriverTest/testJobWithFakeQueries/10383925454001385/sfdc_lib/10381068948677006/hadoop-mapreduce-client-app-2.0.0-cdh4.2.1.jar
 transitioned from INIT to DOWNLOADING
2013-07-25 23:06:09,948 INFO  localizer.LocalizedResource - Resource 
hdfs://internal-EMPTY-gfist1/testuser-wsl/test_case/PreDriverTest/testJobWithFakeQueries/10383925454001385/sfdc_lib/10381068948677006/mahout-utils-0.5.jar
 transitioned from INIT to DOWNLOADING
2013-07-25 23:06:09,948 INFO  localizer.LocalizedResource - Resource 
hdfs://internal-EMPTY-gfist1/testuser-wsl/test_case/PreDriverTest/testJobWithFakeQueries/10383925454001385/sfdc_lib/10381068948677006/avro-1.3.0-rc1-sfdc-patch1.jar
 transitioned from INIT to DOWNLOADING
2013-07-25 23:06:09,948 INFO  localizer.LocalizedResource - Resource 
hdfs://internal-EMPTY-gfist1/testuser-wsl/test_case/PreDriverTest/testJobWithFakeQueries/10383925454001385/sfdc_lib/10381068948677006/jung-samples-2.0.1.jar
 transitioned from INIT to DOWNLOADING
2013-07-25 23:06:09,948 INFO  localizer.LocalizedResource - Resource 
hdfs://internal-EMPTY-gfist1/testuser-wsl/test_case/PreDriverTest/testJobWithFakeQueries/10383925454001385/sfdc_lib/10381068948677006/commons-math-2.1.jar
 transitioned from INIT to DOWNLOADING
2013-07-25 23:06:09,948 INFO  localizer.LocalizedResource - Resource 
hdfs://internal-EMPTY-gfist1/testuser-wsl/test_case/PreDriverTest/testJobWithFakeQueries/10383925454001385/sfdc_lib/10381068948677006/vtd-xml-2.6.jar
 transitioned from INIT to DOWNLOADING

….. around 30 jars are in downloading state.

2013-07-25 23:06:09,957 INFO  localizer.ResourceLocalizationService - 
Downloading public rsrc:{ 
hdfs://internal-EMPTY-gfist1/testuser-wsl/test_case/PreDriverTest/testJobWithFakeQueries/10383925454001385/sfdc_lib/10381068948677006/locale-sh-0.0.3.jar,
 1374793533752, FILE, null }
2013-07-25 23:06:09,957 FATAL event.AsyncDispatcher - Error in dispatcher thread
java.util.concurrent.RejectedExecutionException
        at 
java.util.concurrent.ThreadPoolExecutor$AbortPolicy.rejectedExecution(ThreadPoolExecutor.java:1768)
        at 
java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:767)
        at 
java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:658)
        at 
java.util.concurrent.ExecutorCompletionService.submit(ExecutorCompletionService.java:152)
        at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$PublicLocalizer.addResource(ResourceLocalizationService.java:621)
        at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$LocalizerTracker.handle(ResourceLocalizationService.java:516)
        at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$LocalizerTracker.handle(ResourceLocalizationService.java:458)
        at 
org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:128)
        at 
org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:77)
        at java.lang.Thread.run(Thread.java:662)
2013-07-25 23:06:09,958 INFO  event.AsyncDispatcher - Exiting, bbye..
2013-07-25 23:06:09,959 INFO  service.AbstractService - Service:Dispatcher is 
stopped.
2013-07-25 23:06:09,986 INFO  mortbay.log - Stopped 
[email protected]:8042
2013-07-25 23:06:10,086 INFO  service.AbstractService - 
Service:org.apache.hadoop.yarn.server.nodemanager.webapp.WebServer is stopped.


                
> Nodemanager is shutting down while executing a mapreduce job
> ------------------------------------------------------------
>
>                 Key: YARN-980
>                 URL: https://issues.apache.org/jira/browse/YARN-980
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: nodemanager
>            Reporter: Raghu C Doppalapudi
>            Assignee: Vinod Kumar Vavilapalli
>            Priority: Critical
>
> 2013-07-24 11:00:26,582 FATAL event.AsyncDispatcher - Error in dispatcher 
> thread
> java.util.concurrent.RejectedExecutionException
> at 
> java.util.concurrent.ThreadPoolExecutor$AbortPolicy.rejectedExecution(ThreadPoolExecutor.java:1768)
> at java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:767)
> at 
> java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:658)
> at 
> java.util.concurrent.ExecutorCompletionService.submit(ExecutorCompletionService.java:152)
> at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$PublicLocalizer.addResource(ResourceLocalizationService.java:621)
> at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$LocalizerTracker.handle(ResourceLocalizationService.java:516)
> at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$LocalizerTracker.handle(ResourceLocalizationService.java:458)
> at 
> org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:128)
> at org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:77)
> at java.lang.Thread.run(Thread.java:662)
> 2013-07-24 11:00:26,582 INFO event.AsyncDispatcher - Exiting, bbye..
> 2013-07-24 11:00:26,583 INFO service.AbstractService - Service:Dispatcher is 
> stopped.
> 2013-07-24 11:00:26,585 INFO mortbay.log - Stopped 
> [email protected]:8042
> 2013-07-24 11:00:26,686 INFO service.AbstractService - 
> Service:org.apache.hadoop.yarn.server.nodemanager.webapp.WebServer is stopped.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to