[
https://issues.apache.org/jira/browse/YARN-980?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13721231#comment-13721231
]
Raghu C Doppalapudi commented on YARN-980:
------------------------------------------
And also this incident is not happening every time, it is infrequent. Following
is the entire stack trace Omkar
2013-07-25 23:06:09,851 INFO containermanager.ContainerManagerImpl - Start
request for container_1374637086174_0533_01_000002 by user testuser
2013-07-25 23:06:09,851 INFO containermanager.ContainerManagerImpl - Creating
a new application reference for app application_1374637086174_0533
2013-07-25 23:06:09,851 INFO nodemanager.NMAuditLogger - USER=testuser
IP=10.224.111.21 OPERATION=Start Container Request
TARGET=ContainerManageImpl RESULT=SUCCESS
APPID=application_1374637086174_0533
CONTAINERID=container_1374637086174_0533_01_000002
2013-07-25 23:06:09,852 INFO application.Application - Application
application_1374637086174_0533 transitioned from NEW to INITING
2013-07-25 23:06:09,853 INFO application.Application - Adding
container_1374637086174_0533_01_000002 to application
application_1374637086174_0533
2013-07-25 23:06:09,944 INFO application.Application - Application
application_1374637086174_0533 transitioned from INITING to RUNNING
2013-07-25 23:06:09,948 INFO container.Container - Container
container_1374637086174_0533_01_000002 transitioned from NEW to LOCALIZING
2013-07-25 23:06:09,948 INFO containermanager.AuxServices - Got event
APPLICATION_INIT for appId application_1374637086174_0533
2013-07-25 23:06:09,948 INFO containermanager.AuxServices - Got
APPLICATION_INIT for service mapreduce.shuffle
2013-07-25 23:06:09,948 INFO mapred.ShuffleHandler - Added token for
job_1374637086174_0533
2013-07-25 23:06:09,948 INFO localizer.LocalizedResource - Resource
hdfs://internal-EMPTY-gfist1/testuser-wsl/test_case/PreDriverTest/testJobWithFakeQueries/10383925454001385/sfdc_lib/10381068948677006/locale-sh-0.0.3.jar
transitioned from INIT to DOWNLOADING
2013-07-25 23:06:09,948 INFO localizer.LocalizedResource - Resource
hdfs://internal-EMPTY-gfist1/testuser-wsl/test_case/PreDriverTest/testJobWithFakeQueries/10383925454001385/sfdc_lib/10381068948677006/hadoop-mapreduce-client-app-2.0.0-cdh4.2.1.jar
transitioned from INIT to DOWNLOADING
2013-07-25 23:06:09,948 INFO localizer.LocalizedResource - Resource
hdfs://internal-EMPTY-gfist1/testuser-wsl/test_case/PreDriverTest/testJobWithFakeQueries/10383925454001385/sfdc_lib/10381068948677006/mahout-utils-0.5.jar
transitioned from INIT to DOWNLOADING
2013-07-25 23:06:09,948 INFO localizer.LocalizedResource - Resource
hdfs://internal-EMPTY-gfist1/testuser-wsl/test_case/PreDriverTest/testJobWithFakeQueries/10383925454001385/sfdc_lib/10381068948677006/avro-1.3.0-rc1-sfdc-patch1.jar
transitioned from INIT to DOWNLOADING
2013-07-25 23:06:09,948 INFO localizer.LocalizedResource - Resource
hdfs://internal-EMPTY-gfist1/testuser-wsl/test_case/PreDriverTest/testJobWithFakeQueries/10383925454001385/sfdc_lib/10381068948677006/jung-samples-2.0.1.jar
transitioned from INIT to DOWNLOADING
2013-07-25 23:06:09,948 INFO localizer.LocalizedResource - Resource
hdfs://internal-EMPTY-gfist1/testuser-wsl/test_case/PreDriverTest/testJobWithFakeQueries/10383925454001385/sfdc_lib/10381068948677006/commons-math-2.1.jar
transitioned from INIT to DOWNLOADING
2013-07-25 23:06:09,948 INFO localizer.LocalizedResource - Resource
hdfs://internal-EMPTY-gfist1/testuser-wsl/test_case/PreDriverTest/testJobWithFakeQueries/10383925454001385/sfdc_lib/10381068948677006/vtd-xml-2.6.jar
transitioned from INIT to DOWNLOADING
….. around 30 jars are in downloading state.
2013-07-25 23:06:09,957 INFO localizer.ResourceLocalizationService -
Downloading public rsrc:{
hdfs://internal-EMPTY-gfist1/testuser-wsl/test_case/PreDriverTest/testJobWithFakeQueries/10383925454001385/sfdc_lib/10381068948677006/locale-sh-0.0.3.jar,
1374793533752, FILE, null }
2013-07-25 23:06:09,957 FATAL event.AsyncDispatcher - Error in dispatcher thread
java.util.concurrent.RejectedExecutionException
at
java.util.concurrent.ThreadPoolExecutor$AbortPolicy.rejectedExecution(ThreadPoolExecutor.java:1768)
at
java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:767)
at
java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:658)
at
java.util.concurrent.ExecutorCompletionService.submit(ExecutorCompletionService.java:152)
at
org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$PublicLocalizer.addResource(ResourceLocalizationService.java:621)
at
org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$LocalizerTracker.handle(ResourceLocalizationService.java:516)
at
org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$LocalizerTracker.handle(ResourceLocalizationService.java:458)
at
org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:128)
at
org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:77)
at java.lang.Thread.run(Thread.java:662)
2013-07-25 23:06:09,958 INFO event.AsyncDispatcher - Exiting, bbye..
2013-07-25 23:06:09,959 INFO service.AbstractService - Service:Dispatcher is
stopped.
2013-07-25 23:06:09,986 INFO mortbay.log - Stopped
[email protected]:8042
2013-07-25 23:06:10,086 INFO service.AbstractService -
Service:org.apache.hadoop.yarn.server.nodemanager.webapp.WebServer is stopped.
> Nodemanager is shutting down while executing a mapreduce job
> ------------------------------------------------------------
>
> Key: YARN-980
> URL: https://issues.apache.org/jira/browse/YARN-980
> Project: Hadoop YARN
> Issue Type: Bug
> Components: nodemanager
> Reporter: Raghu C Doppalapudi
> Assignee: Vinod Kumar Vavilapalli
> Priority: Critical
>
> 2013-07-24 11:00:26,582 FATAL event.AsyncDispatcher - Error in dispatcher
> thread
> java.util.concurrent.RejectedExecutionException
> at
> java.util.concurrent.ThreadPoolExecutor$AbortPolicy.rejectedExecution(ThreadPoolExecutor.java:1768)
> at java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:767)
> at
> java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:658)
> at
> java.util.concurrent.ExecutorCompletionService.submit(ExecutorCompletionService.java:152)
> at
> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$PublicLocalizer.addResource(ResourceLocalizationService.java:621)
> at
> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$LocalizerTracker.handle(ResourceLocalizationService.java:516)
> at
> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$LocalizerTracker.handle(ResourceLocalizationService.java:458)
> at
> org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:128)
> at org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:77)
> at java.lang.Thread.run(Thread.java:662)
> 2013-07-24 11:00:26,582 INFO event.AsyncDispatcher - Exiting, bbye..
> 2013-07-24 11:00:26,583 INFO service.AbstractService - Service:Dispatcher is
> stopped.
> 2013-07-24 11:00:26,585 INFO mortbay.log - Stopped
> [email protected]:8042
> 2013-07-24 11:00:26,686 INFO service.AbstractService -
> Service:org.apache.hadoop.yarn.server.nodemanager.webapp.WebServer is stopped.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira