I was able to execute the example by running the job as the yarn user. For example the following successfully completes: sudo -u yarn yarn org.apache.hadoop.examples.RandomWriter /tmp/random-out
Whereas this fails with the local user rpaulk: yarn org.apache.hadoop.examples.RandomWriter /tmp/random-out On Wed, Jul 31, 2013 at 2:28 PM, Rod Paulk <rmang...@gmail.com> wrote: > I am having an issue running 2.0.5-alpha (BigTop-0.6.0) YARN-MapReduce on > the local filesystem instead of HDFS. The appTokens file that the error > states is missing, does exist after the job fails. I saw other 'similar' > issues noted in YARN-917, YARN-513, YARN-993. When I switch to HDFS, the > jobs run fine. > > In core-site.xml > <property> > <name>fs.defaultFS</name> > <value>file:///</value> > </property> > > In mapred-site.xml > <property> > <name>mapreduce.framework.name</name> > <value>yarn</value> > </property> > > 2013-07-29 16:13:06,549 INFO > org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl: > Start request for container_1375138534137_0003_01_000001 by user rpaulk > > 2013-07-29 16:13:06,549 INFO > org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl: > Creating a new application reference for app application_1375138534137_0003 > > 2013-07-29 16:13:06,549 INFO > org.apache.hadoop.yarn.server.nodemanager.NMAuditLogger: USER=rpaulk > IP=172.20.130.215 OPERATION=Start Container Request > TARGET=ContainerManageImpl RESULT=SUCCESS > APPID=application_1375138534137_0003 > CONTAINERID=container_1375138534137_0003_01_000001 > > 2013-07-29 16:13:06,551 INFO > org.apache.hadoop.yarn.server.nodemanager.containermanager.application.Application: > Application application_1375138534137_0003 transitioned from NEW to INITING > > 2013-07-29 16:13:06,551 INFO > org.apache.hadoop.yarn.server.nodemanager.containermanager.application.Application: > Adding container_1375138534137_0003_01_000001 to application > application_1375138534137_0003 > > 2013-07-29 16:13:06,554 INFO > org.apache.hadoop.yarn.server.nodemanager.containermanager.application.Application: > Application application_1375138534137_0003 transitioned from INITING to > RUNNING > > 2013-07-29 16:13:06,555 INFO > org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container: > Container container_1375138534137_0003_01_000001 transitioned from NEW to > LOCALIZING > > *2013-07-29 16:13:06,555 INFO > org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalizedResource: > Resource > file:/nas/scratch/localfs-1/hadoop-yarn/staging/rpaulk/.staging/job_13751385 > * > > *34137_0003/appTokens transitioned from INIT to DOWNLOADING* > > 2013-07-29 16:13:06,556 INFO > org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalizedResource: > Resource > file:/nas/scratch/localfs-1/hadoop-yarn/staging/rpaulk/.staging/job_13751385 > > 34137_0003/job.jar transitioned from INIT to DOWNLOADING > > 2013-07-29 16:13:06,556 INFO > org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalizedResource: > Resource > file:/nas/scratch/localfs-1/hadoop-yarn/staging/rpaulk/.staging/job_13751385 > > 34137_0003/job.splitmetainfo transitioned from INIT to DOWNLOADING > > 2013-07-29 16:13:06,556 INFO > org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalizedResource: > Resource > file:/nas/scratch/localfs-1/hadoop-yarn/staging/rpaulk/.staging/job_13751385 > > 34137_0003/job.split transitioned from INIT to DOWNLOADING > > 2013-07-29 16:13:06,556 INFO > org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalizedResource: > Resource > file:/nas/scratch/localfs-1/hadoop-yarn/staging/rpaulk/.staging/job_1375138534137_0003/job.xml > transitioned from INIT to DOWNLOADING > > 2013-07-29 16:13:06,556 INFO > org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService: > Created localizer for container_1375138534137_0003_01_000001 > > 2013-07-29 16:13:06,559 INFO > org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService: > Writing credentials to the nmPrivate file > /var/lib/hadoop-yarn/cache/yarn/nm-local-dir/nmPrivate/container_1375138534137_0003_01_000001.tokens. > Credentials list: > > 2013-07-29 16:13:06,560 INFO > org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor: > Initializing user rpaulk > > 2013-07-29 16:13:06,564 INFO > org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor: Copying > from > /var/lib/hadoop-yarn/cache/yarn/nm-local-dir/nmPrivate/container_1375138534137_0003_01_000001.tokens > to > /var/lib/hadoop-yarn/cache/yarn/nm-local-dir/usercache/rpaulk/appcache/application_1375138534137_0003/container_1375138534137_0003_01_000001.tokens > > 2013-07-29 16:13:06,564 INFO > org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor: CWD set > to > /var/lib/hadoop-yarn/cache/yarn/nm-local-dir/usercache/rpaulk/appcache/application_1375138534137_0003 > = > file:/var/lib/hadoop-yarn/cache/yarn/nm-local-dir/usercache/rpaulk/appcache/application_1375138534137_0003 > > *2013-07-29 16:13:06,646 ERROR > org.apache.hadoop.security.UserGroupInformation: PriviledgedActionException > as:rpaulk (auth:SIMPLE) cause:java.io.FileNotFoundException: File > file:/nas/scratch/localfs-1/hadoop-yarn/staging/rpaulk/.staging/job_1375138534137_0003/appTokens > does not exist* > > 2013-07-29 16:13:06,648 INFO > org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService: > DEBUG: FAILED { > file:/nas/scratch/localfs-1/hadoop-yarn/staging/rpaulk/.staging/job_1375138534137_0003/appTokens, > 1375139459000, FILE, null } > > RemoteTrace: > > java.io.FileNotFoundException: File > file:/nas/scratch/localfs-1/hadoop-yarn/staging/rpaulk/.staging/job_1375138534137_0003/appTokens > does not exist > > at > org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:492) > > at > org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:395) > > at > org.apache.hadoop.yarn.util.FSDownload.copy(FSDownload.java:176) > > at > org.apache.hadoop.yarn.util.FSDownload.access$000(FSDownload.java:51) > > at > org.apache.hadoop.yarn.util.FSDownload$1.run(FSDownload.java:284) > > at > org.apache.hadoop.yarn.util.FSDownload$1.run(FSDownload.java:282) > > at java.security.AccessController.doPrivileged(Native Method) > > at javax.security.auth.Subject.doAs(Subject.java:396) > > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1478) > > at > org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:280) > > at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:51) > > at > java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) > > at java.util.concurrent.FutureTask.run(FutureTask.java:138) > > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441) > > at > java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) > > at java.util.concurrent.FutureTask.run(FutureTask.java:138) > > at > java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) > > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) > > at java.lang.Thread.run(Thread.java:662) > > at LocalTrace: > > org.apache.hadoop.yarn.exceptions.impl.pb.YarnRemoteExceptionPBImpl: > File > file:/nas/scratch/localfs-1/hadoop-yarn/staging/rpaulk/.staging/job_1375138534137_0003/appTokens > does not exist > > at > org.apache.hadoop.yarn.server.nodemanager.api.protocolrecords.impl.pb.LocalResourceStatusPBImpl.convertFromProtoFormat(LocalResourceStatusPBImpl.java:217) > > at > org.apache.hadoop.yarn.server.nodemanager.api.protocolrecords.impl.pb.LocalResourceStatusPBImpl.getException(LocalResourceStatusPBImpl.java:147) > > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$LocalizerRunner.update(ResourceLocalizationService.java:819) > > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$LocalizerTracker.processHeartbeat(ResourceLocalizationService.java:491) > > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService.heartbeat(ResourceLocalizationService.java:218) > > at > org.apache.hadoop.yarn.server.nodemanager.api.impl.pb.service.LocalizationProtocolPBServiceImpl.heartbeat(LocalizationProtocolPBServiceImpl.java:46) > > at > org.apache.hadoop.yarn.proto.LocalizationProtocol$LocalizationProtocolService$2.callBlockingMethod(LocalizationProtocol.java:57) > > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:454) > > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1014) > > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1741) > > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1737) > > at java.security.AccessController.doPrivileged(Native Method) > > at javax.security.auth.Subject.doAs(Subject.java:396) > > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1478) > > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1735) > > 2013-07-29 16:13:06,650 INFO > org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container: > Container container_1375138534137_0003_01_000001 transitioned from > LOCALIZING to LOCALIZATION_FAILED > > *2013-07-29 16:13:06,650 INFO > org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalizedResource: > Resource > file:/nas/scratch/localfs-1/hadoop-yarn/staging/rpaulk/.staging/job_1375138534137_0003/appTokens > transitioned from DOWNLOADING to INIT* > > 2013-07-29 16:13:06,650 INFO > org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalizedResource: > Resource > file:/nas/scratch/localfs-1/hadoop-yarn/staging/rpaulk/.staging/job_1375138534137_0003/job.jar > transitioned from DOWNLOADING to INIT > > 2013-07-29 16:13:06,650 INFO > org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalizedResource: > Resource > file:/nas/scratch/localfs-1/hadoop-yarn/staging/rpaulk/.staging/job_1375138534137_0003/job.splitmetainfo > transitioned from DOWNLOADING to INIT > > 2013-07-29 16:13:06,650 INFO > org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalizedResource: > Resource > file:/nas/scratch/localfs-1/hadoop-yarn/staging/rpaulk/.staging/job_1375138534137_0003/job.split > transitioned from DOWNLOADING to INIT > > 2013-07-29 16:13:06,650 INFO > org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalizedResource: > Resource > file:/nas/scratch/localfs-1/hadoop-yarn/staging/rpaulk/.staging/job_1375138534137_0003/job.xml > transitioned from DOWNLOADING to INIT > > 2013-07-29 16:13:06,652 WARN > org.apache.hadoop.yarn.server.nodemanager.NMAuditLogger: USER=rpaulk > OPERATION=Container Finished - Failed TARGET=ContainerImpl > RESULT=FAILURE DESCRIPTION=Container failed with state: > LOCALIZATION_FAILED APPID=application_1375138534137_0003 > CONTAINERID=container_1375138534137_0003_01_000001 > > 2013-07-29 16:13:06,652 INFO > org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container: > Container container_1375138534137_0003_01_000001 transitioned from > LOCALIZATION_FAILED to DONE > > 2013-07-29 16:13:06,652 INFO > org.apache.hadoop.yarn.server.nodemanager.containermanager.application.Application: > Removing container_1375138534137_0003_01_000001 from application > application_1375138534137_0003 > > 2013-07-29 16:13:06,652 INFO > org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.AppLogAggregatorImpl: > Considering container container_1375138534137_0003_01_000001 for > log-aggregation > > 2013-07-29 16:13:06,652 INFO > org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor: > Deleting absolute path : > /var/lib/hadoop-yarn/cache/yarn/nm-local-dir/usercache/rpaulk/appcache/application_1375138534137_0003/container_1375138534137_0003_01_000001 >