Hi, When you construct the YarnTwillRunnerService instance, do you provide it with the YarnConfiguration that has fs.default.name pointed to the Hdfs name node of your cluster or did you provide any LocationFactory when creating the instance? Twill depends on those information in order be able to use the HDFS.
Terence Sent from my iPhone > On Jun 21, 2014, at 3:48 AM, Peter Grman <[email protected]> wrote: > > Hi, > > When I run my application only on the local nodemanager, everything works > fine. But when I try to start it on multiple nodes, it fails. Looking in > the nodemanager logs I could find a possible cause for the error: > > 2014-06-21 10:24:05,118 WARN > org.apache.hadoop.security.UserGroupInformation: PriviledgedActionException > as:ubuntu (auth:SIMPLE) cause:java.io.FileNotFoundException: File > file:/home/ubuntu/DrillbitRunnable/c134a771-c65f-4595-b51a-dc6d282cd4ad/logback-template.7c14a388-e69b-4133-bfb8-6102445c8098.xml > does not exist > > This path is only valid for the local machine where Twill generates the > folder and files, but not for the remote machine. Why isn't that file > copied to HDFS for distribution? Is there anything I could change, so the > file gets delivered together with everything else? - I'm using Apache Twill > 0.3.0-snapshot. And Hadoop 2.3.0 as well as Hadoop 2.3.0 libraries in my > application > > Thank you > > The complete log file can be found formatted here: > https://gist.github.com/pgrm/68d07084b1e2cb9e2ce4 > And is also below here: > > 2014-06-21 10:24:03,729 INFO > org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitorImpl: > Stopping resource-monitoring for container_1403345039835_0001_01_000009 > 2014-06-21 10:24:04,961 INFO > org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl: > Start request for container_1403345039835_0001_01_000010 by user ubuntu > 2014-06-21 10:24:04,961 INFO > org.apache.hadoop.yarn.server.nodemanager.NMAuditLogger: USER=ubuntu > IP=10.216.60.23 OPERATION=Start Container Request > TARGET=ContainerManageImpl RESULT=SUCCESS > APPID=application_1403345039835_0001 > CONTAINERID=container_1403345039835_0001_01_000010 > 2014-06-21 10:24:04,961 INFO > org.apache.hadoop.yarn.server.nodemanager.containermanager.application.Application: > Adding container_1403345039835_0001_01_000010 to application > application_1403345039835_0001 > 2014-06-21 10:24:04,966 INFO > org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container: > Container container_1403345039835_0001_01_000010 transitioned from NEW to > LOCALIZING > 2014-06-21 10:24:04,966 INFO > org.apache.hadoop.yarn.server.nodemanager.containermanager.AuxServices: Got > event CONTAINER_INIT for appId application_1403345039835_0001 > 2014-06-21 10:24:04,966 INFO > org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalizedResource: > Resource > file:/home/ubuntu/DrillbitRunnable/c134a771-c65f-4595-b51a-dc6d282cd4ad/logback-template.7c14a388-e69b-4133-bfb8-6102445c8098.xml > transitioned from INIT to DOWNLOADING > 2014-06-21 10:24:04,967 INFO > org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService: > Created localizer for container_1403345039835_0001_01_000010 > 2014-06-21 10:24:04,977 INFO > org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService: > Writing credentials to the nmPrivate file > /tmp/hadoop-ubuntu/nm-local-dir/nmPrivate/container_1403345039835_0001_01_000010.tokens. > Credentials list: > 2014-06-21 10:24:04,980 INFO > org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor: > Initializing user ubuntu > 2014-06-21 10:24:05,050 INFO > org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor: Copying > from > /tmp/hadoop-ubuntu/nm-local-dir/nmPrivate/container_1403345039835_0001_01_000010.tokens > to > /tmp/hadoop-ubuntu/nm-local-dir/usercache/ubuntu/appcache/application_1403345039835_0001/container_1403345039835_0001_01_000010.tokens > 2014-06-21 10:24:05,050 INFO > org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor: CWD set > to > /tmp/hadoop-ubuntu/nm-local-dir/usercache/ubuntu/appcache/application_1403345039835_0001 > = > file:/tmp/hadoop-ubuntu/nm-local-dir/usercache/ubuntu/appcache/application_1403345039835_0001 > 2014-06-21 10:24:05,118 WARN > org.apache.hadoop.security.UserGroupInformation: PriviledgedActionException > as:ubuntu (auth:SIMPLE) cause:java.io.FileNotFoundException: File > file:/home/ubuntu/DrillbitRunnable/c134a771-c65f-4595-b51a-dc6d282cd4ad/logback-template.7c14a388-e69b-4133-bfb8-6102445c8098.xml > does not exist > 2014-06-21 10:24:05,120 INFO > org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService: > DEBUG: FAILED { > file:/home/ubuntu/DrillbitRunnable/c134a771-c65f-4595-b51a-dc6d282cd4ad/logback-template.7c14a388-e69b-4133-bfb8-6102445c8098.xml, > 1403346214000, FILE, null }, File > file:/home/ubuntu/DrillbitRunnable/c134a771-c65f-4595-b51a-dc6d282cd4ad/logback-template.7c14a388-e69b-4133-bfb8-6102445c8098.xml > does not exist > 2014-06-21 10:24:05,121 INFO > org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalizedResource: > Resource > file:/home/ubuntu/DrillbitRunnable/c134a771-c65f-4595-b51a-dc6d282cd4ad/logback-template.7c14a388-e69b-4133-bfb8-6102445c8098.xml > transitioned from DOWNLOADING to FAILED > 2014-06-21 10:24:05,121 INFO > org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container: > Container container_1403345039835_0001_01_000010 transitioned from > LOCALIZING to LOCALIZATION_FAILED > 2014-06-21 10:24:05,121 INFO > org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalResourcesTrackerImpl: > Container container_1403345039835_0001_01_000010 sent RELEASE event on a > resource request { > file:/home/ubuntu/DrillbitRunnable/c134a771-c65f-4595-b51a-dc6d282cd4ad/logback-template.7c14a388-e69b-4133-bfb8-6102445c8098.xml, > 1403346214000, FILE, null } not present in cache. > 2014-06-21 10:24:05,122 WARN org.apache.hadoop.ipc.Client: interrupted > waiting to send rpc request to server > java.lang.InterruptedException > at java.util.concurrent.FutureTask.awaitDone(FutureTask.java:400) > at java.util.concurrent.FutureTask.get(FutureTask.java:187) > at > org.apache.hadoop.ipc.Client$Connection.sendRpcRequest(Client.java:1025) > at org.apache.hadoop.ipc.Client.call(Client.java:1379) > at org.apache.hadoop.ipc.Client.call(Client.java:1359) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206) > at com.sun.proxy.$Proxy25.heartbeat(Unknown Source) > at > org.apache.hadoop.yarn.server.nodemanager.api.impl.pb.client.LocalizationProtocolPBClientImpl.heartbeat(LocalizationProtocolPBClientImpl.java:62) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.localizeFiles(ContainerLocalizer.java:255) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.runLocalization(ContainerLocalizer.java:169) > at > org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.startLocalizer(DefaultContainerExecutor.java:107) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$LocalizerRunner.run(ResourceLocalizationService.java:981) > 2014-06-21 10:24:05,122 WARN > org.apache.hadoop.yarn.server.nodemanager.NMAuditLogger: USER=ubuntu > OPERATION=Container Finished - Failed TARGET=ContainerImpl > RESULT=FAILURE DESCRIPTION=Container failed with state: > LOCALIZATION_FAILED APPID=application_1403345039835_0001 > CONTAINERID=container_1403345039835_0001_01_000010 > 2014-06-21 10:24:05,122 INFO > org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService: > Unknown localizer with localizerId container_1403345039835_0001_01_000010 > is sending heartbeat. Ordering it to DIE > 2014-06-21 10:24:05,122 INFO > org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container: > Container container_1403345039835_0001_01_000010 transitioned from > LOCALIZATION_FAILED to DONE > 2014-06-21 10:24:05,122 INFO > org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor: > Deleting absolute path : > /tmp/hadoop-ubuntu/nm-local-dir/usercache/ubuntu/appcache/application_1403345039835_0001/container_1403345039835_0001_01_000010 > 2014-06-21 10:24:05,123 WARN > org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor: delete > returned false for path: > [/tmp/hadoop-ubuntu/nm-local-dir/usercache/ubuntu/appcache/application_1403345039835_0001/container_1403345039835_0001_01_000010] > 2014-06-21 10:24:05,123 INFO > org.apache.hadoop.yarn.server.nodemanager.containermanager.application.Application: > Removing container_1403345039835_0001_01_000010 from application > application_1403345039835_0001 > 2014-06-21 10:24:05,123 INFO > org.apache.hadoop.yarn.server.nodemanager.containermanager.AuxServices: Got > event CONTAINER_STOP for appId application_1403345039835_0001 > 2014-06-21 10:24:05,968 INFO > org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl: Removed > completed container container_1403345039835_0001_01_000010 > 2014-06-21 10:24:06,730 INFO > org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitorImpl: > Stopping resource-monitoring for container_1403345039835_0001_01_000010
