Hi,

When I run my application only on the local nodemanager, everything works
fine. But when I try to start it on multiple nodes, it fails. Looking in
the nodemanager logs I could find a possible cause for the error:

2014-06-21 10:24:05,118 WARN
org.apache.hadoop.security.UserGroupInformation: PriviledgedActionException
as:ubuntu (auth:SIMPLE) cause:java.io.FileNotFoundException: File
file:/home/ubuntu/DrillbitRunnable/c134a771-c65f-4595-b51a-dc6d282cd4ad/logback-template.7c14a388-e69b-4133-bfb8-6102445c8098.xml
does not exist

This path is only valid for the local machine where Twill generates the
folder and files, but not for the remote machine. Why isn't that file
copied to HDFS for distribution? Is there anything I could change, so the
file gets delivered together with everything else? - I'm using Apache Twill
0.3.0-snapshot. And Hadoop 2.3.0 as well as Hadoop 2.3.0 libraries in my
application

Thank you

The complete log file can be found formatted here:
https://gist.github.com/pgrm/68d07084b1e2cb9e2ce4
And is also below here:

2014-06-21 10:24:03,729 INFO
org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitorImpl:
Stopping resource-monitoring for container_1403345039835_0001_01_000009
2014-06-21 10:24:04,961 INFO
org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl:
Start request for container_1403345039835_0001_01_000010 by user ubuntu
2014-06-21 10:24:04,961 INFO
org.apache.hadoop.yarn.server.nodemanager.NMAuditLogger: USER=ubuntu
IP=10.216.60.23 OPERATION=Start Container Request
TARGET=ContainerManageImpl      RESULT=SUCCESS
 APPID=application_1403345039835_0001
 CONTAINERID=container_1403345039835_0001_01_000010
2014-06-21 10:24:04,961 INFO
org.apache.hadoop.yarn.server.nodemanager.containermanager.application.Application:
Adding container_1403345039835_0001_01_000010 to application
application_1403345039835_0001
2014-06-21 10:24:04,966 INFO
org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container:
Container container_1403345039835_0001_01_000010 transitioned from NEW to
LOCALIZING
2014-06-21 10:24:04,966 INFO
org.apache.hadoop.yarn.server.nodemanager.containermanager.AuxServices: Got
event CONTAINER_INIT for appId application_1403345039835_0001
2014-06-21 10:24:04,966 INFO
org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalizedResource:
Resource
file:/home/ubuntu/DrillbitRunnable/c134a771-c65f-4595-b51a-dc6d282cd4ad/logback-template.7c14a388-e69b-4133-bfb8-6102445c8098.xml
transitioned from INIT to DOWNLOADING
2014-06-21 10:24:04,967 INFO
org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService:
Created localizer for container_1403345039835_0001_01_000010
2014-06-21 10:24:04,977 INFO
org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService:
Writing credentials to the nmPrivate file
/tmp/hadoop-ubuntu/nm-local-dir/nmPrivate/container_1403345039835_0001_01_000010.tokens.
Credentials list:
2014-06-21 10:24:04,980 INFO
org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor:
Initializing user ubuntu
2014-06-21 10:24:05,050 INFO
org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor: Copying
from
/tmp/hadoop-ubuntu/nm-local-dir/nmPrivate/container_1403345039835_0001_01_000010.tokens
to
/tmp/hadoop-ubuntu/nm-local-dir/usercache/ubuntu/appcache/application_1403345039835_0001/container_1403345039835_0001_01_000010.tokens
2014-06-21 10:24:05,050 INFO
org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor: CWD set
to
/tmp/hadoop-ubuntu/nm-local-dir/usercache/ubuntu/appcache/application_1403345039835_0001
=
file:/tmp/hadoop-ubuntu/nm-local-dir/usercache/ubuntu/appcache/application_1403345039835_0001
2014-06-21 10:24:05,118 WARN
org.apache.hadoop.security.UserGroupInformation: PriviledgedActionException
as:ubuntu (auth:SIMPLE) cause:java.io.FileNotFoundException: File
file:/home/ubuntu/DrillbitRunnable/c134a771-c65f-4595-b51a-dc6d282cd4ad/logback-template.7c14a388-e69b-4133-bfb8-6102445c8098.xml
does not exist
2014-06-21 10:24:05,120 INFO
org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService:
DEBUG: FAILED {
file:/home/ubuntu/DrillbitRunnable/c134a771-c65f-4595-b51a-dc6d282cd4ad/logback-template.7c14a388-e69b-4133-bfb8-6102445c8098.xml,
1403346214000, FILE, null }, File
file:/home/ubuntu/DrillbitRunnable/c134a771-c65f-4595-b51a-dc6d282cd4ad/logback-template.7c14a388-e69b-4133-bfb8-6102445c8098.xml
does not exist
2014-06-21 10:24:05,121 INFO
org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalizedResource:
Resource
file:/home/ubuntu/DrillbitRunnable/c134a771-c65f-4595-b51a-dc6d282cd4ad/logback-template.7c14a388-e69b-4133-bfb8-6102445c8098.xml
transitioned from DOWNLOADING to FAILED
2014-06-21 10:24:05,121 INFO
org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container:
Container container_1403345039835_0001_01_000010 transitioned from
LOCALIZING to LOCALIZATION_FAILED
2014-06-21 10:24:05,121 INFO
org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalResourcesTrackerImpl:
Container container_1403345039835_0001_01_000010 sent RELEASE event on a
resource request {
file:/home/ubuntu/DrillbitRunnable/c134a771-c65f-4595-b51a-dc6d282cd4ad/logback-template.7c14a388-e69b-4133-bfb8-6102445c8098.xml,
1403346214000, FILE, null } not present in cache.
2014-06-21 10:24:05,122 WARN org.apache.hadoop.ipc.Client: interrupted
waiting to send rpc request to server
java.lang.InterruptedException
        at java.util.concurrent.FutureTask.awaitDone(FutureTask.java:400)
        at java.util.concurrent.FutureTask.get(FutureTask.java:187)
        at
org.apache.hadoop.ipc.Client$Connection.sendRpcRequest(Client.java:1025)
        at org.apache.hadoop.ipc.Client.call(Client.java:1379)
        at org.apache.hadoop.ipc.Client.call(Client.java:1359)
        at
org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
        at com.sun.proxy.$Proxy25.heartbeat(Unknown Source)
        at
org.apache.hadoop.yarn.server.nodemanager.api.impl.pb.client.LocalizationProtocolPBClientImpl.heartbeat(LocalizationProtocolPBClientImpl.java:62)
        at
org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.localizeFiles(ContainerLocalizer.java:255)
        at
org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.runLocalization(ContainerLocalizer.java:169)
        at
org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.startLocalizer(DefaultContainerExecutor.java:107)
        at
org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$LocalizerRunner.run(ResourceLocalizationService.java:981)
2014-06-21 10:24:05,122 WARN
org.apache.hadoop.yarn.server.nodemanager.NMAuditLogger: USER=ubuntu
OPERATION=Container Finished - Failed   TARGET=ContainerImpl
 RESULT=FAILURE  DESCRIPTION=Container failed with state:
LOCALIZATION_FAILED    APPID=application_1403345039835_0001
 CONTAINERID=container_1403345039835_0001_01_000010
2014-06-21 10:24:05,122 INFO
org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService:
Unknown localizer with localizerId container_1403345039835_0001_01_000010
is sending heartbeat. Ordering it to DIE
2014-06-21 10:24:05,122 INFO
org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container:
Container container_1403345039835_0001_01_000010 transitioned from
LOCALIZATION_FAILED to DONE
2014-06-21 10:24:05,122 INFO
org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor:
Deleting absolute path :
/tmp/hadoop-ubuntu/nm-local-dir/usercache/ubuntu/appcache/application_1403345039835_0001/container_1403345039835_0001_01_000010
2014-06-21 10:24:05,123 WARN
org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor: delete
returned false for path:
[/tmp/hadoop-ubuntu/nm-local-dir/usercache/ubuntu/appcache/application_1403345039835_0001/container_1403345039835_0001_01_000010]
2014-06-21 10:24:05,123 INFO
org.apache.hadoop.yarn.server.nodemanager.containermanager.application.Application:
Removing container_1403345039835_0001_01_000010 from application
application_1403345039835_0001
2014-06-21 10:24:05,123 INFO
org.apache.hadoop.yarn.server.nodemanager.containermanager.AuxServices: Got
event CONTAINER_STOP for appId application_1403345039835_0001
2014-06-21 10:24:05,968 INFO
org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl: Removed
completed container container_1403345039835_0001_01_000010
2014-06-21 10:24:06,730 INFO
org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitorImpl:
Stopping resource-monitoring for container_1403345039835_0001_01_000010

Reply via email to