Hi
2.2.0 and 2.3.0 gave me the same container log.
A little bit more details.
I'll try to use external java client who submits job.
some lines from maven pom.xml file:
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-client</artifactId>
<version>2.3.0</version>
</dependency>
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-core</artifactId>
<version>1.2.1</version>
</dependency>
lines from external client:
...
2014-03-03 17:36:01 INFO FileInputFormat:287 - Total input paths to
process : 1
2014-03-03 17:36:02 INFO JobSubmitter:396 - number of splits:1
2014-03-03 17:36:03 INFO JobSubmitter:479 - Submitting tokens for job:
job_1393848686226_0018
2014-03-03 17:36:04 INFO YarnClientImpl:166 - Submitted application
application_1393848686226_0018
2014-03-03 17:36:04 INFO Job:1289 - The url to track the job:
http://vm38.dbweb.ee:8088/proxy/application_1393848686226_0018/
2014-03-03 17:36:04 INFO Job:1334 - Running job: job_1393848686226_0018
2014-03-03 17:36:10 INFO Job:1355 - Job job_1393848686226_0018 running
in uber mode : false
2014-03-03 17:36:10 INFO Job:1362 - map 0% reduce 0%
2014-03-03 17:36:10 INFO Job:1375 - Job job_1393848686226_0018 failed
with state FAILED due to: Application application_1393848686226_0018
failed 2 times due to AM Container for
appattempt_1393848686226_0018_000002 exited with exitCode: 1 due to:
Exception from container-launch:
org.apache.hadoop.util.Shell$ExitCodeException:
at org.apache.hadoop.util.Shell.runCommand(Shell.java:464)
at org.apache.hadoop.util.Shell.run(Shell.java:379)
at
org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:589)
at
org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:195)
at
org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:283)
at
org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:79)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:744)
...
Lines from namenode:
...
14/03/03 19:12:42 INFO namenode.FSEditLog: Number of transactions: 900
Total time for transactions(ms): 69 Number of transactions batched in
Syncs: 0 Number of syncs: 542 SyncTimes(ms): 9783
14/03/03 19:12:42 INFO BlockStateChange: BLOCK* addToInvalidates:
blk_1073742050_1226 90.190.106.33:50010
14/03/03 19:12:42 INFO hdfs.StateChange: BLOCK* allocateBlock:
/user/hduser/input/data666.noheader.data.
BP-802201089-90.190.106.33-1393506052071
blk_1073742056_1232{blockUCState=UNDER_CONSTRUCTION,
primaryNodeIndex=-1,
replicas=[ReplicaUnderConstruction[90.190.106.33:50010|RBW]]}
14/03/03 19:12:44 INFO hdfs.StateChange: BLOCK* InvalidateBlocks: ask
90.190.106.33:50010 to delete [blk_1073742050_1226]
14/03/03 19:12:53 INFO BlockStateChange: BLOCK* addStoredBlock: blockMap
updated: 90.190.106.33:50010 is added to
blk_1073742056_1232{blockUCState=UNDER_CONSTRUCTION,
primaryNodeIndex=-1,
replicas=[ReplicaUnderConstruction[90.190.106.33:50010|RBW]]} size 0
14/03/03 19:12:53 INFO hdfs.StateChange: DIR* completeFile:
/user/hduser/input/data666.noheader.data is closed by
DFSClient_NONMAPREDUCE_-915999412_15
14/03/03 19:12:54 INFO BlockStateChange: BLOCK* addToInvalidates:
blk_1073742051_1227 90.190.106.33:50010
14/03/03 19:12:54 INFO hdfs.StateChange: BLOCK* allocateBlock:
/user/hduser/input/data666.noheader.data.info.
BP-802201089-90.190.106.33-1393506052071
blk_1073742057_1233{blockUCState=UNDER_CONSTRUCTION,
primaryNodeIndex=-1,
replicas=[ReplicaUnderConstruction[90.190.106.33:50010|RBW]]}
14/03/03 19:12:54 INFO BlockStateChange: BLOCK* addStoredBlock: blockMap
updated: 90.190.106.33:50010 is added to
blk_1073742057_1233{blockUCState=UNDER_CONSTRUCTION,
primaryNodeIndex=-1,
replicas=[ReplicaUnderConstruction[90.190.106.33:50010|RBW]]} size 0
14/03/03 19:12:54 INFO hdfs.StateChange: DIR* completeFile:
/user/hduser/input/data666.noheader.data.info is closed by
DFSClient_NONMAPREDUCE_-915999412_15
14/03/03 19:12:55 INFO hdfs.StateChange: BLOCK* allocateBlock:
/user/hduser/.staging/job_1393848686226_0019/job.jar.
BP-802201089-90.190.106.33-1393506052071
blk_1073742058_1234{blockUCState=UNDER_CONSTRUCTION,
primaryNodeIndex=-1,
replicas=[ReplicaUnderConstruction[90.190.106.33:50010|RBW]]}
14/03/03 19:12:56 INFO hdfs.StateChange: BLOCK* InvalidateBlocks: ask
90.190.106.33:50010 to delete [blk_1073742051_1227]
14/03/03 19:13:12 INFO BlockStateChange: BLOCK* addStoredBlock: blockMap
updated: 90.190.106.33:50010 is added to
blk_1073742058_1234{blockUCState=UNDER_CONSTRUCTION,
primaryNodeIndex=-1,
replicas=[ReplicaUnderConstruction[90.190.106.33:50010|RBW]]} size 0
14/03/03 19:13:12 INFO hdfs.StateChange: DIR* completeFile:
/user/hduser/.staging/job_1393848686226_0019/job.jar is closed by
DFSClient_NONMAPREDUCE_-915999412_15
14/03/03 19:13:12 INFO blockmanagement.BlockManager: Increasing
replication from 3 to 10 for
/user/hduser/.staging/job_1393848686226_0019/job.jar
14/03/03 19:13:12 INFO blockmanagement.BlockManager: Increasing
replication from 3 to 10 for
/user/hduser/.staging/job_1393848686226_0019/job.split
14/03/03 19:13:12 INFO hdfs.StateChange: BLOCK* allocateBlock:
/user/hduser/.staging/job_1393848686226_0019/job.split.
BP-802201089-90.190.106.33-1393506052071
blk_1073742059_1235{blockUCState=UNDER_CONSTRUCTION,
primaryNodeIndex=-1,
replicas=[ReplicaUnderConstruction[90.190.106.33:50010|RBW]]}
14/03/03 19:13:12 INFO BlockStateChange: BLOCK* addStoredBlock: blockMap
updated: 90.190.106.33:50010 is added to
blk_1073742059_1235{blockUCState=UNDER_CONSTRUCTION,
primaryNodeIndex=-1,
replicas=[ReplicaUnderConstruction[90.190.106.33:50010|RBW]]} size 0
14/03/03 19:13:12 INFO hdfs.StateChange: DIR* completeFile:
/user/hduser/.staging/job_1393848686226_0019/job.split is closed by
DFSClient_NONMAPREDUCE_-915999412_15
14/03/03 19:13:12 INFO hdfs.StateChange: BLOCK* allocateBlock:
/user/hduser/.staging/job_1393848686226_0019/job.splitmetainfo.
BP-802201089-90.190.106.33-1393506052071
blk_1073742060_1236{blockUCState=UNDER_CONSTRUCTION,
primaryNodeIndex=-1,
replicas=[ReplicaUnderConstruction[90.190.106.33:50010|RBW]]}
14/03/03 19:13:12 INFO BlockStateChange: BLOCK* addStoredBlock: blockMap
updated: 90.190.106.33:50010 is added to
blk_1073742060_1236{blockUCState=UNDER_CONSTRUCTION,
primaryNodeIndex=-1,
replicas=[ReplicaUnderConstruction[90.190.106.33:50010|RBW]]} size 0
14/03/03 19:13:12 INFO hdfs.StateChange: DIR* completeFile:
/user/hduser/.staging/job_1393848686226_0019/job.splitmetainfo is closed
by DFSClient_NONMAPREDUCE_-915999412_15
14/03/03 19:13:12 INFO hdfs.StateChange: BLOCK* allocateBlock:
/user/hduser/.staging/job_1393848686226_0019/job.xml.
BP-802201089-90.190.106.33-1393506052071
blk_1073742061_1237{blockUCState=UNDER_CONSTRUCTION,
primaryNodeIndex=-1,
replicas=[ReplicaUnderConstruction[90.190.106.33:50010|RBW]]}
14/03/03 19:13:13 INFO BlockStateChange: BLOCK* addStoredBlock: blockMap
updated: 90.190.106.33:50010 is added to
blk_1073742061_1237{blockUCState=UNDER_CONSTRUCTION,
primaryNodeIndex=-1,
replicas=[ReplicaUnderConstruction[90.190.106.33:50010|RBW]]} size 0
14/03/03 19:13:13 INFO hdfs.StateChange: DIR* completeFile:
/user/hduser/.staging/job_1393848686226_0019/job.xml is closed by
DFSClient_NONMAPREDUCE_-915999412_15
...
Lines from namemanager log:
...
2014-03-03 19:13:19,473 WARN
org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor: Exit
code from container container_1393848686226_0019_02_000001 is : 1
2014-03-03 19:13:19,474 WARN
org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor:
Exception from container-launch with container ID:
container_1393848686226_0019_02_000001 and exit code: 1
org.apache.hadoop.util.Shell$ExitCodeException:
at org.apache.hadoop.util.Shell.runCommand(Shell.java:464)
at org.apache.hadoop.util.Shell.run(Shell.java:379)
at
org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:589)
at
org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:195)
at
org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:283)
at
org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:79)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:744)
2014-03-03 19:13:19,474 INFO
org.apache.hadoop.yarn.server.nodemanager.ContainerExecutor:
2014-03-03 19:13:19,474 WARN
org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch:
Container exited with a non-zero exit code 1
2014-03-03 19:13:19,475 INFO
org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container:
Container container_1393848686226_0019_02_000001 transitioned from
RUNNING to EXITED_WITH_FAILURE
2014-03-03 19:13:19,475 INFO
org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch:
Cleaning up container container_1393848686226_0019_02_000001
2014-03-03 19:13:19,496 INFO
org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor:
Deleting absolute path :
/tmp/hadoop-hdfs/nm-local-dir/usercache/hduser/appcache/application_1393848686226_0019/container_1393848686226_0019_02_000001
2014-03-03 19:13:19,498 WARN
org.apache.hadoop.yarn.server.nodemanager.NMAuditLogger:
USER=hduser OPERATION=Container Finished - Failed
TARGET=ContainerImpl RESULT=FAILURE DESCRIPTION=Container
failed with state: EXITED_WITH_FAILURE
APPID=application_1393848686226_0019
CONTAINERID=container_1393848686226_0019_02_000001
2014-03-03 19:13:19,498 INFO
org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container:
Container container_1393848686226_0019_02_000001 transitioned from
EXITED_WITH_FAILURE to DONE
2014-03-03 19:13:19,498 INFO
org.apache.hadoop.yarn.server.nodemanager.containermanager.application.Application:
Removing container_1393848686226_0019_02_000001 from application
application_1393848686226_0019
2014-03-03 19:13:19,499 INFO
org.apache.hadoop.yarn.server.nodemanager.containermanager.AuxServices:
Got event CONTAINER_STOP for appId application_1393848686226_0019
2014-03-03 19:13:20,160 INFO
org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl: Sending
out status for container: container_id { app_attempt_id { application_id
{ id: 19 cluster_timestamp: 1393848686226 } attemptId: 2 } id: 1 }
state: C_COMPLETE diagnostics: "Exception from container-launch:
\norg.apache.hadoop.util.Shell$ExitCodeException: \n\tat
org.apache.hadoop.util.Shell.runCommand(Shell.java:464)\n\tat
org.apache.hadoop.util.Shell.run(Shell.java:379)\n\tat
org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:589)\n\tat
org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:195)\n\tat
org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:283)\n\tat
org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:79)\n\tat
java.util.concurrent.FutureTask.run(FutureTask.java:262)\n\tat
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)\n\tat
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)\n\tat
java.lang.Thread.run(Thread.java:744)\n\n\n" exit_status: 1
2014-03-03 19:13:20,161 INFO
org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl: Removed
completed container container_1393848686226_0019_02_000001
2014-03-03 19:13:20,542 INFO
org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitorImpl:
Starting resource-monitoring for container_1393848686226_0019_02_000001
2014-03-03 19:13:20,543 INFO
org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitorImpl:
Stopping resource-monitoring for container_1393848686226_0019_02_000001
2014-03-03 19:13:21,164 INFO
org.apache.hadoop.yarn.server.nodemanager.containermanager.application.Application:
Application application_1393848686226_0019 transitioned from RUNNING to
APPLICATION_RESOURCES_CLEANINGUP
2014-03-03 19:13:21,164 INFO
org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor:
Deleting absolute path :
/tmp/hadoop-hdfs/nm-local-dir/usercache/hduser/appcache/application_1393848686226_0019
2014-03-03 19:13:21,165 INFO
org.apache.hadoop.yarn.server.nodemanager.containermanager.AuxServices:
Got event APPLICATION_STOP for appId application_1393848686226_0019
2014-03-03 19:13:21,165 INFO
org.apache.hadoop.yarn.server.nodemanager.containermanager.application.Application:
Application application_1393848686226_0019 transitioned from
APPLICATION_RESOURCES_CLEANINGUP to FINISHED
2014-03-03 19:13:21,165 INFO
org.apache.hadoop.yarn.server.nodemanager.containermanager.loghandler.NonAggregatingLogHandler:
Scheduling Log Deletion for application: application_1393848686226_0019,
with delay of 10800 seconds
...
Tervitades, Margus (Margusja) Roo
+372 51 48 780
http://margus.roo.ee
http://ee.linkedin.com/in/margusroo
skype: margusja
ldapsearch -x -h ldap.sk.ee -b c=EE "(serialNumber=37303140314)"
-----BEGIN PUBLIC KEY-----
MIGfMA0GCSqGSIb3DQEBAQUAA4GNADCBiQKBgQCvbeg7LwEC2SCpAEewwpC3ajxE
5ZsRMCB77L8bae9G7TslgLkoIzo9yOjPdx2NN6DllKbV65UjTay43uUDyql9g3tl
RhiJIcoAExkSTykWqAIPR88LfilLy1JlQ+0RD8OXiWOVVQfhOHpQ0R/jcAkM2lZa
BjM8j36yJvoBVsfOHQIDAQAB
-----END PUBLIC KEY-----
On 03/03/14 19:05, Ted Yu wrote:
Can you tell us the hadoop release you're using ?
Seems there is inconsistency in protobuf library.
On Mon, Mar 3, 2014 at 8:01 AM, Margusja <mar...@roo.ee
<mailto:mar...@roo.ee>> wrote:
Hi
I even don't know what information to provide but my container log is:
2014-03-03 17:36:05,311 FATAL [main]
org.apache.hadoop.mapreduce.v2.app.MRAppMaster: Error starting
MRAppMaster
java.lang.VerifyError: class
org.apache.hadoop.yarn.proto.YarnProtos$ApplicationIdProto
overrides final method
getUnknownFields.()Lcom/google/protobuf/UnknownFieldSet;
at java.lang.ClassLoader.defineClass1(Native Method)
at java.lang.ClassLoader.defineClass(ClassLoader.java:800)
at
java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142)
at
java.net.URLClassLoader.defineClass(URLClassLoader.java:449)
at java.net.URLClassLoader.access$100(URLClassLoader.java:71)
at java.net.URLClassLoader$1.run(URLClassLoader.java:361)
at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
at
sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
at java.lang.Class.getDeclaredConstructors0(Native Method)
at
java.lang.Class.privateGetDeclaredConstructors(Class.java:2493)
at java.lang.Class.getConstructor0(Class.java:2803)
at java.lang.Class.getConstructor(Class.java:1718)
at
org.apache.hadoop.yarn.factories.impl.pb.RecordFactoryPBImpl.newRecordInstance(RecordFactoryPBImpl.java:62)
at
org.apache.hadoop.yarn.util.Records.newRecord(Records.java:36)
at
org.apache.hadoop.yarn.api.records.ApplicationId.newInstance(ApplicationId.java:49)
at
org.apache.hadoop.yarn.util.ConverterUtils.toApplicationAttemptId(ConverterUtils.java:137)
at
org.apache.hadoop.yarn.util.ConverterUtils.toContainerId(ConverterUtils.java:177)
at
org.apache.hadoop.mapreduce.v2.app.MRAppMaster.main(MRAppMaster.java:1343)
Where to start digging?
--
Tervitades, Margus (Margusja) Roo
+372 51 48 780 <tel:%2B372%2051%2048%20780>
http://margus.roo.ee
http://ee.linkedin.com/in/margusroo
skype: margusja
ldapsearch -x -h ldap.sk.ee <http://ldap.sk.ee> -b c=EE
"(serialNumber=37303140314)"
-----BEGIN PUBLIC KEY-----
MIGfMA0GCSqGSIb3DQEBAQUAA4GNADCBiQKBgQCvbeg7LwEC2SCpAEewwpC3ajxE
5ZsRMCB77L8bae9G7TslgLkoIzo9yOjPdx2NN6DllKbV65UjTay43uUDyql9g3tl
RhiJIcoAExkSTykWqAIPR88LfilLy1JlQ+0RD8OXiWOVVQfhOHpQ0R/jcAkM2lZa
BjM8j36yJvoBVsfOHQIDAQAB
-----END PUBLIC KEY-----