Hi, all:
I have a problem while building cube at step 2.
The error appears in yarn log:
2017-06-14 11:21:08,793 INFO
org.apache.hadoop.yarn.server.nodemanager.containermanager.application.Application:
Application application_1497364689294_0018 transitioned from NEW to INITING
2017-06-14 11:21:08,793 INFO
org.apache.hadoop.yarn.server.nodemanager.containermanager.application.Application:
Adding container_1497364689294_0018_01_000001 to application
application_1497364689294_0018
2017-06-14 11:21:08,793 INFO
org.apache.hadoop.yarn.server.nodemanager.containermanager.application.Application:
Application application_1497364689294_0018 transitioned from INITING to RUNNING
2017-06-14 11:21:08,794 INFO
org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container:
Container container_1497364689294_0018_01_000001 transitioned from NEW to
LOCALIZING
2017-06-14 11:21:08,794 INFO
org.apache.hadoop.yarn.server.nodemanager.containermanager.AuxServices: Got
event CONTAINER_INIT for appId application_1497364689294_0018
2017-06-14 11:21:08,794 INFO
org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalizedResource:
Resource
file:/tmp/hadoop-yarn/staging/hadoop/.staging/job_1497364689294_0018/job.jar
transitioned from INIT to DOWNLOADING
2017-06-14 11:21:08,794 INFO
org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalizedResource:
Resource
file:/tmp/hadoop-yarn/staging/hadoop/.staging/job_1497364689294_0018/job.splitmetainfo
transitioned from INIT to DOWNLOADING
2017-06-14 11:21:08,794 INFO
org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalizedResource:
Resource
file:/tmp/hadoop-yarn/staging/hadoop/.staging/job_1497364689294_0018/job.split
transitioned from INIT to DOWNLOADING
2017-06-14 11:21:08,794 INFO
org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalizedResource:
Resource
file:/tmp/hadoop-yarn/staging/hadoop/.staging/job_1497364689294_0018/job.xml
transitioned from INIT to DOWNLOADING
2017-06-14 11:21:08,794 INFO
org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalizedResource:
Resource
file:/home/q/hadoop/kylin/tomcat/temp/kylin_job_meta3892468167792432608/meta
transitioned from INIT to DOWNLOADING
2017-06-14 11:21:08,794 INFO
org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService:
Created localizer for container_1497364689294_0018_01_000001
2017-06-14 11:21:08,794 INFO
org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService:
Downloading public rsrc:{
file:/home/q/hadoop/kylin/tomcat/temp/kylin_job_meta3892468167792432608/meta,
1497410467000, FILE, null }
2017-06-14 11:21:08,796 INFO
org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService:
Writing credentials to the nmPrivate file
/home/q/hadoop/hadoop/tmp/nm-local-dir/nmPrivate/container_1497364689294_0018_01_000001.tokens.
Credentials list:
2017-06-14 11:21:08,796 INFO
org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService:
Failed to download rsrc { {
file:/home/q/hadoop/kylin/tomcat/temp/kylin_job_meta3892468167792432608/meta,
1497410467000, FILE, null
},pending,[(container_1497364689294_0018_01_000001)],781495827608056,DOWNLOADING}
java.io.FileNotFoundException: File
file:/home/q/hadoop/kylin/tomcat/temp/kylin_job_meta3892468167792432608/meta
does not exist
at
org.apache.hadoop.fs.RawLocalFileSystem.deprecatedGetFileStatus(RawLocalFileSystem.java:524)
at
org.apache.hadoop.fs.RawLocalFileSystem.getFileLinkStatusInternal(RawLocalFileSystem.java:737)
at
org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:514)
at
org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:397)
at org.apache.hadoop.yarn.util.FSDownload.copy(FSDownload.java:250)
at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:353)
at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:59)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:744)
2017-06-14 11:21:08,796 INFO
org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor:
Initializing user hadoop
2017-06-14 11:21:08,797 INFO
org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalizedResource:
Resource
file:/home/q/hadoop/kylin/tomcat/temp/kylin_job_meta3892468167792432608/meta(->/home/q/hadoop/hadoop/tmp/nm-local-dir/filecache/18/meta)
transitioned from DOWNLOADING to FAILED
2017-06-14 11:21:08,797 INFO
org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container:
Container container_1497364689294_0018_01_000001 transitioned from LOCALIZING
to LOCALIZATION_FAILED
2017-06-14 11:21:08,797 INFO
org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalResourcesTrackerImpl:
Container container_1497364689294_0018_01_000001 sent RELEASE event on a
resource request {
file:/home/q/hadoop/kylin/tomcat/temp/kylin_job_meta3892468167792432608/meta,
1497410467000, FILE, null } not present in cache.
2017-06-14 11:21:08,797 WARN
org.apache.hadoop.yarn.server.nodemanager.NMAuditLogger: USER=hadoop
OPERATION=Container Finished - Failed TARGET=ContainerImpl RESULT=FAILURE
DESCRIPTION=Container failed with state: LOCALIZATION_FAILED
APPID=application_1497364689294_0018
CONTAINERID=container_1497364689294_0018_01_000001
2017-06-14 11:21:08,797 INFO
org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container:
Container container_1497364689294_0018_01_000001 transitioned from
LOCALIZATION_FAILED to DONE
This error appears in yarn-nodemanager log of machine B and D. And
before it I found a warning log in yarn-nodemanager log in machine C (Kylin is
only installed in machine A):
2017-06-14 11:21:01,131 INFO
org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container:
Container container_1497364689294_0017_01_000002 transitioned from LOCALIZING
to LOCALIZED
2017-06-14 11:21:01,146 INFO
org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container:
Container container_1497364689294_0017_01_000002 transitioned from LOCALIZED to
RUNNING
2017-06-14 11:21:01,146 INFO
org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitorImpl:
Neither virutal-memory nor physical-memory monitoring is needed. Not running
the monitor-thread
2017-06-14 11:21:01,149 INFO
org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor:
launchContainer: [nice, -n, 0, bash,
/home/q/hadoop/hadoop/tmp/nm-local-dir/usercache/hadoop/appcache/application_1497364689294_0017/container_1497364689294_0017_01_000002/default_container_executor.sh]
2017-06-14 11:21:05,024 INFO
org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl:
Stopping container with container Id: container_1497364689294_0017_01_000002
2017-06-14 11:21:05,025 INFO
org.apache.hadoop.yarn.server.nodemanager.NMAuditLogger: USER=hadoop
IP=10.90.181.160 OPERATION=Stop Container Request
TARGET=ContainerManageImpl RESULT=SUCCESS
APPID=application_1497364689294_0017
CONTAINERID=container_1497364689294_0017_01_000002
2017-06-14 11:21:05,025 INFO
org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container:
Container container_1497364689294_0017_01_000002 transitioned from RUNNING to
KILLING
2017-06-14 11:21:05,025 INFO
org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch:
Cleaning up container container_1497364689294_0017_01_000002
2017-06-14 11:21:05,028 WARN
org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor: Exit code
from container container_1497364689294_0017_01_000002 is : 143
2017-06-14 11:21:05,040 INFO
org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container:
Container container_1497364689294_0017_01_000002 transitioned from KILLING to
CONTAINER_CLEANEDUP_AFTER_KILL
2017-06-14 11:21:05,041 INFO
org.apache.hadoop.yarn.server.nodemanager.NMAuditLogger: USER=hadoop
OPERATION=Container Finished - Killed TARGET=ContainerImpl RESULT=SUCCESS
APPID=application_1497364689294_0017
CONTAINERID=container_1497364689294_0017_01_000002
2017-06-14 11:21:05,041 INFO
org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container:
Container container_1497364689294_0017_01_000002 transitioned from
CONTAINER_CLEANEDUP_AFTER_KILL to DONE
It puzzles me that why kylin wants to load a local file by applications
on other nodes in step 2? How can I solve it?
Here are some additional information(They may be helpful for analyzing
the problem):
The cluster has 4 machines: A B C and D.
Hadoop version 2.5.0 support snappy
Namenode: A(stand by) B(active)
Datanode: all
Hive version 0.13.1 recompile for hadoop2
HBase version 0.98.6 recompile for hadoop 2.5.0
Master: A(active) and B
When I set “hbase.rootdir” in hbase-site.xml as detail IP
address of active namenode, the step 2 is ok, but it will failed at the last 5
step.
So I change the setting item to cluster name. And there is no
problem in hbase logs.
Thank you
Best regards