Have you looked at: http://sourceforge.net/p/myhadoop/mailman/?source=navbar
Cheers On Fri, Jan 16, 2015 at 12:55 PM, Ruhua Jiang <ruhua.ji...@gmail.com> wrote: > Hello > > I am quite new to Hadoop. I am trying to run Hadoop on top of a HPC > infrastructure using a solution called “myHadoop”. Basically what it does > is trying to allocate some nodes from HPC dynamically and run Hadoop(Use > one as NameNode, others as DataNode ). If anybody familiar with it that > would be perfect, but I think my problem is mostly the Hadoop part. > I am using Hadoop 1.2.1 do to the limited support of myHadoop. > > Here is the log: > === > myHadoop: Using HADOOP_HOME=/home/hpc-ruhua/hadoop-stack/hadoop-1.2.1 > myHadoop: Using MH_SCRATCH_DIR=/tmp/hpc-ruhua/4128 > myHadoop: Using JAVA_HOME=/usr > myHadoop: Generating Hadoop configuration in directory in > /home/hpc-ruhua/hadoop/conf/hadoop-conf.4128... > myHadoop: Using directory /home/hpc-ruhua/hadoop/hdfs for persisting HDFS > state... > myHadoop: Designating cn53 as master node (namenode, secondary namenode, > and jobtracker) > myHadoop: The following nodes will be slaves (datanode, tasktracer): > cn53 > cn54 > cn55 > cn56 > Linking /home/hpc-ruhua/hadoop/hdfs/0 to /tmp/hpc-ruhua/4128/hdfs_data on > cn53 > Linking /home/hpc-ruhua/hadoop/hdfs/1 to /tmp/hpc-ruhua/4128/hdfs_data on > cn54 > Linking /home/hpc-ruhua/hadoop/hdfs/2 to /tmp/hpc-ruhua/4128/hdfs_data on > cn55 > Warning: Permanently added 'cn55,192.168.100.55' (RSA) to the list of > known hosts. > Linking /home/hpc-ruhua/hadoop/hdfs/3 to /tmp/hpc-ruhua/4128/hdfs_data on > cn56 > Warning: Permanently added 'cn56,192.168.100.56' (RSA) to the list of > known hosts. > starting namenode, logging to > /tmp/hpc-ruhua/4128/logs/hadoop-hpc-ruhua-namenode-cn53.out > cn53: starting datanode, logging to > /tmp/hpc-ruhua/4128/logs/hadoop-hpc-ruhua-datanode-cn53.out > cn54: starting datanode, logging to > /tmp/hpc-ruhua/4128/logs/hadoop-hpc-ruhua-datanode-cn54.out > cn55: starting datanode, logging to > /tmp/hpc-ruhua/4128/logs/hadoop-hpc-ruhua-datanode-cn55.out > cn56: starting datanode, logging to > /tmp/hpc-ruhua/4128/logs/hadoop-hpc-ruhua-datanode-cn56.out > cn53: starting secondarynamenode, logging to > /tmp/hpc-ruhua/4128/logs/hadoop-hpc-ruhua-secondarynamenode-cn53.out > starting jobtracker, logging to > /tmp/hpc-ruhua/4128/logs/hadoop-hpc-ruhua-jobtracker-cn53.out > cn53: starting tasktracker, logging to > /tmp/hpc-ruhua/4128/logs/hadoop-hpc-ruhua-tasktracker-cn53.out > cn56: starting tasktracker, logging to > /tmp/hpc-ruhua/4128/logs/hadoop-hpc-ruhua-tasktracker-cn56.out > cn55: starting tasktracker, logging to > /tmp/hpc-ruhua/4128/logs/hadoop-hpc-ruhua-tasktracker-cn55.out > cn54: starting tasktracker, logging to > /tmp/hpc-ruhua/4128/logs/hadoop-hpc-ruhua-tasktracker-cn54.out > mkdir: cannot create directory data: File exists > put: Target data/pg2701.txt already exists > Found 1 items > -rw-r--r-- 3 hpc-ruhua supergroup 0 2015-01-07 00:09 > /user/hpc-ruhua/data/pg2701.txt > 15/01/14 12:21:08 ERROR security.UserGroupInformation: > PriviledgedActionException as:hpc-ruhua > cause:org.apache.hadoop.ipc.RemoteException: > org.apache.hadoop.mapred.JobTrackerNotYetInitializedException: JobTracker > is not yet RUNNING > at > org.apache.hadoop.mapred.JobTracker.checkJobTrackerState(JobTracker.java:5199) > at org.apache.hadoop.mapred.JobTracker.getNewJobId(JobTracker.java:3543) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:606) > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:587) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1432) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1428) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190) > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1426) > > org.apache.hadoop.ipc.RemoteException: > org.apache.hadoop.mapred.JobTrackerNotYetInitializedException: JobTracker > is not yet RUNNING > at > org.apache.hadoop.mapred.JobTracker.checkJobTrackerState(JobTracker.java:5199) > at org.apache.hadoop.mapred.JobTracker.getNewJobId(JobTracker.java:3543) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:606) > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:587) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1432) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1428) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190) > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1426) > > at org.apache.hadoop.ipc.Client.call(Client.java:1113) > at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:229) > at org.apache.hadoop.mapred.$Proxy2.getNewJobId(Unknown Source) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:606) > at > org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:85) > at > org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:62) > at org.apache.hadoop.mapred.$Proxy2.getNewJobId(Unknown Source) > at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:944) > at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:936) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190) > at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:936) > at org.apache.hadoop.mapreduce.Job.submit(Job.java:550) > at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:580) > at org.apache.hadoop.examples.WordCount.main(WordCount.java:82) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:606) > at > org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:68) > at org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:139) > at org.apache.hadoop.examples.ExampleDriver.main(ExampleDriver.java:64) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:606) > at org.apache.hadoop.util.RunJar.main(RunJar.java:160) > ls: Cannot access wordcount-output: No such file or directory. > get: null > stopping jobtracker > cn54: stopping tasktracker > cn55: stopping tasktracker > cn53: stopping tasktracker > cn56: stopping tasktracker > stopping namenode > cn53: no datanode to stop > cn54: no datanode to stop > cn56: no datanode to stop > cn55: no datanode to stop > === > The erro is “ERROR security.UserGroupInformation: > PriviledgedActionException as:hpc-ruhua > cause:org.apache.hadoop.ipc.RemoteException: “, anybody has an idea of > what might be the problem? > That’s the result of using “$HADOOP_HOME/bin/start-all.sh” > > I tried to split the start phase to “ > $HADOOP_HOME/bin/hadoop namenode > $HADOOP_HOME/bin/hadoop datanode > “ > > Below is the log: > myHadoop: Using HADOOP_HOME=/home/hpc-ruhua/hadoop-stack/hadoop-1.2.1 > myHadoop: Using MH_SCRATCH_DIR=/tmp/hpc-ruhua/4178 > myHadoop: Using JAVA_HOME=/usr > myHadoop: Generating Hadoop configuration in directory in > /home/hpc-ruhua/hadoop/conf/hadoop-conf.4178... > myHadoop: Using directory /home/hpc-ruhua/hadoop/hdfs for persisting HDFS > state... > myHadoop: Designating cn53 as master node (namenode, secondary namenode, > and jobtracker) > myHadoop: The following nodes will be slaves (datanode, tasktracer): > cn53 > cn54 > cn55 > cn56 > Linking /home/hpc-ruhua/hadoop/hdfs/0 to /tmp/hpc-ruhua/4178/hdfs_data on > cn53 > Linking /home/hpc-ruhua/hadoop/hdfs/1 to /tmp/hpc-ruhua/4178/hdfs_data on > cn54 > Linking /home/hpc-ruhua/hadoop/hdfs/2 to /tmp/hpc-ruhua/4178/hdfs_data on > cn55 > Linking /home/hpc-ruhua/hadoop/hdfs/3 to /tmp/hpc-ruhua/4178/hdfs_data on > cn56 > 15/01/16 15:35:14 INFO namenode.NameNode: STARTUP_MSG: > /************************************************************ > STARTUP_MSG: Starting NameNode > STARTUP_MSG: host = cn53/192.168.100.53 > STARTUP_MSG: args = [] > STARTUP_MSG: version = 1.2.1 > STARTUP_MSG: build = > https://svn.apache.org/repos/asf/hadoop/common/branches/branch-1.2 -r > 1503152; compiled by 'mattf' on Mon Jul 22 15:23:09 PDT 2013 > STARTUP_MSG: java = 1.7.0_71 > ************************************************************/ > 15/01/16 15:35:14 INFO impl.MetricsConfig: loaded properties from > hadoop-metrics2.properties > 15/01/16 15:35:14 INFO impl.MetricsSourceAdapter: MBean for source > MetricsSystem,sub=Stats registered. > 15/01/16 15:35:14 INFO impl.MetricsSystemImpl: Scheduled snapshot period > at 10 second(s). > 15/01/16 15:35:14 INFO impl.MetricsSystemImpl: NameNode metrics system > started > 15/01/16 15:35:14 INFO impl.MetricsSourceAdapter: MBean for source ugi > registered. > 15/01/16 15:35:14 INFO impl.MetricsSourceAdapter: MBean for source jvm > registered. > 15/01/16 15:35:14 INFO impl.MetricsSourceAdapter: MBean for source > NameNode registered. > 15/01/16 15:35:14 INFO util.GSet: Computing capacity for map BlocksMap > 15/01/16 15:35:14 INFO util.GSet: VM type = 64-bit > 15/01/16 15:35:14 INFO util.GSet: 2.0% max memory = 932184064 > 15/01/16 15:35:14 INFO util.GSet: capacity = 2^21 = 2097152 entries > 15/01/16 15:35:14 INFO util.GSet: recommended=2097152, actual=2097152 > 15/01/16 15:35:15 INFO namenode.FSNamesystem: fsOwner=hpc-ruhua > 15/01/16 15:35:15 INFO namenode.FSNamesystem: supergroup=supergroup > 15/01/16 15:35:15 INFO namenode.FSNamesystem: isPermissionEnabled=true > 15/01/16 15:35:15 INFO namenode.FSNamesystem: > dfs.block.invalidate.limit=100 > 15/01/16 15:35:15 INFO namenode.FSNamesystem: isAccessTokenEnabled=false > accessKeyUpdateInterval=0 min(s), accessTokenLifetime=0 min(s) > 15/01/16 15:35:15 INFO namenode.FSNamesystem: Registered > FSNamesystemStateMBean and NameNodeMXBean > 15/01/16 15:35:15 INFO namenode.FSEditLog: > dfs.namenode.edits.toleration.length = 0 > 15/01/16 15:35:15 INFO namenode.NameNode: Caching file names occuring more > than 10 times > 15/01/16 15:35:15 INFO common.Storage: Start loading image file > /tmp/hpc-ruhua/4178/namenode_data/current/fsimage > 15/01/16 15:35:15 INFO common.Storage: Number of files = 28 > 15/01/16 15:35:15 INFO common.Storage: Number of files under construction > = 1 > 15/01/16 15:35:15 INFO common.Storage: Image file > /tmp/hpc-ruhua/4178/namenode_data/current/fsimage of size 2996 bytes loaded > in 0 seconds. > 15/01/16 15:35:15 INFO namenode.FSEditLog: Start loading edits file > /tmp/hpc-ruhua/4178/namenode_data/current/edits > 15/01/16 15:35:15 INFO namenode.FSEditLog: Invalid opcode, reached end of > edit log Number of transactions found: 32. Bytes read: 2579 > 15/01/16 15:35:15 INFO namenode.FSEditLog: Start checking end of edit log > (/tmp/hpc-ruhua/4178/namenode_data/current/edits) ... > 15/01/16 15:35:15 INFO namenode.FSEditLog: Checked the bytes after the end > of edit log (/tmp/hpc-ruhua/4178/namenode_data/current/edits): > 15/01/16 15:35:15 INFO namenode.FSEditLog: Padding position = 2579 (-1 > means padding not found) > 15/01/16 15:35:15 INFO namenode.FSEditLog: Edit log length = 1048580 > 15/01/16 15:35:15 INFO namenode.FSEditLog: Read length = 2579 > 15/01/16 15:35:15 INFO namenode.FSEditLog: Corruption length = 0 > 15/01/16 15:35:15 INFO namenode.FSEditLog: Toleration length = 0 (= > dfs.namenode.edits.toleration.length) > 15/01/16 15:35:15 INFO namenode.FSEditLog: Summary: |---------- Read=2579 > ----------|-- Corrupt=0 --|-- Pad=1046001 --| > 15/01/16 15:35:15 INFO namenode.FSEditLog: Edits file > /tmp/hpc-ruhua/4178/namenode_data/current/edits of size 1048580 edits # 32 > loaded in 0 seconds. > 15/01/16 15:35:15 INFO common.Storage: Image file > /tmp/hpc-ruhua/4178/namenode_data/current/fsimage of size 3745 bytes saved > in 0 seconds. > 15/01/16 15:35:15 INFO namenode.FSEditLog: closing edit log: position=4, > editlog=/tmp/hpc-ruhua/4178/namenode_data/current/edits > 15/01/16 15:35:15 INFO namenode.FSEditLog: close success: truncate to 4, > editlog=/tmp/hpc-ruhua/4178/namenode_data/current/edits > 15/01/16 15:35:16 INFO namenode.NameCache: initialized with 0 entries 0 > lookups > 15/01/16 15:35:16 INFO namenode.FSNamesystem: Finished loading FSImage in > 1162 msecs > 15/01/16 15:35:16 INFO namenode.FSNamesystem: dfs.safemode.threshold.pct > = 0.9990000128746033 > 15/01/16 15:35:16 INFO namenode.FSNamesystem: > dfs.namenode.safemode.min.datanodes = 0 > 15/01/16 15:35:16 INFO namenode.FSNamesystem: dfs.safemode.extension > = 30000 > 15/01/16 15:35:16 INFO namenode.FSNamesystem: Number of blocks excluded by > safe block count: 0 total blocks: 0 and thus the safe blocks: 0 > 15/01/16 15:35:16 INFO namenode.FSNamesystem: Total number of blocks = 0 > 15/01/16 15:35:16 INFO namenode.FSNamesystem: Number of invalid blocks = 0 > 15/01/16 15:35:16 INFO namenode.FSNamesystem: Number of under-replicated > blocks = 0 > 15/01/16 15:35:16 INFO namenode.FSNamesystem: Number of over-replicated > blocks = 0 > 15/01/16 15:35:16 INFO hdfs.StateChange: STATE* Safe mode termination scan > for invalid, over- and under-replicated blocks completed in 7 msec > 15/01/16 15:35:16 INFO hdfs.StateChange: STATE* Leaving safe mode after 1 > secs > 15/01/16 15:35:16 INFO hdfs.StateChange: STATE* Network topology has 0 > racks and 0 datanodes > 15/01/16 15:35:16 INFO hdfs.StateChange: STATE* UnderReplicatedBlocks has > 0 blocks > 15/01/16 15:35:16 INFO util.HostsFileReader: Refreshing hosts > (include/exclude) list > 15/01/16 15:35:16 INFO namenode.FSNamesystem: ReplicateQueue > QueueProcessingStatistics: First cycle completed 0 blocks in 0 msec > 15/01/16 15:35:16 INFO namenode.FSNamesystem: ReplicateQueue > QueueProcessingStatistics: Queue flush completed 0 blocks in 0 msec > processing time, 0 msec clock time, 1 cycles > 15/01/16 15:35:16 INFO namenode.FSNamesystem: InvalidateQueue > QueueProcessingStatistics: First cycle completed 0 blocks in 0 msec > 15/01/16 15:35:16 INFO namenode.FSNamesystem: InvalidateQueue > QueueProcessingStatistics: Queue flush completed 0 blocks in 0 msec > processing time, 0 msec clock time, 1 cycles > 15/01/16 15:35:16 INFO impl.MetricsSourceAdapter: MBean for source > FSNamesystemMetrics registered. > 15/01/16 15:35:16 INFO ipc.Server: Starting SocketReader > 15/01/16 15:35:16 INFO impl.MetricsSourceAdapter: MBean for source > RpcDetailedActivityForPort54310 registered. > 15/01/16 15:35:16 INFO impl.MetricsSourceAdapter: MBean for source > RpcActivityForPort54310 registered. > 15/01/16 15:35:16 INFO namenode.NameNode: Namenode up at: cn53/ > 192.168.100.53:54310 > 15/01/16 15:35:16 INFO mortbay.log: Logging to > org.slf4j.impl.Log4jLoggerAdapter(org.mortbay.log) via > org.mortbay.log.Slf4jLog > 15/01/16 15:35:16 INFO http.HttpServer: Added global filtersafety > (class=org.apache.hadoop.http.HttpServer$QuotingInputFilter) > 15/01/16 15:35:16 INFO http.HttpServer: dfs.webhdfs.enabled = false > 15/01/16 15:35:16 INFO http.HttpServer: Port returned by > webServer.getConnectors()[0].getLocalPort() before open() is -1. Opening > the listener on 50070 > 15/01/16 15:35:16 INFO http.HttpServer: listener.getLocalPort() returned > 50070 webServer.getConnectors()[0].getLocalPort() returned 50070 > 15/01/16 15:35:16 INFO http.HttpServer: Jetty bound to port 50070 > 15/01/16 15:35:16 INFO mortbay.log: jetty-6.1.26 > 15/01/16 15:35:16 INFO mortbay.log: Started SelectChannelConnector@0.0.0.0 > :50070 > 15/01/16 15:35:16 INFO namenode.NameNode: Web-server up at: 0.0.0.0:50070 > 15/01/16 15:35:16 INFO ipc.Server: IPC Server Responder: starting > 15/01/16 15:35:16 INFO ipc.Server: IPC Server listener on 54310: starting > 15/01/16 15:35:16 INFO ipc.Server: IPC Server handler 0 on 54310: starting > 15/01/16 15:35:16 INFO ipc.Server: IPC Server handler 1 on 54310: starting > 15/01/16 15:35:16 INFO ipc.Server: IPC Server handler 2 on 54310: starting > 15/01/16 15:35:16 INFO ipc.Server: IPC Server handler 3 on 54310: starting > 15/01/16 15:35:16 INFO ipc.Server: IPC Server handler 4 on 54310: starting > 15/01/16 15:35:16 INFO ipc.Server: IPC Server handler 5 on 54310: starting > 15/01/16 15:35:16 INFO ipc.Server: IPC Server handler 6 on 54310: starting > 15/01/16 15:35:16 INFO ipc.Server: IPC Server handler 8 on 54310: starting > 15/01/16 15:35:16 INFO ipc.Server: IPC Server handler 7 on 54310: starting > 15/01/16 15:35:16 INFO ipc.Server: IPC Server handler 9 on 54310: starting > > == > I can also provide the script of running myHadoop or other system > information if that helps. I have been struggling with this problem for > quite long time. Could anyone help? > > Best, > Ruhua > > > > >