Have you looked at:
http://sourceforge.net/p/myhadoop/mailman/?source=navbar

Cheers

On Fri, Jan 16, 2015 at 12:55 PM, Ruhua Jiang <ruhua.ji...@gmail.com> wrote:

> Hello
>
> I am quite new to Hadoop. I am trying to run Hadoop on top of a HPC
> infrastructure using a solution called “myHadoop”. Basically what it does
> is trying  to allocate some nodes from HPC dynamically and run Hadoop(Use
> one as NameNode, others as DataNode ). If anybody familiar with it that
> would be perfect, but I think my problem is mostly the Hadoop part.
> I am using Hadoop 1.2.1 do to the limited support of myHadoop.
>
> Here is the log:
> ===
> myHadoop: Using HADOOP_HOME=/home/hpc-ruhua/hadoop-stack/hadoop-1.2.1
> myHadoop: Using MH_SCRATCH_DIR=/tmp/hpc-ruhua/4128
> myHadoop: Using JAVA_HOME=/usr
> myHadoop: Generating Hadoop configuration in directory in
> /home/hpc-ruhua/hadoop/conf/hadoop-conf.4128...
> myHadoop: Using directory /home/hpc-ruhua/hadoop/hdfs for persisting HDFS
> state...
> myHadoop: Designating cn53 as master node (namenode, secondary namenode,
> and jobtracker)
> myHadoop: The following nodes will be slaves (datanode, tasktracer):
> cn53
> cn54
> cn55
> cn56
> Linking /home/hpc-ruhua/hadoop/hdfs/0 to /tmp/hpc-ruhua/4128/hdfs_data on
> cn53
> Linking /home/hpc-ruhua/hadoop/hdfs/1 to /tmp/hpc-ruhua/4128/hdfs_data on
> cn54
> Linking /home/hpc-ruhua/hadoop/hdfs/2 to /tmp/hpc-ruhua/4128/hdfs_data on
> cn55
> Warning: Permanently added 'cn55,192.168.100.55' (RSA) to the list of
> known hosts.
> Linking /home/hpc-ruhua/hadoop/hdfs/3 to /tmp/hpc-ruhua/4128/hdfs_data on
> cn56
> Warning: Permanently added 'cn56,192.168.100.56' (RSA) to the list of
> known hosts.
> starting namenode, logging to
> /tmp/hpc-ruhua/4128/logs/hadoop-hpc-ruhua-namenode-cn53.out
> cn53: starting datanode, logging to
> /tmp/hpc-ruhua/4128/logs/hadoop-hpc-ruhua-datanode-cn53.out
> cn54: starting datanode, logging to
> /tmp/hpc-ruhua/4128/logs/hadoop-hpc-ruhua-datanode-cn54.out
> cn55: starting datanode, logging to
> /tmp/hpc-ruhua/4128/logs/hadoop-hpc-ruhua-datanode-cn55.out
> cn56: starting datanode, logging to
> /tmp/hpc-ruhua/4128/logs/hadoop-hpc-ruhua-datanode-cn56.out
> cn53: starting secondarynamenode, logging to
> /tmp/hpc-ruhua/4128/logs/hadoop-hpc-ruhua-secondarynamenode-cn53.out
> starting jobtracker, logging to
> /tmp/hpc-ruhua/4128/logs/hadoop-hpc-ruhua-jobtracker-cn53.out
> cn53: starting tasktracker, logging to
> /tmp/hpc-ruhua/4128/logs/hadoop-hpc-ruhua-tasktracker-cn53.out
> cn56: starting tasktracker, logging to
> /tmp/hpc-ruhua/4128/logs/hadoop-hpc-ruhua-tasktracker-cn56.out
> cn55: starting tasktracker, logging to
> /tmp/hpc-ruhua/4128/logs/hadoop-hpc-ruhua-tasktracker-cn55.out
> cn54: starting tasktracker, logging to
> /tmp/hpc-ruhua/4128/logs/hadoop-hpc-ruhua-tasktracker-cn54.out
> mkdir: cannot create directory data: File exists
> put: Target data/pg2701.txt already exists
> Found 1 items
> -rw-r--r--   3 hpc-ruhua supergroup          0 2015-01-07 00:09
> /user/hpc-ruhua/data/pg2701.txt
> 15/01/14 12:21:08 ERROR security.UserGroupInformation:
> PriviledgedActionException as:hpc-ruhua
> cause:org.apache.hadoop.ipc.RemoteException:
> org.apache.hadoop.mapred.JobTrackerNotYetInitializedException: JobTracker
> is not yet RUNNING
> at
> org.apache.hadoop.mapred.JobTracker.checkJobTrackerState(JobTracker.java:5199)
> at org.apache.hadoop.mapred.JobTracker.getNewJobId(JobTracker.java:3543)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:587)
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1432)
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1428)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190)
> at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1426)
>
> org.apache.hadoop.ipc.RemoteException:
> org.apache.hadoop.mapred.JobTrackerNotYetInitializedException: JobTracker
> is not yet RUNNING
> at
> org.apache.hadoop.mapred.JobTracker.checkJobTrackerState(JobTracker.java:5199)
> at org.apache.hadoop.mapred.JobTracker.getNewJobId(JobTracker.java:3543)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:587)
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1432)
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1428)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190)
> at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1426)
>
> at org.apache.hadoop.ipc.Client.call(Client.java:1113)
> at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:229)
> at org.apache.hadoop.mapred.$Proxy2.getNewJobId(Unknown Source)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:85)
> at
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:62)
> at org.apache.hadoop.mapred.$Proxy2.getNewJobId(Unknown Source)
> at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:944)
> at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:936)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190)
> at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:936)
> at org.apache.hadoop.mapreduce.Job.submit(Job.java:550)
> at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:580)
> at org.apache.hadoop.examples.WordCount.main(WordCount.java:82)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at
> org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:68)
> at org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:139)
> at org.apache.hadoop.examples.ExampleDriver.main(ExampleDriver.java:64)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at org.apache.hadoop.util.RunJar.main(RunJar.java:160)
> ls: Cannot access wordcount-output: No such file or directory.
> get: null
> stopping jobtracker
> cn54: stopping tasktracker
> cn55: stopping tasktracker
> cn53: stopping tasktracker
> cn56: stopping tasktracker
> stopping namenode
> cn53: no datanode to stop
> cn54: no datanode to stop
> cn56: no datanode to stop
> cn55: no datanode to stop
> ===
> The erro is “ERROR security.UserGroupInformation:
> PriviledgedActionException as:hpc-ruhua
> cause:org.apache.hadoop.ipc.RemoteException: “, anybody has an idea of
> what might be the problem?
> That’s the result of using “$HADOOP_HOME/bin/start-all.sh”
>
> I tried to split the start phase to “
> $HADOOP_HOME/bin/hadoop namenode
> $HADOOP_HOME/bin/hadoop datanode
> “
>
> Below is the log:
> myHadoop: Using HADOOP_HOME=/home/hpc-ruhua/hadoop-stack/hadoop-1.2.1
> myHadoop: Using MH_SCRATCH_DIR=/tmp/hpc-ruhua/4178
> myHadoop: Using JAVA_HOME=/usr
> myHadoop: Generating Hadoop configuration in directory in
> /home/hpc-ruhua/hadoop/conf/hadoop-conf.4178...
> myHadoop: Using directory /home/hpc-ruhua/hadoop/hdfs for persisting HDFS
> state...
> myHadoop: Designating cn53 as master node (namenode, secondary namenode,
> and jobtracker)
> myHadoop: The following nodes will be slaves (datanode, tasktracer):
> cn53
> cn54
> cn55
> cn56
> Linking /home/hpc-ruhua/hadoop/hdfs/0 to /tmp/hpc-ruhua/4178/hdfs_data on
> cn53
> Linking /home/hpc-ruhua/hadoop/hdfs/1 to /tmp/hpc-ruhua/4178/hdfs_data on
> cn54
> Linking /home/hpc-ruhua/hadoop/hdfs/2 to /tmp/hpc-ruhua/4178/hdfs_data on
> cn55
> Linking /home/hpc-ruhua/hadoop/hdfs/3 to /tmp/hpc-ruhua/4178/hdfs_data on
> cn56
> 15/01/16 15:35:14 INFO namenode.NameNode: STARTUP_MSG:
> /************************************************************
> STARTUP_MSG: Starting NameNode
> STARTUP_MSG:   host = cn53/192.168.100.53
> STARTUP_MSG:   args = []
> STARTUP_MSG:   version = 1.2.1
> STARTUP_MSG:   build =
> https://svn.apache.org/repos/asf/hadoop/common/branches/branch-1.2 -r
> 1503152; compiled by 'mattf' on Mon Jul 22 15:23:09 PDT 2013
> STARTUP_MSG:   java = 1.7.0_71
> ************************************************************/
> 15/01/16 15:35:14 INFO impl.MetricsConfig: loaded properties from
> hadoop-metrics2.properties
> 15/01/16 15:35:14 INFO impl.MetricsSourceAdapter: MBean for source
> MetricsSystem,sub=Stats registered.
> 15/01/16 15:35:14 INFO impl.MetricsSystemImpl: Scheduled snapshot period
> at 10 second(s).
> 15/01/16 15:35:14 INFO impl.MetricsSystemImpl: NameNode metrics system
> started
> 15/01/16 15:35:14 INFO impl.MetricsSourceAdapter: MBean for source ugi
> registered.
> 15/01/16 15:35:14 INFO impl.MetricsSourceAdapter: MBean for source jvm
> registered.
> 15/01/16 15:35:14 INFO impl.MetricsSourceAdapter: MBean for source
> NameNode registered.
> 15/01/16 15:35:14 INFO util.GSet: Computing capacity for map BlocksMap
> 15/01/16 15:35:14 INFO util.GSet: VM type       = 64-bit
> 15/01/16 15:35:14 INFO util.GSet: 2.0% max memory = 932184064
> 15/01/16 15:35:14 INFO util.GSet: capacity      = 2^21 = 2097152 entries
> 15/01/16 15:35:14 INFO util.GSet: recommended=2097152, actual=2097152
> 15/01/16 15:35:15 INFO namenode.FSNamesystem: fsOwner=hpc-ruhua
> 15/01/16 15:35:15 INFO namenode.FSNamesystem: supergroup=supergroup
> 15/01/16 15:35:15 INFO namenode.FSNamesystem: isPermissionEnabled=true
> 15/01/16 15:35:15 INFO namenode.FSNamesystem:
> dfs.block.invalidate.limit=100
> 15/01/16 15:35:15 INFO namenode.FSNamesystem: isAccessTokenEnabled=false
> accessKeyUpdateInterval=0 min(s), accessTokenLifetime=0 min(s)
> 15/01/16 15:35:15 INFO namenode.FSNamesystem: Registered
> FSNamesystemStateMBean and NameNodeMXBean
> 15/01/16 15:35:15 INFO namenode.FSEditLog:
> dfs.namenode.edits.toleration.length = 0
> 15/01/16 15:35:15 INFO namenode.NameNode: Caching file names occuring more
> than 10 times
> 15/01/16 15:35:15 INFO common.Storage: Start loading image file
> /tmp/hpc-ruhua/4178/namenode_data/current/fsimage
> 15/01/16 15:35:15 INFO common.Storage: Number of files = 28
> 15/01/16 15:35:15 INFO common.Storage: Number of files under construction
> = 1
> 15/01/16 15:35:15 INFO common.Storage: Image file
> /tmp/hpc-ruhua/4178/namenode_data/current/fsimage of size 2996 bytes loaded
> in 0 seconds.
> 15/01/16 15:35:15 INFO namenode.FSEditLog: Start loading edits file
> /tmp/hpc-ruhua/4178/namenode_data/current/edits
> 15/01/16 15:35:15 INFO namenode.FSEditLog: Invalid opcode, reached end of
> edit log Number of transactions found: 32.  Bytes read: 2579
> 15/01/16 15:35:15 INFO namenode.FSEditLog: Start checking end of edit log
> (/tmp/hpc-ruhua/4178/namenode_data/current/edits) ...
> 15/01/16 15:35:15 INFO namenode.FSEditLog: Checked the bytes after the end
> of edit log (/tmp/hpc-ruhua/4178/namenode_data/current/edits):
> 15/01/16 15:35:15 INFO namenode.FSEditLog:   Padding position  = 2579 (-1
> means padding not found)
> 15/01/16 15:35:15 INFO namenode.FSEditLog:   Edit log length   = 1048580
> 15/01/16 15:35:15 INFO namenode.FSEditLog:   Read length       = 2579
> 15/01/16 15:35:15 INFO namenode.FSEditLog:   Corruption length = 0
> 15/01/16 15:35:15 INFO namenode.FSEditLog:   Toleration length = 0 (=
> dfs.namenode.edits.toleration.length)
> 15/01/16 15:35:15 INFO namenode.FSEditLog: Summary: |---------- Read=2579
> ----------|-- Corrupt=0 --|-- Pad=1046001 --|
> 15/01/16 15:35:15 INFO namenode.FSEditLog: Edits file
> /tmp/hpc-ruhua/4178/namenode_data/current/edits of size 1048580 edits # 32
> loaded in 0 seconds.
> 15/01/16 15:35:15 INFO common.Storage: Image file
> /tmp/hpc-ruhua/4178/namenode_data/current/fsimage of size 3745 bytes saved
> in 0 seconds.
> 15/01/16 15:35:15 INFO namenode.FSEditLog: closing edit log: position=4,
> editlog=/tmp/hpc-ruhua/4178/namenode_data/current/edits
> 15/01/16 15:35:15 INFO namenode.FSEditLog: close success: truncate to 4,
> editlog=/tmp/hpc-ruhua/4178/namenode_data/current/edits
> 15/01/16 15:35:16 INFO namenode.NameCache: initialized with 0 entries 0
> lookups
> 15/01/16 15:35:16 INFO namenode.FSNamesystem: Finished loading FSImage in
> 1162 msecs
> 15/01/16 15:35:16 INFO namenode.FSNamesystem: dfs.safemode.threshold.pct
>         = 0.9990000128746033
> 15/01/16 15:35:16 INFO namenode.FSNamesystem:
> dfs.namenode.safemode.min.datanodes = 0
> 15/01/16 15:35:16 INFO namenode.FSNamesystem: dfs.safemode.extension
>         = 30000
> 15/01/16 15:35:16 INFO namenode.FSNamesystem: Number of blocks excluded by
> safe block count: 0 total blocks: 0 and thus the safe blocks: 0
> 15/01/16 15:35:16 INFO namenode.FSNamesystem: Total number of blocks = 0
> 15/01/16 15:35:16 INFO namenode.FSNamesystem: Number of invalid blocks = 0
> 15/01/16 15:35:16 INFO namenode.FSNamesystem: Number of under-replicated
> blocks = 0
> 15/01/16 15:35:16 INFO namenode.FSNamesystem: Number of  over-replicated
> blocks = 0
> 15/01/16 15:35:16 INFO hdfs.StateChange: STATE* Safe mode termination scan
> for invalid, over- and under-replicated blocks completed in 7 msec
> 15/01/16 15:35:16 INFO hdfs.StateChange: STATE* Leaving safe mode after 1
> secs
> 15/01/16 15:35:16 INFO hdfs.StateChange: STATE* Network topology has 0
> racks and 0 datanodes
> 15/01/16 15:35:16 INFO hdfs.StateChange: STATE* UnderReplicatedBlocks has
> 0 blocks
> 15/01/16 15:35:16 INFO util.HostsFileReader: Refreshing hosts
> (include/exclude) list
> 15/01/16 15:35:16 INFO namenode.FSNamesystem: ReplicateQueue
> QueueProcessingStatistics: First cycle completed 0 blocks in 0 msec
> 15/01/16 15:35:16 INFO namenode.FSNamesystem: ReplicateQueue
> QueueProcessingStatistics: Queue flush completed 0 blocks in 0 msec
> processing time, 0 msec clock time, 1 cycles
> 15/01/16 15:35:16 INFO namenode.FSNamesystem: InvalidateQueue
> QueueProcessingStatistics: First cycle completed 0 blocks in 0 msec
> 15/01/16 15:35:16 INFO namenode.FSNamesystem: InvalidateQueue
> QueueProcessingStatistics: Queue flush completed 0 blocks in 0 msec
> processing time, 0 msec clock time, 1 cycles
> 15/01/16 15:35:16 INFO impl.MetricsSourceAdapter: MBean for source
> FSNamesystemMetrics registered.
> 15/01/16 15:35:16 INFO ipc.Server: Starting SocketReader
> 15/01/16 15:35:16 INFO impl.MetricsSourceAdapter: MBean for source
> RpcDetailedActivityForPort54310 registered.
> 15/01/16 15:35:16 INFO impl.MetricsSourceAdapter: MBean for source
> RpcActivityForPort54310 registered.
> 15/01/16 15:35:16 INFO namenode.NameNode: Namenode up at: cn53/
> 192.168.100.53:54310
> 15/01/16 15:35:16 INFO mortbay.log: Logging to
> org.slf4j.impl.Log4jLoggerAdapter(org.mortbay.log) via
> org.mortbay.log.Slf4jLog
> 15/01/16 15:35:16 INFO http.HttpServer: Added global filtersafety
> (class=org.apache.hadoop.http.HttpServer$QuotingInputFilter)
> 15/01/16 15:35:16 INFO http.HttpServer: dfs.webhdfs.enabled = false
> 15/01/16 15:35:16 INFO http.HttpServer: Port returned by
> webServer.getConnectors()[0].getLocalPort() before open() is -1. Opening
> the listener on 50070
> 15/01/16 15:35:16 INFO http.HttpServer: listener.getLocalPort() returned
> 50070 webServer.getConnectors()[0].getLocalPort() returned 50070
> 15/01/16 15:35:16 INFO http.HttpServer: Jetty bound to port 50070
> 15/01/16 15:35:16 INFO mortbay.log: jetty-6.1.26
> 15/01/16 15:35:16 INFO mortbay.log: Started SelectChannelConnector@0.0.0.0
> :50070
> 15/01/16 15:35:16 INFO namenode.NameNode: Web-server up at: 0.0.0.0:50070
> 15/01/16 15:35:16 INFO ipc.Server: IPC Server Responder: starting
> 15/01/16 15:35:16 INFO ipc.Server: IPC Server listener on 54310: starting
> 15/01/16 15:35:16 INFO ipc.Server: IPC Server handler 0 on 54310: starting
> 15/01/16 15:35:16 INFO ipc.Server: IPC Server handler 1 on 54310: starting
> 15/01/16 15:35:16 INFO ipc.Server: IPC Server handler 2 on 54310: starting
> 15/01/16 15:35:16 INFO ipc.Server: IPC Server handler 3 on 54310: starting
> 15/01/16 15:35:16 INFO ipc.Server: IPC Server handler 4 on 54310: starting
> 15/01/16 15:35:16 INFO ipc.Server: IPC Server handler 5 on 54310: starting
> 15/01/16 15:35:16 INFO ipc.Server: IPC Server handler 6 on 54310: starting
> 15/01/16 15:35:16 INFO ipc.Server: IPC Server handler 8 on 54310: starting
> 15/01/16 15:35:16 INFO ipc.Server: IPC Server handler 7 on 54310: starting
> 15/01/16 15:35:16 INFO ipc.Server: IPC Server handler 9 on 54310: starting
>
> ==
> I can also provide the script of running myHadoop or other system
> information if that helps.  I have been struggling with this problem for
> quite long time. Could anyone help?
>
> Best,
> Ruhua
>
>
>
>
>

Reply via email to