Is that the first time you start your cluster? My experience is that, when
you start the cluster once, then change the conf (say, add another slave),
and restart your cluster, it sometimes generate some IPC issues (like the
timeout in the namenode log). This change will cause the filesystem into
safe mode so there will be the exceptions.

The exception in your NN log is caused when the cluster was trying to add a
new node but reached a timeout. If you have changed your cluster conf, you
can use "hadoop namenode format" to format the filesystem and then start,
then your cluster should be started.

Hope this can help.

Best regards,
Starry

/* Tomorrow is another day. So is today. */


On Sat, Sep 26, 2009 at 09:03, Zheng Lv <lvzheng19800...@gmail.com> wrote:

> No one can give some suggestions?
>
> 2009/9/25 Zheng Lv <lvzheng19800...@gmail.com>
>
> > Hello Everyone,
> >     We have a cluster with one namenode and three datanodes. And we got
> > these logs when starting hadoop0.20. Is it normal?
> >     2009-09-25 10:45:00,616 INFO
> > org.apache.hadoop.hdfs.server.namenode.NameNode: STARTUP_MSG:
> > /************************************************************
> > STARTUP_MSG: Starting NameNode
> > STARTUP_MSG:   host = ubuntu6/192.168.33.7
> > STARTUP_MSG:   args = []
> > STARTUP_MSG:   version = 0.20.1-dev
> > STARTUP_MSG:   build =  -r ; compiled by 'cyd' on 2009?ê 08?? 07?? ??????
> > 15:20:16 CST
> > ************************************************************/
> > 2009-09-25 10:45:00,721 INFO org.apache.hadoop.ipc.metrics.RpcMetrics:
> > Initializing RPC Metrics with hostName=NameNode, port=9000
> > 2009-09-25 10:45:00,767 INFO
> > org.apache.hadoop.hdfs.server.namenode.NameNode: Namenode up at: ubuntu6/
> > 192.168.33.7:9000
> > 2009-09-25 10:45:00,769 INFO org.apache.hadoop.metrics.jvm.JvmMetrics:
> > Initializing JVM Metrics with processName=NameNode, sessionId=null
> > 2009-09-25 10:45:00,771 INFO
> > org.apache.hadoop.hdfs.server.namenode.metrics.NameNodeMetrics:
> Initializing
> > NameNodeMeterics using context
> > object:org.apache.hadoop.metrics.spi.NullContext
> > 2009-09-25 10:45:00,835 INFO
> > org.apache.hadoop.hdfs.server.namenode.FSNamesystem:
> > fsOwner=cyd,cyd,adm,dialout,cdrom,plugdev,lpadmin,sambashare,admin
> > 2009-09-25 10:45:00,835 INFO
> > org.apache.hadoop.hdfs.server.namenode.FSNamesystem:
> supergroup=supergroup
> > 2009-09-25 10:45:00,835 INFO
> > org.apache.hadoop.hdfs.server.namenode.FSNamesystem:
> > isPermissionEnabled=true
> > 2009-09-25 10:45:00,842 INFO
> > org.apache.hadoop.hdfs.server.namenode.metrics.FSNamesystemMetrics:
> > Initializing FSNamesystemMetrics using context
> > object:org.apache.hadoop.metrics.spi.NullContext
> > 2009-09-25 10:45:00,843 INFO
> > org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Registered
> > FSNamesystemStatusMBean
> > 2009-09-25 10:45:00,871 INFO
> org.apache.hadoop.hdfs.server.common.Storage:
> > Number of files = 2834
> > 2009-09-25 10:45:01,412 INFO
> org.apache.hadoop.hdfs.server.common.Storage:
> > Number of files under construction = 1
> > 2009-09-25 10:45:01,416 INFO
> org.apache.hadoop.hdfs.server.common.Storage:
> > Image file of size 329240 loaded in 0 seconds.
> > 2009-09-25 10:45:01,417 INFO
> org.apache.hadoop.hdfs.server.common.Storage:
> > Edits file /home/cyd/hdfs2/name/current/edits of size 4 edits # 0 loaded
> in
> > 0 seconds.
> > 2009-09-25 10:45:01,683 INFO
> org.apache.hadoop.hdfs.server.common.Storage:
> > Image file of size 329240 saved in 0 seconds.
> > 2009-09-25 10:45:01,805 INFO
> > org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Finished loading
> > FSImage in 1007 msecs
> > 2009-09-25 10:45:01,821 INFO org.apache.hadoop.hdfs.StateChange: STATE*
> > Safe mode ON.
> > The ratio of reported blocks 0.0000 has not reached the threshold 0.9990.
> > Safe mode will be turned off automatically.
> > 2009-09-25 10:45:07,162 INFO org.mortbay.log: Logging to
> > org.slf4j.impl.Log4jLoggerAdapter(org.mortbay.log) via
> > org.mortbay.log.Slf4jLog
> > 2009-09-25 10:45:07,228 INFO org.apache.hadoop.http.HttpServer: Jetty
> bound
> > to port 50070
> > 2009-09-25 10:45:07,229 INFO org.mortbay.log: jetty-6.1.14
> > 2009-09-25 10:47:34,305 INFO org.mortbay.log: Started
> > selectchannelconnec...@0.0.0.0:50070
> > 2009-09-25 10:47:34,319 INFO
> > org.apache.hadoop.hdfs.server.namenode.NameNode: Web-server up at:
> > 0.0.0.0:50070
> > 2009-09-25 10:47:34,320 INFO org.apache.hadoop.ipc.Server: IPC Server
> > Responder: starting
> > 2009-09-25 10:47:34,321 INFO org.apache.hadoop.ipc.Server: IPC Server
> > listener on 9000: starting
> > 2009-09-25 10:47:34,355 INFO org.apache.hadoop.ipc.Server: IPC Server
> > handler 0 on 9000: starting
> > 2009-09-25 10:47:34,355 INFO org.apache.hadoop.ipc.Server: IPC Server
> > handler 1 on 9000: starting
> > 2009-09-25 10:47:34,356 INFO org.apache.hadoop.ipc.Server: IPC Server
> > handler 2 on 9000: starting
> > 2009-09-25 10:47:34,356 INFO org.apache.hadoop.ipc.Server: IPC Server
> > handler 3 on 9000: starting
> > 2009-09-25 10:47:34,384 INFO org.apache.hadoop.ipc.Server: IPC Server
> > handler 5 on 9000: starting
> > 2009-09-25 10:47:34,397 INFO org.apache.hadoop.ipc.Server: IPC Server
> > handler 4 on 9000: starting
> > 2009-09-25 10:47:34,398 INFO org.apache.hadoop.ipc.Server: IPC Server
> > handler 6 on 9000: starting
> > 2009-09-25 10:47:34,400 WARN org.apache.hadoop.ipc.Server: IPC Server
> > Responder, call
> > getProtocolVersion(org.apache.hadoop.hdfs.protocol.ClientProtocol, 41)
> from
> > 192.168.33.7:54977: output error
> > 2009-09-25 10:47:34,401 INFO org.apache.hadoop.ipc.Server: IPC Server
> > handler 7 on 9000: starting
> > 2009-09-25 10:47:34,402 INFO org.apache.hadoop.ipc.Server: IPC Server
> > handler 8 on 9000: starting
> > 2009-09-25 10:47:34,404 INFO org.apache.hadoop.ipc.Server: IPC Server
> > handler 9 on 9000: starting
> > 2009-09-25 10:47:34,433 INFO org.apache.hadoop.ipc.Server: IPC Server
> > handler 1 on 9000 caught: java.nio.channels.ClosedChannelException
> >         at
> > sun.nio.ch.SocketChannelImpl.ensureWriteOpen(SocketChannelImpl.java:126)
> >         at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:324)
> >         at org.apache.hadoop.ipc.Server.channelWrite(Server.java:1195)
> >         at org.apache.hadoop.ipc.Server.access$1900(Server.java:77)
> >         at
> > org.apache.hadoop.ipc.Server$Responder.processResponse(Server.java:613)
> >         at
> > org.apache.hadoop.ipc.Server$Responder.doRespond(Server.java:677)
> >         at org.apache.hadoop.ipc.Server$Handler.run(Server.java:981)
> > 2009-09-25 10:48:09,304 INFO org.apache.hadoop.hdfs.StateChange: BLOCK*
> > NameSystem.registerDatanode: node registration from 
> > 192.168.33.5:50010storage
> DS-1782592326-192.168.33.5-50010-1253784633678
> > 2009-09-25 10:48:09,311 INFO org.apache.hadoop.net.NetworkTopology:
> Adding
> > a new node: /default-rack/192.168.33.5:50010
> > 2009-09-25 10:48:09,920 INFO org.apache.hadoop.hdfs.StateChange: STATE*
> > Safe mode ON.
> > The ratio of reported blocks 0.0005 has not reached the threshold 0.9990.
> > Safe mode will be turned off automatically.
> > 2009-09-25 10:48:09,985 INFO org.apache.hadoop.hdfs.StateChange: STATE*
> > Safe mode extension entered.
> > The ratio of reported blocks 0.9990 has reached the threshold 0.9990.
> Safe
> > mode will be turned off automatically in 29 seconds.
> > 2009-09-25 10:48:29,989 INFO org.apache.hadoop.hdfs.StateChange: STATE*
> > Safe mode ON.
> > The ratio of reported blocks 1.0000 has reached the threshold 0.9990.
> Safe
> > mode will be turned off automatically in 9 seconds.
> > 2009-09-25 10:48:32,412 INFO org.apache.hadoop.hdfs.StateChange: BLOCK*
> > NameSystem.registerDatanode: node registration from 
> > 192.168.33.6:50010storage
> DS-119886468-192.168.33.6-50010-1253784635966
> > 2009-09-25 10:48:32,412 INFO org.apache.hadoop.net.NetworkTopology:
> Adding
> > a new node: /default-rack/192.168.33.6:50010
> > 2009-09-25 10:48:40,147 INFO
> > org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Total number of
> blocks
> > = 2052
> > 2009-09-25 10:48:40,147 INFO
> > org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Number of invalid
> > blocks = 0
> > 2009-09-25 10:48:40,147 INFO
> > org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Number of
> > under-replicated blocks = 2051
> > 2009-09-25 10:48:40,147 INFO
> > org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Number of
> > over-replicated blocks = 0
> > 2009-09-25 10:48:40,147 INFO org.apache.hadoop.hdfs.StateChange: STATE*
> > Leaving safe mode after 219 secs.
> > 2009-09-25 10:48:40,147 INFO org.apache.hadoop.hdfs.StateChange: STATE*
> > Safe mode is OFF.
> > 2009-09-25 10:48:40,148 INFO org.apache.hadoop.hdfs.StateChange: STATE*
> > Network topology has 1 racks and 2 datanodes
> > 2009-09-25 10:48:40,148 INFO org.apache.hadoop.hdfs.StateChange: STATE*
> > UnderReplicatedBlocks has 2051 blocks
> >
> >
> >     I noticed the exceptions and "org.apache.hadoop.hdfs.StateChange:
> > STATE* Network topology has 1 racks and 2 datanodes".Sometimes it tells 1
> > datanodes, and sometimes 2, but we have 3.Please give some suggestions,
> > thank you.
> >
> >
> >
> >     Following is logs in one of datanodes:
> >     2009-09-25 10:47:22,110 INFO
> > org.apache.hadoop.hdfs.server.datanode.DataNode: Registered
> > FSDatasetStatusMBean
> > 2009-09-25 10:47:22,112 INFO
> > org.apache.hadoop.hdfs.server.datanode.DataNode: Opened info server at
> 50010
> > 2009-09-25 10:47:22,114 INFO
> > org.apache.hadoop.hdfs.server.datanode.DataNode: Balancing bandwith is
> > 1048576 bytes/s
> > 2009-09-25 10:47:27,257 INFO org.mortbay.log: Logging to
> > org.slf4j.impl.Log4jLoggerAdapter(org.mortbay.log) via
> > org.mortbay.log.Slf4jLog
> > 2009-09-25 10:47:27,434 INFO org.apache.hadoop.http.HttpServer: Jetty
> bound
> > to port 50075
> > 2009-09-25 10:47:27,434 INFO org.mortbay.log: jetty-6.1.14
> > 2009-09-25 10:48:14,324 INFO org.mortbay.log: Started
> > selectchannelconnec...@0.0.0.0:50075
> > 2009-09-25 10:48:14,334 INFO org.apache.hadoop.metrics.jvm.JvmMetrics:
> > Initializing JVM Metrics with processName=DataNode, sessionId=null
> > 2009-09-25 10:48:19,369 INFO org.apache.hadoop.ipc.metrics.RpcMetrics:
> > Initializing RPC Metrics with hostName=DataNode, port=50020
> > 2009-09-25 10:48:19,376 INFO org.apache.hadoop.ipc.Server: IPC Server
> > Responder: starting
> > 2009-09-25 10:48:19,376 INFO org.apache.hadoop.ipc.Server: IPC Server
> > listener on 50020: starting
> > 2009-09-25 10:48:19,378 INFO org.apache.hadoop.ipc.Server: IPC Server
> > handler 0 on 50020: starting
> > 2009-09-25 10:48:19,380 INFO org.apache.hadoop.ipc.Server: IPC Server
> > handler 1 on 50020: starting
> > 2009-09-25 10:48:19,380 INFO org.apache.hadoop.ipc.Server: IPC Server
> > handler 2 on 50020: starting
> > 2009-09-25 10:48:19,381 INFO org.apache.hadoop.ipc.Server: IPC Server
> > handler 3 on 50020: starting
> > 2009-09-25 10:48:19,381 INFO org.apache.hadoop.ipc.Server: IPC Server
> > handler 4 on 50020: starting
> > 2009-09-25 10:48:19,381 INFO org.apache.hadoop.ipc.Server: IPC Server
> > handler 5 on 50020: starting
> > 2009-09-25 10:48:19,382 INFO org.apache.hadoop.ipc.Server: IPC Server
> > handler 6 on 50020: starting
> > 2009-09-25 10:48:19,382 INFO org.apache.hadoop.ipc.Server: IPC Server
> > handler 7 on 50020: starting
> > 2009-09-25 10:48:19,382 INFO org.apache.hadoop.ipc.Server: IPC Server
> > handler 8 on 50020: starting
> > 2009-09-25 10:48:19,383 INFO org.apache.hadoop.ipc.Server: IPC Server
> > handler 9 on 50020: starting
> > 2009-09-25 10:48:19,383 INFO org.apache.hadoop.ipc.Server: IPC Server
> > handler 10 on 50020: starting
> > 2009-09-25 10:48:19,383 INFO org.apache.hadoop.ipc.Server: IPC Server
> > handler 11 on 50020: starting
> > 2009-09-25 10:48:19,384 INFO org.apache.hadoop.ipc.Server: IPC Server
> > handler 12 on 50020: starting
> > 2009-09-25 10:48:19,384 INFO org.apache.hadoop.ipc.Server: IPC Server
> > handler 13 on 50020: starting
> > 2009-09-25 10:48:19,385 INFO org.apache.hadoop.ipc.Server: IPC Server
> > handler 14 on 50020: starting
> > 2009-09-25 10:48:19,385 INFO org.apache.hadoop.ipc.Server: IPC Server
> > handler 15 on 50020: starting
> > 2009-09-25 10:48:19,385 INFO org.apache.hadoop.ipc.Server: IPC Server
> > handler 16 on 50020: starting
> > 2009-09-25 10:48:19,386 INFO org.apache.hadoop.ipc.Server: IPC Server
> > handler 17 on 50020: starting
> > 2009-09-25 10:48:19,386 INFO org.apache.hadoop.ipc.Server: IPC Server
> > handler 18 on 50020: starting
> > 2009-09-25 10:48:19,386 INFO org.apache.hadoop.ipc.Server: IPC Server
> > handler 19 on 50020: starting
> > 2009-09-25 10:48:19,387 INFO org.apache.hadoop.ipc.Server: IPC Server
> > handler 20 on 50020: starting
> > 2009-09-25 10:48:19,387 INFO org.apache.hadoop.ipc.Server: IPC Server
> > handler 21 on 50020: starting
> > 2009-09-25 10:48:19,388 INFO org.apache.hadoop.ipc.Server: IPC Server
> > handler 22 on 50020: starting
> > 2009-09-25 10:48:19,388 INFO org.apache.hadoop.ipc.Server: IPC Server
> > handler 23 on 50020: starting
> > 2009-09-25 10:48:19,388 INFO org.apache.hadoop.ipc.Server: IPC Server
> > handler 24 on 50020: starting
> > 2009-09-25 10:48:19,388 INFO org.apache.hadoop.ipc.Server: IPC Server
> > handler 25 on 50020: starting
> > 2009-09-25 10:48:19,389 INFO org.apache.hadoop.ipc.Server: IPC Server
> > handler 26 on 50020: starting
> > 2009-09-25 10:48:19,389 INFO org.apache.hadoop.ipc.Server: IPC Server
> > handler 27 on 50020: starting
> > 2009-09-25 10:48:19,390 INFO org.apache.hadoop.ipc.Server: IPC Server
> > handler 28 on 50020: starting
> > 2009-09-25 10:48:19,390 INFO org.apache.hadoop.ipc.Server: IPC Server
> > handler 29 on 50020: starting
> > 2009-09-25 10:48:19,390 INFO org.apache.hadoop.ipc.Server: IPC Server
> > handler 30 on 50020: starting
> > 2009-09-25 10:48:19,391 INFO org.apache.hadoop.ipc.Server: IPC Server
> > handler 31 on 50020: starting
> > 2009-09-25 10:48:19,391 INFO
> > org.apache.hadoop.hdfs.server.datanode.DataNode: dnRegistration =
> > DatanodeRegistration(ubuntu7:50010,
> > storageID=DS-119886468-192.168.33.6-50010-1253784635966, infoPort=50075,
> > ipcPort=50020)
> > 2009-09-25 10:48:19,434 INFO
> > org.apache.hadoop.hdfs.server.datanode.DataNode: DatanodeRegistration(
> > 192.168.33.6:50010,
> > storageID=DS-119886468-192.168.33.6-50010-1253784635966, infoPort=50075,
> > ipcPort=50020)In DataNode.run, data =
> > FSDataset{dirpath='/home/cyd/hdfs2/data/current'}
> > 2009-09-25 10:48:19,456 INFO
> > org.apache.hadoop.hdfs.server.datanode.DataNode: using
> BLOCKREPORT_INTERVAL
> > of 3600000msec Initial delay: 0msec
> > 2009-09-25 10:48:19,950 INFO
> > org.apache.hadoop.hdfs.server.datanode.DataNode: BlockReport of 2052
> blocks
> > got processed in 458 msecs
> > 2009-09-25 10:48:19,951 INFO
> > org.apache.hadoop.hdfs.server.datanode.DataNode: Starting Periodic block
> > scanner.
> > 2009-09-25 10:48:20,601 INFO
> > org.apache.hadoop.hdfs.server.datanode.DataBlockScanner: Verification
> > succeeded for blk_2990263655324635865_9556
> > 2009-09-25 10:48:58,692 INFO
> > org.apache.hadoop.hdfs.server.datanode.DataNode: BlockReport of 2052
> blocks
> > got processed in 224 msecs
> > 2009-09-25 10:51:07,955 INFO
> > org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving block
> > blk_-1955286602120745744_18039 src: /192.168.33.5:43889 dest: /
> > 192.168.33.6:50010
> > 2009-09-25 10:51:07,982 INFO
> > org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: /
> > 192.168.33.5:43889, dest: /192.168.33.6:50010, bytes: 4, op: HDFS_WRITE,
> > cliID: DFSClient_2050853952, srvID:
> > DS-119886468-192.168.33.6-50010-1253784635966, blockid:
> > blk_-1955286602120745744_18039
> > 2009-09-25 10:51:07,983 INFO
> > org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder 0 for
> block
> > blk_-1955286602120745744_18039 terminating
> > 2009-09-25 10:51:10,742 INFO
> > org.apache.hadoop.hdfs.server.datanode.DataNode: Deleting block
> > blk_-6733538557527582040_18038 file
> > /home/cyd/hdfs2/data/current/blk_-6733538557527582040
> > 2009-09-25 10:56:45,523 INFO
> > org.apache.hadoop.hdfs.server.datanode.DataBlockScanner: Verification
> > succeeded for blk_-8204684875565980176_3413
> > 2009-09-25 11:03:06,572 INFO
> > org.apache.hadoop.hdfs.server.datanode.DataBlockScanner: Verification
> > succeeded for blk_8093352211012570299_17220
> > 2009-09-25 11:10:28,649 INFO
> > org.apache.hadoop.hdfs.server.datanode.DataBlockScanner: Verification
> > succeeded for blk_-6978381366409301329_8279
> >
> > Thanks a lot,
> > LvZheng
> >
>

Reply via email to