Yes I have a secondary Namenode running. Here are the log for
SecondaryNamenode

2011-07-21 16:02:47,908 INFO org.apache.hadoop.hdfs.server.common.Storage:
Edits file /home/hadoop/tmp/dfs/namesecondary/current/edits of size 12751835
edits # 138217 loaded in 1581 seconds.
2011-07-21 16:03:21,925 INFO org.apache.hadoop.hdfs.server.common.Storage:
Image file of size 2045516451 saved in 29 seconds.
2011-07-21 16:03:24,974 INFO
org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Number of transactions:
0 Total time for transactions(ms): 0Number of transactions batched in Syncs:
0 Number of syncs: 0 SyncTimes(ms): 0
2011-07-21 16:03:25,545 INFO
org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode: Posted URL
xx.xx.xx.xx:50070putimage=1&port=50090&machine=xx.xx.xx.xx&token=-18:1554828842:0:1311242583000:1311240481442
2011-07-21 16:29:24,356 ERROR
org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode: Exception in
doCheckpoint:
2011-07-21 16:29:24,358 ERROR
org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode:
java.io.IOException: Call to xx.xx.xx.xx:9000 failed on local exception:
java.io.IOException: Connection reset by peer

Regards,
Rahul

On Fri, Jul 22, 2011 at 5:40 PM, Joey Echeverria <j...@cloudera.com> wrote:

> Do you have an instance of the SecondaryNamenode in your cluster?
>
> -Joey
>
>
> On Fri, Jul 22, 2011 at 3:15 AM, Rahul Das <rahul.h...@gmail.com> wrote:
>
>> Hi,
>>
>> I am running a Hadoop cluster with 20 Data node. Yesterday I found that
>> the Namenode was not responding ( No write/read to HDFS is happening). It
>> got stuck for few hours, then I shut down the Namenode and found the
>> following error from the Name node log.
>>
>> 2011-07-21 16:15:31,500 WARN org.apache.hadoop.ipc.Server: IPC Server
>> Responder, call
>> getProtocolVersion(org.apache.hadoop.hdfs.protocol.ClientProtocol, 41) from
>> xx.xx.xx.xx:13568: output error
>>
>> This error was coming for every data node and data nodes are not able to
>> communicate with the Name node
>>
>> After I restart the Namenode
>>
>> 2011-07-21 16:31:54,110 INFO
>> org.apache.hadoop.hdfs.server.namenode.NameNode: STARTUP_MSG:
>> 2011-07-21 16:31:54,216 INFO org.apache.hadoop.ipc.metrics.RpcMetrics:
>> Initializing RPC Metrics with hostName=NameNode, port=9000
>> 2011-07-21 16:31:54,223 INFO
>> org.apache.hadoop.hdfs.server.namenode.NameNode: Namenode up at:
>> xx.xx.xx.xx:9000
>> 2011-07-21 16:31:54,225 INFO org.apache.hadoop.metrics.jvm.JvmMetrics:
>> Initializing JVM Metrics with processName=NameNode, sessionId=null
>> 2011-07-21 16:31:54,226 INFO
>> org.apache.hadoop.hdfs.server.namenode.metrics.NameNodeMetrics: Initializing
>> NameNodeMeterics using context
>> object:org.apache.hadoop.metrics.spi.NullContext
>> 2011-07-21 16:31:54,280 INFO
>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: fsOwner=hadoop,hadoop
>> 2011-07-21 16:31:54,280 INFO
>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: supergroup=supergroup
>> 2011-07-21 16:31:54,280 INFO
>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem:
>> isPermissionEnabled=false
>> 2011-07-21 16:31:54,287 INFO
>> org.apache.hadoop.hdfs.server.namenode.metrics.FSNamesystemMetrics:
>> Initializing FSNamesystemMetrics using context
>> object:org.apache.hadoop.metrics.spi.NullContext
>> 2011-07-21 16:31:54,289 INFO
>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Registered
>> FSNamesystemStatusMBean
>> 2011-07-21 16:31:54,880 INFO org.apache.hadoop.hdfs.server.common.Storage:
>> Number of files = 15817482
>> 2011-07-21 16:34:38,463 INFO org.apache.hadoop.hdfs.server.common.Storage:
>> Number of files under construction = 82
>> 2011-07-21 16:34:41,177 INFO org.apache.hadoop.hdfs.server.common.Storage:
>> Image file of size 2042701824 loaded in 166 seconds.
>> 2011-07-21 16:58:07,624 INFO org.apache.hadoop.hdfs.server.common.Storage:
>> Edits file /home/hadoop/current/edits of size 12751835 edits # 138217 loaded
>> in 1406 seconds.
>>
>> And it goes for a long halt. After about an hour it starts working again.
>>
>> My question is when the error "IPC Server Responde" comes and is there a
>> way to deal with it.
>> Also if my Namenode is busy doing something then what is the way to find
>> out what it is doing.
>>
>> Regards,
>> Rahul
>
>
>
>
> --
> Joseph Echeverria
> Cloudera, Inc.
> 443.305.9434
>
>

Reply via email to