Hi,

I found a strange behavior in my cluster. The data nodes stop sending any
information randomly (no logs coming). So the namenode thinks its down. But
after some time ( approx 30 mints) the datanode nodes comes up and start
behaving properly. I tried finding any error log, but the datanode node is
not writing any error message during this time.

The Namenode shows some warning similar to

2011-07-28 20:59:35,275 WARN
org.apache.hadoop.hdfs.server.namenode.FSNamesystem:
PendingReplicationMonitor timed out block blk_8370263993564715002_23947922

I checked this is not happening due to network outage or some other process
eating up the CPU.

Please help me with this.
--
Rahul

Reply via email to