Re: Regionserver fails to serve region

Michael Stack Wed, 22 Oct 2008 23:12:35 -0700

Jean-Adrien wrote:

..
Stack, you ask me if my hard disks were full. I said one is. Why did you
link the above problem with that. Because of the du problem noticed in
HADOOP-3232 ? I don't think I'm affected by this problem, my BlockReport

process duration is less than a second.

We were seeing HADOOP-3831 on our cluster (hadoop 0.18.0 and hbase0.18.1RC1). After a rebalance of the hdfs content, brought on by theobservation that loading was lopsided, the issue went away. Thought --not proven -- is that the lopsidedness was causing disks to fill whicheventually led to 3831.

...

Another question by the way:
We saw that the hadoop-default.xml is used by hbase client, it overrides the
replication factor; ok. But could it override the dfs.datanode.du.reserved /
dfs.datanode.pct properties ? (which sounds to be policy of datanode rather
than client). I said that my settings doesn't seem to affect the behaviour
of datanodes.

I could be wrong, but I don't see how. You are running start-dfs.shover in HADOOP_HOME, not in HBASE_HOME. Unless you somehow haveCLASSPATHs intermingled, datanode startup should not be picking upcontent of HBASE_HOME/conf.

I owe you other answers/support. In particular, I need to try runningdfs.datanode.socket.write.timeout = 0 to see if I get same problem asyou. Let me know if anything else you'd have me try.


Thanks for all the excellent diagnosis.
St.Ack

Re: Regionserver fails to serve region

Reply via email to