Re: What should happen if I restart hmaster?

Jeff Whiting Wed, 23 Nov 2011 10:39:40 -0800

We are running cdh3u2 with hbase 0.90.4-cdh3u2. I'm guessing the logs were left over from aprevious crash. I also found some errand /etc/hosts entries that I'm guessing is the cause of thesome of the weirdness. So lets just say my problems are due to a misconfiguration and some previouscrashes that showed up during the restart.


However I'm thinking the null pointer is a really problem that shouldn't make 
the master die.


~Jeff

On 11/23/2011 12:29 AM, Stack wrote:

On Tue, Nov 22, 2011 at 11:04 PM, Jeff Whiting<je...@qualtrics.com>  wrote:

If I run /etc/init.d/hadoop-hbase-master restart what would the expected
behavior be?  Are the region servers supposed to stay up until the master is
back?  Are they supposed to shutdown?  What is _supposed_ to happen?

My understanding was that the region servers should continue to server
without problems until the master is comes backup.

Yes.  That is what is supposed to happen.

Instead the master came up and started splitting logs (which I thought it
would only do if a region server failed).

That is correct.

Nothing had failed at the time.  Were these old logs left over from a
previous crash?

11/11/22 23:45:37 FATAL master.HMaster: Unhandled exception. Starting
shutdown.
java.lang.NullPointerException
    at
org.apache.hadoop.hbase.master.AssignmentManager.regionOnline(AssignmentManager.java:731)
    at

Which version of hbase?   I don't see a fix for this one in 0.90 branch.

Region server logs around that time:
11/11/22 23:45:46 WARN hdfs.DFSClient: Error Recovery for block
blk_6275991319376720851_37314 failed  because recovery from primary datanode
10.1.37.4:50010 failed 6 times.  Pipeline was 10.1.37.4:50010. Aborting...
11/11/22 23:45:46 WARN hdfs.DFSClient: Error while syncing
java.io.IOException: Error Recovery for block blk_6275991319376720851_37314
failed  because recovery from primary datanode 10.1.37.4:50010 failed 6
times.  Pipeline was 10.1.37.4:50010. Aborting...
    at
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.processDatanodeError(DFSClient.java:2833)
    at
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$1600(DFSClient.java:2305)
    at
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2477)
11/11/22 23:45:46 FATAL wal.HLog: Could not append. Requesting close of hlog
java.io.IOException: Reflection
    at


This looks unrelated.  HDFS went away around this time?  What version of hdfs?

St.Ack


--
Jeff Whiting
Qualtrics Senior Software Engineer
je...@qualtrics.com

Re: What should happen if I restart hmaster?

Reply via email to