version of CDH HBase?
Best regards,
- Andy
Problems worthy of attack prove their worth by hitting back. - Piet Hein (via
Tom White)
- Original Message -
From: Ted Yu
To: user@hbase.apache.org
Cc:
Sent: Tuesday, February 7, 2012 3:45 AM
Subject: Re: Should a data node restart caus
February 7, 2012 3:45 AM
> Subject: Re: Should a data node restart cause a region server to go down?
>
> In your case Error Recovery wasn't successful because of:
> All datanodes 10.49.29.92:50010 are bad. Aborting...
>
> On Mon, Feb 6, 2012 at 10:28 AM, Jeff Whiting wrote
So I restart one of the data nodes and everything continues to work just fine even though the local
one is no longer valid. Additionally I can restart n-1 nodes without any problem and hbase
continues to work. However as soon as I restart the last data node RSs start dying. hbck and fsck
say
This is the normal behavior of the sync-API (that when the first DN in
pipeline fails, the whole op is failed), correct me if am wrong.
The rule here I think was that you do not want RSes to go switch over
writing to a remote DN cause the first one in the pipeline (always the
local one) failed. He
What would "hadoop fsck /" that type of problem if there really were no nodes with that data? The
worst I've seen is: Target Replicas is 4 but found 3 replica(s).
~Jeff
On 2/6/2012 12:45 PM, Ted Yu wrote:
In your case Error Recovery wasn't successful because of:
All datanodes 10.49.29.92:500
I've been able to reproduce this on multiple clusters. I'm basically doing a rolling restart of
data nodes with 1 every 5-10+ minutes. However the region servers will just die. "hadoop fsck /"
shows it is healthy, the web interface says all the data nodes are up, and region servers logs seem
q
In your case Error Recovery wasn't successful because of:
All datanodes 10.49.29.92:50010 are bad. Aborting...
On Mon, Feb 6, 2012 at 10:28 AM, Jeff Whiting wrote:
> I was increasing the storage on some of my data nodes and thus had to do a
> restart of the data node. I use cdh3u2 and ran "/etc
I was increasing the storage on some of my data nodes and thus had to do a restart of the data
node. I use cdh3u2 and ran "/etc/init.d/hadoop-0.20-datanode restart" (I don't think this is a cdh
problem). Unfortunately doing the restart caused region servers to go offline. Is this expected
beha