Fixing a bad HD

2011-04-25 Thread Mayuran Yogarajah

Hello,

One of our nodes has a bad hard disk which needs to be replaced.  I'm 
planning on doing the following:

1) Decommission the node
2) Replace the disk
3) Bring the node back into the cluster

Is there a quicker/better way to address this? Please advise.

thanks,
M


Re: Fixing a bad HD

2011-04-25 Thread James Seigel
Quicker:

Shut off power
Throw hard drive out put new one in
Turn power back on.

Sent from my mobile. Please excuse the typos.

On 2011-04-25, at 5:38 PM, Mayuran Yogarajah
 wrote:

> Hello,
>
> One of our nodes has a bad hard disk which needs to be replaced.  I'm 
> planning on doing the following:
> 1) Decommission the node
> 2) Replace the disk
> 3) Bring the node back into the cluster
>
> Is there a quicker/better way to address this? Please advise.
>
> thanks,
> M


Re: Fixing a bad HD

2011-04-25 Thread Brian Bockelman
Much quicker, but less safe: data might become inaccessible between boots if 
you simultaneously lose another node.  Probably not an issue at 3 replicas, but 
definitely an issue at 2.

Brian

On Apr 25, 2011, at 7:58 PM, James Seigel wrote:

> Quicker:
> 
> Shut off power
> Throw hard drive out put new one in
> Turn power back on.
> 
> Sent from my mobile. Please excuse the typos.
> 
> On 2011-04-25, at 5:38 PM, Mayuran Yogarajah
>  wrote:
> 
>> Hello,
>> 
>> One of our nodes has a bad hard disk which needs to be replaced.  I'm 
>> planning on doing the following:
>> 1) Decommission the node
>> 2) Replace the disk
>> 3) Bring the node back into the cluster
>> 
>> Is there a quicker/better way to address this? Please advise.
>> 
>> thanks,
>> M



smime.p7s
Description: S/MIME cryptographic signature


Re: Fixing a bad HD

2011-04-25 Thread James Seigel
Good point. Advice without details can be tough.

Additional notes:  make sure you have three replicas and the blocks
are replicated. :)

Sent from my mobile. Please excuse the typos.

On 2011-04-25, at 7:04 PM, Brian Bockelman  wrote:

> Much quicker, but less safe: data might become inaccessible between boots if 
> you simultaneously lose another node.  Probably not an issue at 3 replicas, 
> but definitely an issue at 2.
>
> Brian
>
> On Apr 25, 2011, at 7:58 PM, James Seigel wrote:
>
>> Quicker:
>>
>> Shut off power
>> Throw hard drive out put new one in
>> Turn power back on.
>>
>> Sent from my mobile. Please excuse the typos.
>>
>> On 2011-04-25, at 5:38 PM, Mayuran Yogarajah
>>  wrote:
>>
>>> Hello,
>>>
>>> One of our nodes has a bad hard disk which needs to be replaced.  I'm 
>>> planning on doing the following:
>>> 1) Decommission the node
>>> 2) Replace the disk
>>> 3) Bring the node back into the cluster
>>>
>>> Is there a quicker/better way to address this? Please advise.
>>>
>>> thanks,
>>> M
>


RE: Fixing a bad HD

2011-04-25 Thread Jones, Nick
Several SATA controllers support hot-swapping in Linux, but you're still at the 
whim of replication.

Nick Jones

-Original Message-
From: James Seigel [mailto:ja...@tynt.com] 
Sent: Monday, April 25, 2011 8:33 PM
To: common-user@hadoop.apache.org
Subject: Re: Fixing a bad HD

Good point. Advice without details can be tough.

Additional notes:  make sure you have three replicas and the blocks
are replicated. :)

Sent from my mobile. Please excuse the typos.

On 2011-04-25, at 7:04 PM, Brian Bockelman  wrote:

> Much quicker, but less safe: data might become inaccessible between boots if 
> you simultaneously lose another node.  Probably not an issue at 3 replicas, 
> but definitely an issue at 2.
>
> Brian
>
> On Apr 25, 2011, at 7:58 PM, James Seigel wrote:
>
>> Quicker:
>>
>> Shut off power
>> Throw hard drive out put new one in
>> Turn power back on.
>>
>> Sent from my mobile. Please excuse the typos.
>>
>> On 2011-04-25, at 5:38 PM, Mayuran Yogarajah
>>  wrote:
>>
>>> Hello,
>>>
>>> One of our nodes has a bad hard disk which needs to be replaced.  I'm 
>>> planning on doing the following:
>>> 1) Decommission the node
>>> 2) Replace the disk
>>> 3) Bring the node back into the cluster
>>>
>>> Is there a quicker/better way to address this? Please advise.
>>>
>>> thanks,
>>> M
>




Re: Fixing a bad HD

2011-04-25 Thread Bharath Mundlapudi
Right, if you have a hardware which supports hot-swappable disk, this might be 
easiest one. But still you will need to restart the datanode to detect this new 
disk. There is an open Jira on this.

-Bharath



- Original Message -
From: "Jones, Nick" 
To: "common-user@hadoop.apache.org" 
Cc: 
Sent: Monday, April 25, 2011 7:05 PM
Subject: RE: Fixing a bad HD

Several SATA controllers support hot-swapping in Linux, but you're still at the 
whim of replication.

Nick Jones

-Original Message-
From: James Seigel [mailto:ja...@tynt.com] 
Sent: Monday, April 25, 2011 8:33 PM
To: common-user@hadoop.apache.org
Subject: Re: Fixing a bad HD

Good point. Advice without details can be tough.

Additional notes:  make sure you have three replicas and the blocks
are replicated. :)

Sent from my mobile. Please excuse the typos.

On 2011-04-25, at 7:04 PM, Brian Bockelman  wrote:

> Much quicker, but less safe: data might become inaccessible between boots if 
> you simultaneously lose another node.  Probably not an issue at 3 replicas, 
> but definitely an issue at 2.
>
> Brian
>
> On Apr 25, 2011, at 7:58 PM, James Seigel wrote:
>
>> Quicker:
>>
>> Shut off power
>> Throw hard drive out put new one in
>> Turn power back on.
>>
>> Sent from my mobile. Please excuse the typos.
>>
>> On 2011-04-25, at 5:38 PM, Mayuran Yogarajah
>>  wrote:
>>
>>> Hello,
>>>
>>> One of our nodes has a bad hard disk which needs to be replaced.  I'm 
>>> planning on doing the following:
>>> 1) Decommission the node
>>> 2) Replace the disk
>>> 3) Bring the node back into the cluster
>>>
>>> Is there a quicker/better way to address this? Please advise.
>>>
>>> thanks,
>>> M
>


Re: Fixing a bad HD

2011-04-26 Thread Steve Loughran

On 26/04/11 05:20, Bharath Mundlapudi wrote:

Right, if you have a hardware which supports hot-swappable disk, this might be 
easiest one. But still you will need to restart the datanode to detect this new 
disk. There is an open Jira on this.

-Bharath



That'll be HDFS-664
 https://issues.apache.org/jira/browse/HDFS-664

Nobody is working on this, all contributions welcome


Re: Fixing a bad HD

2011-04-26 Thread Steve Loughran

On 26/04/11 05:20, Bharath Mundlapudi wrote:

Right, if you have a hardware which supports hot-swappable disk, this might be 
easiest one. But still you will need to restart the datanode to detect this new 
disk. There is an open Jira on this.

-Bharath


Correction, there is a patch up there now. If you wan't to get involved 
in the coding of Hadoop to meet your specific needs, this might be the 
place to start