Two separate years running I had RAID drives blow out while I was on
vacation and a second RAID drive blow out before I got back. Meaning I
had to rebuild the whole $%#%!!! thing from scratch both times.

The HP hotswap drives I worked with were packaged such that pressing the
drive's quick-release lever would power it down clean. Of course, since
the drive was supposedly kaput, that was a lesser concern, but it's
always possible to press the wrong lever (whoops!).

My bigger problem was in finding exact-match replacement disks after a
year or 2 of service. You'd think in a company with over 1200 people in
it, someone might have a spare. But unfortunately, most of them weren't
into 10K RPM hot-swap servers. To say nothing of tape library units.

At least with Linux, heterogeneous RAID isn't an issue.

William, if you have 2 spares, you ought to consider RAID-6. It can
handle cases like what I was getting.

  Tim

On Tue, 2011-02-01 at 16:02 -0500, William L. Thomson Jr. wrote:
> On Tue, 2011-02-01 at 15:50 -0500, robert mckennon wrote:
> > believe it or not, I have never actually swapped out a hard-drive in a
> > raided system.
> 
> We all have our first time with something at some point. No one was born
> knowing or having done it all before ;)
> 
> > I have a HP Proliant ML350 G4p running RHEL 5.3  with a bad drive in a 
> > raid-5.
> 
> Assuming you have a HP Smart Array controller or some RAID controller.
> 
> > It says it's hot-swappable....  but is that the best practice, or
> > should I shut down and then replace the drive?
> 
> If the machine is still running, usually the drive will already be taken
> out of use by the  raid controller. Which if thats the case, red light
> vs green, or could be another color on the drive in question. Then you
> can safely remove the drive and add back a new one. If using HP Smart
> Arrays can use the hpacucli to check status, make changes, etc to the
> array while the machine is up and running.
> 
> If you want you can take it down. Though you might find yourself with
> more than one bad drive going that route. I would just swap out the
> drive now with a known good one. Let it do its thing, once its back in
> use and you have a spare again. If you want to be really cautious and
> can take down the server. Then a reboot can't hurt. It will
> re-initialize controller and do a test on most all drives.
> 
> Last time I lost a drive for some reason it cause the server to lock up.
> Though that was before any attempt to replace the drive. A reboot
> brought the machine back up, but drive was still dead, and luckily just
> that one. After that replaced the drive, and all has been well ever
> since. So at times freakish things can happen, even though they
> technically should not. I am running RAID 5 with 2 spares, so should not
> have effected the server running, but did.
> 



---------------------------------------------------------------------
Archive      http://marc.info/?l=jaxlug-list&r=1&w=2
RSS Feed     http://www.mail-archive.com/[email protected]/maillist.xml
Unsubscribe  [email protected]

Reply via email to