Re: raid5 failure

2000-07-24 Thread Bill Carlson

On Fri, 21 Jul 2000, Seth Vidal wrote:

 Hi,
  We've been using the sw raid 5 support in linux for about 2-3 months now.
 We've had good luck with it.
 
 Until this week.
 
 In this one week we've lost two drives on a 3 drive array. Completely
 eliminating the array. We have good backups, made everynight, so the data
 is safe. The problem is this: What could have caused these dual drive
 failures?
 
 One went out on saturday the next on the following friday. Complete death.
 
 One drive won't detect anywhere anymore and its been RMA'd the other
 detects and I'm currently mke2fs -c on the drive.

Hey Seth,

Sorry to hear about your drive failures. To me, this is something that
most people ignore about RAID5: Lose more than one drive and everything is
toast. Good reason to have a drive setup as a hot spare, not to mention an
extra drive laying on the shelf. And hold your breathe while the array is
rebuilding.

Bill Carlson

Systems Programmer[EMAIL PROTECTED]|  Opinions are mine,
Virtual Hospital  http://www.vh.org/|  not my employer's.
University of Iowa Hospitals and Clinics|





Re: Swap on RAID

2000-06-02 Thread Bill Carlson

On Thu, 1 Jun 2000, Henry J. Cobb wrote:

 Does anybody really want to wait while their swap data is duplicated out to
 multiple disks by a CPU that is working to free up memory to run
 applications?
 
 Isn't Swapping slow enough already?
 
 Why not simply swap on multiple disks, get Hardware RAID-5 for swap or buy
 RAM?


Linux uses swap intelligently, if areas of memory don't change they get
swapped out to disk, making more physical RAM available for file caching,
etc. Having swap is good even if you have oodles of RAM just for that
reason. 

Bill Carlson

Systems Programmer[EMAIL PROTECTED]|  Opinions are mine,
Virtual Hospital  http://www.vh.org/|  not my employer's.
University of Iowa Hospitals and Clinics|





Re: Raid5 with two failed disks?

2000-03-30 Thread Bill Carlson

On Thu, 30 Mar 2000, Martin Bene wrote:

 At 02:16 30.03.00, you wrote:
 Hi... I have a Raid5 Array, using 4 IDE HDs. A few days ago, the system
 hung, no reaction, except ping from the host, nothing to see on the
 monitor. I rebooted the system and it told me, 2 out of 4 disks were out
 of sync. 2 Disks have an event counter of 0062, the two others
 0064. I hope, that there is a way to fix this. I searched through the
 mailing-list and found one thread, but it did not help me.
 
 Yes I do. Check Jakobs Raid howto, section "recovering from multiple failures".
 
 You can recreate the superblocks of the raid disks using mkraid; if you 
 explicitly mark one disk as failed in the raidtab, no automatic resync is 
 started, so you get to check if all works and perhaps change something and 
 retry.


Hey all,

I've been thinking about this for a different project, how bad would it be
to setup RAID 5 to allow for 2 (or more) failures in an array? Or is this
handled under a different class of RAID (ignoring things like RAID 5 over
mirrored disks and such).

Three words: Net block device 

Bill Carlson

Systems Programmer[EMAIL PROTECTED]|  Opinions are mine,
Virtual Hospital  http://www.vh.org/|  not my employer's.
University of Iowa Hospitals and Clinics|





Re: Raid5 with two failed disks?

2000-03-30 Thread Bill Carlson

On Thu, 30 Mar 2000, Theo Van Dinter wrote:

 On Thu, Mar 30, 2000 at 08:36:52AM -0600, Bill Carlson wrote:
  I've been thinking about this for a different project, how bad would it be
  to setup RAID 5 to allow for 2 (or more) failures in an array? Or is this
  handled under a different class of RAID (ignoring things like RAID 5 over
  mirrored disks and such).
 
 You just can't do that with RAID5.  I seem to remember that there's a RAID 6
 or 7 that handles 2 disk failures (multiple parity devices or something like
 that.)
 
 You can optionally do RAID 5+1 where you mirror partitions and then stripe
 across them ala RAID 0+1.  You'd have to lose 4 disks minimally before the
 array goes offline.

1+5 would still fail on 2 drives if those 2 drives where both from the 
same RAID 1 set. The wasted space becomes more than N/2, but it might
worth it for the HA aspect. RAID 6 looks cleaner, but that would require
someone to write an implementation, whereas you could do RAID 15 (51?)
now. 

My thought here is leading to a distributed file system that is server
independent, it seems something like that would solve a lot of problems
that things like NFS and Coda don't handle. From what I've read GFS is
supposed to do this, never hurts to attack a thing from a couple of
directions.

Use the net block device, RAID 15 and go. Very tempting...:)

Bill Carlson

Systems Programmer[EMAIL PROTECTED]|  Opinions are mine,
Virtual Hospital  http://www.vh.org/|  not my employer's.
University of Iowa Hospitals and Clinics|





Re: Raid5 with two failed disks?

2000-03-30 Thread Bill Carlson

On Thu, 30 Mar 2000, Theo Van Dinter wrote:

 On Thu, Mar 30, 2000 at 02:21:45PM -0600, Bill Carlson wrote:
  1+5 would still fail on 2 drives if those 2 drives where both from the 
  same RAID 1 set. The wasted space becomes more than N/2, but it might
  worth it for the HA aspect. RAID 6 looks cleaner, but that would require
  someone to write an implementation, whereas you could do RAID 15 (51?)
  now. 
 
 2 drives failing in either RAID 1+5 or 5+1 results in a still available
 array:

Doh, you're right. Thanks for drawing me a picture...:)

Bill Carlson

Systems Programmer[EMAIL PROTECTED]|  Opinions are mine,
Virtual Hospital  http://www.vh.org/|  not my employer's.
University of Iowa Hospitals and Clinics|





Re: mkraid did not work !!

2000-02-07 Thread Bill Carlson

On Mon, 7 Feb 2000 [EMAIL PROTECTED] wrote:

 Hi !
   I did install the raid like it is described at
 http://ostenfeld.dk/~jakob/Software-RAID.HOWTO/ with patch and latest
 raid-tools. After editing /etc/raidtab like sample with raid1

Hi Peter,

Run dmesg and look for a line like 'autodetecting RAID arrays'. If it's
not there, you're not running the patched code.

Something like this would work:

dmesg | grep -i raid 

to show you any raid output from boot.

HTH,

Bill Carlson

Systems Programmer[EMAIL PROTECTED]|  Opinions are mine,
Virtual Hospital  http://www.vh.org/|  not my employer's.
University of Iowa Hospitals and Clinics|




RE: Compaq smartSCSI doesn't recognise 2nd drive

2000-01-31 Thread Bill Carlson

On Tue, 1 Feb 2000, Newman Chakerian wrote:

 Thanks Bill for your comments 
 
 I actually had the entries in fstab right when I tried it. (Sorry for the
 manual typos)
 
 I was wondering .. The Compaq Proliant has an internal SCSI Controller
 (NCRXX ??). The RedHat install picks this up no problem. AM I better off
 using the internal controller than the Smart SCSI 2, seeing as I don't want
 to use mirroring ? I read somewhere that the Smart SCSI 2 driver is not yet
 certified - It's a 'use at own risk' type of thing. 


I'm assuming that your SMART is one of the Hardware RAID controllers.
Which to use is ultimately up to you and greatly depends on what you
require for your data. If your data is very important, I'd run with the
SMART, mirror and have a good backup plan. If speed is important, I'd run
with the SMART and RAID 0. If stability is important, I might stick with
the internal SCSI, I haven't run the SMART under linux that much.

For most of my uses, I'd use both...:) 

Bill Carlson

Systems Programmer[EMAIL PROTECTED]|  Opinions are mine,
Virtual Hospital  http://www.vh.org/|  not my employer's.
University of Iowa Hospitals and Clinics|




Re: Compaq smartSCSI doesn't recognise 2nd drive

2000-01-31 Thread Bill Carlson

On Mon, 31 Jan 2000, Newman Chakerian wrote:

 here is my fstab file: the ones I'm trying to add are commented out (I added
 these entries myself):
 
 /dev/ida/c0d0p5 /   ext2  defaults
 1 1
 /dev/ida/c0d0p1 /boot   ext2  defaults
 1 2
 /dev/cdrom  /mnt/cdrom  iso9660   noauto,owner,ro 0 0
 /dev/ida/c0d0p6 swapswap  defaults
 0 0
 /dev/fd0/mnt/floppy ext2  noauto,owner
 0 0
 none/proc   proc  defaults
 0 0
 none/dev/ptsdevpts
 gid=5,mode=6200 0
 #/dev/ida/c0d1p5  /optext2
 default   1 3
 #/dev/ida/c0d1/p1 /mnt/dsk2   ext2
 defaults  1 1
 
 

I'm hoping this is a cut and paste of the file, you have typos in both
lines. Also, make sure each entry is all one line.

First entry: defaults instead of default

Second entry: The device is incorrect, I think you want /dev/ida/c0d1p1
instead of /dev/ida/c0d1/p1.

Might double check and make sure those partitions exist as well.

HTH,

Bill Carlson

Systems Programmer[EMAIL PROTECTED]|  Opinions are mine,
Virtual Hospital  http://www.vh.org/|  not my employer's.
University of Iowa Hospitals and Clinics|




RE: mkraid aborts, no info?

1999-10-27 Thread Bill Carlson

On Wed, 27 Oct 1999, Fernandez, Richard wrote:

 Mandrake doesn't have RAID support built into the kernel AFAIK.
 I was trying to do the same thing you're doing using Mandrake 6.0.
 
 Below is an e-mail I received from Mandrake...
 
 dear Richard Fernandez,
 
 you should recompile the kernel with raid support or use the RedHat
 compiled kernel which already has that.
 
 sincerely,
 
 -- 
 Florin Grad
 (Technical Support Team)
 [EMAIL PROTECTED]
 


I'm thinking Florin means the kernel is not compiled with support by
default.

From the info on raidtools:

This package includes the tools you need to set up and maintain a software
RAID
device under Linux. It only works with Linux 2.2 kernels and later, or 2.0
kernel specifically patched with newer raid support.

To me that implies a 2.2.x kernel does not need a patch. On Mandrake 6.1,
the required RAID modules were already in place after installation. 

Bill Carlson

Systems Programmer[EMAIL PROTECTED]|  Opinions are mine,
Virtual Hospital  http://www.vh.org/|  not my employer's.
University of Iowa Hospitals and Clinics|




Re: mkraid aborts, no info?

1999-10-27 Thread Bill Carlson

On Wed, 27 Oct 1999, David A. Cooley wrote:

 Hi Bill,
 You need to get the latest raid kernel patch (ignore the errors it gives... 
 one hunk is included in the 2.2.12/2.2.13 kernel) and the latest raidtools 
 (0.90).
 

Ah, I see now. I'll try applying the patch to the 2.2.13 now.

Thanks David, Luca.

Bill Carlson

Systems Programmer[EMAIL PROTECTED]|  Opinions are mine,
Virtual Hospital  http://www.vh.org/|  not my employer's.
University of Iowa Hospitals and Clinics|





raid0145 patch

1999-10-27 Thread Bill Carlson

Hey,

The patch did the trick, just like it was supposed to.

cd /usr/src/linux
patch -p1  raid0145-19990824-2.2.11

There was the one error, which I ignored, as I was patching against
2.2.12. Does the same patch apply vs 2.2.13? I'm guessing that Mandrake's
sources are what caused the errors that lead me to go with a fresh 2.2.12
source tree.

Recompile, reboot and the magic messages started. :)

2 minutes later I had me an 8 GB array.

Thanks a lot everyone!

Bill Carlson

Systems Programmer[EMAIL PROTECTED]|  Opinions are mine,
Virtual Hospital  http://www.vh.org/|  not my employer's.
University of Iowa Hospitals and Clinics|