Re: raid5 failure

2000-07-24 Thread Bill Carlson

On Fri, 21 Jul 2000, Seth Vidal wrote:

> Hi,
>  We've been using the sw raid 5 support in linux for about 2-3 months now.
> We've had good luck with it.
> 
> Until this week.
> 
> In this one week we've lost two drives on a 3 drive array. Completely
> eliminating the array. We have good backups, made everynight, so the data
> is safe. The problem is this: What could have caused these dual drive
> failures?
> 
> One went out on saturday the next on the following friday. Complete death.
> 
> One drive won't detect anywhere anymore and its been RMA'd the other
> detects and I'm currently mke2fs -c on the drive.

Hey Seth,

Sorry to hear about your drive failures. To me, this is something that
most people ignore about RAID5: Lose more than one drive and everything is
toast. Good reason to have a drive setup as a hot spare, not to mention an
extra drive laying on the shelf. And hold your breathe while the array is
rebuilding.

Bill Carlson

Systems Programmer[EMAIL PROTECTED]|  Opinions are mine,
Virtual Hospital  http://www.vh.org/|  not my employer's.
University of Iowa Hospitals and Clinics|





Re: Swap on RAID

2000-06-02 Thread Bill Carlson

On Thu, 1 Jun 2000, Henry J. Cobb wrote:

> Does anybody really want to wait while their swap data is duplicated out to
> multiple disks by a CPU that is working to free up memory to run
> applications?
> 
> Isn't Swapping slow enough already?
> 
> Why not simply swap on multiple disks, get Hardware RAID-5 for swap or buy
> RAM?
>

Linux uses swap intelligently, if areas of memory don't change they get
swapped out to disk, making more physical RAM available for file caching,
etc. Having swap is good even if you have oodles of RAM just for that
reason. 

Bill Carlson

Systems Programmer[EMAIL PROTECTED]|  Opinions are mine,
Virtual Hospital  http://www.vh.org/|  not my employer's.
University of Iowa Hospitals and Clinics|





Re: Raid5 with two failed disks?

2000-03-30 Thread Bill Carlson

On Thu, 30 Mar 2000, Theo Van Dinter wrote:

> On Thu, Mar 30, 2000 at 02:21:45PM -0600, Bill Carlson wrote:
> > 1+5 would still fail on 2 drives if those 2 drives where both from the 
> > same RAID 1 set. The wasted space becomes more than N/2, but it might
> > worth it for the HA aspect. RAID 6 looks cleaner, but that would require
> > someone to write an implementation, whereas you could do RAID 15 (51?)
> > now. 
> 
> 2 drives failing in either RAID 1+5 or 5+1 results in a still available
> array:

Doh, you're right. Thanks for drawing me a picture...:)

Bill Carlson

Systems Programmer[EMAIL PROTECTED]|  Opinions are mine,
Virtual Hospital  http://www.vh.org/|  not my employer's.
University of Iowa Hospitals and Clinics|





Re: Raid5 with two failed disks?

2000-03-30 Thread Bill Carlson

On Thu, 30 Mar 2000, Theo Van Dinter wrote:

> On Thu, Mar 30, 2000 at 08:36:52AM -0600, Bill Carlson wrote:
> > I've been thinking about this for a different project, how bad would it be
> > to setup RAID 5 to allow for 2 (or more) failures in an array? Or is this
> > handled under a different class of RAID (ignoring things like RAID 5 over
> > mirrored disks and such).
> 
> You just can't do that with RAID5.  I seem to remember that there's a RAID 6
> or 7 that handles 2 disk failures (multiple parity devices or something like
> that.)
> 
> You can optionally do RAID 5+1 where you mirror partitions and then stripe
> across them ala RAID 0+1.  You'd have to lose 4 disks minimally before the
> array goes offline.

1+5 would still fail on 2 drives if those 2 drives where both from the 
same RAID 1 set. The wasted space becomes more than N/2, but it might
worth it for the HA aspect. RAID 6 looks cleaner, but that would require
someone to write an implementation, whereas you could do RAID 15 (51?)
now. 

My thought here is leading to a distributed file system that is server
independent, it seems something like that would solve a lot of problems
that things like NFS and Coda don't handle. From what I've read GFS is
supposed to do this, never hurts to attack a thing from a couple of
directions.

Use the net block device, RAID 15 and go. Very tempting...:)

Bill Carlson

Systems Programmer[EMAIL PROTECTED]|  Opinions are mine,
Virtual Hospital  http://www.vh.org/|  not my employer's.
University of Iowa Hospitals and Clinics|





Re: Raid5 with two failed disks?

2000-03-30 Thread Bill Carlson

On Thu, 30 Mar 2000, Martin Bene wrote:

> At 02:16 30.03.00, you wrote:
> >Hi... I have a Raid5 Array, using 4 IDE HDs. A few days ago, the system
> >hung, no reaction, except ping from the host, nothing to see on the
> >monitor. I rebooted the system and it told me, 2 out of 4 disks were out
> >of sync. 2 Disks have an event counter of 0062, the two others
> >0064. I hope, that there is a way to fix this. I searched through the
> >mailing-list and found one thread, but it did not help me.
> 
> Yes I do. Check Jakobs Raid howto, section "recovering from multiple failures".
> 
> You can recreate the superblocks of the raid disks using mkraid; if you 
> explicitly mark one disk as failed in the raidtab, no automatic resync is 
> started, so you get to check if all works and perhaps change something and 
> retry.
>

Hey all,

I've been thinking about this for a different project, how bad would it be
to setup RAID 5 to allow for 2 (or more) failures in an array? Or is this
handled under a different class of RAID (ignoring things like RAID 5 over
mirrored disks and such).

Three words: Net block device 

Bill Carlson

Systems Programmer[EMAIL PROTECTED]|  Opinions are mine,
Virtual Hospital  http://www.vh.org/|  not my employer's.
University of Iowa Hospitals and Clinics|





Re: mkraid did not work !!

2000-02-07 Thread Bill Carlson

On Mon, 7 Feb 2000 [EMAIL PROTECTED] wrote:

> Hi !
>   I did install the raid like it is described at
> http://ostenfeld.dk/~jakob/Software-RAID.HOWTO/ with patch and latest
> raid-tools. After editing /etc/raidtab like sample with raid1

Hi Peter,

Run dmesg and look for a line like 'autodetecting RAID arrays'. If it's
not there, you're not running the patched code.

Something like this would work:

dmesg | grep -i raid 

to show you any raid output from boot.

HTH,

Bill Carlson

Systems Programmer[EMAIL PROTECTED]|  Opinions are mine,
Virtual Hospital  http://www.vh.org/|  not my employer's.
University of Iowa Hospitals and Clinics|




RE: Compaq smartSCSI doesn't recognise 2nd drive

2000-01-31 Thread Bill Carlson

On Tue, 1 Feb 2000, Newman Chakerian wrote:

> Thanks Bill for your comments 
> 
> I actually had the entries in fstab right when I tried it. (Sorry for the
> manual typos)
> 
> I was wondering .. The Compaq Proliant has an internal SCSI Controller
> (NCRXX ??). The RedHat install picks this up no problem. AM I better off
> using the internal controller than the Smart SCSI 2, seeing as I don't want
> to use mirroring ? I read somewhere that the Smart SCSI 2 driver is not yet
> certified - It's a 'use at own risk' type of thing. 
>

I'm assuming that your SMART is one of the Hardware RAID controllers.
Which to use is ultimately up to you and greatly depends on what you
require for your data. If your data is very important, I'd run with the
SMART, mirror and have a good backup plan. If speed is important, I'd run
with the SMART and RAID 0. If stability is important, I might stick with
the internal SCSI, I haven't run the SMART under linux that much.

For most of my uses, I'd use both...:) 

Bill Carlson

Systems Programmer[EMAIL PROTECTED]|  Opinions are mine,
Virtual Hospital  http://www.vh.org/|  not my employer's.
University of Iowa Hospitals and Clinics|




Re: Compaq smartSCSI doesn't recognise 2nd drive

2000-01-31 Thread Bill Carlson

On Mon, 31 Jan 2000, Newman Chakerian wrote:

> here is my fstab file: the ones I'm trying to add are commented out (I added
> these entries myself):
> 
> /dev/ida/c0d0p5 /   ext2  defaults
> 1 1
> /dev/ida/c0d0p1 /boot   ext2  defaults
> 1 2
> /dev/cdrom  /mnt/cdrom  iso9660   noauto,owner,ro 0 0
> /dev/ida/c0d0p6 swapswap  defaults
> 0 0
> /dev/fd0/mnt/floppy ext2  noauto,owner
> 0 0
> none/proc   proc  defaults
> 0 0
> none/dev/ptsdevpts
> gid=5,mode=6200 0
> #/dev/ida/c0d1p5  /optext2
> default   1 3
> #/dev/ida/c0d1/p1 /mnt/dsk2   ext2
> defaults  1 1
> 
> 

I'm hoping this is a cut and paste of the file, you have typos in both
lines. Also, make sure each entry is all one line.

First entry: defaults instead of default

Second entry: The device is incorrect, I think you want /dev/ida/c0d1p1
instead of /dev/ida/c0d1/p1.

Might double check and make sure those partitions exist as well.

HTH,

Bill Carlson

Systems Programmer[EMAIL PROTECTED]|  Opinions are mine,
Virtual Hospital  http://www.vh.org/|  not my employer's.
University of Iowa Hospitals and Clinics|




raid0145 patch

1999-10-27 Thread Bill Carlson

Hey,

The patch did the trick, just like it was supposed to.

cd /usr/src/linux
patch -p1 < raid0145-19990824-2.2.11

There was the one error, which I ignored, as I was patching against
2.2.12. Does the same patch apply vs 2.2.13? I'm guessing that Mandrake's
sources are what caused the errors that lead me to go with a fresh 2.2.12
source tree.

Recompile, reboot and the magic messages started. :)

2 minutes later I had me an 8 GB array.

Thanks a lot everyone!

Bill Carlson

Systems Programmer[EMAIL PROTECTED]|  Opinions are mine,
Virtual Hospital  http://www.vh.org/|  not my employer's.
University of Iowa Hospitals and Clinics|




Re: mkraid aborts, no info?

1999-10-27 Thread Bill Carlson

On Wed, 27 Oct 1999, David A. Cooley wrote:

> Hi Bill,
> You need to get the latest raid kernel patch (ignore the errors it gives... 
> one hunk is included in the 2.2.12/2.2.13 kernel) and the latest raidtools 
> (0.90).
> 

Ah, I see now. I'll try applying the patch to the 2.2.13 now.

Thanks David, Luca.

Bill Carlson

Systems Programmer[EMAIL PROTECTED]|  Opinions are mine,
Virtual Hospital  http://www.vh.org/|  not my employer's.
University of Iowa Hospitals and Clinics|





RE: mkraid aborts, no info?

1999-10-27 Thread Bill Carlson

On Wed, 27 Oct 1999, Fernandez, Richard wrote:

> Mandrake doesn't have RAID support built into the kernel AFAIK.
> I was trying to do the same thing you're doing using Mandrake 6.0.
> 
> Below is an e-mail I received from Mandrake...
> 
> dear Richard Fernandez,
> 
> you should recompile the kernel with raid support or use the RedHat
> compiled kernel which already has that.
> 
> sincerely,
> 
> -- 
> Florin Grad
> (Technical Support Team)
> [EMAIL PROTECTED]
> 
>

I'm thinking Florin means the kernel is not compiled with support by
default.

>From the info on raidtools:

This package includes the tools you need to set up and maintain a software
RAID
device under Linux. It only works with Linux 2.2 kernels and later, or 2.0
kernel specifically patched with newer raid support.

To me that implies a 2.2.x kernel does not need a patch. On Mandrake 6.1,
the required RAID modules were already in place after installation. 

Bill Carlson

Systems Programmer[EMAIL PROTECTED]|  Opinions are mine,
Virtual Hospital  http://www.vh.org/|  not my employer's.
University of Iowa Hospitals and Clinics|




mkraid aborts, no info?

1999-10-27 Thread Bill Carlson

Hello,

I'm trying to setup your basic RAID0 Software raid and running into some
problems.

Details:

Mandrake 6.1,(not RedHat!)
Kernel 2.2.13 and 2.2.12
2 SCSI drives, completely seperate from root, 4 GB each

My test raidtab (/etc/raidtab.test)

raiddev /dev/md0
raid-level  0
nr-raid-disks   2
persistent-superblock   1
chunk-size  8

device  /dev/sdb1
raid-disk   0

device  /dev/sdc1
raid-disk   1




When I run mkraid I get the following:

[root@washu /root]# mkraid -c /etc/raidtab.test /dev/md0
handling MD device /dev/md0
analyzing super-block
disk 0: /dev/sdb1, 4192933kB, raid superblock at 4192832kB
disk 1: /dev/sdc1, 4192933kB, raid superblock at 4192832kB
mkraid: aborted, see the syslog and /proc/mdstat for potential clues.

Output from /proc/mdstat:

[root@washu /root]# chkraid
Personalities : [2 raid0]
read_ahead not set
md0 : inactive
md1 : inactive
md2 : inactive
md3 : inactive
[root@washu /root]#

Nothing of note in syslogd (*.* >> /dev/tty12)

Any ideas on what I am doing wrong?

I just joined the list, are there any searchable archives?
Is a kernel patch still required for the 2.2.1x series?

Thanks in advance,


Bill Carlson

Systems Programmer[EMAIL PROTECTED]|  Opinions are mine,
Virtual Hospital  http://www.vh.org/|  not my employer's.
University of Iowa Hospitals and Clinics|





Re: DPT ``V'' series (Millenium, Century, Decade)

1999-03-19 Thread Bill Carlson

On Fri, 12 Mar 1999, Josh Fishman wrote:

> Hi all,
> 
> We're considering buying a RAID controller from DPT. The techie we
> spoke with there says they will be releasing a Linux driver on Monday
> for the V series (their new controllers). Does anyone here have
> experience with this driver? (beta test or what have you.)
> 
> Are their drivers typically Open Source(R)? All I've heard about them
> in the past has been pretty good.
> 
> Does anyone have a reason why we might hesitate in purchasing one
> of these controllers?
> 
> Thank you,
> Josh Fishman
> NYU / RLab
>
I purchased one of the PM3334UW for a Netware server and was very unhappy
with it and DPT.

1) You have to use DPT Hot Swap bays or cabinets in order to be able to
autosync an array (i.e. put in a new drive and the rebuild starts by
itself)

2) The management piece for Netware was pathetic, it required a
DOS/Windows station to run (meaning no array functions from the server).

3) As far as I know, the linux driver is "unsupported" and does not
include a management piece, requiring one to boot DOS to rebuild the
array. One nice "feature" was that the once the array starts rebuilding,
it keeps going indepentant of the OS. I seem to remember there was work
being done on the management functions under linux, but it was still a
third party effort that was tolerated by DPT.

4) email conversations with DPT were not very responsive and generally
unimpressive.


I'd look at ICP/Vortex if I were you. Sounds like a lot of people are
happy with those cards.
 

Bill Carlson|   Opinions expressed are my own
KINZE Manufacturing, Inc.   |   not my employer's.