Re: RAID 5 Array fails if first disk is missing

1999-11-16 Thread Marc Haber

I am an idiot. Sent the first version of that article to
owner-linux-raid :-(

This is a later version.

On Sat, 13 Nov 1999 15:59:50 GMT, I wrote:
For a test, I disconnected sda while system power was off and expected
the system to come up on the remaining disks. However, the RAID array
wasn't detected:

|autodetecting RAID arrays
|autorun...
|   ... autorun DONE.

/proc/mdstat shows no active RAID devices, raidstart /dev/md0 gives:
|blkdev_open() failed: -6
|md: could not lock sdp15, zero-size= Marking faulty.
|could not import sdp15!
|autostart sdp14 failed!
|/dev/md0: Invalid argument

When I plug the first disk back in, everything works again.

I suspect that there is something wrong with the persistent
superblocks on the second and/or the third disk. Can I rewrite the
persistent superblock?

ftr, I did a backup - mkraid - mke2fs - restore routine to get my RAID
back and have been able to reproduce the problem.

This is what seems to have happened:
- RAID is running.
- system shutdown.
- system reboot with missing first disk.
- RAID code notices this, removes first disk from array, pulls in
  spare disk, starts recovery.
- system shutdown. It is irrelevant if recovery had finished in the
  mean time or not, same things happen.
- system reboot with first disk still missing.
- = BOOM, "could not import sdp15".
- system shutdown again.
- system reboot with first disk present.
- system comes up and continues recovery on the spare disk. First
  disk is ignored.

Looks like the array's persistent superblock really is only updated on
the array's first disk or somewhere in the root fs (that is also on
the first disk). So, a shutdown after the first disk's failure has
been detected will put the array in an inconsistent state, from where
it looks like there is no way to get the array back unless a backup -
mkraid - mke2fs - restore is done. I don't know what will happen if
the first disk is really dead, but I suspect that the array won't come
up with a blank first disk, putting all data at stake.

I think this is a bug that should be fixed. Otherwise, there is no
redundancy in case of the first disk failing: First disk dead, data
gone. This shouldn't happen in a system that always has at least three
working disks if only two are needed to keep data integrity.

If I can do any tracing to pin down the bug, please get in touch with
me.

Greetings
Marc

-- 
-- !! No courtesy copies, please !! -
Marc Haber  |   " Questions are the | Mailadresse im Header
Karlsruhe, Germany  | Beginning of Wisdom " | Fon: *49 721 966 32 15
Nordisch by Nature  | Lt. Worf, TNG "Rightful Heir" | Fax: *49 721 966 31 29



Help - Bug in Raid-1 - Second failure

1999-11-16 Thread Kent A. Ziebell

Software levels: Redhat 6.0, kernel 2.2.5-15, raidtools-0.90

I reported the following kernel errors to the list late last week that I got
on one of my production email pop servers (each of my five service has ~7000
user pop mailboxes).  Late yesterday, I got the exact same failure of the
raid-1 on a different server.  I have yet to fix the first server, as I have
to wait until this Saturday to schedule a four hour outage, as I attempt to
repair the raid.  Anybody have a clue about what is causing this, how to fix
the raid (do I have to tar it up, and then rebuild the raid from scratch and
restore the data?), and what can ge done to prevent this kind of failure?

Btw, even though the Raid-1 has failed, it appears to be reading and writing
correctly to the first disk in the raid in both cases.  This is not a
hardware problem. 


Kernel errors follow:
Nov  8 14:03:27 pop-3 kernel: attempt to access beyond end of device 
Nov  8 14:03:27 pop-3 kernel: 08:51: rw=0, want=892027448, limit=8956206 
Nov  8 14:03:27 pop-3 kernel: raid1: Disk failure on sdf1, disabling device.
Nov  8 14:03:27 pop-3 kernel:Operation continuing on 1 devices 
Nov  8 14:03:27 pop-3 kernel: raid1: md1: rescheduling block 892027447 
Nov  8 14:03:27 pop-3 kernel: attempt to access beyond end of device 
Nov  8 14:03:27 pop-3 kernel: 08:41: rw=0, want=1295592304, limit=8956206 
Nov  8 14:03:27 pop-3 kernel: raid1: only one disk left and IO error. 
Nov  8 14:03:27 pop-3 kernel: raid1: md1: rescheduling block 1295592303 
Nov  8 14:03:27 pop-3 kernel: md: recovery thread got woken up ... 
Nov  8 14:03:27 pop-3 kernel: md1: no spare disk to reconstruct array! -- conti
nuing in degraded mode 
Nov  8 14:03:27 pop-3 kernel: md: recovery thread finished ... 
Nov  8 14:03:27 pop-3 kernel: dirty sb detected, updating. 
Nov  8 14:03:27 pop-3 kernel: md: updating md1 RAID superblock on device 
Nov  8 14:03:27 pop-3 kernel: (skipping faulty sdf1 ) 
Nov  8 14:03:27 pop-3 kernel: (skipping faulty sde1 ) 
Nov  8 14:03:27 pop-3 kernel: . 
Nov  8 14:03:27 pop-3 kernel: raid1: md1: unrecoverable I/O read error for bloc
k 1295592303 
Nov  8 14:03:27 pop-3 kernel: raid1: md1: redirecting sector 892027447 to anoth
er mirror 
Nov  8 14:03:27 pop-3 kernel: attempt to access beyond end of device 
Nov  8 14:03:27 pop-3 kernel: 08:41: rw=0, want=892027448, limit=8956206 
Nov  8 14:03:27 pop-3 kernel: raid1: only one disk left and IO error. 
Nov  8 14:03:27 pop-3 kernel: raid1: md1: rescheduling block 892027447 
Nov  8 14:03:27 pop-3 kernel: raid1: md1: unrecoverable I/O read error for bloc
k 892027447 
Nov  8 14:03:27 pop-3 kernel: md: recovery thread got woken up ... 
Nov  8 14:03:27 pop-3 kernel: md1: no spare disk to reconstruct array! -- conti
n uing in degraded mode 
Nov  8 14:03:27 pop-3 kernel: md: recovery thread finished ... 
Nov  8 14:03:27 pop-3 kernel: attempt to access beyond end of device 
Nov  8 14:03:27 pop-3 kernel: 08:41: rw=0, want=892027448, limit=8956206 
Nov  8 14:03:27 pop-3 kernel: raid1: only one disk left and IO error. 
Nov  8 14:03:27 pop-3 kernel: raid1: md1: rescheduling block 892027447 
Nov  8 14:03:27 pop-3 kernel: attempt to access beyond end of device 
Nov  8 14:03:27 pop-3 kernel: 08:41: rw=0, want=1295592304, limit=8956206 
Nov  8 14:03:27 pop-3 kernel: raid1: only one disk left and IO error. 
Nov  8 14:03:27 pop-3 kernel: raid1: md1: rescheduling block 1295592303 
Nov  8 14:03:27 pop-3 kernel: md: recovery thread got woken up ... 
Nov  8 14:03:27 pop-3 kernel: md1: no spare disk to reconstruct array! -- conti
n
uing in degraded mode 
Nov  8 14:03:27 pop-3 kernel: md: recovery thread finished ... 
Nov  8 14:03:27 pop-3 kernel: raid1: md1: unrecoverable I/O read error for
block 1295592303 
Nov  8 14:03:27 pop-3 kernel: raid1: md1: unrecoverable I/O read error for bloc
k 892027447 

The disk drives involved are for our pop servers, so most likely the software
error was due to a corrupt map file.  

My /proc/mdstat currently looks like:

Personalities : [raid1] 
read_ahead 1024 sectors
md1 : active raid1 sdf1[1](F) sde1[0](F) 8956096 blocks [2/1] [U_]
unused devices: none

The first thing I tried was to do was:
raidhotremove /dev/md1 /dev/sde1
but it said the device was busy.  

From looking at /proc/scsi/aic7xxx/0, I have determined that reads and writes
to /dev/md1 are going to device 4, which is /dev/sde1 .

Now the question is, what do I do to recover from this situation?  My
instinct is to tar /dev/sde1 up as a backup before I mess with the raid stuff
and then to reboot in single user mode after removing /dev/md1 from
/etc/fstab so that the raid does not attempt to start up.  Then do an fsck on
/dev/sde1.  Now if I restart the raid at this point, what ensures me that
this will resync in the right direction?  That is, uses /dev/sde1 as the
master and not /dev/sdf1.  Does it use the raid superblocks to determine
which way to sync up?

Comments???

My /etc/raidtab looks like:

 # raid-1 configuration
raiddev 

RE: Root RAID and unmounting /boot

1999-11-16 Thread Dirk Lutzebaeck

Theo Van Dinter writes:
  On Wed, 27 Oct 1999, Bruno Prior wrote:
  
  BP (a) Getting the latest patched lilo from RedHat or applying the lilo.raid1 patch
  BP and rebuilding it yourself (if /dev/md0 is RAID-1)
  BP (b) Providing lilo with the geometry of one of the devices in the array (again
  BP if /dev/md0 is RAID-1)
  BP (c) Using a slightly-adapted grub instead of lilo (again if /dev/md0 is RAID-1)
  BP (d) Making sure the files to which these lines point are not on a software-RAID
  BP array.
  
  Just a note:  I setup root RAID1 over the weekend on my RH61 box. The
  configuration file is really simple, you run lilo as normal, it writes the
  boot block to all disks in the array and you're done:
  
  boot=/dev/md0
  map=/boot/map
  install=/boot/boot.b
  timeout=50
  default=linux
  
  image=/boot/vmlinuz
  label=linux
  read-only
  root=/dev/md0

Hello, I just tried this with RH6.1 but couldn't succeed:

- floppy boot on /dev/md0 works
- calling lilo gives:

boot = /dev/sdb, map = /boot/map.0811
Added linux2210r *
boot = /dev/sda, map = /boot/map.0801
Added linux2210r
open /tmp/dev.0: No such device

I can't figure out where this /tmp/dev.0 messages comes from.
strace says:

stat("/tmp/dev.0", 0xbfffe1d0)  = -1 ENOENT (No such file or directory)
mknod("/tmp/dev.0", S_IFBLK|0600, makedev(0, 0)) = 0
stat("/tmp/dev.0", {st_mode=S_IFBLK|0600, st_rdev=makedev(0, 0), ...}) = 0
open("/tmp/dev.0", 0x4) = -1 ENODEV (No such device)


When booting from /dev/md0 lilo puts an Error 0x80 which means

   0x80   "Disk timeout". The disk or the drive isn't ready. Either the 
media is bad or the disk isn't spinning. If you're booting from a 
floppy, you might not have closed the drive door. Otherwise, trying to 
boot again might help. 

When I'm booting with root=/dev/md0 it works.

I'm using lilo-0.21-10 from RH6.1

Greetings,

Dirk

lilo.conf:

boot=/dev/md0
map=/boot/map
install=/boot/boot.b
prompt
timeout=50

image=/boot/vmlinuz-2.2.10raid
label=linux2210r
read-only
root=/dev/md0

raidtab:

raiddev /dev/md0
raid-level  1
nr-raid-disks   2
nr-spare-disks  0
persistent-superblock   1
chunk-size  32
device  /dev/sdb1
raid-disk   0
device  /dev/sda1
raid-disk   1



SCSI - RAID - data loss? Adding disks to RAID - data balancing?

1999-11-16 Thread Otis Gospodnetic

Hi,

I have 2 questions here...

1.
I'm getting some SCSI disks that I may decide/need to put them in a RAID
later on.
I am wondering whether I will be able to preserve all the data on the SCSI
disks when I convert them to RAID, or will conversion to RAID mean that I'll
have to lose data on those disks and put it back on the RAID after the
conversion?

2.
If I get 2 disks and put them into RAID-0 (striping) what happens when I
decide to add another (3rd) disk to the setup?
Does the data automatically migrate from disks 1 and 2 onto 3 so that it is
evenly spread accross all 3 disks, or would I have to do that manually
somehow, or is this a bad thing to do and I should just buy all disks that I
think I will need right away?
Any advice?

Thanks!

Otis



Re: errors on boot

1999-11-16 Thread Luca Berra

On Tue, Nov 16, 1999 at 03:14:42PM -0500, Michel Pelletier wrote:
 It looks to me like the [2/1] means only one of my two 'low level'
 partitions below /dev/md2 is working.  I don't know what the [U_] means,
it means the second partiotion is gone

U=ok
_=fubar


-- 
Luca Berra -- [EMAIL PROTECTED]
Communications Media  Services S.r.l.



RE: errors on boot

1999-11-16 Thread Michel Pelletier

 -Original Message-
 From: Luca Berra [mailto:[EMAIL PROTECTED]]
 Sent: Tuesday, November 16, 1999 2:45 PM
 To: '[EMAIL PROTECTED]'
 Subject: Re: errors on boot
 
 
 On Tue, Nov 16, 1999 at 03:14:42PM -0500, Michel Pelletier wrote:
  It looks to me like the [2/1] means only one of my two 'low level'
  partitions below /dev/md2 is working.  I don't know what 
 the [U_] means,
 it means the second partiotion is gone
 
 U=ok
 _=fubar

Well I figured that.  So how do I find out if this is a logical error or
physical?  The other two partions on the same drive work dandy with
their RAID, it looks like just this one partition on the drive is bad.
How, in a nutshell, do I fix this sucker?

Thanks for your help,

-Michel



zero-D raid chassis

1999-11-16 Thread Seth Vidal

has anyone on this list used or had any dealing's with the Zero-D UDMA
internal SCSI external Raid Arrays?
this is the URL (the 400 model specifically)
 http://www.zero-d.com/ide2.html

I'm interested for use with linux and/or solaris and I'd love to know of
any feelings or responses.
They look like a good deal in that replacement cost on these units is low
(due to IDE drives) and I'd like to hear opinions on them.

Thanks

-sv






module loading issues with RAID1.o

1999-11-16 Thread David McMillan

Dear RAID developers-

This email has been sent to redhat a few times with no luck, I hope
that you can help.
I have followed the relevant suggestions for similar problems posted to
the software-raid newsgroup
with no luck, I know that their is some step that I am missing but
cannot find it, nor could I find good doc on kernel module loading
procedure and troubleshooting.

The particulars:
RedHat 6.0 default install.
New Kernel compiled with RAID1 built in. Using default 6.0 kernel
source.
Latest RAID tools  .90-5, as an rpm for i386

The problem:
  I am attempting to mount two secondary disks as a RAID 1 array.
these devices are sdb2 and sdc2. I am booting off of sda1 which is not
RAID. I have compiled a new
kernel with RAID 1 support. No module load. On boot, dmesg shows the
RAID1 mounted by kernel
then it get torn down by a module loader???. I am using the kernel
source included with RH6.0 and a
recent version of RAID tools v.90-5. I did not update fdisk from the rpm

as its fellow tools
included in the redhat distr., required a higher kernel. This means that

I just forced "unknown" tag in
fdisk "fd"

 Included are my dmesg, /var/log/messages, fstab, raidtab.

 /var/log/messages
Attempts to issue /sbin/modprobe raid1, result in:

Nov  9 23:02:55 fat-man insmod: /lib/modules/2.2.5-15/block/raid1.o:
unresolved symbol __global_cli
Nov  9 23:02:55 fat-man insmod: /lib/modules/2.2.5-15/block/raid1.o:
unresolved symbol __global_save_flags
Nov  9 23:02:55 fat-man insmod: /lib/modules/2.2.5-15/block/raid1.o:
unresolved symbol __global_restore_flags

Dmesg:
Linux version 2.2.5-15 ([EMAIL PROTECTED]) (gcc version
egcs-2.91.66 19990314/Linux (egcs-1.1.2 release)) #2 SMP Thu Nov 4
23:13:56 PST 1999
Intel MultiProcessor Specification v1.1
snip
RAM disk driver initialized:  16 RAM disks of 4096K size
PIIX4: IDE controller snip
md driver 0.90.0 MAX_MD_DEVS=256, MAX_REAL=12
(scsi0) Adaptec AHA-294X Ultra SCSI host adapter found at PCI 17/0
(scsi0) Wide Channel, SCSI ID=7, 16/255 SCBs
(scsi0) Warning - detected auto-termination
(scsi0) Please verify driver detected settings are correct.
(scsi0) If not, then please properly set the device termination
(scsi0) in the Adaptec SCSI BIOS by hitting CTRL-A when prompted
(scsi0) during machine bootup.
(scsi0) Cables present (Int-50 NO, Int-68 YES, Ext-68 NO)
(scsi0) Downloading sequencer code... 413 instructions downloaded
scsi0 : Adaptec AHA274x/284x/294x (EISA/VLB/PCI-Fast SCSI) 5.1.15/3.2.4
   Adaptec AHA-294X Ultra SCSI host adapter
scsi : 1 host.
(scsi0:0:0:0) Synchronous at 40.0 Mbyte/sec, offset 8.
  Vendor: SEAGATE   Model: ST39175LW Rev: 0001
  Type:   Direct-Access  ANSI SCSI revision: 02
Detected scsi disk sda at scsi0, channel 0, id 0, lun 0
(scsi0:0:1:0) Synchronous at 40.0 Mbyte/sec, offset 8.
  Vendor: SEAGATE   Model: ST39175LW Rev: 0001
  Type:   Direct-Access  ANSI SCSI revision: 02
Detected scsi disk sdb at scsi0, channel 0, id 1, lun 0
(scsi0:0:2:0) Synchronous at 40.0 Mbyte/sec, offset 8.
  Vendor: SEAGATE   Model: ST39175LW Rev: 0001
  Type:   Direct-Access  ANSI SCSI revision: 02
Detected scsi disk sdc at scsi0, channel 0, id 2, lun 0
scsi : detected 3 SCSI disks total.
SCSI device sda: hdwr sector= 512 bytes. Sectors= 17783240 [8683 MB]
[8.7 GB]
SCSI device sdb: hdwr sector= 512 bytes. Sectors= 17783240 [8683 MB]
[8.7 GB]
SCSI device sdc: hdwr sector= 512 bytes. Sectors= 17783240 [8683 MB]
[8.7 GB]
  eth0: snip
Partition check:
 sda: sda1 sda2  sda5 sda6 sda7 sda8 sda9 
 sdb: sdb1 sdb2
 sdc: sdc1 sdc2
md.c: sizeof(mdp_super_t) = 4096
autodetecting RAID arrays
(read) sdb2's sb offset: 8201088 [events: 0002]
(read) sdc2's sb offset: 8201088 [events: 0002]
autorun ...
considering sdc2 ...
  adding sdc2 ...
  adding sdb2 ...
created md0
bindsdb2,1
bindsdc2,2
running: sdc2sdb2
now!
sdc2's event counter: 0002
sdb2's event counter: 0002
request_module[md-personality-3]: Root fs not mounted
do_md_run() returned -22
unbindsdc2,1
export_rdev(sdc2)
unbindsdb2,0
export_rdev(sdb2)
md0 stopped.
... autorun DONE.
VFS: Mounted root (ext2 filesystem) readonly.
Freeing unused kernel memory: 68k freed
Adding Swap: 682724k swap-space (priority 1)
Adding Swap: 682724k swap-space (priority 1)
END:

/var/log/messages:
Nov 10 18:45:12 fat-man kernel: autodetecting RAID arrays
Nov 10 18:45:12 fat-man kernel: (read) sdb2's sb offset: 8201088
[events: 0002]
Nov 10 18:45:12 fat-man kernel: (read) sdc2's sb offset: 8201088
[events: 0002]
Nov 10 18:45:12 fat-man kernel: autorun ...
Nov 10 18:45:12 fat-man kernel: considering sdc2 ...
Nov 10 18:45:12 fat-man kernel:   adding sdc2 ...
Nov 10 18:45:12 fat-man kernel:   adding sdb2 ...
Nov 10 18:45:12 fat-man kernel: created md0
Nov 10 18:45:12 fat-man kernel: bindsdb2,1
Nov 10 18:45:12 fat-man kernel: bindsdc2,2
Nov 10 18:45:12 fat-man kernel: running: sdc2sdb2
Nov 10 18:45:12 fat-man kernel: now!
Nov 10 

Re: SCSI - RAID - data loss? Adding disks to RAID - data balancing?

1999-11-16 Thread Jakob Østergaard

On Tue, Nov 16, 1999 at 03:09:38PM -0500, Otis Gospodnetic wrote:
 Hi,
 
 I have 2 questions here...
 
 1.
 I'm getting some SCSI disks that I may decide/need to put them in a RAID
 later on.
 I am wondering whether I will be able to preserve all the data on the SCSI
 disks when I convert them to RAID, or will conversion to RAID mean that I'll
 have to lose data on those disks and put it back on the RAID after the
 conversion?

I'm developing a raidreconf utility to allow this.  Currently it can ``convert''
a single data disk and a number of empty disks into a RAID-0 while preserving
data.

HOWEVER(!) This has not been tested on large disks (!)  I guess it works, and
it's tested on 60-90 MB disks (loopback devices).

When I get a little more spare time I'll add RAID-5 capability.

 2.
 If I get 2 disks and put them into RAID-0 (striping) what happens when I
 decide to add another (3rd) disk to the setup?
 Does the data automatically migrate from disks 1 and 2 onto 3 so that it is
 evenly spread accross all 3 disks, or would I have to do that manually
 somehow, or is this a bad thing to do and I should just buy all disks that I
 think I will need right away?
 Any advice?

The raidreconf utility can do this with RAID-0 arrays.  I've tested it on
a 50+GB RAID, and succesfully expanded it to 100+GB.  After expanding the
RAID, I used ext2resize to resize the filesystem.  Everything went just
fine.

raidreconf is slow as molasses, but it works - at least for me.  Until it has
seen some more testing, I'm reluctant to say that it really works.  It does
perform quite a few paranoia checks before actually moving the data, so it
_should_ catch most internal errors, if there are any.

The current algorithm is very clean, eg. it moves data in a way that is simple
and therefore likely to be correct.  Therefore it is also very slow.  The 
expansion of the large array mentioned above took almost 24 hours (!).  I'll
optimize this when I get some spare time.

You can get the raidreconf utility as a patch to the raidtools package from
 http://ostenfeld.dk/~jakob/Software-RAID.HOWTO/

-- 

: [EMAIL PROTECTED]  : And I see the elder races, :
:.: putrid forms of man:
:   Jakob Østergaard  : See him rise and claim the earth,  :
:OZ9ABN   : his downfall is at hand.   :
:.:{Konkhra}...: