Re: [Re: Raid isnt shifting to degrading mode while copying data to it.]

Gadi Oxman Wed, 18 Aug 1999 10:57:50 -0700
On 18 Aug 1999, Ziber wrote:

> But the problem is Raid isnt working in this manner. It try hard to access
> faulty device and after giving some errors system just hanged. In my case i m
> using Samba to access the Raid. and if i m copying data to raid and pull the
> power cable after some period my samba connection lost and linuxbox halted.
> Raid suppose to isolate drive after its old buffer is finished but in my case
> it isnt.
> This is very basic raid functionality isnt it.
> 
> Is there any other procedure to test that raid is working fine?

Even if we do not survive the failure, the redundancy that RAID provides
is still valuable. On the next boot after the failure, we will be able
to continue working in degraded mode with the working drives.

To be able to survive the failure and continue working is not so
simple since in addition to the RAID layer, the low level disk drivers,
the bus controller and the devices have to survive the failure.

Gadi

> 
> 
> 
> 
> 
> Gadi Oxman <[EMAIL PROTECTED]> wrote:
> This behavior can be improved, but most probably no bug there.
> 
> On the first detected error, we have switched to degraded mode and
> *new requests* to the RAID device will not be redirected to the failed
> disk drive.
> 
> However, in the current architecture *old requests* which have
> been submitted to the failed disk prior to the failure can't be
> aborted while there are already in the low level drivers queues.
> 
> That period in which the failed disk queue is flushed can take quite
> a bit of time, as the low level drivers attempt do everything they
> possibly can to service the queued requests, including slow bus resets,
> etc (and rightly so, as they don't currently know that this device is
> actually a part of a redundant disk setup and that nothing bad will
> happen if they won't try as hard to service the requests). After this
> period, though, the failed disk should be idle.
> 
> Gadi
> 
> On Tue, 17 Aug 1999, James Manning wrote:
> 
> > > I am is facing problem in Raid.
> > > I have created Raid1 on 500Mb partitions on sda5 and sdb5. For testing
> purpose
> > > i disconnect power from sdb while copying data to the Raid. But the raid
> is
> > > still trying to access sdb and then giving  errors and do it in an
> infinite
> > > loop.
> > 
> > Since it looks like each new chunk is generating the raid1 attempt at
> > recovery, it does seem like sdb5 should be removed from md0 automatically
> > and the recovery thread should only get woken up again when you get the
> > drive back into a workable state and "raidhotadd" the partition again.
> > 
> > Worst case, it seems like you should be able to "raidhotremove" the
> > partition in the mean time.
> > 
> > Is this a raid/raid1 bug? Or desired behavior?
> > 
> > James
> > 
> > > ........
> > > Aug 17 17:34:58 client7 kernel: scsi0 channel 0 : resetting for second
> half of
> > > retries. 
> > > Aug 17 17:34:58 client7 kernel: SCSI bus is being reset for host 0 channel
> 0.
> > > 
> > > Aug 17 17:35:01 client7 kernel: (scsi0:0:0:0) Synchronous at 20.0
> Mbyte/sec,
> > > offset 8. 
> > > Aug 17 17:35:01 client7 kernel: SCSI disk error : host 0 channel 0 id 1
> lun 0
> > > return code = 26030000 
> > > Aug 17 17:35:01 client7 kernel: scsidisk I/O error: dev 08:15, sector
> 557856 
> > > Aug 17 17:35:01 client7 kernel: md: recovery thread got woken up ... 
> > > Aug 17 17:35:01 client7 kernel: md0: no spare disk to reconstruct array!
> --
> > > continuing in degraded mode 
> > > Aug 17 17:35:01 client7 kernel: md: recovery thread finished ... 
> > > Aug 17 17:35:01 client7 kernel: SCSI disk error : host 0 channel 0 id 1
> lun 0
> > > return code = 26030000 
> > > Aug 17 17:35:01 client7 kernel: scsidisk I/O error: dev 08:15, sector
> 557730 
> > > Aug 17 17:35:01 client7 kernel: md: recovery thread got woken up ... 
> > > Aug 17 17:35:01 client7 kernel: md0: no spare disk to reconstruct array!
> --
> > > continuing in degraded mode 
> > > Aug 17 17:35:01 client7 kernel: md: recovery thread finished ... 
> > > Aug 17 17:35:01 client7 kernel: SCSI disk error : host 0 channel 0 id 1
> lun 0
> > > return code = 26030000 
> > > Aug 17 17:35:01 client7 kernel: scsidisk I/O error: dev 08:15, sector
> 557602 
> > > Aug 17 17:35:01 client7 kernel: md: recovery thread got woken up ... 
> > > Aug 17 17:35:01 client7 kernel: md0: no spare disk to reconstruct array!
> --
> > > continuing in degraded mode 
> > > Aug 17 17:35:01 client7 kernel: md: recovery thread finished ... 
> > > Aug 17 17:35:02 client7 kernel: scsi0 channel 0 : resetting for second
> half of
> > > retries. 
> > > Aug 17 17:35:02 client7 kernel: SCSI bus is being reset for host 0 channel
> 0.
> > > 
> > > Aug 17 17:35:05 client7 kernel: (scsi0:0:0:0) Synchronous at 20.0
> Mbyte/sec,
> > > offset 8. 
> > > Aug 17 17:35:05 client7 kernel: SCSI disk error : host 0 channel 0 id 1
> lun 0
> > > return code = 26030000 
> > > Aug 17 17:35:05 client7 kernel: scsidisk I/O error: dev 08:15, sector
> 557858 
> > > Aug 17 17:35:05 client7 kernel: md: recovery thread got woken up ... 
> > > Aug 17 17:35:05 client7 kernel: md0: no spare disk to reconstruct array!
> --
> > > continuing in degraded mode 
> > > Aug 17 17:35:05 client7 kernel: md: recovery thread finished ... 
> > > Aug 17 17:35:05 client7 kernel: SCSI disk error : host 0 channel 0 id 1
> lun 0
> > > return code = 26030000 
> > > Aug 17 17:35:05 client7 kernel: scsidisk I/O error: dev 08:15, sector
> 557604 
> > > Aug 17 17:35:05 client7 kernel: md: recovery thread got woken up ... 
> > > Aug 17 17:35:05 client7 kernel: md0: no spare disk to reconstruct array!
> --
> > > continuing in degraded mode 
> > > Aug 17 17:35:05 client7 kernel: md: recovery thread finished ... 
> > > Aug 17 17:35:06 client7 kernel: SCSI disk error : host 0 channel 0 id 1
> lun 0
> > > return code = 26030000 
> > > .........
> > > 
> > > 
> > > 
> > > i m using Adaptec AHA-2955 SCSI card. 2.2.10 kernel and patched it with
> > > raid0145-19990724
> > > my raidtab file is
> > > 
> > > 
> > > raiddev   /dev/md0
> > >   raid-level      1
> > >   nr-raid-disks   2
> > >   nr-spare-disks  0
> > >   persistent-superblock   1
> > >   chunk-size      32
> > > 
> > >           device  /dev/sda5
> > >           raid-disk       0
> > > 
> > >           device  /dev/sdb5
> > >           raid-disk       1
> > 
> > 
> 
> 
> 
> 
> 
> 
>
Re: [Re: Raid isnt shifting to degrading mode while copying data to it.]

Reply via email to