I had a simple RAID1 system up and running for the first time yesterday on i386
3.8, but now after testing failure mode, I can't get things unwedged.  It's
quite possible I wedged it worse while messing with it after the failure, so
I'm OK with starting the build over (no data yet) - but I can't get it to that
state either.  Any pointers appreciated!


Initial running status:

bash-3.00# raidctl -s raid0
raid0 Components:
/dev/wd2a: optimal
/dev/wd1a: optimal
No spares.
Parity status: clean
Reconstruction is 100% complete.
Parity Re-write is 100% complete.
Copyback is 100% complete.

Mounted it ok too:

/dev/raid0a  19239324         6  18277352     0%    /home

Then I tried a simple 'raidctl -f /dev/wd1a raid0' which failed ok, but upon a
'raidctl -R /dev/wd1a raid0' it kernel paniced, and I have not been able to
resurrect it.

Here's what's been changed in the kernel.... not a lot. I don't understand why
it would panic on a simple reconstruct command.

bash-3.00# diff GENERIC GENERIC_RAID
16,18c16,18
< option                I386_CPU        # CPU classes; at least one is REQUIRED
< option                I486_CPU
< option                I586_CPU
---
> #option               I386_CPU        # CPU classes; at least one is REQUIRED
> #option               I486_CPU
> #option               I586_CPU
21a22,24
> option        RAID_AUTOCONFIG
> option        RAIDDEBUG
>
23a27
>
607c611
< #pseudo-device        raid            4       # RAIDframe disk driver
---
> pseudo-device raid            4       # RAIDframe disk driver


Here's the current status of the RAID:

bash-3.00# raidctl -s raid0
raid0 Components:
           /dev/wd2a: optimal
           /dev/wd1a: failed
No spares.
Parity status: DIRTY
Reconstruction is 100% complete.
Parity Re-write is 100% complete.
Copyback is 100% complete.

I can't fix the parity either now, it fails on both a -i and a -P attempt.

wd2a's component label looks fine:

bash-3.00# raidctl -g /dev/wd2a raid0
Component label for /dev/wd2a:
   Row: 0, Column: 0, Num Rows: 1, Num Columns: 2
   Version: 2, Serial Number: 100, Mod Counter: 192
   Clean: No, Status: 0
   sectPerSU: 32, SUsPerPU: 1, SUsPerRU: 1
   Queue size: 100, blocksize: 512, numBlocks: 39102080
   RAID Level: 1
   Autoconfig: No
   Root partition: No
   Last configured as: raid0

wd1a's does not... pretty much every bit of data that could be changed, has
been somehow.  I suspect this is part of the root reason that raid0 can't deal
with it, but I can't seem to get it to reinitialize correctly either.

bash-3.00# raidctl -g /dev/wd1a raid0
Component label for /dev/wd1a:
   Row: 16, Column: 24, Num Rows: 1312, Num Columns: 16
   Version: 0, Serial Number: 0, Mod Counter: 8
   Clean: Yes, Status: 1133920558
   sectPerSU: 9775536, SUsPerPU: 9620351, SUsPerRU: 119
   Queue size: 2048, blocksize: 8, numBlocks: 5
   RAID Level:
   Autoconfig: Yes
   Root partition: Yes
   Last configured as: raid256


After a reboot, dmesg says:

wd1 at pciide0 channel 0 drive 1: <IBM-DPTA-372050>
wd1: 16-sector PIO, LBA, 19574MB, 40088160 sectors
wd1(pciide0:0:1): using PIO mode 4, Ultra-DMA mode 2

wd2 at pciide0 channel 1 drive 1: <WDC WD200EB-00CSF0>
wd2: 16-sector PIO, LBA, 19092MB, 39102336 sectors
wd2(pciide0:1:1): using PIO mode 4, Ultra-DMA mode 2

Kernelized RAIDframe activated
Searching for raid components...
cd0(atapiscsi0:0:0): Check Condition (error 0x70) on opcode 0x0
    SENSE KEY: Not Ready
     ASC/ASCQ: Medium Not Present

        [ why does cd0 show up here at all? ]

Component on: wd2a: 39102147
   Row: 0 Column: 0 Num Rows: 1 Num Columns: 2
   Version: 2 Serial Number: 100 Mod Counter: 183
   Clean: No Status: 0
   sectPerSU: 32 SUsPerPU: 1 SUsPerRU: 1
   RAID Level: 1  blocksize: 512 numBlocks: 39102080
   Autoconfig: No
   Contains root partition: No
   Last configured as: raid0
Found: wd2a at 0,0

RAIDFRAME: protectedSectors is 64.
Hosed component: /dev/wd1a.
Hosed component: /dev/wd1a.
raid0: Component /dev/wd2a being configured at row: 0 col: 0
         Row: 0 Column: 0 Num Rows: 1 Num Columns: 2
         Version: 2 Serial Number: 100 Mod Counter: 183
         Clean: No Status: 0
/dev/wd2a is not clean !
raid0: Ignoring /dev/wd1a.
RAIDFRAME(RAID Level 1): Using 6 floating recon bufs with no head sep limit.
raid0 (root)raid0: Error re-writing parity!


I agree it's 'hosed', just looking at the component label!  Nice message. :)

If I umount the partition and try to -u(nconfigure) raid0, it kernel panics.
If I -R(econstruct), it kernel panics.    I've messed it up but good.  How do I
reinitialize wd1a and/or raid0 and/or start over completely?

-Dave Diller

Reply via email to