owie, disk failure

Jeffrey Paul Sun, 06 Aug 2000 17:57:26 -0700
hmmmm, the day i had hoped would never arrive has...

Aug  2 07:38:27 chrome kernel: raid1: Disk failure on hdg1, disabling device.
Aug  2 07:38:27 chrome kernel: raid1: md0: rescheduling block 8434238
Aug  2 07:38:27 chrome kernel: md0: no spare disk to reconstruct 
array! -- continuing in degraded mode
Aug  2 07:38:27 chrome kernel: raid1: md0: redirecting sector 8434238 
to another mirror


my setup is a two-disk (40gb each) raid1 configuration... hde1 and 
hdg1.   I didn't have measures in place to notify me of such an 
event, so I didnt notice it until i looked at the console today and 
noticed it there...

I ran raidhotremove /dev/md0 /dev/hdg1 and then raidhotadd /dev/md0 
/dev/hdg1 and it seemed to begin reconstruction:

Aug  5 09:23:45 chrome login[97]: ROOT LOGIN on `tty1'
Aug  5 09:28:45 chrome kernel: md: updating md0 RAID superblock on device
Aug  5 09:29:00 chrome kernel: md: updating md0 RAID superblock on device
Aug  5 09:29:00 chrome kernel: md: recovery thread got woken up ...
Aug  5 09:29:00 chrome kernel: md0: resyncing spare disk hdg1 to 
replace failed disk
Aug  5 09:29:00 chrome kernel: md: syncing RAID array md0
Aug  5 09:29:00 chrome kernel: md: minimum _guaranteed_ 
reconstruction speed: 100 KB/sec.
Aug  5 09:29:00 chrome kernel: md: using maximum available idle IO 
bandwith for reconstruc
tion.
Aug  5 09:29:00 chrome kernel: md: using 128k window.
Aug  5 09:31:04 chrome kernel: md: recovery thread finished ...
Aug  5 09:31:04 chrome kernel: md: updating md0 RAID superblock on device
Aug  5 09:31:04 chrome kernel: md0 stopped.
Aug  5 09:33:51 chrome kernel: md0: max total readahead window set to 128k
Aug  5 09:33:51 chrome kernel: md0: 1 data-disks, max readahead per 
data-disk: 128k
Aug  5 09:33:51 chrome kernel: raid1: spare disk hdg1
Aug  5 09:33:51 chrome kernel: raid1: device hde1 operational as mirror 0
Aug  5 09:33:51 chrome kernel: raid1: raid set md0 active with 1 out 
of 2 mirrors


chrome:~# cat /proc/mdstat
Personalities : [raid1]
read_ahead 1024 sectors
md0 : active raid1 hdg1[2] hde1[0] 39061952 blocks [2/1] [U_] 
recovery=0% finish=112.2min
unused devices: <none>
chrome:~#

(this used to be hde1[0] and hdg1[1] before, but i raidhotremove and 
raidhotadded it... i'm not extremely worried about the numbering, as 
long as it works...)

but I got scared and decided to stop it...  so now it's sitting idle 
unmounted spun down (both disks) awaiting professional advice (rather 
than me stumbling around in the dark before i hose my data).   Both 
disks are less than two weeks old, although I have heard of people 
having similar problems (disks failing in less than a month new from 
the factory) with this brand and model....   I would like to get the 
drives back to the way the were before the system decided that the 
disk had failed (what causes it to think that, anyways?) and see if 
it continues to work, as I find it hard to believe that the drive 
would have died so quickly.   What is the proper course of action?
-- 

--------------------------------------------------
[EMAIL PROTECTED]        -             0xCD91A427
9907 3747 3CE9 11C5 2B1C  F141 D09F 488C CD91 A427
Note: key id 0x299450B6 is lost and inactive.
--------------------------------------------------
Copyright 2000 Jeffrey Paul.
The information contained in this message may be
privileged and confidential and protected from
disclosure.  If the reader of this message is not
the intended recipient, or an employee or agent
responsible for delivering this message to the
intended recipient, you are hereby notified that
any dissemination, distribution or copying of this
communication is strictly prohibited.  Thank you.
owie, disk failure

Reply via email to