I have a pair of Western Digital RE4-GP (WD2002FYPS) drives in 
a software RAID1 configuration using Linux 2.6.30.3 on a 
LSISAS1068E controller.  Within hours one of the drives was 
kicked out of the array with:

[ 4907.485324] end_request: I/O error, dev sdb, sector 3907028974               
                                                                                
        
[ 4907.485543] md: super_written gets error=-5, uptodate=0                      
                                                                                
        
[ 4907.485546] raid1: Disk failure on sdb2, disabling device.                   
                                                                                
        
[ 4907.485547] raid1: Operation continuing on 1 devices.                        
                                                                                
        
[ 4907.499157] RAID1 conf printout:                                             
                                                                                
        
[ 4907.499159]  --- wd:1 rd:2                                                   
                                                                                
        
[ 4907.499162]  disk 0, wo:0, o:1, dev:sda2                                     
                                                                                
        
[ 4907.499164]  disk 1, wo:1, o:0, dev:sdb2                                     
                                                                                
        
[ 4907.503037] RAID1 conf printout:                                             
                                                                                
        
[ 4907.503039]  --- wd:1 rd:2                                                   
                                                                                
        
[ 4907.503041]  disk 0, wo:0, o:1, dev:sda2                                     
                                                                                
        
[ 6705.292961] sd 4:0:1:0: [sdb] Sense Key : Recovered Error 
[current] [descriptor]                                                          
                           
[ 6705.292967] Descriptor sense data with sense descriptors (in 
hex):                                                                           
                        
[ 6705.292970]         72 01 00 1d 00 00 00 0e 09 0c 00 00 00 00 
00 00                                                                           
                       
[ 6705.292978]         00 4f 00 c2 00 50                                        
                                                                                
        
[ 6705.292983] sd 4:0:1:0: [sdb] Add. Sense: ATA pass through 
information available                                                           
                          
[ 6705.359497] sd 4:0:1:0: [sdb] Sense Key : Recovered Error 
[current] [descriptor]                                                          
                           
...

Subsequently, I disabled NCQ with:

$ echo 1 > /sys/block/sda/device/queue_depth
$ echo 1 > /sys/block/sdb/device/queue_depth

which rendered the system stable.  Is there a better way of 
implementing this work-around than a rcS.d or rc2.d script?  

What is the right place to report this problem with NCQ?  
[email protected] generated no response.


/Allan
-- 
Allan Wind
Life Integrity, LLC
<http://lifeintegrity.com>


-- 
To UNSUBSCRIBE, email to [email protected] 
with a subject of "unsubscribe". Trouble? Contact [email protected]

Reply via email to