Re: 2850 PERC 4e/Di drive errors

2010-05-11 Thread Matthew Lenz
Jefferson Ogata wrote:
> On 2010-05-11 14:54, Matthew Lenz wrote:
>   
>> $ megactl -H
>> a0   PERC 4e/Di   chan:2 ldrv:1  batt:good
>> a0c0t0 279GiB  a0d0  online   errs: media:6  other:1
>>  write errors: corr:  0delay:  0rewrit:  0tot/corr:  
>> 0tot/uncorr:  0   
>>   read errors: corr:  4Mi  delay: 58reread:  0tot/corr:  
>> 0tot/uncorr:  0   
>> verify errors: corr:  0delay:  0revrfy:  6tot/corr:  
>> 0tot/uncorr:  6   
>> temperature: current:28C threshold:0C
>>
>> This is the only system with this showing up (of several x850 raid 
>> setups).  These systems are still under warranty should I request a 
>> replacement of this drive?  I really don't know how long this drive has 
>> been erroring.
>> 
>
> Try running a long self-test on the drive (megactl -T long a0c0t0). If
> it fails that it will be worth replacing it.
>
> I'm assuming that that drive is part of a redunant RAID. Be aware that
> if the disk is failing, a long self-test may turn up enough problems for
> the PERC to knock the disk offline. If you don't have redundancy, back
> the system up first...
>   
Yeah, there is a hot spare in all these raid-5 systems.   I'll try 
running it during off time/hours to see.  My primary concern is that 
these machines all have their warranties expiring in the next 4-5 weeks 
and we aren't planning on renewing since only the PE2850s are renewable 
(dell won't renew PE1850 and PC5324).  We were just planning on buying a 
couple of each used as spares.

If this drive is on it's way out I'd like to get it replaced.  From what 
I've seen those 300GB 10K U320 drives aren't cheap even refurbed.

I'll give the long self-test a try.

___
Linux-PowerEdge mailing list
Linux-PowerEdge@dell.com
https://lists.us.dell.com/mailman/listinfo/linux-poweredge
Please read the FAQ at http://lists.us.dell.com/faq


Re: 2850 PERC 4e/Di drive errors

2010-05-11 Thread Jefferson Ogata
On 2010-05-11 14:54, Matthew Lenz wrote:
> $ megactl -H
> a0   PERC 4e/Di   chan:2 ldrv:1  batt:good
> a0c0t0 279GiB  a0d0  online   errs: media:6  other:1
>  write errors: corr:  0delay:  0rewrit:  0tot/corr:  
> 0tot/uncorr:  0   
>   read errors: corr:  4Mi  delay: 58reread:  0tot/corr:  
> 0tot/uncorr:  0   
> verify errors: corr:  0delay:  0revrfy:  6tot/corr:  
> 0tot/uncorr:  6   
> temperature: current:28C threshold:0C
> 
> This is the only system with this showing up (of several x850 raid 
> setups).  These systems are still under warranty should I request a 
> replacement of this drive?  I really don't know how long this drive has 
> been erroring.

Try running a long self-test on the drive (megactl -T long a0c0t0). If
it fails that it will be worth replacing it.

I'm assuming that that drive is part of a redunant RAID. Be aware that
if the disk is failing, a long self-test may turn up enough problems for
the PERC to knock the disk offline. If you don't have redundancy, back
the system up first...

___
Linux-PowerEdge mailing list
Linux-PowerEdge@dell.com
https://lists.us.dell.com/mailman/listinfo/linux-poweredge
Please read the FAQ at http://lists.us.dell.com/faq


2850 PERC 4e/Di drive errors

2010-05-11 Thread Matthew Lenz
$ megactl -H
a0   PERC 4e/Di   chan:2 ldrv:1  batt:good
a0c0t0 279GiB  a0d0  online   errs: media:6  other:1
 write errors: corr:  0delay:  0rewrit:  0tot/corr:  
0tot/uncorr:  0   
  read errors: corr:  4Mi  delay: 58reread:  0tot/corr:  
0tot/uncorr:  0   
verify errors: corr:  0delay:  0revrfy:  6tot/corr:  
0tot/uncorr:  6   
temperature: current:28C threshold:0C

This is the only system with this showing up (of several x850 raid 
setups).  These systems are still under warranty should I request a 
replacement of this drive?  I really don't know how long this drive has 
been erroring.

TIA,

-Matt

___
Linux-PowerEdge mailing list
Linux-PowerEdge@dell.com
https://lists.us.dell.com/mailman/listinfo/linux-poweredge
Please read the FAQ at http://lists.us.dell.com/faq