Re: [CentOS] Re: 3ware 9650 issues

2008-06-23 Thread Gary Richardson
1500W should be plenty, but the card may not be getting enough power.

On a much smaller system (3 drives, 1 3ware card), I had power problems. I
used a 400W power supply and the +-5V rail was only delivering 3.9V. I kept
losing drives. This was an 'expensive' Antec power supply.

I switched to a budget 300W power supply just to see what would happen. The
unit delivered a much cleaner ~4.8V. It's worked great ever since.

On Mon, Jun 23, 2008 at 7:45 AM, Joshua Baker-LePain <[EMAIL PROTECTED]> wrote:

> On Sun, 22 Jun 2008 at 10:23am, Scott Silva wrote
>
>  on 6-21-2008 9:04 PM Joshua Baker-LePain spake the following:
>>
>>>
>>> This of course leads to a several hour downtime as the system has to be
>>> powered down (not just rebooted) and then the volume needs to be fscked.
>>> I've been back and forth with both the vendor and (via the vendor) 3ware
>>> with this.  The card has been replaced, as well as the whole system.  I'm
>>> running the latest firmware and drivers from 3ware.
>>>
>>>  That looks like either drive, cabling, or power problems.
>>
>
> I'd agree, except for a) all the hardware has been swapped out and b) 1500W
> should be plenty.
>
> It's starting to sound like this may be a somewhat known issue with a
> *long* overdue fix coming from 3ware.  *sigh*
>
> Thanks all.
>
>
> --
> Joshua Baker-LePain
> QB3 Shared Cluster Sysadmin
> UCSF
> ___
> CentOS mailing list
> CentOS@centos.org
> http://lists.centos.org/mailman/listinfo/centos
>
___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] Re: 3ware 9650 issues

2008-06-23 Thread Joshua Baker-LePain

On Sun, 22 Jun 2008 at 10:23am, Scott Silva wrote


on 6-21-2008 9:04 PM Joshua Baker-LePain spake the following:


This of course leads to a several hour downtime as the system has to be 
powered down (not just rebooted) and then the volume needs to be fscked. 
I've been back and forth with both the vendor and (via the vendor) 3ware 
with this.  The card has been replaced, as well as the whole system.  I'm 
running the latest firmware and drivers from 3ware.



That looks like either drive, cabling, or power problems.


I'd agree, except for a) all the hardware has been swapped out and b) 
1500W should be plenty.


It's starting to sound like this may be a somewhat known issue with a 
*long* overdue fix coming from 3ware.  *sigh*


Thanks all.

--
Joshua Baker-LePain
QB3 Shared Cluster Sysadmin
UCSF
___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


[CentOS] Re: 3ware 9650 issues

2008-06-22 Thread Scott Silva

on 6-21-2008 9:04 PM Joshua Baker-LePain spake the following:
I've been having no end of issues with a 3ware 9650SE-24M8 in a server 
that's coming on a year old.  I've got 24 WDC WD5001ABYS drives (500GB) 
hooked to it, running as a single RAID6 w/ a hot spare.  These issues 
boil down to the card periodically throwing errors like the following:


sd 1:0:0:0: WARNING: (0x06:0x002C): Command (0x8a) timed out, resetting 
card.


Usually when this happens, it's followed by:

3w-9xxx: scsi1: AEN: INFO (0x04:0x005E): Cache synchronization 
completed:unit=0.


On the less pleasant occasions, it's followed by:

scsi1: ERROR: (0x06:0x0036): Response queue (large) empty failed during 
reset sequence.
3w-9xxx: scsi1: ERROR: (0x06:0x002B): Controller reset failed during 
scsi host reset.

sd 1:0:0:0: scsi: Device offlined - not ready after error recovery

This of course leads to a several hour downtime as the system has to be 
powered down (not just rebooted) and then the volume needs to be fscked. 
I've been back and forth with both the vendor and (via the vendor) 3ware 
with this.  The card has been replaced, as well as the whole system.  
I'm running the latest firmware and drivers from 3ware.


Have other folks had good luck with this card?  What sorts of configs 
are you running?  I'm in the position of needing more storage, and I'm a 
bit gun shy on 3ware at the moment...



That looks like either drive, cabling, or power problems.

--
MailScanner is like deodorant...
You hope everybody uses it, and
you notice quickly if they don't



signature.asc
Description: OpenPGP digital signature
___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos