I don't think the hardware has any problems, it only started having errors when 
I upgraded OpenSolaris.
It's still working fine again now after a reboot.  Actually, I reread one of 
your earlier messages,
and I didn't realize at first when you said "non-Sun JBOD" that this didn't 
apply to me (in regards to
the msi=0 fix) because I didn't realize JBOD was shorthand for an external 
expander device.  Since
I'm just using baremetal, and passive backplanes, I think the msi=0 fix should 
apply to me based on
what you wrote earlier, anyway I've put 
        set mpt:mpt_enable_msi = 0
now in /etc/system and rebooted as it was suggested earlier.  I've resumed my 
rsync, and so far there
have been no errors, but it's only been 20 minutes or so.  I should have a good 
idea by tomorrow if this
definitely fixed the problem (since even when the machine was not crashing it 
was tallying up iostat errors
fairly rapidly)

Thanks again for your help.  Sorry for wasting your time if the previously 
posted workaround fixes things.
I'll let you know tomorrow either way.

Chad

On Tue, Dec 01, 2009 at 05:57:28PM +1000, James C. McPherson wrote:
> Chad Cantwell wrote:
> >After another crash I checked the syslog and there were some different 
> >errors than the ones
> >I saw previously during operation:
> ...
> 
> >Nov 30 20:59:13 the-vault       LSI PCI device (1000,ffff) not supported.
> ...
> >Nov 30 20:59:13 the-vault       mpt_config_space_init failed
> ...
> >Nov 30 20:59:15 the-vault       mpt_restart_ioc failed
> ....
> 
> >Nov 30 21:33:02 the-vault fmd: [ID 377184 daemon.error] SUNW-MSG-ID: 
> >PCIEX-8000-8R, TYPE: Fault, VER: 1, SEVERITY: Major
> >Nov 30 21:33:02 the-vault EVENT-TIME: Mon Nov 30 21:33:02 PST 2009
> >Nov 30 21:33:02 the-vault PLATFORM: System-Product-Name, CSN: 
> >System-Serial-Number, HOSTNAME: the-vault
> >Nov 30 21:33:02 the-vault SOURCE: eft, REV: 1.16
> >Nov 30 21:33:02 the-vault EVENT-ID: 7886cc0d-4760-60b2-e06a-8158c3334f63
> >Nov 30 21:33:02 the-vault DESC: The transmitting device sent an invalid 
> >request.
> >Nov 30 21:33:02 the-vault   Refer to http://sun.com/msg/PCIEX-8000-8R for 
> >more information.
> >Nov 30 21:33:02 the-vault AUTO-RESPONSE: One or more device instances may be 
> >disabled
> >Nov 30 21:33:02 the-vault IMPACT: Loss of services provided by the device 
> >instances associated with this fault
> >Nov 30 21:33:02 the-vault REC-ACTION: Ensure that the latest drivers and 
> >patches are installed. Otherwise schedule a repair procedure to replace the 
> >affected device(s).  Us
> >e fmadm faulty to identify the devices or contact Sun for support.
> 
> 
> Sorry to have to tell you, but that HBA is dead. Or at
> least dying horribly. If you can't init the config space
> (that's the pci bus config space), then you've got about
> 1/2 the nails in the coffin hammered in. Then the failure
> to restart the IOC (io controller unit) == the rest of
> the lid hammered down.
> 
> 
> best regards,
> James C. McPherson
> --
> Senior Kernel Software Engineer, Solaris
> Sun Microsystems
> http://blogs.sun.com/jmcp     http://www.jmcp.homeunix.com/blog
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to