I've had a server up for about a week, and something strange happened while 
copying a batch of files over the network. After copying for about three hours, 
I lost connectivity to the server, although it answered my ping and the my xVMs 
were up. 
The machine only showed a black screen with some error text, so I rebooted. 
Everything worked, except that my main NIC skge0 was unavailable and marked as 
"missing:[driver unavailable]"
Take a look: http://pastebin.com/f52d34233 I rebooted the machine at 18:52

Basically, a lot of:
#
May 19 18:07:49 srv01st scsi: [ID 243001 kern.warning] WARNING: 
/p...@0,0/pci8086,2...@1/pci15d9,a...@0 (mpt0):
#
May 19 18:07:49 srv01st         mpt_handle_event_sync: IOCStatus=0x8000, 
IOCLogInfo=0x31120200
#
May 19 18:07:49 srv01st scsi: [ID 243001 kern.warning] WARNING: 
/p...@0,0/pci8086,2...@1/pci15d9,a...@0 (mpt0):
#
May 19 18:07:49 srv01st         mpt_handle_event: IOCStatus=0x8000, 
IOCLogInfo=0x31120200
#
May 19 18:07:49 srv01st scsi: [ID 243001 kern.warning] WARNING: 
/p...@0,0/pci8086,2...@1/pci15d9,a...@0 (mpt0):
#
May 19 18:07:49 srv01st         mpt_handle_event_sync: IOCStatus=0x8000, 
IOCLogInfo=0x31120403
#
May 19 18:07:49 srv01st scsi: [ID 243001 kern.warning] WARNING: 
/p...@0,0/pci8086,2...@1/pci15d9,a...@0 (mpt0):
#
May 19 18:07:49 srv01st         mpt_handle_event: IOCStatus=0x8000, 
IOCLogInfo=0x31120403
#
May 19 18:07:52 srv01st scsi: [ID 365881 kern.info] 
/p...@0,0/pci8086,2...@1/pci15d9,a...@0 (mpt0):
#
May 19 18:07:52 srv01st         Log info 0x31120200 received for target 5.
#
May 19 18:07:52 srv01st         scsi_status=0x0, ioc_status=0x804b, 
scsi_state=0xc
#
May 19 18:07:52 srv01st scsi: [ID 365881 kern.info] 
/p...@0,0/pci8086,2...@1/pci15d9,a...@0 (mpt0):
#
May 19 18:07:52 srv01st         Log info 0x31120200 received for target 5.
#
May 19 18:07:52 srv01st         scsi_status=0x0, ioc_status=0x804b, 
scsi_state=0xc
#
May 19 18:07:52 srv01st scsi: [ID 365881 kern.info] 
/p...@0,0/pci8086,2...@1/pci15d9,a...@0 (mpt0):
#
May 19 18:07:52 srv01st         Log info 0x31120200 received for target 5.
#
May 19 18:07:52 srv01st         scsi_status=0x0, ioc_status=0x804b, 
scsi_state=0xc
#
May 19 18:07:52 srv01st scsi: [ID 365881 kern.info] 
/p...@0,0/pci8086,2...@1/pci15d9,a...@0 (mpt0):
#
May 19 18:07:52 srv01st         Log info 0x31120200 received for target 5.
#
May 19 18:07:52 srv01st         scsi_status=0x0, ioc_status=0x804b, 
scsi_state=0xc
#
May 19 18:07:52 srv01st scsi: [ID 365881 kern.info] 
/p...@0,0/pci8086,2...@1/pci15d9,a...@0 (mpt0):
#
May 19 18:07:52 srv01st         Log info 0x31120200 received for target 5.
#
May 19 18:07:52 srv01st         scsi_status=0x0, ioc_status=0x804b, 
scsi_state=0xc
#
May 19 18:07:52 srv01st scsi: [ID 365881 kern.info] 
/p...@0,0/pci8086,2...@1/pci15d9,a...@0 (mpt0):

#
May 19 18:08:33 srv01st fmd: [ID 441519 daemon.error] SUNW-MSG-ID: ZFS-8000-FD, 
TYPE: Fault, VER: 1, SEVERITY: Major
#
May 19 18:08:33 srv01st EVENT-TIME: Tue May 19 18:08:33 CEST 2009
#
May 19 18:08:33 srv01st PLATFORM: System Product Name, CSN: System Serial 
Number, HOSTNAME: srv01st
#
May 19 18:08:33 srv01st SOURCE: zfs-diagnosis, REV: 1.0
#
May 19 18:08:33 srv01st EVENT-ID: ad3d3ce2-0110-cc51-f2e1-befc0cd3f0ba
#
May 19 18:08:33 srv01st DESC: The number of I/O errors associated with a ZFS 
device exceeded
#
May 19 18:08:33 srv01st              acceptable levels.  Refer to 
http://sun.com/msg/ZFS-8000-FD for more information.
#
May 19 18:08:33 srv01st AUTO-RESPONSE: The device has been offlined and marked 
as faulted.  An attempt
#
May 19 18:08:33 srv01st              will be made to activate a hot spare if 
available.
#
May 19 18:08:33 srv01st IMPACT: Fault tolerance of the pool may be compromised.
#
May 19 18:08:33 srv01st REC-ACTION: Run 'zpool status -x' and replace the bad 
device.
#
May 19 18:52:08 srv01st genunix: [ID 540533 kern.notice] ^MSunOS Release 5.11 
Version snv_101b 64-bit
#
May 19 18:52:08 srv01st genunix: [ID 172908 kern.notice] Copyright 1983-2008 
Sun Microsystems, Inc.  All rights reserved.


I'm also getting alot of these:
#
May 19 18:52:17 srv01st unix: [ID 954099 kern.info] NOTICE: IRQ19 is being 
shared by drivers with different interrupt levels.
#
May 19 18:52:17 srv01st This may result in reduced system performance.

I've got two virtual win2k3 machines up for testing, accessing the network via 
a dedicated 100mbit NIC, which is now used as primary connection. I managed to 
transfer the rest of the batch without any further problems.
The NIC that dropped out is called skge0.

Help is very much appreciated!
-- 
This message posted from opensolaris.org
_______________________________________________
opensolaris-discuss mailing list
[email protected]

Reply via email to