I've had a server up for about a week, and something strange happened while copying a batch of files over the network. After copying for about three hours, I lost connectivity to the server, although it answered my ping and the my xVMs were up. The machine only showed a black screen with some error text, so I rebooted. Everything worked, except that my main NIC skge0 was unavailable and marked as "missing:[driver unavailable]" Take a look: http://pastebin.com/f52d34233 I rebooted the machine at 18:52
Basically, a lot of: # May 19 18:07:49 srv01st scsi: [ID 243001 kern.warning] WARNING: /p...@0,0/pci8086,2...@1/pci15d9,a...@0 (mpt0): # May 19 18:07:49 srv01st mpt_handle_event_sync: IOCStatus=0x8000, IOCLogInfo=0x31120200 # May 19 18:07:49 srv01st scsi: [ID 243001 kern.warning] WARNING: /p...@0,0/pci8086,2...@1/pci15d9,a...@0 (mpt0): # May 19 18:07:49 srv01st mpt_handle_event: IOCStatus=0x8000, IOCLogInfo=0x31120200 # May 19 18:07:49 srv01st scsi: [ID 243001 kern.warning] WARNING: /p...@0,0/pci8086,2...@1/pci15d9,a...@0 (mpt0): # May 19 18:07:49 srv01st mpt_handle_event_sync: IOCStatus=0x8000, IOCLogInfo=0x31120403 # May 19 18:07:49 srv01st scsi: [ID 243001 kern.warning] WARNING: /p...@0,0/pci8086,2...@1/pci15d9,a...@0 (mpt0): # May 19 18:07:49 srv01st mpt_handle_event: IOCStatus=0x8000, IOCLogInfo=0x31120403 # May 19 18:07:52 srv01st scsi: [ID 365881 kern.info] /p...@0,0/pci8086,2...@1/pci15d9,a...@0 (mpt0): # May 19 18:07:52 srv01st Log info 0x31120200 received for target 5. # May 19 18:07:52 srv01st scsi_status=0x0, ioc_status=0x804b, scsi_state=0xc # May 19 18:07:52 srv01st scsi: [ID 365881 kern.info] /p...@0,0/pci8086,2...@1/pci15d9,a...@0 (mpt0): # May 19 18:07:52 srv01st Log info 0x31120200 received for target 5. # May 19 18:07:52 srv01st scsi_status=0x0, ioc_status=0x804b, scsi_state=0xc # May 19 18:07:52 srv01st scsi: [ID 365881 kern.info] /p...@0,0/pci8086,2...@1/pci15d9,a...@0 (mpt0): # May 19 18:07:52 srv01st Log info 0x31120200 received for target 5. # May 19 18:07:52 srv01st scsi_status=0x0, ioc_status=0x804b, scsi_state=0xc # May 19 18:07:52 srv01st scsi: [ID 365881 kern.info] /p...@0,0/pci8086,2...@1/pci15d9,a...@0 (mpt0): # May 19 18:07:52 srv01st Log info 0x31120200 received for target 5. # May 19 18:07:52 srv01st scsi_status=0x0, ioc_status=0x804b, scsi_state=0xc # May 19 18:07:52 srv01st scsi: [ID 365881 kern.info] /p...@0,0/pci8086,2...@1/pci15d9,a...@0 (mpt0): # May 19 18:07:52 srv01st Log info 0x31120200 received for target 5. # May 19 18:07:52 srv01st scsi_status=0x0, ioc_status=0x804b, scsi_state=0xc # May 19 18:07:52 srv01st scsi: [ID 365881 kern.info] /p...@0,0/pci8086,2...@1/pci15d9,a...@0 (mpt0): # May 19 18:08:33 srv01st fmd: [ID 441519 daemon.error] SUNW-MSG-ID: ZFS-8000-FD, TYPE: Fault, VER: 1, SEVERITY: Major # May 19 18:08:33 srv01st EVENT-TIME: Tue May 19 18:08:33 CEST 2009 # May 19 18:08:33 srv01st PLATFORM: System Product Name, CSN: System Serial Number, HOSTNAME: srv01st # May 19 18:08:33 srv01st SOURCE: zfs-diagnosis, REV: 1.0 # May 19 18:08:33 srv01st EVENT-ID: ad3d3ce2-0110-cc51-f2e1-befc0cd3f0ba # May 19 18:08:33 srv01st DESC: The number of I/O errors associated with a ZFS device exceeded # May 19 18:08:33 srv01st acceptable levels. Refer to http://sun.com/msg/ZFS-8000-FD for more information. # May 19 18:08:33 srv01st AUTO-RESPONSE: The device has been offlined and marked as faulted. An attempt # May 19 18:08:33 srv01st will be made to activate a hot spare if available. # May 19 18:08:33 srv01st IMPACT: Fault tolerance of the pool may be compromised. # May 19 18:08:33 srv01st REC-ACTION: Run 'zpool status -x' and replace the bad device. # May 19 18:52:08 srv01st genunix: [ID 540533 kern.notice] ^MSunOS Release 5.11 Version snv_101b 64-bit # May 19 18:52:08 srv01st genunix: [ID 172908 kern.notice] Copyright 1983-2008 Sun Microsystems, Inc. All rights reserved. I'm also getting alot of these: # May 19 18:52:17 srv01st unix: [ID 954099 kern.info] NOTICE: IRQ19 is being shared by drivers with different interrupt levels. # May 19 18:52:17 srv01st This may result in reduced system performance. I've got two virtual win2k3 machines up for testing, accessing the network via a dedicated 100mbit NIC, which is now used as primary connection. I managed to transfer the rest of the batch without any further problems. The NIC that dropped out is called skge0. Help is very much appreciated! -- This message posted from opensolaris.org _______________________________________________ opensolaris-discuss mailing list [email protected]
