Ragnar Sundblad wrote:
Hello,
I wonder if anyone could help me with a pci-e problem.
I have a X4150 running snv_134. It was shipped with a "STK RAID INT"
adaptec/intel/storagetek/sun SAS HBA. The machine also has a
LSI SAS card in another slot, though I don't know if that is
significant in any way.
It might help troubleshooting.
You can try putting the disks behind the LSI SAS HBA and see if you
still get errors. That will at the least tell you if the two errors are
manifestations of the same problem, or separate issues.
You might still have issues with the fabric. You can then take off the
HBA that is throwing errors (STK RAID) and put the LSI SAS HBA on the
slot on which the STK RAID rested earlier and check the behaviour.
Maybe, this will point at the culprit. If the fabric errors continue
with what ever card on the currently faulty slot (if at all it is), it
is more probable that the issue is with the fabric.
Pavan
It logs some errors, as shown with "fmdump -e(V).
It is most often a pci bridge error (I think), about five to ten
times an hour, and occasionally a problem with accessing a
mode page on the disks behind the STK raid controller for
enabling/disabling the disks' write caches, one error for each disk,
about every three hours. I don't believe the two have to be related.
I am especially interested in understanding the ereport.io.pci.fabric
report.
I haven't seen this problem on other more or less identical
machines running sol10.
Is this a known software problem, or do I have faulty hardware?
Thanks!
/ragge
--------------
% fmdump -e
...
Apr 04 01:21:53.2244 ereport.io.pci.fabric
Apr 04 01:30:00.6999 ereport.io.pci.fabric
Apr 04 01:30:23.4647 ereport.io.scsi.cmd.disk.dev.uderr
Apr 04 01:30:23.4651 ereport.io.scsi.cmd.disk.dev.uderr
...
% fmdump -eV
Apr 04 2010 01:21:53.224492765 ereport.io.pci.fabric
nvlist version: 0
class = ereport.io.pci.fabric
ena = 0xd6a00a43be800c01
detector = (embedded nvlist)
nvlist version: 0
version = 0x0
scheme = dev
device-path = /p...@0,0/pci8086,2...@4
(end detector)
bdf = 0x20
device_id = 0x25f8
vendor_id = 0x8086
rev_id = 0xb1
dev_type = 0x40
pcie_off = 0x6c
pcix_off = 0x0
aer_off = 0x100
ecc_ver = 0x0
pci_status = 0x10
pci_command = 0x147
pci_bdg_sec_status = 0x0
pci_bdg_ctrl = 0x3
pcie_status = 0x0
pcie_command = 0x2027
pcie_dev_cap = 0xfc1
pcie_adv_ctl = 0x0
pcie_ue_status = 0x0
pcie_ue_mask = 0x100000
pcie_ue_sev = 0x62031
pcie_ue_hdr0 = 0x0
pcie_ue_hdr1 = 0x0
pcie_ue_hdr2 = 0x0
pcie_ue_hdr3 = 0x0
pcie_ce_status = 0x0
pcie_ce_mask = 0x0
pcie_rp_status = 0x0
pcie_rp_control = 0x7
pcie_adv_rp_status = 0x0
pcie_adv_rp_command = 0x7
pcie_adv_rp_ce_src_id = 0x0
pcie_adv_rp_ue_src_id = 0x0
remainder = 0x0
severity = 0x1
__ttl = 0x1
__tod = 0x4bb7cd91 0xd617cdd
...
Apr 04 2010 01:30:23.464768275 ereport.io.scsi.cmd.disk.dev.uderr
nvlist version: 0
class = ereport.io.scsi.cmd.disk.dev.uderr
ena = 0xde0cd54f84201c01
detector = (embedded nvlist)
nvlist version: 0
version = 0x0
scheme = dev
device-path = /p...@0,0/pci8086,2...@4/pci108e,2...@0/d...@5,0
devid = id1,s...@tsun_____stk_raid_int____ea4b6f24
(end detector)
driver-assessment = fail
op-code = 0x1a
cdb = 0x1a 0x0 0x8 0x0 0x18 0x0
pkt-reason = 0x0
pkt-state = 0x1f
pkt-stats = 0x0
stat-code = 0x0
un-decode-info = sd_get_write_cache_enabled: Mode Sense caching page
code mismatch 0
un-decode-value =
__ttl = 0x1
__tod = 0x4bb7cf8f 0x1bb3cd13
...
_______________________________________________
driver-discuss mailing list
[email protected]
http://mail.opensolaris.org/mailman/listinfo/driver-discuss
_______________________________________________
driver-discuss mailing list
[email protected]
http://mail.opensolaris.org/mailman/listinfo/driver-discuss