On Fri, Mar 6, 2015 at 10:48 AM, Robert Mustacchi <[email protected]> wrote:

> On 3/6/15 8:43 , Schweiss, Chip via illumos-discuss wrote:
> > I have two fairly new Haswell based servers running OmniOS.  I have
> several
> > faults from both systems that I don't know what they are or what to do
> > about them.
> >
> > I am not seeing any related issues these faults.
> >
> > Can anyone clarify what they are and what to do about them?
>
> We've received error reports that the system doesn't understand how to
> diagnose. Here, getting the actual ereports that were generated on the
> system and looking at them will shed more light on the problem and will
> allow us to better understand what's happening on the systems.
>
>
I'm not familiar with ereports.  After some googling, I'm assuming you mean
the output from 'fmdump -eV'

Here's reports that correspond to the first event.  If this is what you
were asking for I'll dig out the rest of them.

Feb 27 2015 18:11:17.068478684 ereport.io.pci.fabric
nvlist version: 0
        class = ereport.io.pci.fabric
        ena = 0xe97c1b9f5a501401
        detector = (embedded nvlist)
        nvlist version: 0
                version = 0x0
                scheme = dev
                device-path = /pci@0,0/pci8086,2f06@2,2
        (end detector)

        bdf = 0x12
        device_id = 0x2f06
        vendor_id = 0x8086
        rev_id = 0x2
        dev_type = 0x40
        pcie_off = 0x90
        pcix_off = 0x0
        aer_off = 0x148
        ecc_ver = 0x0
        pci_status = 0x10
        pci_command = 0x47
        pci_bdg_sec_status = 0x2000
        pci_bdg_ctrl = 0x3
        pcie_status = 0x0
        pcie_command = 0x27
        pcie_dev_cap = 0x8001
        pcie_adv_ctl = 0x0
        pcie_ue_status = 0x0
        pcie_ue_mask = 0x100000
        pcie_ue_sev = 0x62030
        pcie_ue_hdr0 = 0x0
        pcie_ue_hdr1 = 0x0
        pcie_ue_hdr2 = 0x0
        pcie_ue_hdr3 = 0x0
        pcie_ce_status = 0x0
        pcie_ce_mask = 0x0
        pcie_rp_status = 0x0
        pcie_rp_control = 0x0
        pcie_adv_rp_status = 0x1
        pcie_adv_rp_command = 0x7
        pcie_adv_rp_ce_src_id = 0x600
        pcie_adv_rp_ue_src_id = 0x0
        remainder = 0x3
        severity = 0x1
        __ttl = 0x1
        __tod = 0x54f107a5 0x414e6dc

Feb 27 2015 18:11:17.068509897 ereport.io.pci.fabric
nvlist version: 0
        class = ereport.io.pci.fabric
        ena = 0xe97c1ba6ebb01401
        detector = (embedded nvlist)
        nvlist version: 0
                version = 0x0
                scheme = dev
                device-path = /pci@0,0/pci8086,2f06@2,2/pci10b5,8724@0
        (end detector)

        bdf = 0x400
        device_id = 0x8724
        vendor_id = 0x10b5
        rev_id = 0xca
        dev_type = 0x50
        pcie_off = 0x68
        pcix_off = 0x0
        aer_off = 0xfb4
        ecc_ver = 0x0
        pci_status = 0x10
        pci_command = 0x147
        pci_bdg_sec_status = 0x0
        pci_bdg_ctrl = 0x3
        pcie_status = 0x9
        pcie_command = 0x37
        pcie_dev_cap = 0x8004
        pcie_adv_ctl = 0xbf
        pcie_ue_status = 0x100000
        pcie_ue_mask = 0x180000
        pcie_ue_sev = 0x62030
        pcie_ue_hdr0 = 0x0
        pcie_ue_hdr1 = 0x0
        pcie_ue_hdr2 = 0x0
        pcie_ue_hdr3 = 0x0
        pcie_ce_status = 0x2000
        pcie_ce_mask = 0x0
        remainder = 0x2
        severity = 0x3
        __ttl = 0x1
        __tod = 0x54f107a5 0x41560c9

Feb 27 2015 18:11:17.068526093 ereport.io.pci.fabric
nvlist version: 0
        class = ereport.io.pci.fabric
        ena = 0xe97c1baaee901401
        detector = (embedded nvlist)
        nvlist version: 0
                version = 0x0
                scheme = dev
                device-path = /pci@0,0/pci8086,2f06@2,2/pci10b5,8724@0
/pci10b5,8724@1
        (end detector)

        bdf = 0x508
        device_id = 0x8724
        vendor_id = 0x10b5
        rev_id = 0xca
        dev_type = 0x60
        pcie_off = 0x68
        pcix_off = 0x0
        aer_off = 0xfb4
        ecc_ver = 0x0
        pci_status = 0x10
        pci_command = 0x147
        pci_bdg_sec_status = 0x0
        pci_bdg_ctrl = 0x3
        pcie_status = 0x0
        pcie_command = 0x37
        pcie_dev_cap = 0x8004
        pcie_adv_ctl = 0xbf
        pcie_ue_status = 0x0
        pcie_ue_mask = 0x180000
        pcie_ue_sev = 0x462030
        pcie_ue_hdr0 = 0x0
        pcie_ue_hdr1 = 0x0
        pcie_ue_hdr2 = 0x0
        pcie_ue_hdr3 = 0x0
        pcie_ce_status = 0x0
        pcie_ce_mask = 0x0
        remainder = 0x1
        severity = 0x1
        __ttl = 0x1
        __tod = 0x54f107a5 0x415a00d

Feb 27 2015 18:11:17.068541905 ereport.io.pci.fabric
nvlist version: 0
        class = ereport.io.pci.fabric
        ena = 0xe97c1baedbc01401
        detector = (embedded nvlist)
        nvlist version: 0
                version = 0x0
                scheme = dev
                device-path = /pci@0,0/pci8086,2f06@2,2/pci10b5,8724@0
/pci10b5,8724@1/pci1000,3070@0
        (end detector)

        bdf = 0x600
        device_id = 0x87
        vendor_id = 0x1000
        rev_id = 0x5
        dev_type = 0x0
        pcie_off = 0x68
        pcix_off = 0x0
        aer_off = 0x100
        ecc_ver = 0x0
        pci_status = 0x10
        pci_command = 0x146
        pcie_status = 0x1
        pcie_command = 0x2037
        pcie_dev_cap = 0x10008025
        pcie_adv_ctl = 0x0
        pcie_ue_status = 0x0
        pcie_ue_mask = 0x180000
        pcie_ue_sev = 0x462031
        pcie_ue_hdr0 = 0x4000001
        pcie_ue_hdr1 = 0x122003
        pcie_ue_hdr2 = 0x6010000
        pcie_ue_hdr3 = 0xb70d8120
        pcie_ce_status = 0x1
        pcie_ce_mask = 0x0
        remainder = 0x0
        severity = 0x3
        __ttl = 0x1
        __tod = 0x54f107a5 0x415ddd1

Feb 27 2015 18:11:17.068478684 ereport.io.pciex.rc.ce-msg
nvlist version: 0
        ena = 0xe97c1b9f5a501401
        detector = (embedded nvlist)
        nvlist version: 0
                version = 0x0
                scheme = dev
                device-path = /pci@0,0/pci8086,2f06@2,2
        (end detector)

        class = ereport.io.pciex.rc.ce-msg
        rc-status = 0x1
        source-id = 0x600
        source-valid = 1
        __ttl = 0x1
        __tod = 0x54f107a5 0x414e6dc

Feb 27 2015 18:11:17.068509897 ereport.io.pciex.a-nonfatal
nvlist version: 0
        ena = 0xe97c1ba6ebb01401
        detector = (embedded nvlist)
        nvlist version: 0
                version = 0x0
                scheme = dev
                device-path = /pci@0,0/pci8086,2f06@2,2/pci10b5,8724@0
        (end detector)

        class = ereport.io.pciex.a-nonfatal
        dev-status = 0x9
        ce-status = 0x2000
        __ttl = 0x1
        __tod = 0x54f107a5 0x41560c9

Feb 27 2015 18:11:17.068509897 ereport.io.pciex.rc.ce-msg
nvlist version: 0
        ena = 0xe97c1ba6ebb01401
        detector = (embedded nvlist)
        nvlist version: 0
                version = 0x0
                scheme = dev
                device-path = /pci@0,0
        (end detector)

        class = ereport.io.pciex.rc.ce-msg
        rc-status = 0x1
        source-id = 0x400
        source-valid = 1
        __ttl = 0x1
        __tod = 0x54f107a5 0x41560c9

Feb 27 2015 18:11:17.068541905 ereport.io.pciex.pl.re
nvlist version: 0
        ena = 0xe97c1baedbc01401
        detector = (embedded nvlist)
        nvlist version: 0
                version = 0x0
                scheme = dev
                device-path = /pci@0,0/pci8086,2f06@2,2/pci10b5,8724@0
/pci10b5,8724@1/pci1000,3070@0
        (end detector)

        class = ereport.io.pciex.pl.re
        dev-status = 0x1
        ce-status = 0x1
        __ttl = 0x1
        __tod = 0x54f107a5 0x415ddd1

Feb 27 2015 18:11:17.068541905 ereport.io.pciex.rc.ce-msg
nvlist version: 0
        ena = 0xe97c1baedbc01401
        detector = (embedded nvlist)
        nvlist version: 0
                version = 0x0
                scheme = dev
                device-path = /pci@0,0
        (end detector)

        class = ereport.io.pciex.rc.ce-msg
        rc-status = 0x1
        source-id = 0x600
        source-valid = 1
        __ttl = 0x1
        __tod = 0x54f107a5 0x415ddd1






> Robert
>
> >>From host #1:
> >
> > --------------- ------------------------------------  --------------
> > ---------
> > TIME            EVENT-ID                              MSG-ID
> > SEVERITY
> > --------------- ------------------------------------  --------------
> > ---------
> > Feb 27 18:11:19 3951b062-71f1-cccc-9fea-bbdc354f2603  SUNOS-8000-J0
> Major
> >
> > Host        : mir-zfs01
> > Platform    : SYS-6028U-TR4+    Chassis_id  : S16512424A07095
> > Product_sn  :
> >
> > Fault class : defect.sunos.eft.unexpected_telemetry 50%
> >               fault.sunos.eft.unexpected_telemetry 50%
> > Problem in  : dev:////pci@0,0
> >                   faulted and taken out of service
> >
> > Description : The diagnosis engine encountered telemetry from the listed
> >               devices for which it was unable to perform a diagnosis -
> >               Refer to http://illumos.org/msg/SUNOS-8000-J0 for more
> >               information.  Refer to
> http://illumos.org/msg/SUNOS-8000-J0
> > for
> >               more information.
> >
> > Response    : Error reports have been logged for examination by Sun.
> >
> > Impact      : Automated diagnosis and response for these events will not
> > occur.
> >
> > Action      : Ensure that the latest Solaris Kernel and Predictive
> > Self-Healing
> >               (PSH) patches are installed.
> >
> > --------------- ------------------------------------  --------------
> > ---------
> > TIME            EVENT-ID                              MSG-ID
> > SEVERITY
> > --------------- ------------------------------------  --------------
> > ---------
> > Jan 15 21:53:07 2cb9f0e0-dd7f-c912-dd22-bbaa7a4ebf6c  SUNOS-8000-J0
> Major
> >
> > Host        : mir-zfs01
> > Platform    : SYS-6028U-TR4+    Chassis_id  : S16512424A07095
> > Product_sn  :
> >
> > Fault class : defect.sunos.eft.unexpected_telemetry max 25%
> >               fault.sunos.eft.unexpected_telemetry max 25%
> > Affects     : cpu:///cpuid=6
> >               cpu:///cpuid=16
> >                   faulted but still in service
> > FRU         :
> >
> hc://:product-id=SYS-6028U-TR4+:server-id=mir-zfs01:chassis-id=S16512424A07095/motherboard=0/chip=0
> > 25%
> >
> >
> hc://:product-id=SYS-6028U-TR4+:server-id=mir-zfs01:chassis-id=S16512424A07095/motherboard=0/chip=1
> > 25%
> >                   faulty
> >
> > Description : The diagnosis engine encountered telemetry from the listed
> >               devices for which it was unable to perform a diagnosis -
> >               Refer to http://illumos.org/msg/SUNOS-8000-J0 for more
> >               information.  Refer to
> http://illumos.org/msg/SUNOS-8000-J0
> > for
> >               more information.
> >
> > Response    : Error reports have been logged for examination by Sun.
> >
> > Impact      : Automated diagnosis and response for these events will not
> > occur.
> >
> > Action      : Ensure that the latest Solaris Kernel and Predictive
> > Self-Healing
> >               (PSH) patches are installed.
> >
> >
> >>From host #2:
> >
> > --------------- ------------------------------------  --------------
> > ---------
> > TIME            EVENT-ID                              MSG-ID
> > SEVERITY
> > --------------- ------------------------------------  --------------
> > ---------
> > Jan 31 12:45:54 0efc914b-7cc5-c4df-fd11-9be172d4931a  SUNOS-8000-J0
> Major
> >
> > Host        : mir-zfs02
> > Platform    : SYS-6028U-TR4+    Chassis_id  : S16512424A07109
> > Product_sn  :
> >
> > Fault class : defect.sunos.eft.unexpected_telemetry 50%
> >               fault.sunos.eft.unexpected_telemetry 50%
> > Problem in  : dev:////pci@74,0
> >                   faulted and taken out of service
> >
> > Description : The diagnosis engine encountered telemetry from the listed
> >               devices for which it was unable to perform a diagnosis -
> >               Refer to http://illumos.org/msg/SUNOS-8000-J0 for more
> >               information.  Refer to
> http://illumos.org/msg/SUNOS-8000-J0
> > for
> >               more information.
> >
> > Response    : Error reports have been logged for examination by Sun.
> >
> > Impact      : Automated diagnosis and response for these events will not
> > occur.
> >
> > Action      : Ensure that the latest Solaris Kernel and Predictive
> > Self-Healing
> >               (PSH) patches are installed.
> > --------------- ------------------------------------  --------------
> > ---------
> > TIME            EVENT-ID                              MSG-ID
> > SEVERITY
> > --------------- ------------------------------------  --------------
> > ---------
> > Dec 04 15:22:09 6020baed-5ab6-cdb0-95c0-ed3f9fde1172  SUNOS-8000-J0
> Major
> >
> > Host        : mir-zfs02
> > Platform    : SYS-6028U-TR4+    Chassis_id  : S16512424A07109
> > Product_sn  :
> >
> > Fault class : fault.sunos.eft.unexpected_telemetry max 25%
> >               defect.sunos.eft.unexpected_telemetry max 25%
> > Affects     : cpu:///cpuid=41
> >                   ok and in service
> >               cpu:///cpuid=26
> >                   faulted but still in service
> > FRU         :
> >
> hc://:product-id=SYS-6028U-TR4+:server-id=mir-zfs02:chassis-id=S16512424A07109/motherboard=0/chip=1
> > 25%
> >                   acquitted
> >
> >
> hc://:product-id=SYS-6028U-TR4+:server-id=mir-zfs02:chassis-id=S16512424A07109/motherboard=0/chip=0
> > 25%
> >                   faulty
> >
> > Description : The diagnosis engine encountered telemetry from the listed
> >               devices for which it was unable to perform a diagnosis -
> >               Refer to http://illumos.org/msg/SUNOS-8000-J0 for more
> >               information.  Refer to
> http://illumos.org/msg/SUNOS-8000-J0
> > for
> >               more information.
> >
> > Response    : Error reports have been logged for examination by Sun.
> >
> > Impact      : Automated diagnosis and response for these events will not
> > occur.
> >
> > Action      : Ensure that the latest Solaris Kernel and Predictive
> > Self-Healing
> >               (PSH) patches are installed.
> >
> > --------------- ------------------------------------  --------------
> > ---------
> > TIME            EVENT-ID                              MSG-ID
> > SEVERITY
> > --------------- ------------------------------------  --------------
> > ---------
> > Dec 04 18:55:38 eadd4984-7c7a-490b-f6e1-b0f936b09ab7  SUNOS-8000-J0
> Major
> >
> > Host        : mir-zfs02
> > Platform    : SYS-6028U-TR4+    Chassis_id  : S16512424A07109
> > Product_sn  :
> >
> > Fault class : fault.sunos.eft.unexpected_telemetry max 25%
> >               defect.sunos.eft.unexpected_telemetry max 25%
> > Affects     : cpu:///cpuid=6
> >               cpu:///cpuid=18
> >                   faulted but still in service
> > FRU         :
> >
> hc://:product-id=SYS-6028U-TR4+:server-id=mir-zfs02:chassis-id=S16512424A07109/motherboard=0/chip=0
> > 25%
> >
> >
> hc://:product-id=SYS-6028U-TR4+:server-id=mir-zfs02:chassis-id=S16512424A07109/motherboard=0/chip=1
> > 25%
> >                   faulty
> >
> > Description : The diagnosis engine encountered telemetry from the listed
> >               devices for which it was unable to perform a diagnosis -
> >               Refer to http://illumos.org/msg/SUNOS-8000-J0 for more
> >               information.  Refer to
> http://illumos.org/msg/SUNOS-8000-J0
> > for
> >               more information.
> >
> > Response    : Error reports have been logged for examination by Sun.
> >
> > Impact      : Automated diagnosis and response for these events will not
> > occur.
> >
> > Action      : Ensure that the latest Solaris Kernel and Predictive
> > Self-Healing
> >               (PSH) patches are installed.
> >
> >
> >
> > -------------------------------------------
> > illumos-discuss
> > Archives: https://www.listbox.com/member/archive/182180/=now
> > RSS Feed:
> https://www.listbox.com/member/archive/rss/182180/21175748-6cf9d6b5
> > Modify Your Subscription:
> https://www.listbox.com/member/?&;
> > Powered by Listbox: http://www.listbox.com
> >
>
>



-------------------------------------------
illumos-discuss
Archives: https://www.listbox.com/member/archive/182180/=now
RSS Feed: https://www.listbox.com/member/archive/rss/182180/21175430-2e6923be
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=21175430&id_secret=21175430-6a77cda4
Powered by Listbox: http://www.listbox.com

Reply via email to