------- Comment From sudeeshj...@in.ibm.com 2017-12-01 03:46 EDT------- (In reply to comment #16) > The trace reported is no more seen ; But I see some other trace in dmesg; > > > > root@ltc84-pkvm1:~# echo 10000 > /sys/kernel/debug/powerpc/eeh_max_freezes > root@ltc84-pkvm1:~# echo 1 > /sys/class/cxl/card0/perst_reloads_same_image > root@ltc84-pkvm1:~# > root@ltc84-pkvm1:~# > root@ltc84-pkvm1:~# lspci | grep acc > 0001:01:00.0 Processing accelerators: IBM Device 0477 (rev 01) > 0002:00:00.0 Processing accelerators: IBM Device 4350 (rev 0a) > root@ltc84-pkvm1:~# echo 0x8000000000000000 > > /sys/kernel/debug/powerpc/PCI0001/err_injct_outbound > root@ltc84-pkvm1:~# > root@ltc84-pkvm1:~# uname -a > Linux ltc84-pkvm1 4.10.0-40-generic #44-Ubuntu SMP Thu Nov 9 14:48:23 UTC > 2017 ppc64le ppc64le ppc64le GNU/Linux > root@ltc84-pkvm1:~# > > root@ltc84-pkvm1:~# dmesg > <snip> > [ 123.426172] ip6_tables: (C) 2000-2006 Netfilter Core Team > [ 123.573736] Ebtables v2.0 registered > [ 123.964678] virbr0: port 1(virbr0-nic) entered blocking state > [ 123.964682] virbr0: port 1(virbr0-nic) entered disabled state > [ 123.964870] device virbr0-nic entered promiscuous mode > [ 124.298173] virbr0: port 1(virbr0-nic) entered blocking state > [ 124.298176] virbr0: port 1(virbr0-nic) entered listening state > [ 124.372069] virbr0: port 1(virbr0-nic) entered disabled state > [ 171.671205] Harmless Hypervisor Maintenance interrupt [Recovered] > [ 171.671211] Error detail: Unknown > [ 171.671214] HMER: 8040000000000000 > [ 171.671218] Harmless Hypervisor Maintenance interrupt [Recovered] > [ 171.671220] Error detail: Unknown > [ 171.671223] HMER: 8040000000000000 > [ 171.671382] EEH: Fenced PHB#1 detected, location: N/A > [ 171.672512] EEH: This PCI device has failed 1 times in the last hour > [ 171.672513] EEH: Notify device drivers to shutdown > [ 171.672522] cxl afu0.0: Deactivating AFU directed mode > [ 171.672660] cxl afu0.0: PSL Purge called with link down, ignoring > [ 171.673304] EEH: Collect temporary log > [ 171.673306] PHB3 PHB#1 Diag-data (Version: 1) > [ 171.673307] brdgCtl: 0000ffff > [ 171.673309] UtlSts: 00200000 00000000 00000000 > [ 171.673311] RootSts: ffffffff ffffffff ffffffff ffffffff 0000ffff > [ 171.673312] RootErrSts: ffffffff ffffffff ffffffff > [ 171.673313] RootErrLog: ffffffff ffffffff ffffffff ffffffff > [ 171.673314] RootErrLog1: ffffffff 0000000000000000 0000000000000000 > [ 171.673316] nFir: 0000809000000000 0030006e00000000 > 0000800000000000 > [ 171.673317] PhbSts: 0000001800000000 0000001800000000 > [ 171.673318] Lem: 8000020000800000 40018e2400022482 > 8000000000000000 > [ 171.673320] OutErr: 8000002000000000 8000000000000000 > 1210026000020003 0000400000000000 > [ 171.673321] InBErr: 0000000040000000 0000000040000000 > 0000080000000000 000c104010010000 > [ 171.673323] EEH: Reset without hotplug activity > [ 176.174078] EEH: Notify device drivers the completion of reset > [ 176.174089] cxl-pci 0001:01:00.0: enabling device (0140 -> 0142) > [ 176.174404] pci 0001:01 : [PE# 00] Switching PHB to CXL > [ 176.174505] pci 0001:01 : [PE# 00] Switching PHB to CXL > [ 176.186032] Adapter context unlocked with 0 active contexts > [ 176.186109] ------------[ cut here ]------------ > [ 176.186120] WARNING: CPU: 10 PID: 971 at > /build/linux-W3h2EL/linux-4.10.0/drivers/misc/cxl/main.c:317 > cxl_adapter_context_unlock+0x68/0x90 [cxl] > [ 176.186121] Modules linked in: xt_conntrack ipt_REJECT nf_reject_ipv4 > ebtable_filter ebtables ip6table_filter ip6_tables xt_CHECKSUM > iptable_mangle ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat > nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack xt_tcpudp > bridge stp llc iptable_filter kvm_hv kvm joydev input_leds mac_hid at24 > nvmem_core ofpart ipmi_powernv ipmi_devintf ipmi_msghandler cmdlinepart > powernv_flash uio_pdrv_genirq uio mtd powernv_rng vmx_crypto opal_prd > ibmpowernv nfsd auth_rpcgss nfs_acl lockd grace ib_iser rdma_cm iw_cm ib_cm > sunrpc ib_core configfs iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi > ip_tables x_tables autofs4 btrfs raid10 raid456 async_raid6_recov > async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 raid0 > multipath linear hid_generic > [ 176.186175] usbhid hid uas usb_storage nouveau crc32c_vpmsum > i2c_algo_bit ttm drm_kms_helper syscopyarea sysfillrect sysimgblt > fb_sys_fops tg3 cxl drm ahci libahci pnv_php > [ 176.186191] CPU: 10 PID: 971 Comm: eehd Not tainted 4.10.0-40-generic > #44-Ubuntu > [ 176.186193] task: c0000027d32c5800 task.stack: c0000027d3348000 > [ 176.186194] NIP: d000000038db0c50 LR: d000000038db0c4c CTR: > c0000000006119a0 > [ 176.186196] REGS: c0000027d334b6a0 TRAP: 0700 Not tainted > (4.10.0-40-generic) > [ 176.186197] MSR: 900000000282b033 <SF,HV,VEC,VSX,EE,FP,ME,IR,DR,RI,LE> > [ 176.186205] CR: 28008282 XER: 20000000 > [ 176.186206] CFAR: c000000000b60b34 SOFTE: 1 > GPR00: d000000038db0c4c c0000027d334b920 d000000038de5c40 > 000000000000002f > GPR04: 0000000000000001 0000000000000464 0000000063206576 > c0000000015fc900 > GPR08: 0000000000000000 0000000000000000 0000000000000000 > 0000000000000001 > GPR12: 0000000000008800 c00000000fb85a00 c0000000001103b8 > c0000063b13d2000 > GPR16: 0000000000000000 0000000000000000 0000000000000000 > 0000000000000000 > GPR20: 0000000000000000 0000000000000000 0000000000000000 > c000000000d64d78 > GPR24: c000000000d64d50 c0000000014c4330 c0000063aecebc00 > c0000063a74fdfc0 > GPR28: c0000063b1347098 0000000000000000 c0000063aecebc00 > 0000000000000006 > [ 176.186237] NIP [d000000038db0c50] cxl_adapter_context_unlock+0x68/0x90 > [cxl] > [ 176.186243] LR [d000000038db0c4c] cxl_adapter_context_unlock+0x64/0x90 > [cxl] > [ 176.186244] Call Trace: > [ 176.186251] [c0000027d334b920] [d000000038db0c4c] > cxl_adapter_context_unlock+0x64/0x90 [cxl] (unreliable) > [ 176.186260] [c0000027d334b980] [d000000038dc1634] > cxl_configure_adapter+0xa6c/0xab0 [cxl] > [ 176.186268] [c0000027d334ba30] [d000000038dc16d0] > cxl_pci_slot_reset+0x58/0x250 [cxl] > [ 176.186272] [c0000027d334bae0] [c00000000003bb14] > eeh_report_reset+0x154/0x190 > [ 176.186276] [c0000027d334bb20] [c000000000039e68] > eeh_pe_dev_traverse+0x98/0x170 > [ 176.186279] [c0000027d334bbb0] [c00000000003c25c] > eeh_handle_normal_event+0x3ec/0x540 > [ 176.186281] [c0000027d334bc60] [c00000000003c614] > eeh_handle_event+0x174/0x350 > [ 176.186284] [c0000027d334bd10] [c00000000003c9d8] > eeh_event_handler+0x1e8/0x1f0 > [ 176.186287] [c0000027d334bdc0] [c000000000110514] kthread+0x164/0x1b0 > [ 176.186291] [c0000027d334be30] [c00000000000b4e8] > ret_from_kernel_thread+0x5c/0x74 > [ 176.186292] Instruction dump: > [ 176.186294] 2f84ffff 4d9e0020 7c0802a6 f8010010 f821ffa1 39200000 > 7c8407b4 912303d0 > [ 176.186300] 3c620000 e8638070 48021941 e8410018 <0fe00000> 38210060 > e8010010 7c0803a6 > [ 176.186307] ---[ end trace 8780457e503acb38 ]--- > [ 176.186350] cxl afu0.0: Activating AFU directed mode > [ 176.186481] EEH: Notify device driver to resume > root@ltc84-pkvm1:~#
Please ignore this comment. -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1694485 Title: Ubuntu17.04: CAPI: call trace seen while error injection to the CAPI card. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu-power-systems/+bug/1694485/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs