[Kernel-packages] [Bug 1630304] [NEW] Ubuntu 16.10 KVM: Issue doing hotplug detach to SRIOV VF

2016-10-04 Thread bugproxy
Public bug reported:

---Problem Description---
I can not get hotplug attach to work in Ubuntu but if I try to detach a CX4 VF 
from a guest I am getting some issues:
Like in this case:
[  474.393308] vfio-pci 0001:01:00.3: No device request channel registered, 
blocked until released by user
[  474.393543] pci 0001:01: 0.3: [PE# 006] Removing DMA window #0
[  474.393553] pci 0001:01: 0.3: [PE# 006] Removing DMA window #1
[  474.393906] mlx5_core 0001:01:00.3: enabling device ( -> 0002)
[  474.393939] mlx5_core 0001:01:00.3: Using 32-bit DMA via iommu
[  474.400360] pci 0001:01: 0.3: [PE# 006] Setting up window#0 0..7fff 
pg=1000
[  474.400380] mlx5_core 0001:01:00.3: firmware version: 12.17.226
[  474.401341] pci 0001:01: 0.3: [PE# 006] Enabling 64-bit DMA bypass
[  474.402284] EEH: Frozen PE#6 on PHB#1 detected
[  474.402475] EEH: PE location: Slot4, PHB location: N/A
[  474.403699] EEH: This PCI device has failed 1 times in the last hour
[  474.403700] EEH: Notify device drivers to shutdown
[  474.403707] mlx5_core 0001:01:00.3: mlx5_pci_err_detected was called
[  474.403711] mlx5_core 0001:01:00.3: 
0001:01:00.3:mlx5_enter_error_state:115:(pid 779): start
[  474.403870] mlx5_core 0001:01:00.3: 
0001:01:00.3:mlx5_enter_error_state:120:(pid 779): end


One time I saw 
SSep 13 09:41:32 ltc-fire1 kernel: [70437.943722] vfio-pci 0001:01:00.3: No 
device request channel registered, blocked until released by user
Sep 13 09:41:32 ltc-fire1 kernel: [70437.944076] mlx5_core 0001:01:00.3: 
enabling device ( -> 0002)
Sep 13 09:41:32 ltc-fire1 kernel: [70437.944110] mlx5_core 0001:01:00.3: Using 
32-bit DMA via iommu
Sep 13 09:41:32 ltc-fire1 kernel: [70437.944145] pci 0001:01: 0.3: [PE# 006] 
Removing DMA window #0
Sep 13 09:41:32 ltc-fire1 kernel: [70437.944152] pci 0001:01: 0.3: [PE# 006] 
Removing DMA window #1
Sep 13 09:41:32 ltc-fire1 kernel: [70437.944195] mlx5_core 0001:01:00.3: 
firmware version: 12.17.226
Sep 13 09:41:32 ltc-fire1 kernel: [70437.944260] Unable to handle kernel paging 
request for data at address 0x
Sep 13 09:41:32 ltc-fire1 kernel: [70437.944533] Faulting instruction address: 
0xc05b37e0
Sep 13 09:41:32 ltc-fire1 kernel: [70437.944592] Oops: Kernel access of bad 
area, sig: 11 [#1]
Sep 13 09:41:32 ltc-fire1 kernel: [70437.944636] SMP NR_CPUS=2048 NUMA PowerNV
Sep 13 09:41:32 ltc-fire1 kernel: [70437.944851] Modules linked in: vfio_pci 
irqbypass vfio_iommu_spapr_tce vfio_virqfd vfio vfio_spapr_eeh xt_CHECKSUM 
iptable_mangle ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_nat_ipv4 
nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack ipt_REJECT 
nf_reject_ipv4 xt_tcpudp kvm_hv kvm_pr kvm ebtable_filter ebtables 
ip6table_filter ip6_tables iptable_filter rdma_ucm(OE) ib_ucm(OE) rdma_cm(OE) 
iw_cm(OE) configfs ib_ipoib(OE) ib_cm(OE) ib_uverbs(OE) ib_umad(OE) mlx5_ib(OE) 
mlx5_core(OE) mlx4_ib(OE) ib_sa(OE) ib_mad(OE) ib_core(OE) mlx4_en(OE) 
ib_addr(OE) ib_netlink(OE) mlx4_core(OE) mlx_compat(OE) bridge stp llc joydev 
input_leds mac_hid ofpart at24 cmdlinepart powernv_flash ipmi_powernv 
nvmem_core uio_pdrv_genirq opal_prd mtd ipmi_msghandler uio ibmpowernv 
powernv_rng binfmt_misc dm_multipath knem(OE) ip_tables x_tables autofs4 
hid_generic usbhid hid uas usb_storage ast i2c_algo_bit ttm drm_kms_helper 
syscopyarea sysfillrect sysim
 gblt fb_sys_fops drm ahci devlink libahci [last unloaded: mlx4_core]
Sep 13 09:41:32 ltc-fire1 kernel: [70437.946007] CPU: 40 PID: 12501 Comm: 
libvirtd Tainted: G   OE   4.7.0unofficial #5
Sep 13 09:41:32 ltc-fire1 kernel: [70437.946074] task: c00ec319a200 ti: 
c00ec324c000 task.ti: c00ec324c000
Sep 13 09:41:32 ltc-fire1 kernel: [70437.946140] NIP: c05b37e0 LR: 
c05ad070 CTR: 
Sep 13 09:41:32 ltc-fire1 kernel: [70437.946208] REGS: c00ec324f100 TRAP: 
0300   Tainted: G   OE(4.7.0unofficial)
Sep 13 09:41:32 ltc-fire1 kernel: [70437.946286] MSR: 90010280b033 
  CR: 84028844  XER: 2000
Sep 13 09:41:32 ltc-fire1 kernel: [70437.946533] CFAR: c0008468 DAR: 
 DSISR: 4000 SOFTE: 0
Sep 13 09:41:32 ltc-fire1 kernel: [70437.946533] GPR00: c05d19c8 
c00ec324f380 c13bef00 
Sep 13 09:41:32 ltc-fire1 kernel: [70437.946533] GPR04:  
  
Sep 13 09:41:32 ltc-fire1 kernel: [70437.946533] GPR08:  
  f3803140
Sep 13 09:41:32 ltc-fire1 kernel: [70437.946533] GPR12: 24048840 
cfb96800 c00ec0def080 c00ec0def000
Sep 13 09:41:32 ltc-fire1 kernel: [70437.946533] GPR16: 3fff93ef 
0001  c00ec0def100
Sep 13 09:41:32 ltc-fire1 kernel: [70437.946533] GPR20: c00ec0def118 
 0001 
Sep 13 09:41:32 ltc-fire1 kernel: [70437.946533] GPR24: 

[Kernel-packages] [Bug 1630304] [NEW] Ubuntu 16.10 KVM: Issue doing hotplug detach to SRIOV VF

2016-10-04 Thread Launchpad Bug Tracker
You have been subscribed to a public bug:

---Problem Description---
I can not get hotplug attach to work in Ubuntu but if I try to detach a CX4 VF 
from a guest I am getting some issues:
Like in this case:
[  474.393308] vfio-pci 0001:01:00.3: No device request channel registered, 
blocked until released by user
[  474.393543] pci 0001:01: 0.3: [PE# 006] Removing DMA window #0
[  474.393553] pci 0001:01: 0.3: [PE# 006] Removing DMA window #1
[  474.393906] mlx5_core 0001:01:00.3: enabling device ( -> 0002)
[  474.393939] mlx5_core 0001:01:00.3: Using 32-bit DMA via iommu
[  474.400360] pci 0001:01: 0.3: [PE# 006] Setting up window#0 0..7fff 
pg=1000
[  474.400380] mlx5_core 0001:01:00.3: firmware version: 12.17.226
[  474.401341] pci 0001:01: 0.3: [PE# 006] Enabling 64-bit DMA bypass
[  474.402284] EEH: Frozen PE#6 on PHB#1 detected
[  474.402475] EEH: PE location: Slot4, PHB location: N/A
[  474.403699] EEH: This PCI device has failed 1 times in the last hour
[  474.403700] EEH: Notify device drivers to shutdown
[  474.403707] mlx5_core 0001:01:00.3: mlx5_pci_err_detected was called
[  474.403711] mlx5_core 0001:01:00.3: 
0001:01:00.3:mlx5_enter_error_state:115:(pid 779): start
[  474.403870] mlx5_core 0001:01:00.3: 
0001:01:00.3:mlx5_enter_error_state:120:(pid 779): end


One time I saw 
SSep 13 09:41:32 ltc-fire1 kernel: [70437.943722] vfio-pci 0001:01:00.3: No 
device request channel registered, blocked until released by user
Sep 13 09:41:32 ltc-fire1 kernel: [70437.944076] mlx5_core 0001:01:00.3: 
enabling device ( -> 0002)
Sep 13 09:41:32 ltc-fire1 kernel: [70437.944110] mlx5_core 0001:01:00.3: Using 
32-bit DMA via iommu
Sep 13 09:41:32 ltc-fire1 kernel: [70437.944145] pci 0001:01: 0.3: [PE# 006] 
Removing DMA window #0
Sep 13 09:41:32 ltc-fire1 kernel: [70437.944152] pci 0001:01: 0.3: [PE# 006] 
Removing DMA window #1
Sep 13 09:41:32 ltc-fire1 kernel: [70437.944195] mlx5_core 0001:01:00.3: 
firmware version: 12.17.226
Sep 13 09:41:32 ltc-fire1 kernel: [70437.944260] Unable to handle kernel paging 
request for data at address 0x
Sep 13 09:41:32 ltc-fire1 kernel: [70437.944533] Faulting instruction address: 
0xc05b37e0
Sep 13 09:41:32 ltc-fire1 kernel: [70437.944592] Oops: Kernel access of bad 
area, sig: 11 [#1]
Sep 13 09:41:32 ltc-fire1 kernel: [70437.944636] SMP NR_CPUS=2048 NUMA PowerNV
Sep 13 09:41:32 ltc-fire1 kernel: [70437.944851] Modules linked in: vfio_pci 
irqbypass vfio_iommu_spapr_tce vfio_virqfd vfio vfio_spapr_eeh xt_CHECKSUM 
iptable_mangle ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_nat_ipv4 
nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack ipt_REJECT 
nf_reject_ipv4 xt_tcpudp kvm_hv kvm_pr kvm ebtable_filter ebtables 
ip6table_filter ip6_tables iptable_filter rdma_ucm(OE) ib_ucm(OE) rdma_cm(OE) 
iw_cm(OE) configfs ib_ipoib(OE) ib_cm(OE) ib_uverbs(OE) ib_umad(OE) mlx5_ib(OE) 
mlx5_core(OE) mlx4_ib(OE) ib_sa(OE) ib_mad(OE) ib_core(OE) mlx4_en(OE) 
ib_addr(OE) ib_netlink(OE) mlx4_core(OE) mlx_compat(OE) bridge stp llc joydev 
input_leds mac_hid ofpart at24 cmdlinepart powernv_flash ipmi_powernv 
nvmem_core uio_pdrv_genirq opal_prd mtd ipmi_msghandler uio ibmpowernv 
powernv_rng binfmt_misc dm_multipath knem(OE) ip_tables x_tables autofs4 
hid_generic usbhid hid uas usb_storage ast i2c_algo_bit ttm drm_kms_helper 
syscopyarea sysfillrect sysim
 gblt fb_sys_fops drm ahci devlink libahci [last unloaded: mlx4_core]
Sep 13 09:41:32 ltc-fire1 kernel: [70437.946007] CPU: 40 PID: 12501 Comm: 
libvirtd Tainted: G   OE   4.7.0unofficial #5
Sep 13 09:41:32 ltc-fire1 kernel: [70437.946074] task: c00ec319a200 ti: 
c00ec324c000 task.ti: c00ec324c000
Sep 13 09:41:32 ltc-fire1 kernel: [70437.946140] NIP: c05b37e0 LR: 
c05ad070 CTR: 
Sep 13 09:41:32 ltc-fire1 kernel: [70437.946208] REGS: c00ec324f100 TRAP: 
0300   Tainted: G   OE(4.7.0unofficial)
Sep 13 09:41:32 ltc-fire1 kernel: [70437.946286] MSR: 90010280b033 
  CR: 84028844  XER: 2000
Sep 13 09:41:32 ltc-fire1 kernel: [70437.946533] CFAR: c0008468 DAR: 
 DSISR: 4000 SOFTE: 0
Sep 13 09:41:32 ltc-fire1 kernel: [70437.946533] GPR00: c05d19c8 
c00ec324f380 c13bef00 
Sep 13 09:41:32 ltc-fire1 kernel: [70437.946533] GPR04:  
  
Sep 13 09:41:32 ltc-fire1 kernel: [70437.946533] GPR08:  
  f3803140
Sep 13 09:41:32 ltc-fire1 kernel: [70437.946533] GPR12: 24048840 
cfb96800 c00ec0def080 c00ec0def000
Sep 13 09:41:32 ltc-fire1 kernel: [70437.946533] GPR16: 3fff93ef 
0001  c00ec0def100
Sep 13 09:41:32 ltc-fire1 kernel: [70437.946533] GPR20: c00ec0def118 
 0001 
Sep 13 09:41:32 ltc-fire1 kernel: