Hi Joseph, thanks for your answer. Currnetly I have two various hardware configuration:
- HPE ProLiant m710p Server Cartridge (have no this problem) - HPE ProLiant m710x Server Cartridge (have this problem) > Did this issue start happening after an update/upgrade? > Was there a prior kernel version where you were not having this particular > problem? Well, I uses debootstrap script for install all needed software automatically and build image with base system. After that I uses this image for boot my nodes via PXE. So each boot I have system that installed from scratch. I've tested the next kernels: - Ubuntu 16.04 with stock kernel: 4.4.0-116-generic - Ubuntu 16.04 with hwe kernel: 4.13.0-36-generic - Ubuntu 16.04 with pve kernel: 4.13.13-6-pve - Debian 9 with pve kernel: 4.13.13-6-pve - Debian 9 with stock kernel: 4.9.0-6-amd64 All of them have this problem, but stock kernels can drops after some time. (I had no this error only on debian with 4.9.0-6-amd64 but presume it exists there because I'm not tested it properly) Another thing, that if I do this steps AFTER the system is boot up: rmmod mlx4_en mlx4_ib mlx4_core modprobe mlx4_core num_vfs=1 port_type_array=2,2 probe_vf=1 systemctl restart networking Everything starts working fine. > Please test the latest v4.16 kernel[0]. Ok, I'll do this -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1755268 Title: Kernel panic when using KVM and Mellanox OFED driver (bonding and sriov enabled) Status in linux package in Ubuntu: Incomplete Status in linux source package in Xenial: Incomplete Status in linux source package in Artful: Incomplete Bug description: ##### System information ##### # uname -a Linux m5c37 4.13.0-36-generic #40~16.04.1-Ubuntu SMP Fri Feb 16 23:25:58 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux # cat /etc/os-release NAME="Ubuntu" VERSION="16.04.4 LTS (Xenial Xerus)" ID=ubuntu ID_LIKE=debian PRETTY_NAME="Ubuntu 16.04.4 LTS" VERSION_ID="16.04" HOME_URL="http://www.ubuntu.com/" SUPPORT_URL="http://help.ubuntu.com/" BUG_REPORT_URL="http://bugs.launchpad.net/ubuntu/" VERSION_CODENAME=xenial UBUNTU_CODENAME=xenial # ethtool -i eno1 driver: mlx4_en version: 4.3-1.0.1 firmware-version: 2.42.5004 expansion-rom-version: bus-info: 0000:11:00.0 supports-statistics: yes supports-test: yes supports-eeprom-access: no supports-register-dump: no supports-priv-flags: yes # ethtool -i bond0 driver: bonding version: 3.7.1 firmware-version: 2 expansion-rom-version: bus-info: supports-statistics: no supports-test: no supports-eeprom-access: no supports-register-dump: no supports-priv-flags: no # ethtool -i vmbr0 driver: bridge version: 2.3 firmware-version: N/A expansion-rom-version: bus-info: N/A supports-statistics: no supports-test: no supports-eeprom-access: no supports-register-dump: no supports-priv-flags: no Mellanox driver was installed from http://content.mellanox.com/ofed/MLNX_OFED-4.3-1.0.1.0/MLNX_OFED_LINUX-4.3-1.0.1.0-ubuntu16.04-x86_64.tgz ./mlnxofedinstall --kernel 4.13.0-36-generic --without-dkms --add- kernel-support ##### Steps for reproduce ##### This is my /etc/network/interfaces file: auto lo iface lo inet loopback auto openibd iface openibd inet manual pre-up /etc/init.d/openibd start pre-down /etc/init.d/openibd force-stop auto bond0 iface bond0 inet manual pre-up ip link add bond0 type bond || true pre-up ip link set bond0 down pre-up ip link set bond0 type bond mode active-backup arp_interval 2000 arp_ip_target 10.36.0.1 arp_validate 3 primary eno1 pre-up ip link set eno1 down pre-up ip link set eno1d1 down pre-up ip link set eno1 master bond0 pre-up ip link set eno1d1 master bond0 pre-up ip link set bond0 up pre-down ip link del bond0 auto vmbr0 iface vmbr0 inet static address 10.36.128.217 netmask 255.255.0.0 gateway 10.36.0.1 bridge_ports bond0 bridge_stp off bridge_fd 0 I execute these commands: wget http://dl-cdn.alpinelinux.org/alpine/v3.7/releases/x86_64/alpine-virt-3.7.0-x86_64.iso -O alpine.iso qemu-system-x86_64 -boot d -cdrom alpine.iso -m 512 -nographic -device e1000,netdev=net0 -netdev tap,id=net0 And after few moments I have hang kernel, and theese messages in console: [74390.187908] mlx4_core 0000:11:00.0: bond for multifunction failed [74390.486476] mlx4_en: eno1d1: Fail to bond device [74390.750758] cache_from_obj: Wrong slab cache. kmalloc-256 but object is from kmalloc-192 [74391.152326] general protection fault: 0000 [#1] SMP PTI [74391.410424] cache_from_obj: Wrong slab cache. kmalloc-256 but object is from kmalloc-192 kernel trace log in attachment ProblemType: Bug DistroRelease: Ubuntu 16.04 Package: linux-image-4.13.0-36-generic 4.13.0-36.40~16.04.1 ProcVersionSignature: Ubuntu 4.13.0-36.40~16.04.1-generic 4.13.13 Uname: Linux 4.13.0-36-generic x86_64 ApportVersion: 2.20.1-0ubuntu2.15 Architecture: amd64 Date: Mon Mar 12 19:59:16 2018 ProcEnviron: TERM=xterm-256color PATH=(custom, no user) LANG=C SHELL=/bin/bash SourcePackage: linux-hwe UpgradeStatus: No upgrade log present (probably fresh install) To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1755268/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp