Hi Joseph, thanks for your answer.

Currnetly I have two various hardware configuration:

- HPE ProLiant m710p Server Cartridge (have no this problem)
- HPE ProLiant m710x Server Cartridge (have this problem)

> Did this issue start happening after an update/upgrade?
> Was there a prior kernel version where you were not having this particular 
> problem?

Well, I uses debootstrap script for install all needed software automatically 
and build image with base system.
After that I uses this image for boot my nodes via PXE. So each boot I have 
system that installed from scratch.
I've tested the next kernels:
- Ubuntu 16.04 with stock kernel: 4.4.0-116-generic
- Ubuntu 16.04 with hwe kernel: 4.13.0-36-generic
- Ubuntu 16.04 with pve kernel: 4.13.13-6-pve
- Debian 9 with pve kernel: 4.13.13-6-pve
- Debian 9 with stock kernel: 4.9.0-6-amd64

All of them have this problem, but stock kernels can drops after some time.
(I had no this error only on debian with 4.9.0-6-amd64 but presume it exists 
there because I'm not tested it properly)

Another thing, that if I do this steps AFTER the system is boot up:

    rmmod mlx4_en mlx4_ib mlx4_core
    modprobe mlx4_core num_vfs=1 port_type_array=2,2 probe_vf=1
    systemctl restart networking

Everything starts working fine.

> Please test the latest v4.16 kernel[0].

Ok, I'll do this

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1755268

Title:
  Kernel panic when using KVM and Mellanox OFED driver (bonding and
  sriov enabled)

Status in linux package in Ubuntu:
  Incomplete
Status in linux source package in Xenial:
  Incomplete
Status in linux source package in Artful:
  Incomplete

Bug description:
  ##### System information #####

      # uname -a
      Linux m5c37 4.13.0-36-generic #40~16.04.1-Ubuntu SMP Fri Feb 16 23:25:58 
UTC 2018 x86_64 x86_64 x86_64 GNU/Linux

      # cat /etc/os-release
      NAME="Ubuntu"
      VERSION="16.04.4 LTS (Xenial Xerus)"
      ID=ubuntu
      ID_LIKE=debian
      PRETTY_NAME="Ubuntu 16.04.4 LTS"
      VERSION_ID="16.04"
      HOME_URL="http://www.ubuntu.com/";
      SUPPORT_URL="http://help.ubuntu.com/";
      BUG_REPORT_URL="http://bugs.launchpad.net/ubuntu/";
      VERSION_CODENAME=xenial
      UBUNTU_CODENAME=xenial

      # ethtool -i eno1
      driver: mlx4_en
      version: 4.3-1.0.1
      firmware-version: 2.42.5004
      expansion-rom-version:
      bus-info: 0000:11:00.0
      supports-statistics: yes
      supports-test: yes
      supports-eeprom-access: no
      supports-register-dump: no
      supports-priv-flags: yes

      # ethtool -i bond0
      driver: bonding
      version: 3.7.1
      firmware-version: 2
      expansion-rom-version:
      bus-info:
      supports-statistics: no
      supports-test: no
      supports-eeprom-access: no
      supports-register-dump: no
      supports-priv-flags: no

      # ethtool -i vmbr0
      driver: bridge
      version: 2.3
      firmware-version: N/A
      expansion-rom-version:
      bus-info: N/A
      supports-statistics: no
      supports-test: no
      supports-eeprom-access: no
      supports-register-dump: no
      supports-priv-flags: no

  Mellanox driver was installed from
  
http://content.mellanox.com/ofed/MLNX_OFED-4.3-1.0.1.0/MLNX_OFED_LINUX-4.3-1.0.1.0-ubuntu16.04-x86_64.tgz

      ./mlnxofedinstall --kernel 4.13.0-36-generic --without-dkms --add-
  kernel-support

  ##### Steps for reproduce #####

  This is my /etc/network/interfaces file:

  
      auto lo
      iface lo inet loopback

      auto openibd
      iface openibd inet manual
              pre-up /etc/init.d/openibd start
              pre-down /etc/init.d/openibd force-stop

      auto bond0
      iface bond0 inet manual
              pre-up ip link add bond0 type bond || true
              pre-up ip link set bond0 down
              pre-up ip link set bond0 type bond mode active-backup 
arp_interval 2000 arp_ip_target 10.36.0.1 arp_validate 3 primary eno1
              pre-up ip link set eno1 down
              pre-up ip link set eno1d1 down
              pre-up ip link set eno1 master bond0
              pre-up ip link set eno1d1 master bond0
              pre-up ip link set bond0 up
              pre-down ip link del bond0

      auto vmbr0
      iface vmbr0 inet static
              address 10.36.128.217
              netmask 255.255.0.0
              gateway 10.36.0.1
              bridge_ports bond0
              bridge_stp off
              bridge_fd 0

  I execute these commands:

      wget 
http://dl-cdn.alpinelinux.org/alpine/v3.7/releases/x86_64/alpine-virt-3.7.0-x86_64.iso
 -O alpine.iso
      qemu-system-x86_64 -boot d -cdrom alpine.iso -m 512 -nographic -device 
e1000,netdev=net0 -netdev tap,id=net0

  And after few moments I have hang kernel, and theese messages in
  console:

      [74390.187908] mlx4_core 0000:11:00.0: bond for multifunction failed
      [74390.486476] mlx4_en: eno1d1: Fail to bond device
      [74390.750758] cache_from_obj: Wrong slab cache. kmalloc-256 but object 
is from kmalloc-192
      [74391.152326] general protection fault: 0000 [#1] SMP PTI
      [74391.410424] cache_from_obj: Wrong slab cache. kmalloc-256 but object 
is from kmalloc-192

  kernel trace log in attachment

  ProblemType: Bug
  DistroRelease: Ubuntu 16.04
  Package: linux-image-4.13.0-36-generic 4.13.0-36.40~16.04.1
  ProcVersionSignature: Ubuntu 4.13.0-36.40~16.04.1-generic 4.13.13
  Uname: Linux 4.13.0-36-generic x86_64
  ApportVersion: 2.20.1-0ubuntu2.15
  Architecture: amd64
  Date: Mon Mar 12 19:59:16 2018
  ProcEnviron:
   TERM=xterm-256color
   PATH=(custom, no user)
   LANG=C
   SHELL=/bin/bash
  SourcePackage: linux-hwe
  UpgradeStatus: No upgrade log present (probably fresh install)

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1755268/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to     : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp

Reply via email to