[Kernel-packages] [Bug 1975616] Re: [UBUNTU 20.04] Several "failed assertion in udev" messages causing unexpected errors in the system

2022-07-24 Thread Launchpad Bug Tracker
[Expired for Ubuntu on IBM z Systems because there has been no activity
for 60 days.]

** Changed in: ubuntu-z-systems
   Status: Incomplete => Expired

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1975616

Title:
  [UBUNTU 20.04] Several "failed assertion in udev" messages causing
  unexpected errors in the system

Status in Ubuntu on IBM z Systems:
  Expired
Status in linux package in Ubuntu:
  Incomplete

Bug description:
  ---Problem Description---
  Several " failed assertion in udev" messages  causing unexpected errors in 
the system causing multiple concurrent dumps to be generated.
  This system is a s390x cloud appliance LPAR hosting multiple VMS
  This issue is suspected to be seen ,when the VMs(Virtual Server Instance) is 
being spawned and during its  attempt to rename a network interface derived 
from one  SRIOV interface from Mellanox CX5 NIC. 
  We see multiple of such messages from journalctl as below  that caused the 
concurrent dump. The pattern always repeats while renaming a virtual function 
based interface. 

  We would like to know , what's causing the assertion and subsequent dumps 
  Feel free to let us know for dump information( as this is a customer machine, 
as i need to follow a process to get the same )

  
  Mar  3 06:53:18 tok1-qz1-sr5-rk361-s04 kernel: [6545462.208754] I6 NDP3 
578b7bb 0024/361-s04 IN=IMGMT.53 OUT= 
MAC=33:33:00:00:00:01:00:10:6f:24:e5:8c:86:dd 
SRC=fe80::::0210:6fff:fe24:e58c 
DST=ff02:::::::0001 LEN=552 TC=0 HOPLIMIT=1 FLOWLBL=0 
PROTO=UDP SPT=52245 DPT=9901 LEN=512
  Mar  3 06:53:18 tok1-qz1-sr5-rk361-s04 kernel: [6545462.213811] IPv4: martian 
source 192.168.255.255 from 192.168.104.119, on dev IMGMT.53
  Mar  3 06:53:18 tok1-qz1-sr5-rk361-s04 kernel: [6545462.213813] ll header: 
: ff ff ff ff ff ff 00 10 6f 24 e5 8c 08 00
  Mar  3 06:53:18 tok1-qz1-sr5-rk361-s04 kernel: [6545462.213911] IPv4: martian 
source 192.168.255.255 from 192.168.104.119, on dev IMGMT.53
  Mar  3 06:53:18 tok1-qz1-sr5-rk361-s04 kernel: [6545462.213913] ll header: 
: ff ff ff ff ff ff 00 10 6f 24 e5 8c 08 00
  Mar  3 06:53:18 tok1-qz1-sr5-rk361-s04 systemd-udevd[1716040]: Using default 
interface naming scheme 'v245'.
  Mar  3 06:53:18 tok1-qz1-sr5-rk361-s04 kernel: mlx5_core 0001:00:06.7 p0v53: 
renamed from if02089C284C2C
  Mar  3 06:53:18 tok1-qz1-sr5-rk361-s04 kernel: I6 IDP 578b7bb 0024/361-s04 
IN=vlan1.53 OUT= MAC=33:33:00:00:00:01:00:10:6f:24:e5:8c:86:dd:60:00:00:00 
SRC=fe80::::0210:6fff:fe24:e58c 
DST=ff02:::::::0001 LEN=552 TC=0 HOPLIMIT=1 FLOWLBL=0 
PROTO=UDP SPT=52245 DPT=9901 LEN=512
  Mar  3 06:53:18 tok1-qz1-sr5-rk361-s04 kernel: I6 NDP3 578b7bb 0024/361-s04 
IN=IMGMT.53 OUT= MAC=33:33:00:00:00:01:00:10:6f:24:e5:8c:86:dd 
SRC=fe80::::0210:6fff:fe24:e58c 
DST=ff02:::::::0001 LEN=552 TC=0 HOPLIMIT=1 FLOWLBL=0 
PROTO=UDP SPT=52245 DPT=9901 LEN=512
  Mar  3 06:53:18 tok1-qz1-sr5-rk361-s04 kernel: I6 NDP3 578b7bb 0024/361-s04 
IN=IMGMT.53 OUT= MAC=33:33:00:00:00:01:00:10:6f:24:e5:8c:86:dd 
SRC=fe80::::0210:6fff:fe24:e58c 
DST=ff02:::::::0001 LEN=552 TC=0 HOPLIMIT=1 FLOWLBL=0 
PROTO=UDP SPT=52245 DPT=9901 LEN=512
  Mar  3 06:53:18 tok1-qz1-sr5-rk361-s04 kernel: IPv4: martian source 
192.168.255.255 from 192.168.104.119, on dev IMGMT.53
  Mar  3 06:53:18 tok1-qz1-sr5-rk361-s04 kernel: ll header: : ff ff ff 
ff ff ff 00 10 6f 24 e5 8c 08 00
  Mar  3 06:53:18 tok1-qz1-sr5-rk361-s04 kernel: IPv4: martian source 
192.168.255.255 from 192.168.104.119, on dev IMGMT.53
  Mar  3 06:53:18 tok1-qz1-sr5-rk361-s04 kernel: ll header: : ff ff ff 
ff ff ff 00 10 6f 24 e5 8c 08 00
  Mar  3 06:53:18 tok1-qz1-sr5-rk361-s04 NetworkManager[19870]:   
[1646290398.8433] device (if02089C284C2C): interface index 208 renamed iface 
from 'if02089C284C2C' to 'p0v53'
  Mar  3 06:53:19 tok1-qz1-sr5-rk361-s04 systemd-udevd[1716040]: ethtool: 
autonegotiation is unset or enabled, the speed and duplex are not writable.
  Mar  3 06:53:19 tok1-qz1-sr5-rk361-s04 systemd-udevd[1716040]: Assertion 
'size > 0' failed at src/udev/udev-event.c:449, function 
udev_event_apply_format(). Aborting.
  Mar  3 06:53:19 tok1-qz1-sr5-rk361-s04 systemd-udevd[19533]: Worker [1716040] 
terminated by signal 6 (ABRT)
  Mar  3 06:53:19 tok1-qz1-sr5-rk361-s04 systemd-udevd[19533]: p0v53: Worker 
[1716040] failed
  Mar  3 06:53:19 tok1-qz1-sr5-rk361-s04 /usr/sbin/fpcStealConsole[19295]: 
/usr/sbin/event-daemonCoreDump.sh Wrote 
/var/crash/userfiles/core.systemd-udevd.1716040.6
  Mar  3 06:53:19 tok1-qz1-sr5-rk361-s04 /usr/sbin/fpcStealConsole[19295]: 
/usr/sbin/event-daemonCoreDump.sh: Sent log 2A5A0092: Process 
systemd-udevd.1716040 core dumped (6).
  Mar  3 06:53:19 tok1-qz1-sr5-rk361-s04 kernel: I4 IDP2 578b7bb 0024/361-s04 

[Kernel-packages] [Bug 1975616] Re: [UBUNTU 20.04] Several "failed assertion in udev" messages causing unexpected errors in the system

2022-05-25 Thread Frank Heimes
Thanks for raising this.

Please let me mention a few observations:
- The kernel that is in use is (significantly) outdated: 5.4.0-80-generic (Jul 
9 17:41:33 UTC 2021), there is a delta of about 20 missed kernel updates.
- And the system as a whole is outdated as well, since it reports "20.04.2 LTS".
  Please notice that every new point release replaces the previous one
  and with that makes it obsolete and no longer supported.
  We are btw. currently at 20.04.4 (20.04.5 will come in August).
  On top a system needs to receive regular (functional and security) updates,
  for example with the help of:
  sudo apt update
   
  sudo apt upgrade
  An up-to-date system will obviously have lot's of mlx5 kernel fixes
  as well as systemd, udev and other user-land patches included.
- So please reproduce this with the latest levels, on an up-to-date system.
  Otherwise time and effort might be wasted at both parties on hunting down bugs
  that are potentially already fixed.

I'm also reading that "this is a customer machine", in such a case it's
preferred to open a SalesForce ticket (for support cases) rather than a
Launchpad bug (for development tasks).

And without crash/dump files one can only roughly assume what might
happen.

A first brief investigation of some of the messages, seem to point to a 
potential problem of the hw-fw/driver combination.
So please also double-check if the firmware of the RoCE/Mellanox adapter is on 
the very latest fw level.

** Changed in: linux (Ubuntu)
   Status: New => Incomplete

** Changed in: ubuntu-z-systems
   Status: New => Incomplete

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1975616

Title:
  [UBUNTU 20.04] Several "failed assertion in udev" messages causing
  unexpected errors in the system

Status in Ubuntu on IBM z Systems:
  Incomplete
Status in linux package in Ubuntu:
  Incomplete

Bug description:
  ---Problem Description---
  Several " failed assertion in udev" messages  causing unexpected errors in 
the system causing multiple concurrent dumps to be generated.
  This system is a s390x cloud appliance LPAR hosting multiple VMS
  This issue is suspected to be seen ,when the VMs(Virtual Server Instance) is 
being spawned and during its  attempt to rename a network interface derived 
from one  SRIOV interface from Mellanox CX5 NIC. 
  We see multiple of such messages from journalctl as below  that caused the 
concurrent dump. The pattern always repeats while renaming a virtual function 
based interface. 

  We would like to know , what's causing the assertion and subsequent dumps 
  Feel free to let us know for dump information( as this is a customer machine, 
as i need to follow a process to get the same )

  
  Mar  3 06:53:18 tok1-qz1-sr5-rk361-s04 kernel: [6545462.208754] I6 NDP3 
578b7bb 0024/361-s04 IN=IMGMT.53 OUT= 
MAC=33:33:00:00:00:01:00:10:6f:24:e5:8c:86:dd 
SRC=fe80::::0210:6fff:fe24:e58c 
DST=ff02:::::::0001 LEN=552 TC=0 HOPLIMIT=1 FLOWLBL=0 
PROTO=UDP SPT=52245 DPT=9901 LEN=512
  Mar  3 06:53:18 tok1-qz1-sr5-rk361-s04 kernel: [6545462.213811] IPv4: martian 
source 192.168.255.255 from 192.168.104.119, on dev IMGMT.53
  Mar  3 06:53:18 tok1-qz1-sr5-rk361-s04 kernel: [6545462.213813] ll header: 
: ff ff ff ff ff ff 00 10 6f 24 e5 8c 08 00
  Mar  3 06:53:18 tok1-qz1-sr5-rk361-s04 kernel: [6545462.213911] IPv4: martian 
source 192.168.255.255 from 192.168.104.119, on dev IMGMT.53
  Mar  3 06:53:18 tok1-qz1-sr5-rk361-s04 kernel: [6545462.213913] ll header: 
: ff ff ff ff ff ff 00 10 6f 24 e5 8c 08 00
  Mar  3 06:53:18 tok1-qz1-sr5-rk361-s04 systemd-udevd[1716040]: Using default 
interface naming scheme 'v245'.
  Mar  3 06:53:18 tok1-qz1-sr5-rk361-s04 kernel: mlx5_core 0001:00:06.7 p0v53: 
renamed from if02089C284C2C
  Mar  3 06:53:18 tok1-qz1-sr5-rk361-s04 kernel: I6 IDP 578b7bb 0024/361-s04 
IN=vlan1.53 OUT= MAC=33:33:00:00:00:01:00:10:6f:24:e5:8c:86:dd:60:00:00:00 
SRC=fe80::::0210:6fff:fe24:e58c 
DST=ff02:::::::0001 LEN=552 TC=0 HOPLIMIT=1 FLOWLBL=0 
PROTO=UDP SPT=52245 DPT=9901 LEN=512
  Mar  3 06:53:18 tok1-qz1-sr5-rk361-s04 kernel: I6 NDP3 578b7bb 0024/361-s04 
IN=IMGMT.53 OUT= MAC=33:33:00:00:00:01:00:10:6f:24:e5:8c:86:dd 
SRC=fe80::::0210:6fff:fe24:e58c 
DST=ff02:::::::0001 LEN=552 TC=0 HOPLIMIT=1 FLOWLBL=0 
PROTO=UDP SPT=52245 DPT=9901 LEN=512
  Mar  3 06:53:18 tok1-qz1-sr5-rk361-s04 kernel: I6 NDP3 578b7bb 0024/361-s04 
IN=IMGMT.53 OUT= MAC=33:33:00:00:00:01:00:10:6f:24:e5:8c:86:dd 
SRC=fe80::::0210:6fff:fe24:e58c 
DST=ff02:::::::0001 LEN=552 TC=0 HOPLIMIT=1 FLOWLBL=0 
PROTO=UDP SPT=52245 DPT=9901 LEN=512
  Mar  3 06:53:18 tok1-qz1-sr5-rk361-s04 kernel: IPv4: martian source 
192.168.255.255 from 192.168.104.119, on dev IMGMT.53
  Mar  3 06:53:18 

[Kernel-packages] [Bug 1975616] Re: [UBUNTU 20.04] Several "failed assertion in udev" messages causing unexpected errors in the system

2022-05-25 Thread Frank Heimes
** Also affects: ubuntu-z-systems
   Importance: Undecided
   Status: New

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1975616

Title:
  [UBUNTU 20.04] Several "failed assertion in udev" messages causing
  unexpected errors in the system

Status in Ubuntu on IBM z Systems:
  New
Status in linux package in Ubuntu:
  New

Bug description:
  ---Problem Description---
  Several " failed assertion in udev" messages  causing unexpected errors in 
the system causing multiple concurrent dumps to be generated.
  This system is a s390x cloud appliance LPAR hosting multiple VMS
  This issue is suspected to be seen ,when the VMs(Virtual Server Instance) is 
being spawned and during its  attempt to rename a network interface derived 
from one  SRIOV interface from Mellanox CX5 NIC. 
  We see multiple of such messages from journalctl as below  that caused the 
concurrent dump. The pattern always repeats while renaming a virtual function 
based interface. 

  We would like to know , what's causing the assertion and subsequent dumps 
  Feel free to let us know for dump information( as this is a customer machine, 
as i need to follow a process to get the same )

  
  Mar  3 06:53:18 tok1-qz1-sr5-rk361-s04 kernel: [6545462.208754] I6 NDP3 
578b7bb 0024/361-s04 IN=IMGMT.53 OUT= 
MAC=33:33:00:00:00:01:00:10:6f:24:e5:8c:86:dd 
SRC=fe80::::0210:6fff:fe24:e58c 
DST=ff02:::::::0001 LEN=552 TC=0 HOPLIMIT=1 FLOWLBL=0 
PROTO=UDP SPT=52245 DPT=9901 LEN=512
  Mar  3 06:53:18 tok1-qz1-sr5-rk361-s04 kernel: [6545462.213811] IPv4: martian 
source 192.168.255.255 from 192.168.104.119, on dev IMGMT.53
  Mar  3 06:53:18 tok1-qz1-sr5-rk361-s04 kernel: [6545462.213813] ll header: 
: ff ff ff ff ff ff 00 10 6f 24 e5 8c 08 00
  Mar  3 06:53:18 tok1-qz1-sr5-rk361-s04 kernel: [6545462.213911] IPv4: martian 
source 192.168.255.255 from 192.168.104.119, on dev IMGMT.53
  Mar  3 06:53:18 tok1-qz1-sr5-rk361-s04 kernel: [6545462.213913] ll header: 
: ff ff ff ff ff ff 00 10 6f 24 e5 8c 08 00
  Mar  3 06:53:18 tok1-qz1-sr5-rk361-s04 systemd-udevd[1716040]: Using default 
interface naming scheme 'v245'.
  Mar  3 06:53:18 tok1-qz1-sr5-rk361-s04 kernel: mlx5_core 0001:00:06.7 p0v53: 
renamed from if02089C284C2C
  Mar  3 06:53:18 tok1-qz1-sr5-rk361-s04 kernel: I6 IDP 578b7bb 0024/361-s04 
IN=vlan1.53 OUT= MAC=33:33:00:00:00:01:00:10:6f:24:e5:8c:86:dd:60:00:00:00 
SRC=fe80::::0210:6fff:fe24:e58c 
DST=ff02:::::::0001 LEN=552 TC=0 HOPLIMIT=1 FLOWLBL=0 
PROTO=UDP SPT=52245 DPT=9901 LEN=512
  Mar  3 06:53:18 tok1-qz1-sr5-rk361-s04 kernel: I6 NDP3 578b7bb 0024/361-s04 
IN=IMGMT.53 OUT= MAC=33:33:00:00:00:01:00:10:6f:24:e5:8c:86:dd 
SRC=fe80::::0210:6fff:fe24:e58c 
DST=ff02:::::::0001 LEN=552 TC=0 HOPLIMIT=1 FLOWLBL=0 
PROTO=UDP SPT=52245 DPT=9901 LEN=512
  Mar  3 06:53:18 tok1-qz1-sr5-rk361-s04 kernel: I6 NDP3 578b7bb 0024/361-s04 
IN=IMGMT.53 OUT= MAC=33:33:00:00:00:01:00:10:6f:24:e5:8c:86:dd 
SRC=fe80::::0210:6fff:fe24:e58c 
DST=ff02:::::::0001 LEN=552 TC=0 HOPLIMIT=1 FLOWLBL=0 
PROTO=UDP SPT=52245 DPT=9901 LEN=512
  Mar  3 06:53:18 tok1-qz1-sr5-rk361-s04 kernel: IPv4: martian source 
192.168.255.255 from 192.168.104.119, on dev IMGMT.53
  Mar  3 06:53:18 tok1-qz1-sr5-rk361-s04 kernel: ll header: : ff ff ff 
ff ff ff 00 10 6f 24 e5 8c 08 00
  Mar  3 06:53:18 tok1-qz1-sr5-rk361-s04 kernel: IPv4: martian source 
192.168.255.255 from 192.168.104.119, on dev IMGMT.53
  Mar  3 06:53:18 tok1-qz1-sr5-rk361-s04 kernel: ll header: : ff ff ff 
ff ff ff 00 10 6f 24 e5 8c 08 00
  Mar  3 06:53:18 tok1-qz1-sr5-rk361-s04 NetworkManager[19870]:   
[1646290398.8433] device (if02089C284C2C): interface index 208 renamed iface 
from 'if02089C284C2C' to 'p0v53'
  Mar  3 06:53:19 tok1-qz1-sr5-rk361-s04 systemd-udevd[1716040]: ethtool: 
autonegotiation is unset or enabled, the speed and duplex are not writable.
  Mar  3 06:53:19 tok1-qz1-sr5-rk361-s04 systemd-udevd[1716040]: Assertion 
'size > 0' failed at src/udev/udev-event.c:449, function 
udev_event_apply_format(). Aborting.
  Mar  3 06:53:19 tok1-qz1-sr5-rk361-s04 systemd-udevd[19533]: Worker [1716040] 
terminated by signal 6 (ABRT)
  Mar  3 06:53:19 tok1-qz1-sr5-rk361-s04 systemd-udevd[19533]: p0v53: Worker 
[1716040] failed
  Mar  3 06:53:19 tok1-qz1-sr5-rk361-s04 /usr/sbin/fpcStealConsole[19295]: 
/usr/sbin/event-daemonCoreDump.sh Wrote 
/var/crash/userfiles/core.systemd-udevd.1716040.6
  Mar  3 06:53:19 tok1-qz1-sr5-rk361-s04 /usr/sbin/fpcStealConsole[19295]: 
/usr/sbin/event-daemonCoreDump.sh: Sent log 2A5A0092: Process 
systemd-udevd.1716040 core dumped (6).
  Mar  3 06:53:19 tok1-qz1-sr5-rk361-s04 kernel: I4 IDP2 578b7bb 0024/361-s04 
IN=vlan1.53 OUT= MAC=02:cc:ef:80:d1:11:52:70:89:51:de:08:08:00:45:00:00:45 
SRC=192.168.81.0