Update: I've contacted the mail addresses from the Kai-Heng's post. Michael Chan (from Broadcom) replied that they've seen similar issues on other AMD systems and that they were working with AMD to resolve this. The plan was to establish contact between me and AMD, unfortunately this never happened. The attempt to contact AMD via the official way (tech support) failed because I could not answer AMD's questions without feedback from Broadcom, who then also did not reply anymore.
Workaround: Luckily, with the information that came out of the conversation with Broadcom, I was able to troubleshoot a bit myself since I knew at least somewhat where to look. It appears that by setting Advanced -> NB Configuration -> IOMMU to "disabled" (default is "Auto") in Supermicro BIOS the problem does not occur anymore. Since then the whole topic is "stuck". It's just a workaround and not really a fix, but at least servers running stable now for me. Since I don't know where the actual problem is (whether in AMD hardware, bios, kernel, or whatever) so I can't say if this bug report can be marked as closed or not. -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1931106 Title: bnxt_en NIC driver crashes IO_PAGE_FAULT Status in linux package in Ubuntu: Incomplete Bug description: Hi all, We received a bunch of new servers with a Supermicro H12SSL-NT mainboard that has an embedded Broadcom BCM57416 NIC. On all those servers we observe crashes of the NIC driver (bnxt_en) from time to time. We're not able to manually reproduce this issue, it just occurs at some point. Also our monitoring does not show any irregularities(high traffic flow or sth. like this). Syslog: https://pastebin.com/yDAyjHvF All servers are running with up-to-date packages: $ lsb_release -rd Description: Ubuntu 20.04.2 LTS Release: 20.04 $ uname -r 5.4.0-73-generic It also happens on older kernel versions (tested 5.4.0-66) as well as the HWE kernel (tested 5.8.0-55). Thanks in advance. ~ Roman To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1931106/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp