[Bug 1672521] Re: ThunderX: soft lockup on 4.8+ kernels

2017-03-21 Thread Alexandru Avadanii
Hi, Dann, First of all, I think the bug title is misleading, as this issue happens on all kernels we tested (4.4.0-45..66, 4.8.0-x, 4.10.0-x etc). To be fair, we haven't this exact bug (or at least I don't think we did) in practice, i.e. without running stress-ng, 4.4.0-x never ever crashed.

[Bug 1672521] Re: ThunderX: soft lockup on 4.8+ kernels

2017-03-21 Thread dann frazier
@Alexandru: do you have a console log of a system hitting the issue w/ the VM use case? Soft lockups are a fairly generic failure mode, and it would not surprise me if stress-ng was triggering a different issue than the VM case, but both emitting soft lockups. -- You received this bug

[Bug 1672521] Re: ThunderX: soft lockup on 4.8+ kernels

2017-03-14 Thread Alexandru Avadanii
4.11-rc1 console log attached. Board firmware is latest available on Gigabyte's site (T31). 1. Install 4.11-rc1 (`make modules_install install`) and reboot 2. Observe networking driver issues in boot log Dmesg: 4.11-rc1_dmesg_on_clean_boot.log [3] 3. Try `ping google.com`, obviously not

[Bug 1672521] Re: ThunderX: soft lockup on 4.8+ kernels

2017-03-14 Thread Ciprian Barbu
Hi, The same bug happened again on a similar board with T27 firmware, but this time running kernel 4.4.0-45-generic. I'm attaching log with serial console (with debug info from the FW). I can't attach more because the kernel hanged. So far 4.4.0-45-generic was stable on our lab, this happened

[Bug 1672521] Re: ThunderX: soft lockup on 4.8+ kernels

2017-03-14 Thread Ciprian Barbu
Just one addition, the log before contains dmesg output too. The task that hanged was systemd, it might be related with some VMs from the previous boot record being restarted automatically, but it still doesn't explain the crash. Rebooting the node again with 4.4 did not result in kernel crash.

[Bug 1672521] Re: ThunderX: soft lockup on 4.8+ kernels

2017-03-14 Thread Alexandru Avadanii
Hi, I tried out 4.11-rc1 a few days ago. Unfortunately, I did not get the board to boot properly from the start, since ThunderX networking drivers failed to allocate MSI-X/MSI interrupts, and polling on some registers also failed ... So, with 4.11-rc1, at least one networking interfaces was

[Bug 1672521] Re: ThunderX: soft lockup on 4.8+ kernels

2017-03-14 Thread Joseph Salisbury
If the mainline kernel still exhibits the bug, we can perform a kernel bisect to identify what commit introduced the regression. If the mainline kernel fixes the bug, we can perform a "Reverse" bisect to identify the fix. ** Also affects: linux (Ubuntu Zesty) Importance: High Status:

[Bug 1672521] Re: ThunderX: soft lockup on 4.8+ kernels

2017-03-14 Thread Joseph Salisbury
Would it be possible for you to test the latest upstream kernel? Refer to https://wiki.ubuntu.com/KernelMainlineBuilds . Please test the latest v4.11 kernel[0]. If this bug is fixed in the mainline kernel, please add the following tag 'kernel-fixed-upstream'. If the mainline kernel does not fix

[Bug 1672521] Re: ThunderX: soft lockup on 4.8+ kernels

2017-03-13 Thread Alexandru Avadanii
apport information ** Tags added: apport-collected xenial ** Description changed: I have been trying to easily reproduce this for days. We initially observed it in OPNFV Armband, when we tried to upgrade our Ubuntu Xenial installation kernel to linux-image-generic-hwe-16.04 (4.8). In