[Kernel-packages] [Bug 2052663] Re: fabric-manager-535 setup fails during install on Grace/Hopper arm64 system running noble

2024-04-24 Thread Mitchell Augustin
This bug no longer appears to be reproducible on noble with the 6.8 generic kernels, so I have marked it as resolved. -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/2052663 Title:

[Kernel-packages] [Bug 2052663] Re: fabric-manager-535 setup fails during install on Grace/Hopper arm64 system running noble

2024-04-24 Thread Mitchell Augustin
** Changed in: fabric-manager-535 (Ubuntu) Assignee: (unassigned) => Mitchell Augustin (mitchellaugustin) ** Changed in: linux (Ubuntu) Assignee: (unassigned) => Mitchell Augustin (mitchellaugustin) ** Changed in: nvidia-graphics-drivers-535-server (Ubuntu) Assignee: (unas

[Kernel-packages] [Bug 2058557] Re: Kernel panic during checkbox stress_ng_test on Grace running noble 6.8 (arm64+largemem) kernel

2024-04-09 Thread Mitchell Augustin
Fix has landed upstream: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/fs/aio.c?h=v6.9-rc3=caeb4b0a11b3393e43f7fa8e0a5a18462acc66bd -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu.

[Kernel-packages] [Bug 2058557] Re: Kernel panic during checkbox stress_ng_test on Grace running noble 6.8 (arm64+largemem) kernel

2024-04-01 Thread Mitchell Augustin
A fix has been applied to vfs.fixes upstream and should land soon. I have tested this patch and verified that the panic no longer occurs. ** Changed in: linux (Ubuntu) Status: New => Fix Committed -- You received this bug notification because you are a member of Kernel Packages, which is

[Kernel-packages] [Bug 2058557] Re: Kernel panic during checkbox stress_ng_test on Grace running noble 6.8 (arm64+largemem) kernel

2024-03-28 Thread Mitchell Augustin
This issue is still present upstream, so I reported it to the original committer of the patch. -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/2058557 Title: Kernel panic during checkbox

[Kernel-packages] [Bug 2058557] Re: Kernel panic during checkbox stress_ng_test on Grace running noble 6.8 (arm64+largemem) kernel

2024-03-28 Thread Mitchell Augustin
I have isolated the cause of this bug to this commit: https://git.launchpad.net/~ubuntu- kernel/ubuntu/+source/linux/+git/noble/commit/?h=Ubuntu-6.8.0-20.20=71eb6b6b0ba93b1467bccff57b5de746b09113d2 All versions that I tested before this commit during my bisect passed the aiol test at least 15

[Kernel-packages] [Bug 2058557] Re: Kernel panic during checkbox stress_ng_test on Grace running noble 6.8 (arm64+largemem) kernel

2024-03-26 Thread Mitchell Augustin
It turns out that this issue does not appear with *every* run of the aiol test on affected kernels, so multiple runs of that test may be necessary for the panic to occur. -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu.

[Kernel-packages] [Bug 2058557] Re: Kernel panic during checkbox stress_ng_test on Grace running noble 6.8 (arm64+largemem) kernel

2024-03-25 Thread Mitchell Augustin
I did some more version testing, and I have not been able to reproduce this bug with the "aiol" stressor on either Upstream 6.5 or Ubuntu 6.5.0-26-generic-64k, so it was evidently introduced after that version. -- You received this bug notification because you are a member of Kernel Packages,

[Kernel-packages] [Bug 2058557] Re: Kernel panic during checkbox stress_ng_test on Grace running noble 6.8 (arm64+largemem) kernel

2024-03-22 Thread Mitchell Augustin
Earlier, I said that the device mapper observation did not seem to be a hard line - however, further testing now indicates that the situations where I observed panics when stressing nvme0n1 were due to an unrelated bug that is present in the latest 6.5 mainline tree, but *not* the latest 6.5

[Kernel-packages] [Bug 2058557] Re: Kernel panic during checkbox stress_ng_test on Grace running noble 6.8 (arm64+largemem) kernel

2024-03-22 Thread Mitchell Augustin
I did not observe this issue with any other stress_ng disk tests on linux-image-6.8.0-11-generic-64k after 1 full run of the suite with the "aiol" test disabled. (When running the "aiol" test alone, it panicked reliably each time.) -- You received this bug notification because you are a member

[Kernel-packages] [Bug 2058557] Re: Kernel panic during checkbox stress_ng_test on Grace running noble 6.8 (arm64+largemem) kernel

2024-03-21 Thread Mitchell Augustin
Upon further investigation, the device mapper observation does not seem to be a hard line, as I was able to observe panics when stressing both dm-0 and nvme0n1 under different circumstances. At the moment, it also seems like the specific part of stress_ng_test that is the culprit is the

[Kernel-packages] [Bug 2058557] Re: Kernel panic during checkbox stress_ng_test on Grace running noble 6.8 (arm64+largemem) kernel

2024-03-21 Thread Mitchell Augustin
I have observed that this panic does not seem to happen when stressing non-device-mapper devices (ex: it panics when running /usr/lib/checkbox- provider-base/bin/stress_ng_test.py disk --device dm-0 --base-time 240, but completes successfully when running /usr/lib/checkbox-provider-

[Kernel-packages] [Bug 2058557] Re: Kernel panic during checkbox stress_ng_test on Grace running noble 6.8 (arm64+largemem) kernel

2024-03-20 Thread Mitchell Augustin
This is also reproducible on the latest mainline version (https://kernel.ubuntu.com/mainline/v6.8/arm64/, retrieved 20 Mar 2024 @ 5 PM): 20 Mar 22:54: Running stress-ng aiol stressor for 240 seconds... [ 354.451450] Unable to handle kernel paging request at virtual address 17be9b4aa3e187be [

[Kernel-packages] [Bug 2058557] [NEW] Kernel panic during checkbox stress_ng_test on Grace running noble 6.8 (arm64+largemem) kernel

2024-03-20 Thread Mitchell Augustin
Public bug reported: A kernel oops and panic occurred during 22.04 SoC certification on Gunyolk (Grace/Grace) with 6.8 kernel, arm64+largemem variant Steps to reproduce: Run (as root) the following commands: add-apt-repository -y ppa:checkbox-dev/stable apt-add-repository -y

[Kernel-packages] [Bug 2029934] Re: arm64 AWS host hangs during modprobe nvidia on lunar and mantic

2024-02-07 Thread Mitchell Augustin
I identified a similar bug today when installing nvidia- fabricmanager-535 on a noble dev build for arm64 that may be related: https://bugs.launchpad.net/ubuntu/+source/fabric- manager-535/+bug/2052663 -- You received this bug notification because you are a member of Kernel Packages, which is