On Mon, 27 Apr 2026, Quentin Thébault wrote:

Hello,

I have been running 15-STABLE for a few weeks and since around three weeks ago 
I have had to come back to an older BE (stable/15-n283067-e9d3512bb587) because 
I get an amdgpu-related panic on boot.

Which stable/15 hashes have you tried since e9d3512bb587?
e9d3512bb587 is from April 21 so not really "a few weeks old"?

There were a lot of MFCs in the last week, so having a more precise hash which 
crashed would help.


You will find attached two crashinfos:
- core.txt.8 is with drm-66-kmod built from the ports tree,
KDB: stack backtrace:
#0 0xffffffff80bcf71d at kdb_backtrace+0x5d
#1 0xffffffff80b80316 at vpanic+0x136
#2 0xffffffff80b801d3 at panic+0x43
#3 0xffffffff81091b88 at trap_fatal+0x68
#4 0xffffffff81067b78 at calltrap+0x8
#5 0xffffffff80e0832a at lkpi_devm_device_add_group+0x3a

That function itself hasn't been changed in ages, so I would assume
that your src.git and drm-kmod is out of sync somehow.  Was the ports
tree updated before?

In only thing which changed related to this function is the binattr change:
main: April 1st      5bb0f63020669bd3675c651ba7745fc4356edc1a
stable/15: April 21  901aec0a855b5c69b327e423558fbcc001781805.

The matching drm-kmod master change was committed and backed out:

drm-kmod master:
Feb  12 d99f16276e06736ceff9ea3ae5a1ac09135bcfdc add initially
March 5 f287ee46b250e8b59805e1c0953c4aba2c0866c9 revert
April 1 c7b061f5a322994292e93bc1c09418abbe915097 re-add

but it never made it to drm-6.6 so should not have had eny effect there.

It would be good to (a) have the panic message, and (b) if you could look up
line numbers (on main this is in sysfs.h which indicates my initial guess was 
right).


#6 0xffffffff84615ab2 at amdgpu_device_init+0x1cf2
#7 0xffffffff84636fc6 at amdgpu_driver_load_kms+0x16
#8 0xffffffff846268df at amdgpu_pci_probe+0x29f
#9 0xffffffff80e10dcc at linux_pci_attach_device+0x57c
#10 0xffffffff80bbe5cd at device_attach+0x43d
#11 0xffffffff80bc0323 at bus_generic_driver_added+0x73
#12 0xffffffff80bbbb89 at devclass_driver_added+0x29
#13 0xffffffff80bbbb1e at devclass_add_driver+0x11e
#14 0xffffffff80e11e2c at _linux_pci_register_driver+0xcc
#15 0xffffffff84626633 at amdgpu_evh+0x73
#16 0xffffffff80b5a5a5 at module_register_init+0x85
#17 0xffffffff80b4b12f at linker_load_module+0xc0f

- core.txt.9 is with drm-latest-kmod from pkg.
KDB: stack backtrace:
#0 0xffffffff80bcf71d at kdb_backtrace+0x5d
#1 0xffffffff80b80316 at vpanic+0x136
#2 0xffffffff80b801d3 at panic+0x43
#3 0xffffffff81091fcd at trap_pfault+0x37d
#4 0xffffffff81067b78 at calltrap+0x8
#5 0xffffffff80e1f997 at xa_load+0x77

This one is even more strange as the last changes were in 2024/2023, and 
otherwise only to make arguments const in the radix tree but otherwise some of 
that code goes back untouched to 2011.

#6 0xffffffff84536966 at drm_sched_entity_pop_job+0x86
#7 0xffffffff84535445 at drm_sched_run_job_work+0x215
#8 0xffffffff80e1ece4 at linux_work_fn+0xe4
#9 0xffffffff80be6212 at taskqueue_run_locked+0x182
#10 0xffffffff80be73e2 at taskqueue_thread_loop+0xc2
#11 0xffffffff80b36b5b at fork_exit+0x7b
#12 0xffffffff81068b9e at fork_trampoline+0xe

Is this a know bug? Is there something I can do?

Can you provide the two core.txt files (happily privately)?

Alternatively I'd suggest to try 15.1-BETA1 to come out any moment (if it 
hasn't already) as that'd give us a clear synch point.


--
Bjoern A. Zeeb                                                     r15:7

Reply via email to