On Tue, Jul 01, 2025 at 06:18:37PM +0200, Jeremie Courreges-Anglas wrote:
> On Tue, Jun 24, 2025 at 05:21:56PM +0200, Jeremie Courreges-Anglas wrote:
> >
> > I think it's uvm_purge(), as far as I can see it happens when building
> > rust with cvs up -D2025/06/04 in /sys, not with -D2025/06/03. Maybe I
> > missed lang/rust when testing the diff.
> >
> > This is with additional MP_LOCKDEBUG support for mutexes, and
> > __mp_lock_spinout = 50L * INT_MAX.
> >
> > Suggested by claudio: tr /t 0t269515 fails.
>
> Should be fixed, at least for CPUs that ddb actually managed to
> stop...
>
> > WITNESS doesn't flag an obvious lock ordering issue. I'm not even
> > sure there is one. It also happen with CPU_MAX_BUSY_CYCLES == 64.
> >
> > Maybe we're still hammering too much the locks? Input and ideas to
> > test welcome. Right now I'm running with just uvm_purge() reverted.
>
> Reverting uvm_purge() did not help. I've been able to reproduce the
> hangs up to cvs up -D2025/05/18, backporting the arm64 ddb trace fix,
> mpi's mutex backoff diff and the mtx_enter MP_LOCKDEBUG diff.
> Currently trying to reproduce with cvs up -D2025/05/07 as suggested by
> dlg.
cvs up -D2025/05/07 also did not help. -current still locks up, but
the db_mutex fix helps get proper stacktraces:
--8<--
login: mtf_fnxe0:00nf2ffmrte8 0m21xt2 8xlef_ cf2nktft esl8r_u0f k00t sxor2f0
0tf
8oeff
ftff
efStopped at mtx_enter+0x13c: ldr x26, [x25,#2376]
ddb{8}> mach cpuinfo
0: stopped
1: stopped
2: stopped
3: stopped
4: stopped
5: stopped
6: stopped
7: stopped
* 8: ddb
9: stopped
ddb{8}> mach ddb 0
Stopped at aplintc_fiq_handler+0x6c: b ffffff8000be553c <aplin
tc_fiq_handler+0x7c>
ddb{0}> tr
db_enter() at aplintc_fiq_handler+0x68
aplintc_fiq_handler() at arm_cpu_fiq+0x44
arm_cpu_fiq() at handle_el1h_fiq+0x68
handle_el1h_fiq() at mtx_enter+0x10c
mtx_enter() at uvm_pmr_getpages+0x16c
uvm_pmr_getpages() at uvm_pmr_cache_alloc+0x74
uvm_pmr_cache_alloc() at uvm_pmr_cache_get+0x114
uvm_pmr_cache_get() at uvm_pagealloc+0xc4
uvm_pagealloc() at uvmfault_promote+0xac
uvmfault_promote() at uvm_fault_lower+0x2e4
uvm_fault_lower() at uvm_fault+0x158
uvm_fault() at udata_abort+0x128
udata_abort() at do_el0_sync+0x100
do_el0_sync() at handle_el0_sync+0x70
handle_el0_sync() at 0x4c7b735f4
--- trap ---
end of kernel
ddb{0}> mach ddb 1
Stopped at aplintc_fiq_handler+0x6c: b ffffff8000be553c <aplin
tc_fiq_handler+0x7c>
ddb{1}> tr
db_enter() at aplintc_fiq_handler+0x68
aplintc_fiq_handler() at arm_cpu_fiq+0x44
arm_cpu_fiq() at handle_el1h_fiq+0x68
handle_el1h_fiq() at sys_sched_yield+0x3c
sys_sched_yield() at svc_handler+0x450
svc_handler() at do_el0_sync+0x1cc
do_el0_sync() at handle_el0_sync+0x70
handle_el0_sync() at 0xb05ca7fdc
--- trap ---
end of kernel
ddb{1}> mach ddb 2
Stopped at aplintc_fiq_handler+0x6c: b ffffff8000be553c <aplin
tc_fiq_handler+0x7c>
ddb{2}> tr
db_enter() at aplintc_fiq_handler+0x68
aplintc_fiq_handler() at arm_cpu_fiq+0x44
arm_cpu_fiq() at handle_el1h_fiq+0x68
handle_el1h_fiq() at remrunqueue+0xfc
remrunqueue() at sched_chooseproc+0x110
sched_chooseproc() at mi_switch+0xf4
mi_switch() at sys_sched_yield+0x88
sys_sched_yield() at svc_handler+0x450
svc_handler() at do_el0_sync+0x1cc
do_el0_sync() at handle_el0_sync+0x70
handle_el0_sync() at 0xb05ca7fdc
--- trap ---
end of kernel
ddb{2}> mach ddb 3
Stopped at aplintc_fiq_handler+0x6c: b ffffff8000be553c <aplin
tc_fiq_handler+0x7c>
ddb{3}> tr
db_enter() at aplintc_fiq_handler+0x68
aplintc_fiq_handler() at arm_cpu_fiq+0x44
arm_cpu_fiq() at handle_el1h_fiq+0x68
handle_el1h_fiq() at mtx_enter+0x10c
mtx_enter() at sys_sched_yield+0x68
sys_sched_yield() at svc_handler+0x450
svc_handler() at do_el0_sync+0x1cc
do_el0_sync() at handle_el0_sync+0x70
handle_el0_sync() at 0xb05ca7fdc
--- trap ---
end of kernel
ddb{3}> mach ddb 4
Stopped at aplintc_fiq_handler+0x6c: b ffffff8000be553c <aplin
tc_fiq_handler+0x7c>
ddb{4}> tr
db_enter() at aplintc_fiq_handler+0x68
aplintc_fiq_handler() at arm_cpu_fiq+0x44
arm_cpu_fiq() at handle_el1h_fiq+0x68
handle_el1h_fiq() at mtx_enter+0x10c
mtx_enter() at wakeup_n+0x44
wakeup_n() at uvm_pmr_freepages+0x1f0
uvm_pmr_freepages() at uvm_anfree_list+0xec
uvm_anfree_list() at amap_wipeout+0xfc
amap_wipeout() at uvm_unmap_detach+0x94
uvm_unmap_detach() at sys_munmap+0x15c
sys_munmap() at svc_handler+0x450
svc_handler() at do_el0_sync+0x1cc
do_el0_sync() at handle_el0_sync+0x70
handle_el0_sync() at 0x4277ecffc
--- trap ---
end of kernel
ddb{4}> mach ddb 5
Stopped at aplintc_fiq_handler+0x6c: b ffffff8000be553c <aplin
tc_fiq_handler+0x7c>
ddb{5}> tr
db_enter() at aplintc_fiq_handler+0x68
aplintc_fiq_handler() at arm_cpu_fiq+0x44
arm_cpu_fiq() at handle_el1h_fiq+0x68
handle_el1h_fiq() at mtx_enter+0x10c
mtx_enter() at preempt+0x20
preempt() at ast+0x98
ast() at handle_el0_sync+0x9c
handle_el0_sync() at 0x16e30386bc
--- trap ---
end of kernel
ddb{5}> mach ddb 6
Stopped at aplintc_fiq_handler+0x6c: b ffffff8000be553c <aplin
tc_fiq_handler+0x7c>
ddb{6}> tr
db_enter() at aplintc_fiq_handler+0x68
aplintc_fiq_handler() at arm_cpu_fiq+0x44
arm_cpu_fiq() at handle_el1h_fiq+0x68
handle_el1h_fiq() at mtx_enter+0x10c
mtx_enter() at preempt+0x20
preempt() at ast+0x98
ast() at handle_el0_sync+0x9c
handle_el0_sync() at 0xb05cb3094
--- trap ---
end of kernel
ddb{6}> mach ddb 7
Stopped at aplintc_fiq_handler+0x6c: b ffffff8000be553c <aplin
tc_fiq_handler+0x7c>
ddb{7}> tr
db_enter() at aplintc_fiq_handler+0x68
aplintc_fiq_handler() at arm_cpu_fiq+0x44
arm_cpu_fiq() at handle_el1h_fiq+0x68
handle_el1h_fiq() at exuartcnputc+0x44
exuartcnputc() at cnputc+0x40
cnputc() at db_putchar+0x2bc
db_putchar() at kprintf+0xdd0
kprintf() at db_printf+0x78
db_printf() at mtx_enter+0x134
mtx_enter() at uvm_pmr_freepages+0xc8
uvm_pmr_freepages() at uvm_anfree_list+0xec
uvm_anfree_list() at amap_wiperange_chunk+0x104
amap_wiperange_chunk() at amap_wiperange+0x180
amap_wiperange() at amap_pp_adjref+0x25c
amap_pp_adjref() at amap_adjref_anons+0xf0
amap_adjref_anons() at uvm_unmap_detach+0x94
uvm_unmap_detach() at sys_munmap+0x15c
sys_munmap() at svc_handler+0x450
svc_handler() at do_el0_sync+0x1cc
do_el0_sync() at handle_el0_sync+0x70
handle_el0_sync() at 0x4ebf95ffc
--- trap ---
end of kernel
ddb{7}> mach ddb 8
Stopped at mtx_enter+0x13c: ldr x26, [x25,#2376]
ddb{8}> tr
db_enter() at mtx_enter+0x138
mtx_enter() at uvm_pmr_getpages+0x16c
uvm_pmr_getpages() at uvm_pmr_cache_alloc+0x74
uvm_pmr_cache_alloc() at uvm_pmr_cache_get+0x114
uvm_pmr_cache_get() at uvm_pagealloc+0xc4
uvm_pagealloc() at uvmfault_promote+0xac
uvmfault_promote() at uvm_fault_lower+0x2e4
uvm_fault_lower() at uvm_fault+0x158
uvm_fault() at udata_abort+0x128
udata_abort() at do_el0_sync+0x100
do_el0_sync() at handle_el0_sync+0x70
handle_el0_sync() at __ALIGN_SIZE+0x48d8044
--- trap ---
end of kernel
ddb{8}> mach ddb 9
Stopped at aplintc_fiq_handler+0x6c: b ffffff8000be553c <aplin
tc_fiq_handler+0x7c>
ddb{9}> tr
db_enter() at aplintc_fiq_handler+0x68
aplintc_fiq_handler() at arm_cpu_fiq+0x44
arm_cpu_fiq() at handle_el1h_fiq+0x68
handle_el1h_fiq() at exuartcnputc+0x44
exuartcnputc() at cnputc+0x40
cnputc() at db_putchar+0x2bc
db_putchar() at kprintf+0xdd0
kprintf() at db_printf+0x78
db_printf() at mtx_enter+0x134
mtx_enter() at uvm_pmr_getpages+0x16c
uvm_pmr_getpages() at uvm_pmr_cache_alloc+0x74
uvm_pmr_cache_alloc() at uvm_pmr_cache_get+0x114
uvm_pmr_cache_get() at uvm_pagealloc+0xc4
uvm_pagealloc() at uvmfault_promote+0xac
uvmfault_promote() at uvm_fault_lower+0x2e4
uvm_fault_lower() at uvm_fault+0x158
uvm_fault() at udata_abort+0x128
udata_abort() at do_el0_sync+0x100
do_el0_sync() at handle_el0_sync+0x70
handle_el0_sync() at __ALIGN_SIZE+0x37af29c
--- trap ---
end of kernel
ddb{9}> show all locks
CPU 4:
exclusive mutex &uvm.fpageqlock r = 0 (0xffffff80012f8e38)
CPU 2:
exclusive mutex &sched_lock r = 0 (0xffffff80012615f8)
CPU 1:
exclusive mutex &pr->ps_mtx r = 0 (0xffffff8119fede98)
Process 63925 (c++) thread 0xffffff81118dc088 (139279)
exclusive rwlock amaplk r = 0 (0xffffff81186ea920)
shared rwlock vmmaplk r = 0 (0xffffff8055613c40)
Process 45702 (c++) thread 0xffffff80ffcaef78 (100747)
exclusive rwlock amaplk r = 0 (0xffffff81137b5410)
Process 56051 (c++) thread 0xffffff81118dc820 (416882)
exclusive rwlock amaplk r = 0 (0xffffff8113c8ead0)
shared rwlock vmmaplk r = 0 (0xffffff810f232e88)
Process 1137 (c++) thread 0xffffff81029762d8 (371891)
exclusive rwlock amaplk r = 0 (0xffffff80540e6cc0)
shared rwlock vmmaplk r = 0 (0xffffff80556132e0)
Process 95468 (c++) thread 0xffffff81118dd240 (227740)
exclusive rwlock amaplk r = 0 (0xffffff811ab40078)
ddb{9}> show witness /b
Number of known direct relationships is 442
Lock order reversal between "&mp->mnt_lock"(rwlock) and "&ip->i_lock"(rrwlock)!
witness: incomplete path, depth 4
ddb{9}> ps /o
TID PID UID PRFLAGS PFLAGS CPU COMMAND
139279 63925 55 0x3 0 0 c++
57313 41142 55 0x2000003 0x4000000 5 rustc
100747 45702 55 0x3 0 4 c++
416882 56051 55 0x3 0 8 c++
*371891 1137 55 0x3 0 9 c++
227740 95468 55 0x3 0 7 c++
344959 21226 55 0x2000003 0x4000000 3 rustc
418483 21226 55 0x2000003 0x4000000 1 rustc
69626 21226 55 0x2000003 0x4000000 6 rustc
ddb{9}> ps
PID TID PPID UID S FLAGS WAIT COMMAND
63925 139279 79490 55 7 0x3 c++
79490 171954 36022 55 3 0x10008b sigsusp sh
41142 134365 40336 55 3 0x2000083 fsleep rustc
41142 302975 40336 55 3 0x6000083 piperd rustc
41142 57313 40336 55 7 0x6000003 rustc
40336 347395 15169 55 3 0x2000083 wait rustc
45702 100747 64449 55 7 0x3 c++
64449 250642 36022 55 3 0x10008b sigsusp sh
56051 416882 47805 55 7 0x3 c++
47805 1643 36022 55 3 0x10008b sigsusp sh
* 1137 371891 29510 55 7 0x3 c++
29510 131639 36022 55 3 0x10008b sigsusp sh
95468 227740 67080 55 7 0x3 c++
67080 286525 36022 55 3 0x10008b sigsusp sh
21226 108617 3494 55 3 0x2000083 fsleep rustc
21226 510541 3494 55 3 0x6000083 piperd rustc
21226 489915 3494 55 3 0x6000083 fsleep rustc
21226 67149 3494 55 3 0x6000083 piperd rustc
21226 48866 3494 55 3 0x6000083 fsleep rustc
21226 344959 3494 55 7 0x6000003 rustc
21226 418483 3494 55 7 0x6000003 rustc
21226 380395 3494 55 2 0x6000003 rustc
21226 69626 3494 55 7 0x6000003 rustc
3494 361736 15169 55 3 0x2000083 wait rustc
15169 47593 56577 55 3 0x2000083 fsleep cargo
15169 55128 56577 55 3 0x6000083 piperd cargo
15169 456809 56577 55 3 0x6000083 kqread cargo
15169 45734 56577 55 3 0x6000083 kqread cargo
56577 460218 94706 55 3 0x2000083 piperd bootstrap
94706 144930 19294 55 3 0x2200083 wait python3.12
19294 227744 46195 55 3 0x10008b sigsusp make
36022 189017 46616 55 3 0x8b kqread ninja
46616 494547 5179 55 3 0x83 kqread cmake
5179 41695 49137 55 3 0x10008b sigsusp make
46195 139722 78463 1000 3 0x10008b sigsusp make
78463 339800 44112 1000 3 0x10008b sigsusp sh
49137 166591 32884 1000 3 0x10008b sigsusp make
32884 130499 19324 1000 3 0x10008b sigsusp sh
19324 2171 13719 1000 3 0x10008b sigsusp make
44112 82636 13750 1000 3 0x10008b sigsusp make
13719 280625 37164 1000 3 0x10008b sigsusp ksh
13750 311794 37164 1000 3 0x10008b sigsusp ksh
37164 271851 1 1000 3 0x100080 kqread tmux
70208 65461 38741 1000 3 0x100083 kqread tmux
72473 190992 1 1000 3 0x1000b8 kqread ssh-agent
38741 398367 4112 1000 3 0x10008b sigsusp ksh
4112 49829 24648 1000 3 0x98 kqread sshd-session
24648 276484 70750 0 3 0x92 kqread sshd-session
64792 38488 1 0 3 0x100083 ttyin getty
13658 311859 1 0 3 0x100083 ttyin getty
93601 440231 1 0 3 0x100083 ttyin getty
63470 459539 1 0 3 0x100083 ttyin getty
96609 469760 1 0 3 0x100083 ttyin getty
30529 126857 1 0 3 0x100083 ttyin getty
25133 315686 1 0 3 0x100098 kqread cron
63695 355503 1 99 3 0x1100090 kqread sndiod
21489 342535 1 110 3 0x100090 kqread sndiod
2773 26068 87190 95 3 0x1100092 kqread smtpd
50414 65728 87190 103 3 0x1100092 kqread smtpd
26666 492754 87190 95 3 0x1100092 kqread smtpd
15809 384184 87190 95 3 0x100092 kqread smtpd
6296 50132 87190 95 3 0x1100092 kqread smtpd
62965 453772 87190 95 3 0x1100092 kqread smtpd
87190 36758 1 0 3 0x100080 kqread smtpd
70750 520713 1 0 3 0x88 kqread sshd
88177 217785 64329 0 3 0x80 nfsd nfsd
19091 209499 64329 0 3 0x80 nfsd nfsd
16547 436525 64329 0 3 0x80 nfsd nfsd
78468 310707 64329 0 3 0x80 nfsd nfsd
64329 34190 1 0 3 0x80 netacc nfsd
44169 309817 7798 0 3 0x80 kqread mountd
7798 3741 1 0 3 0x100080 kqread mountd
41132 247300 1 28 3 0x1100090 kqread portmap
41102 15370 59220 83 3 0x100092 kqread ntpd
59220 80027 79319 83 3 0x1100092 kqread ntpd
79319 186570 1 0 3 0x100080 kqread ntpd
72869 395225 46771 74 3 0x1100092 bpf pflogd
46771 417638 1 0 3 0x80 sbwait pflogd
94369 233003 28011 73 3 0x1100090 kqread syslogd
28011 381388 1 0 3 0x100082 sbwait syslogd
81963 219622 1 0 3 0x100080 kqread resolvd
88462 439074 21502 77 3 0x100092 kqread dhcpleased
59766 25595 21502 77 3 0x100092 kqread dhcpleased
21502 369741 1 0 3 0x80 kqread dhcpleased
556 135670 52238 115 3 0x100092 kqread slaacd
73789 109490 52238 115 3 0x100092 kqread slaacd
52238 190284 1 0 3 0x100080 kqread slaacd
41668 73399 0 0 3 0x14200 bored wsdisplay0
40970 505250 0 0 3 0x14200 bored apple-dcp-afkep%02x
97767 350383 0 0 3 0x14200 bored apple-dcp-afkep%02x
47220 373849 0 0 3 0x14200 bored apple-dcp-afkep%02x
88568 200274 0 0 3 0x14200 bored drmrtk
6787 522175 0 0 3 0x14200 bored smr
40522 9485 0 0 3 0x14200 pgzero zerothread
16498 109981 0 0 3 0x14200 aiodoned aiodoned
59611 103710 0 0 3 0x14200 syncer update
56156 385464 0 0 3 0x14200 cleaner cleaner
70501 345835 0 0 3 0x14200 reaper reaper
46109 83168 0 0 3 0x14200 pgdaemon pagedaemon
22925 161998 0 0 3 0x14200 usbtsk usbtask
63513 64083 0 0 3 0x14200 usbatsk usbatsk
18728 349413 0 0 3 0x14200 bored drmtskl
98853 157550 0 0 3 0x14200 bored drmlwq
39410 237152 0 0 3 0x14200 bored drmlwq
92541 76851 0 0 3 0x14200 bored drmlwq
47164 197379 0 0 3 0x14200 bored drmlwq
9853 127153 0 0 3 0x14200 bored drmubwq
18545 398258 0 0 3 0x14200 bored drmubwq
38466 331523 0 0 3 0x14200 bored drmubwq
16049 277430 0 0 3 0x14200 bored drmubwq
50444 316366 0 0 3 0x14200 bored drmhpwq
94598 67921 0 0 3 0x14200 bored drmhpwq
77503 379929 0 0 3 0x14200 bored drmhpwq
8860 448340 0 0 3 0x14200 bored drmhpwq
52476 388513 0 0 3 0x14200 bored drmwq
185 26590 0 0 3 0x14200 bored drmwq
45860 442452 0 0 3 0x14200 bored drmwq
64786 112562 0 0 3 0x14200 bored drmwq
58525 179335 0 0 3 0x14200 bored sensors
68906 400714 0 0 3 0x14200 bored sleep
92770 116939 0 0 3 0x40014200 idle9
54583 422906 0 0 3 0x40014200 idle8
93235 264602 0 0 3 0x40014200 idle7
23847 130999 0 0 3 0x40014200 idle6
87396 374260 0 0 3 0x40014200 idle5
67773 146451 0 0 3 0x40014200 idle4
664 465524 0 0 3 0x40014200 idle3
8634 372189 0 0 3 0x40014200 idle2
85417 214509 0 0 3 0x40014200 idle1
8301 414624 0 0 3 0x14200 bored softnet3
25868 484464 0 0 3 0x14200 bored softnet2
75194 231167 0 0 3 0x14200 bored softnet1
90163 153430 0 0 3 0x14200 bored softnet0
13057 294652 0 0 3 0x14200 bored systqmp
61843 256983 0 0 3 0x14200 bored systq
44957 446138 0 0 3 0x14200 tmoslp softclockmp
79423 156384 0 0 3 0x40014200 tmoslp softclock
40932 356300 0 0 3 0x40014200 idle0
80938 484135 0 0 3 0x14200 kmalloc kmthread
1 506573 0 0 3 0x82 wait init
0 0 -1 0 3 0x10200 scheduler swapper
ddb{9}>
-->8--
--
jca