Vitaliy Makkoveev writes:
> > On 17 May 2024, at 12:06, Stuart Henderson <[email protected]> =
> wrote:
> >=20
> > There are problems with wg(4) that people with some workloads have =
> been
> > seeing after upgrading past 7.3, though looking at this thread from =
> when
> > it last came up https://marc.info/?t=3D170940892700001&r=3D1&w=3D2 I'm =
> not
> > sure if we'd be expecting to see trouble on non-MP=E2=80=A6
> >=20
>
> We do. The problem is not MP related.
>
> Antony, does the diff [1] help?
>
> 1. https://marc.info/?l=3Dopenbsd-bugs&m=3D170980835807159&w=3D2
Crashes continue to occur with the same frequency after patching.
Here are three more crashes from running with the patch. I've seen
identical traces with and without the patch but these were not in
my last email.
kernel: page fault trap, code=0
Stopped at schedclock+0x8a: movzbl 0x344(%rax),%r13d
ddb> show panic
the kernel did not panic
ddb> trace
schedclock(ffff8000fffeaa68) at schedclock+0x8a
statclock(ffffffff82529bf8,ffff80001ca32a20,0) at statclock+0x129
clockintr_dispatch(ffff80001ca32a20) at clockintr_dispatch+0x30d
clockintr(ffff80001ca32a20) at clockintr+0x59
intr_handler(ffff80001ca32a20,ffff8000000e6000) at intr_handler+0x3c
Xintr_legacy0_untramp() at Xintr_legacy0_untramp+0x1a3
memset() at memset+0x5c
end trace frame: 0x0, count: -7
ddb> ps
PID TID PPID UID S FLAGS WAIT COMMAND
panic: pr_find_pagehead: mbufpl: incorrect page
Stopped at db_enter+0x14: popq %rbp
TID PID UID PRFLAGS PFLAGS CPU COMMAND
db_enter() at db_enter+0x14
panic(ffffffff82161d70) at panic+0xb5
pool_do_put(ffffffff8260b3c0,fffffd8028dbf600) at pool_do_put+0x27a
pool_put(ffffffff8260b3c0,fffffd8028dbf600) at pool_put+0x53
m_free(fffffd8028dbf600) at m_free+0xa6
m_freem(fffffd8028dbf600) at m_freem+0x38
vio_txeof(ffff800000064118) at vio_txeof+0x12d
vio_tx_intr(ffff800000064118) at vio_tx_intr+0x31
virtio_check_vqs(ffff800000024800) at virtio_check_vqs+0x102
virtio_pci_legacy_intr(ffff800000024800) at virtio_pci_legacy_intr+0x65
intr_handler(ffff80001ca7e7f0,ffff800000073e00) at intr_handler+0x3c
Xintr_legacy5_untramp() at Xintr_legacy5_untramp+0x1a3
memset() at memset+0x5c
wg_encap_worker(ffff8000007ed000) at wg_encap_worker+0x79
end trace frame: 0xffff80001ca7e9f0, count: 0
https://www.openbsd.org/ddb.html describes the minimum info required in bug
reports. Insufficient info makes it difficult to find and fix bugs.
ddb> trace
db_enter() at db_enter+0x14
panic(ffffffff82161d70) at panic+0xb5
pool_do_put(ffffffff8260b3c0,fffffd8028dbf600) at pool_do_put+0x27a
pool_put(ffffffff8260b3c0,fffffd8028dbf600) at pool_put+0x53
m_free(fffffd8028dbf600) at m_free+0xa6
m_freem(fffffd8028dbf600) at m_freem+0x38
vio_txeof(ffff800000064118) at vio_txeof+0x12d
vio_tx_intr(ffff800000064118) at vio_tx_intr+0x31
virtio_check_vqs(ffff800000024800) at virtio_check_vqs+0x102
virtio_pci_legacy_intr(ffff800000024800) at virtio_pci_legacy_intr+0x65
intr_handler(ffff80001ca7e7f0,ffff800000073e00) at intr_handler+0x3c
Xintr_legacy5_untramp() at Xintr_legacy5_untramp+0x1a3
memset() at memset+0x5c
wg_encap_worker(ffff8000007ed000) at wg_encap_worker+0x79
taskq_thread(ffff80000088ac00) at taskq_thread+0xf0
end trace frame: 0x0, count: -15
ddb> show panic
*cpu0: pr_find_pagehead: mbufpl: incorrect page
ddb> ps
PID TID PPID UID S FLAGS WAIT COMMAND
56587 470184 85475 0 3 0x18000083 dtread btrace
58952 222967 0 89 3 0x19100092 kqread relayd
83190 101464 0 89 3 0x19100092 kqread relayd
ddb> show registers
rdi 0x4
rsi 0x14
rbp 0xffff80001ca7e4a0
rbx 0xfffffd8028dbf600
rdx 0x3fd
rcx 0x4800000000000111
rax 0x30
r8 0x101010101010101
r9 0
r10 0x582c2a7821cc399f
r11 0xf4834d1e02cdca10
r12 0xfffffd8028dbf600
r13 0xffff800000024800
r14 0
r15 0xffffffff82161d70 pp_r600_decoded_lanes+0xc8aa
rip 0xffffffff81fa1d44 db_enter+0x14
cs 0x8
rflags 0x282
rsp 0xffff80001ca7e4a0
ss 0x10
db_enter+0x14: popq %rbp
panic: pr_find_pagehead: mbufpl: incorrect page
Stopped at db_enter+0x14: popq %rbp
TID PID UID PRFLAGS PFLAGS CPU COMMAND
*225925 73351 0 0x14000 0x200 0 wg_crypt
db_enter() at db_enter+0x14
panic(ffffffff82161d70) at panic+0xb5
pool_do_put(ffffffff8260b3c0,fffffd8035fd9400) at pool_do_put+0x27a
pool_put(ffffffff8260b3c0,fffffd8035fd9400) at pool_put+0x53
m_free(fffffd8035fd9400) at m_free+0xa6
m_freem(fffffd8035fd9400) at m_freem+0x38
vio_txeof(ffff800000064118) at vio_txeof+0x12d
vio_tx_intr(ffff800000064118) at vio_tx_intr+0x31
virtio_check_vqs(ffff800000024800) at virtio_check_vqs+0x102
virtio_pci_legacy_intr(ffff800000024800) at virtio_pci_legacy_intr+0x65
intr_handler(ffff80001c922500,ffff800000073e00) at intr_handler+0x3c
Xintr_legacy5_untramp() at Xintr_legacy5_untramp+0x1a3
memset() at memset+0x5c
wg_encap_worker(ffff8000007ef000) at wg_encap_worker+0x79
end trace frame: 0xffff80001c922700, count: 0
https://www.openbsd.org/ddb.html describes the minimum info required in bug
reports. Insufficient info makes it difficult to find and fix bugs.
ddb> show panic
*cpu0: pr_find_pagehead: mbufpl: incorrect page
ddb> trace
db_enter() at db_enter+0x14
panic(ffffffff82161d70) at panic+0xb5
pool_do_put(ffffffff8260b3c0,fffffd8035fd9400) at pool_do_put+0x27a
pool_put(ffffffff8260b3c0,fffffd8035fd9400) at pool_put+0x53
m_free(fffffd8035fd9400) at m_free+0xa6
m_freem(fffffd8035fd9400) at m_freem+0x38
vio_txeof(ffff800000064118) at vio_txeof+0x12d
vio_tx_intr(ffff800000064118) at vio_tx_intr+0x31
virtio_check_vqs(ffff800000024800) at virtio_check_vqs+0x102
virtio_pci_legacy_intr(ffff800000024800) at virtio_pci_legacy_intr+0x65
intr_handler(ffff80001c922500,ffff800000073e00) at intr_handler+0x3c
Xintr_legacy5_untramp() at Xintr_legacy5_untramp+0x1a3
memset() at memset+0x5c
wg_encap_worker(ffff8000007ef000) at wg_encap_worker+0x79
taskq_thread(ffff800000889080) at taskq_thread+0xf0
end trace frame: 0x0, count: -15
ddb> ps
PID TID PPID UID S FLAGS WAIT COMMAND
51969 144614 37729 0 2 0x18000003 btrace
40841 474945 76353 1000 3 0x810008b sigsusp ksh
76353 455143 78366 1000 3 0x18000098 kqread sshd-session
78366 500790 60748 0 3 0x18000092 kqread sshd-session
1661 483333 93900 89 3 0x19100092 kqread relayd
20971 454162 93900 89 3 0x19100092 kqread relayd
66174 90602 93900 89 3 0x19100092 kqread relayd
48738 445549 93900 89 3 0x19100092 kqread relayd
88711 54303 93900 89 3 0x19100092 kqread relayd
33085 157864 93900 89 2 0x19100012 relayd
36613 263398 93900 89 3 0x19100092 relayd
93900 61929 1 0 3 0x18000080 kqread relayd
58569 410836 1 0 3 0x8100083 ttyin ksh
30102 428727 1 0 3 0x18100098 kqread cron
*73351 225925 0 0 7 0x14200 wg_crypt
25707 237828 0 0 3 0x14200 bored wg_handshake
75251 422241 0 0 3 0x14200 bored wg_handshake
89402 219146 1 110 3 0x18100090 kqread sndiod
1652 116066 1 99 3 0x19100090 kqread sndiod
41636 131173 47944 95 3 0x19100092 kqread smtpd
56159 435661 47944 103 3 0x19100092 kqread smtpd
30864 263446 47944 95 3 0x18100092 kqread smtpd
64861 75991 47944 95 3 0x19100092 kqread smtpd
74399 157341 47944 95 3 0x19100092 kqread smtpd
47944 325461 1 0 3 0x18100080 kqread smtpd
60748 251840 1 0 3 0x18000088 kqread sshd
93282 26115 1 0 3 0x18100080 kqread ntpd
12262 492605 81276 83 3 0x18100092 kqread ntpd
81276 343918 1 83 2 0x19100492 ntpd
24416 419389 95291 74 3 0x19100092 bpf pflogd
95291 58348 1 0 3 0x18000080 sbwait pflogd
99456 71886 56811 73 3 0x19100090 kqread syslogd
57202 274926 82913 77 3 0x18100092 kqread dhcpleased
93609 415070 82913 77 3 0x18100092 kqread dhcpleased
82913 38615 1 0 3 0x18000080 kqread dhcpleased
39413 85502 22242 115 3 0x18100092 kqread slaacd
84235 356871 22242 115 3 0x18100092 kqread slaacd
22242 283359 1 0 3 0x18100080 kqread slaacd
53776 372278 0 0 3 0x14200 bored smr
16202 188026 0 0 3 0x14200 pgzero zerothread
40368 204141 0 0 3 0x14200 aiodoned aiodoned
18183 419428 0 0 3 0x14200 syncer update
79669 281449 0 0 3 0x14200 cleaner cleaner
80971 55573 0 0 3 0x14200 reaper reaper
88433 220842 0 0 3 0x14200 pgdaemon pagedaemon
34834 242944 0 0 3 0x14200 bored softnet3
28119 493362 0 0 3 0x14200 bored softnet2
41877 463150 0 0 3 0x14200 bored softnet1
16167 354819 0 0 3 0x14200 bored softnet0
93717 296304 0 0 3 0x14200 bored systqmp
45065 39416 0 0 3 0x14200 bored systq
46106 21722 0 0 3 0x40014200 tmoslp softclock
25869 146461 0 0 3 0x40014200 idle0
1 357659 0 0 3 0x8000082 wait init
0 0 -1 0 3 0x10200 scheduler swapper
ddb> show registers
rdi 0x4
rsi 0x14
rbp 0xffff80001c9221b0
rbx 0xfffffd8035fd9400
rdx 0x3fd
rcx 0x4800000000000111
rax 0x30
r8 0x101010101010101
r9 0
r10 0x8dd14be7a93050dc
r11 0xe3e5f94705a0c9e7
r12 0xfffffd8035fd9400
r13 0xffff800000024800
r14 0
r15 0xffffffff82161d70 pp_r600_decoded_lanes+0xc8aa
rip 0xffffffff81fa1d44 db_enter+0x14
cs 0x8
rflags 0x286
rsp 0xffff80001c9221b0
ss 0x10
db_enter+0x14: popq %rbp