Hello, after 2 days uptime there was another crash. OpenBSD 7.2-beta (GENERIC.MP) #712: Mon Aug 29 12:35:51 MDT 2022 dera...@amd64.openbsd.org:/usr/src/sys/arch/amd64/compile/GENERIC.MP
ddb{2}> show panic *cpu2: uvm_fault(0xffffffff823e0440, 0x0, 0, 2) -> e ddb{2}> trace amdgpu_vram_mgr_reserve_range(ffff8000226d3738,14,9) at amdgpu_vram_mgr_reserve _range+0x101 esp46_input(ffff8000226d3738,ffff8000226d3744,32,2) at esp46_input+0xee ip_deliver(ffff8000226d3738,ffff8000226d3744,32,2) at ip_deliver+0x137 ipintr() at ipintr+0x69 if_netisr(0) at if_netisr+0xea taskq_thread(ffff80000002c080) at taskq_thread+0x100 end trace frame: 0x0, count: -6 ddb{2}> show register rdi 0xffff800000cf4478 rsi 0 rbp 0xffff8000226d3630 rbx 0x4 rdx 0xd0ec __ALIGN_SIZE+0xc0ec rcx 0x5 rax 0 r8 0x10 r9 0x40bc0e0c79f31f6e r10 0 r11 0x9a636d5cd973370a r12 0x32 r13 0x14 r14 0xffff8000226d3738 r15 0xffff800000cf4448 rip 0xffffffff81968321 amdgpu_vram_mgr_reserve_range+0x101 cs 0x8 rflags 0x10246 __ALIGN_SIZE+0xf246 rsp 0xffff8000226d3598 ss 0x10 amdgpu_vram_mgr_reserve_range+0x101: addb %al,0(%rax) ddb{2}> ps PID TID PPID UID S FLAGS WAIT COMMAND 86171 333411 63955 74 3 0x1100092 bpf pflogd 63955 461469 1 0 3 0x80 netio pflogd 44926 303198 1 0 3 0x100083 ttyin getty 91823 366301 1 0 3 0x100098 kqread cron 56167 242659 1 0 3 0x80 nanoslp apcupsd 56167 147809 1 0 3 0x4000088 sigwait apcupsd 56167 171020 1 0 3 0x4000080 nanoslp apcupsd 47439 46683 1 99 3 0x1100090 kqread sndiod 51482 395614 1 110 3 0x100090 kqread sndiod 5163 480478 99386 95 3 0x1100092 kqread smtpd 76226 349104 99386 103 3 0x1100092 kqread smtpd 53394 15461 99386 95 3 0x1100092 kqread smtpd 7344 376375 99386 95 3 0x100092 kqread smtpd 63235 309137 99386 95 3 0x1100092 kqread smtpd 48160 64945 99386 95 3 0x1100092 kqread smtpd 99386 38247 1 0 3 0x100080 kqread smtpd 41420 333984 1 77 3 0x1100090 kqread dhcpd 17481 420686 1 0 3 0x88 kqread sshd 18611 160721 77165 68 3 0x1000090 kqread isakmpd 77165 391744 1 0 3 0x80 netio isakmpd 79254 85499 1 0 3 0x100080 kqread ntpd 10257 99703 79620 83 3 0x100092 kqread ntpd 79620 463938 1 83 3 0x1100092 kqread ntpd 91894 184130 24465 73 3 0x1100090 kqread syslogd 24465 244414 1 0 3 0x100082 netio syslogd 62105 141098 1 0 3 0x100080 kqread resolvd 19257 376757 61629 77 3 0x100092 kqread dhcpleased 76652 204506 61629 77 3 0x100092 kqread dhcpleased 61629 486499 1 0 3 0x80 kqread dhcpleased 90626 362267 95555 115 3 0x100092 kqread slaacd 93187 97889 95555 115 3 0x100092 kqread slaacd 95555 477868 1 0 3 0x100080 kqread slaacd 11780 22955 0 0 3 0x14200 bored smr 98305 221595 0 0 3 0x14200 pgzero zerothread 96593 391889 0 0 3 0x14200 aiodoned aiodoned 30232 412444 0 0 3 0x14200 syncer update 45741 353942 0 0 3 0x14200 cleaner cleaner 39902 310884 0 0 3 0x14200 reaper reaper 65354 212624 0 0 3 0x14200 pgdaemon pagedaemon 33348 495407 0 0 3 0x14200 mmctsk sdmmc0 74730 412089 0 0 3 0x14200 usbtsk usbtask 86868 405536 0 0 3 0x14200 usbatsk usbatsk 20735 139086 0 0 3 0x40014200 acpi0 acpi0 6266 77841 0 0 3 0x40014200 idle3 61610 247494 0 0 3 0x40014200 idle2 55796 232481 0 0 7 0x40014200 idle1 91772 351464 0 0 3 0x14200 bored sensors 89632 253333 0 0 3 0x14200 bored softnet 4662 418517 0 0 3 0x14200 bored softnet 22887 421630 0 0 7 0x14200 softnet *56210 145960 0 0 7 0x14200 softnet 45994 274986 0 0 3 0x14200 bored systqmp 23497 183217 0 0 3 0x14200 bored systq 99138 346429 0 0 3 0x40014200 bored softclock 83442 336559 0 0 7 0x40014200 idle0 1 436237 0 0 3 0x82 wait init 0 0 -1 0 3 0x10200 scheduler swapper ddb{2}> mach ddbcpu 0 Stopped at x86_ipi_db+0x12: leave x86_ipi_db(ffffffff822d4ff0) at x86_ipi_db+0x12 x86_ipi_handler() at x86_ipi_handler+0x80 Xresume_lapic_ipi() at Xresume_lapic_ipi+0x23 _kernel_lock() at _kernel_lock+0xa6 softintr_dispatch(0) at softintr_dispatch+0x49 Xsoftclock() at Xsoftclock+0x1f acpicpu_idle() at acpicpu_idle+0x11f sched_idle(ffffffff822d4ff0) at sched_idle+0x280 end trace frame: 0x0, count: 7 ddb{0}> mach ddbcpu 1 Stopped at x86_ipi_db+0x12: leave x86_ipi_db(ffff800022508ff0) at x86_ipi_db+0x12 x86_ipi_handler() at x86_ipi_handler+0x80 Xresume_lapic_ipi() at Xresume_lapic_ipi+0x23 acpicpu_idle() at acpicpu_idle+0x11f sched_idle(ffff800022508ff0) at sched_idle+0x280 end trace frame: 0x0, count: 10 ddb{1}> mach ddbcpu 2 Stopped at amdgpu_vram_mgr_reserve_range+0x101: addb %al,0(%rax) amdgpu_vram_mgr_reserve_range(ffff8000226d3738,14,9) at amdgpu_vram_mgr_reserve _range+0x101 esp46_input(ffff8000226d3738,ffff8000226d3744,32,2) at esp46_input+0xee ip_deliver(ffff8000226d3738,ffff8000226d3744,32,2) at ip_deliver+0x137 ipintr() at ipintr+0x69 if_netisr(0) at if_netisr+0xea taskq_thread(ffff80000002c080) at taskq_thread+0x100 end trace frame: 0x0, count: 9 ddb{2}> mach ddbcpu 3 Stopped at x86_ipi_db+0x12: leave x86_ipi_db(ffff80002251aff0) at x86_ipi_db+0x12 x86_ipi_handler() at x86_ipi_handler+0x80 Xresume_lapic_ipi() at Xresume_lapic_ipi+0x23 __mp_acquire_count(ffff80000002c100,ffff80000002c118) at __mp_acquire_count taskq_next_work(ffff80000002c100,ffff8000226d93d0) at taskq_next_work+0x61 taskq_thread(ffff80000002c100) at taskq_thread+0xeb end trace frame: 0x0, count: 9 ddb{3}> On Wed, 31 Aug 2022 22:07:45 +0200 Radek <r...@int.pl> wrote: > Hello Alexandr, hello Alexander, > > > does your box run also diff committed [1] by bluhm@ ~week ago? > No, I didn't. I missed that diff. I upgraded to a new snapshot yesterday. I > works fine as far. > OpenBSD 7.2-beta (GENERIC.MP) #712: Mon Aug 29 12:35:51 MDT 2022 > dera...@amd64.openbsd.org:/usr/src/sys/arch/amd64/compile/GENERIC.MP > > Thank you Alexander for your extensive explanation of the proper ddb commands > order. > > Radek > > > On Mon, 29 Aug 2022 12:30:31 +0200 > Alexander Bluhm <alexander.bl...@gmx.net> wrote: > > > On Mon, Aug 29, 2022 at 04:42:45AM +0200, Radek wrote: > > > the same problem occurs on -current. > > > > It is not the same problem. Traces are different. But I guess > > your setup triggers some sort of race. > > > > Previous crashes with 7.1 were in route and IPsec, now it is in pf. > > Unfortunately you missed my pf fragment fix by a couple of hours. > > Please try a newer snapshot. > > > > OpenBSD 7.2-beta (GENERIC.MP) #705: Mon Aug 22 12:25:07 MDT 2022 > > Changes by: bl...@cvs.openbsd.org 2022/08/22 14:35:39 > > > > I could not figure out what is wrong with 7.1-stable crashes. The > > register and ps output are not from the CPU where the crash happened. > > You have to run show register and ps before switching CPU with mach > > ddbcpu. > > > > So first run show panic. Then trace, show register, ps. > > Finally inspect the other CPU with mach ddbcpu. > > > > The number in ddb{2}> prompt shows the CPU you are currently on. > > If "show panic" mentions more than one CPU, the one with the * is > > the interresting one. Usually ddb drops to that initially. Traces > > from other CPU help to see if something was running concurrently. > > > > bluhm > > > > > Radek > Radek