Re: i386 pmap_remove_ptes_pae uvm_fault

2025-10-08 Thread Martin Pieuchot
On 12/09/25(Fri) 18:45, Alexander Bluhm wrote: > On Thu, Sep 11, 2025 at 06:31:42PM +0200, Martin Pieuchot wrote: > > On 08/09/25(Mon) 18:53, Martin Pieuchot wrote: > > > On 29/08/25(Fri) 19:12, Alexander Bluhm wrote: > > > > Hi, > > > > > > >

Re: landisk pmrwait (was: Re: i386 pmap_remove_ptes_pae uvm_fault)

2025-10-02 Thread Martin Pieuchot
On 26/09/25(Fri) 07:46, Miod Vallat wrote: > > Diff below includes a fix for a missing wakeup. It might correspond to > > the bug you're seeing. > > > > Problem is that there are two sleep channels and two mechanism for OOM > > situations. The IOdone daemon was waking only one of the two. > >

Re: sparc64 panic during make build

2025-09-21 Thread Martin Pieuchot
On 20/09/25(Sat) 22:59, Claudio Jeker wrote: > The M10-1 hits this panic in roughly 24h of running make -j 32 build in a > loop. First time it exploded inside the reaper for me. So maybe this is > closer to the truth. This smells like a sparc64 pmap bug (or bugs)... What's interesting is that all

Re: i386 pmap_remove_ptes_pae uvm_fault

2025-09-19 Thread Martin Pieuchot
On 11/09/25(Thu) 17:12, Miod Vallat wrote: > > Note that NetBSD also calls pmap_kenter_pa(9) in this case. So maybe > > there's a fix for landisk out there. Anyone care about landisk? > > I'm not sure of this, but to begin with, it appears we lack that fix, > which I am testing at the moment (wit

Re: i386 pmap_remove_ptes_pae uvm_fault

2025-09-17 Thread Martin Pieuchot
On 29/08/25(Fri) 19:12, Alexander Bluhm wrote: > Hi, > > One of my i386 test machines crashed during make build. Kernel is > GENERIC.MP built from current sources. > > panic: uvm_fault(0xd59b2424, 0xcf80, 0, 1) -> e > Stopped at db_enter+0x4: popl%ebp > TIDPIDUID P

Re: i386 pmap_remove_ptes_pae uvm_fault

2025-09-11 Thread Martin Pieuchot
On 08/09/25(Mon) 18:53, Martin Pieuchot wrote: > On 29/08/25(Fri) 19:12, Alexander Bluhm wrote: > > Hi, > > > > One of my i386 test machines crashed during make build. Kernel is > > GENERIC.MP built from current sources. > > > > panic: uvm_fault(0xd59b2424

Re: errors in dmesg with qwx but network work

2025-09-03 Thread Martin Pieuchot
On 02/09/25(Tue) 11:24, Theo de Raadt wrote: > > Perhaps pre-allocation is the better strategy in this case? > > I think so. I agree. Interfaces that do not pre-allocate such "constrained" memory tend to die during heavy swapping. iwx(4) also has such problem. We should try to pre-allocate as

Re: kernel diagnostic assertion in uvm_map.c stopping vmd vm

2025-08-23 Thread Martin Pieuchot
On 21/08/25(Thu) 19:21, Dave Voutila wrote: > Added subject. > > o...@disroot.org writes: > > >>Synopsis: Kernel crash when shutting down hypervised VM > >>Category: kernel > >>Environment: > > System : OpenBSD 7.7 > > Details : OpenBSD 7.7 (GENERIC) #619: Sun Apr 13 08:19:34

Re: ISSET(bp->b_flags, B_BC) failed: file "/usr/src/sys/kern/vfs_bio.c", line 391

2025-08-17 Thread Martin Pieuchot
On 16/08/25(Sat) 21:56, Kirill A. Korinsky wrote: > On Fri, 15 Aug 2025 14:44:51 +0200, > Martin Pieuchot wrote: > > > > On 07/08/25(Thu) 17:17, Kirill A. Korinsky wrote: > > > On Wed, 06 Aug 2025 14:41:51 +0200, > > > Kirill A. Korinsky wrote: > >

Re: uvm_map_protect(9) & amd64 pmap bug?

2025-08-16 Thread Martin Pieuchot
On 16/08/25(Sat) 14:41, Mark Kettenis wrote: > > Date: Fri, 15 Aug 2025 15:53:25 +0200 > > From: Martin Pieuchot > > > > On 15/08/25(Fri) 15:45, Mark Kettenis wrote: > > > > Date: Fri, 15 Aug 2025 14:06:06 +0200 > > > > From: Martin Pieucho

Re: uvm_map_protect(9) & amd64 pmap bug?

2025-08-15 Thread Martin Pieuchot
On 15/08/25(Fri) 15:45, Mark Kettenis wrote: > > Date: Fri, 15 Aug 2025 14:06:06 +0200 > > From: Martin Pieuchot > > > > On 15/08/25(Fri) 13:17, Mark Kettenis wrote: > > > > Date: Fri, 15 Aug 2025 11:51:18 +0200 > > > > From: Martin Pieuchot >

Re: ISSET(bp->b_flags, B_BC) failed: file "/usr/src/sys/kern/vfs_bio.c", line 391

2025-08-15 Thread Martin Pieuchot
On 07/08/25(Thu) 17:17, Kirill A. Korinsky wrote: > On Wed, 06 Aug 2025 14:41:51 +0200, > Kirill A. Korinsky wrote: > > > > On Wed, 06 Aug 2025 11:34:45 +0200, > > Kirill A. Korinsky wrote: > > > > > > >How-To-Repeat: > > > Not sure, can guess that the next build will crash as well. > > > >

Re: uvm_map_protect(9) & amd64 pmap bug?

2025-08-15 Thread Martin Pieuchot
On 15/08/25(Fri) 13:17, Mark Kettenis wrote: > > Date: Fri, 15 Aug 2025 11:51:18 +0200 > > From: Martin Pieuchot > > > > On 15/08/25(Fri) 09:39, Miod Vallat wrote: > > > > Please don't. Keeping that page read-only is important for security. > > >

Re: uvm_map_protect(9) & amd64 pmap bug?

2025-08-15 Thread Martin Pieuchot
On 15/08/25(Fri) 09:39, Miod Vallat wrote: > > Please don't. Keeping that page read-only is important for security. > > Maybe if nobody cares about the amd64 and i386 pmaps we should just > > delete those architectures? > > But remember, because the end argument was wrong (sz instead of va + > sz

Re: uvm_map_protect(9) & amd64 pmap bug?

2025-08-15 Thread Martin Pieuchot
On 30/03/25(Sun) 15:38, Martin Pieuchot wrote: > On 28/03/25(Fri) 19:53, Miod Vallat wrote: > > > If this code has never been tested on pmap_kernel() then it is dead code > > > and I'd rather remove it. Whoever wants to reduce the permission of the > > >

CyberTAN NU361-HS 802.11a/b/g/n/ac Wifi + Bluetooth Combo Kernel crash

2025-07-28 Thread Martin
Additional info from CPU 4 ddb{4}> machine ddbcpu 4 Stopped at mtx_enter_try+0x42:movl0x8(%rdi),%edi mtx_enter_try(18) at mtx_enter_try+042 mtx_enter(18) at mtx_enter+0x35 taskq_do_barrier(0) at taskq_do_barrier+0x69 bwfm_detach(80d22000,1) at bwfm_detach+0x5c bwfm_usb_detach(f

CyberTAN NU361-HS 802.11a/b/g/n/ac Wifi + Bluetooth Combo Kernel crash

2025-07-28 Thread Martin
Hi! Try to connect CyberTAN HU361-HS WiFi + Bluetooth combo module to OpenBSD host. Even it has bwfm driver support (at least chip in this module has been listed as supported in man bwfm), but after connecting it recognized by bwfm driver, then have lots of messages like: ... bwfm0: could not

Re: mac mini m2 hangs while building lang/rust

2025-07-07 Thread Martin Pieuchot
On 07/07/25(Mon) 14:11, Jeremie Courreges-Anglas wrote: > Thanks folks for your replies, > > On Mon, Jul 07, 2025 at 12:26:21PM +0200, Mark Kettenis wrote: > > > Date: Mon, 7 Jul 2025 08:17:37 +0200 > > > From: Martin Pieuchot > > > > > > On 06/07/

Re: mac mini m2 hangs while building lang/rust

2025-07-06 Thread Martin Pieuchot
On 06/07/25(Sun) 21:15, Jeremie Courreges-Anglas wrote: > On Tue, Jul 01, 2025 at 06:18:37PM +0200, Jeremie Courreges-Anglas wrote: > > On Tue, Jun 24, 2025 at 05:21:56PM +0200, Jeremie Courreges-Anglas wrote: > > > > > > I think it's uvm_purge(), as far as I can see it happens when building > > >

Re: server freezes under heavy CPU usage

2025-07-01 Thread Martin Pieuchot
On 01/07/25(Tue) 14:29, K R wrote: > On Tue, Jul 1, 2025 at 9:07 AM Claudio Jeker wrote: > > > > On Tue, Jul 01, 2025 at 04:57:14AM -0300, K R wrote: > > > On Mon, Jun 30, 2025 at 2:39 AM K R wrote: > > > > > > > > On Fri, Jun 27,

Re: server freezes under heavy CPU usage

2025-06-26 Thread Martin Pieuchot
On 26/06/25(Thu) 11:02, K R wrote: > On Wed, Jun 25, 2025 at 1:30 PM K R wrote: > > > > [...] > > > Hi Alexander, > > > > > > The good news: I can consistently reproduce the hang problem. But the > > > bad news is that even with a WITNESS kernel and kern.witness.watch=2 > > > (or even 3) I don't

Re: kernel panic with protectli vp2430

2025-05-27 Thread Martin Pieuchot
Thanks for the report. On 27/05/25(Tue) 11:17, Tim Kuijsten wrote: > >Synopsis:kernel panic while ssh(1) with ~14 MByte/sec > >Category:system kernel amd64 > >Environment: > System : OpenBSD 7.7 > Details : OpenBSD 7.7 (GENERIC.MP) #0: Sun May 4 11:23:50 MDT 2025 >

Re: witness: userret: locks held: exclusive rwlock vmmaplk

2025-05-20 Thread Martin Pieuchot
On 20/05/25(Tue) 12:03, Alexander Bluhm wrote: > Hi, > > According to my regress statistics kernel crashes reliably since > May 16th. /usr/src/regress/misc/posixtestsuite/ triggers it. I just reverted my May 16th UVM commit. Please let me know if that was the problem. Thanks. > > With a witn

Re: fault w/ vmctl stop/vmmci/prsignal on 7.7 vmm

2025-05-18 Thread Martin Pieuchot
On 18/05/25(Sun) 23:43, Mark Kettenis wrote: > > From: Dave Voutila > > Date: Sun, 18 May 2025 15:21:28 -0400 > > > > Claudio Jeker writes: > > > > > On Thu, May 15, 2025 at 01:15:43PM +0200, Martin Pieuchot wrote: > > >> Fault below

fault w/ vmctl stop/vmmci/prsignal on 7.7 vmm

2025-05-15 Thread Martin Pieuchot
Fault below has been triggered by dong 'vmctl $myvm stop' while rebooting a 7.7 amd64 VM: OpenBSD 7.7 (GENERIC) #619: Sun Apr 13 08:19:34 MDT 2025 dera...@amd64.openbsd.org:/usr/src/sys/arch/amd64/compile/GENERIC real mem = 117428224 (111MB) avail mem = 87916544 (83MB) random: good seed from b

Re: uvm_map_protect(9) & amd64 pmap bug?

2025-03-30 Thread Martin Pieuchot
On 28/03/25(Fri) 19:53, Miod Vallat wrote: > > If this code has never been tested on pmap_kernel() then it is dead code > > and I'd rather remove it. Whoever wants to reduce the permission of the > > mapping will have to check on all architectures that this is supported. > > Well it is obvious th

Re: uvm_map_protect(9) & amd64 pmap bug?

2025-03-27 Thread Martin Pieuchot
On 26/03/25(Wed) 13:33, Mark Kettenis wrote: > > Date: Tue, 25 Mar 2025 18:59:46 + > > From: Miod Vallat > [...] > > This STRONGLY hints that this routine has never been used on > > pmap_kernel() addresses until now. > > I guess we stopped swapping out kernel stacks long before amd64 was a >

uvm_map_protect(9) & amd64 pmap bug?

2025-03-25 Thread Martin Pieuchot
David Higgs recently reported an incorrect usage of uvm_map_protect(9): https://marc.info/?l=openbsd-tech&m=174001772620750&w=2 It turns out there's another one in exec_sigcode_map(), fixed by the diff below. Currently the uvm_map_protect(9) calls has no effect and returns EINVAL. However, on

Re: High CPU load for pagedaemon and mouse freezing

2025-03-11 Thread Martin Pieuchot
On 07/03/25(Fri) 17:30, Julian Smith wrote: > On Sun, 2 Mar 2025 18:06:17 +0100 > Martin Pieuchot wrote: > > > Hello Julian, > > > > Thanks for the report. > > > > On 01/03/25(Sat) 17:57, Julian Smith wrote: > > > >Synopsis:

Re: High CPU load for pagedaemon and mouse freezing

2025-03-02 Thread Martin Pieuchot
On 01/03/25(Sat) 22:27, Peter Hessler wrote: > On 2025 Mar 01 (Sat) at 17:57:38 + (+), Julian Smith wrote: > : Memory: Real: 8593M/15G act/tot Free: 769M Cache: 498M > : Swap: 7889M/16G > > You are hitting swap, and thus are screwed. Use less memory. Please do not spread FUD

Re: High CPU load for pagedaemon and mouse freezing

2025-03-02 Thread Martin Pieuchot
Hello Julian, Thanks for the report. On 01/03/25(Sat) 17:57, Julian Smith wrote: > >Synopsis:High CPU load for pagedaemon and mouse freezing > >Category:kernel > >Environment: > System : OpenBSD 7.6 > Details : OpenBSD 7.6-current (GENERIC.MP) #477: Thu Dec 12 > 21:4

Re: panic: kernel diagnostic assertion "anon->an_lock == NULL || rw_write_held(anon->an_lock)" failed

2025-01-25 Thread Martin Pieuchot
On 24/01/25(Fri) 17:44, K R wrote: > On Fri, Jan 24, 2025 at 2:54 PM K R wrote: > > > > On Fri, Jan 24, 2025 at 1:09 PM Martin Pieuchot wrote: > > > > > > Hello, > > > > > > Thanks for your report. > > > > > > On 24/01/

Re: panic: kernel diagnostic assertion "anon->an_lock == NULL || rw_write_held(anon->an_lock)" failed

2025-01-24 Thread Martin Pieuchot
re specific commands at the ddb prompt. I don't need more informations. Please find a fix below. The panic is due to an incorrect unlock in error path. Can you confirm the fix works for you? Thanks, Martin Index: uvm/uvm_pdaemon.c ===

Re: uvm issue noticed on arm64

2025-01-18 Thread Martin Pieuchot
On 18/01/25(Sat) 13:44, Mark Kettenis wrote: > > Date: Fri, 17 Jan 2025 16:39:12 +0100 > > From: Martin Pieuchot > > > > Thanks for your answer Mark! > > > > On 17/01/25(Fri) 15:12, Mark Kettenis wrote: > > > > [...] > > > >

Re: uvm issue noticed on arm64

2025-01-17 Thread Martin Pieuchot
work with 8K > > allocations on arm64? > > Yes, but it still wouldn't solve the fundamental problem and we'd > probably waste resources. > > > - Or can we change the arm64 pmap to not do allocations bigger than a > > page? > > Yes, but it still wouldn't solve the fundamental problem. > > > Any of these two would reduce reduce the differences between non-direct > > and direct map pmaps. If we could ensure uvm_map() is not called from > > within pmap_enter() that would help me sleep better at night. > > I don't think that is achievable. Well it is if we use `uvm_km_pages' or limit pmap_enter() allocations to page-sized ones. > > Which other pmaps could suffer from a shortage of `uvm_km_pages'? > > armv7, arm64, powerpc64, riscv64, sparc64 So I now believe your fix is what we want. One thing we did not consider is if uvm_map() failed because of some contention. But I believe we can wait until this becomes a bottleneck. I'd go even further, remove the #ifdef and replace uvm_swapisfull() + uvm_wait() by pmap_populate(). Thanks for the fix and sharing your knowledge with me. Cheers, Martin

Re: uvm issue noticed on arm64

2025-01-17 Thread Martin Pieuchot
On 10/01/25(Fri) 21:04, Mark Kettenis wrote: > > Date: Fri, 10 Jan 2025 09:07:44 +0100 > > From: Martin Pieuchot > > > > Hello, > > > > On 09/01/25(Thu) 22:49, Mark Kettenis wrote: > > > > > > [...] > > > > > > Any proposa

Re: uvm issue noticed on arm64

2025-01-10 Thread Martin Pieuchot
Hello, On 09/01/25(Thu) 22:49, Mark Kettenis wrote: > > > > [...] > > > > Any proposal on how we could proceed to find a solution for this issue? > > > > > > The following hack fixes the issue for me. I don't think this is a > > > proper solution, but it may be a starting point. Or a temporary

Re: powerpc64/pmap.c trouble report

2024-12-24 Thread Martin Pieuchot
Hello Eric, On 23/12/24(Mon) 17:05, Eric Grosse wrote: > On Thu, Dec 12, 2024 at 11:30 AM Martin Pieuchot wrote: > ... > > That sounds like a memory corruption of some sort. It might be that > > recent changes hide it. I'd be glad if you could test George's chang

Re: powerpc64/pmap.c trouble report

2024-12-12 Thread Martin Pieuchot
Hello, Thanks for your report. On 12/12/24(Thu) 11:58, Eric Grosse wrote: > I have not had a chance to test the > pte += (idx ^ (PTED_HID(pted) ? pmap_ptab_mask : 0)) * 8; > change yet, because with the improvements Martin Pieuchot has > committed, my machines have been crash-f

Re: pagedaemon causing crash?

2024-11-28 Thread Martin Pieuchot
Hello Laurence, On 01/11/24(Fri) 18:48, Martin Pieuchot wrote: > On 01/11/24(Fri) 17:22, Laurence Tratt wrote: > > While doing some normal stuff on my amd64 desktop (-current as of Wed), I > > noticed the mouse starting to lag; I switched to a virtual desktop that > > happen

Re: deattach broken webcam leads to crash

2024-11-24 Thread Martin Pieuchot
On 24/11/24(Sun) 16:51, Kirill A. Korinsky wrote: > On Sun, 24 Nov 2024 15:34:30 +0100, > Martin Pieuchot wrote: > > > > On 23/11/24(Sat) 21:45, Kirill A. Korinsky wrote: > > > I had dig a bit future and setup a breakpoint at usbd_ref_decr. >

Re: deattach broken webcam leads to crash

2024-11-24 Thread Martin Pieuchot
On 23/11/24(Sat) 21:45, Kirill A. Korinsky wrote: > I had dig a bit future and setup a breakpoint at usbd_ref_decr. > > It was called from uvideo_vs_start_bulk_thread, but still it crahses: Could you check if uvideo_vs_close() is called multiple times? > > Breakpoint atusbd_ref_decr

Re: deattach broken webcam leads to crash

2024-11-23 Thread Martin Pieuchot
Hello Kirill, On 22/11/24(Fri) 15:31, Kirill A. Korinsky wrote: > >Synopsis:deattach broken webcam leads to crash What do you mean by broken? > >Category:uvideo > >Environment: > System : OpenBSD 7.6 > Details : OpenBSD 7.6-current (GENERIC.MP) #44: Fri Nov 22 15:03:

Re: pagedaemon causing crash?

2024-11-01 Thread Martin Pieuchot
On 01/11/24(Fri) 17:22, Laurence Tratt wrote: > While doing some normal stuff on my amd64 desktop (-current as of Wed), I > noticed the mouse starting to lag; I switched to a virtual desktop that > happened to be running `top -S -s 1` and then the machine hung solid. > > The attached photo of `top

Re: Snapshot kernel continuous crashes (4 times now)

2024-10-21 Thread Martin Pieuchot
time. It's at a > rather bad time, because I need my system for work. > > There are more photos than Gmail allows me to upload. So I might have to > creat a gallery in google photos. This is due to a mistake that has been reverted. Please wait for the next snapshot or build a kernel from sources. Cheers, Martin

Re: smartmontools package /etc/smartd_warning.sh has incorrect file permissions

2024-10-18 Thread Martin Ziemer
Am Thu, Oct 17, 2024 at 11:21:08AM -0500 schrieb m...@techautonomy.net: > >Synopsis:smartmontools package /etc/smartd_warning.sh has incorrect file > >permissions > >Environment: > System : OpenBSD 7.6 > Architecture: OpenBSD.amd64 > Machine : amd64 > > >Description:

Re: gdb broken on arm64/MT

2024-09-06 Thread Martin Pieuchot
On 26/07/24(Fri) 08:36, Claudio Jeker wrote: > On Thu, Jul 25, 2024 at 08:20:32PM +0200, Martin Pieuchot wrote: > > On 25/07/24(Thu) 17:33, Claudio Jeker wrote: > > > On Thu, Jul 25, 2024 at 05:15:32PM +0200, Martin Pieuchot wrote: > > > > On 25/07/24(Thu) 14:51, Cla

Re: panic - mutex? - AMD64/7.4 to current

2024-08-25 Thread Martin Pieuchot
On 24/08/24(Sat) 13:14, Hugh Graham wrote: > On Sat, Aug 24, 2024 at 09:31:43PM +0200, Martin Pieuchot wrote: > > On 24/08/24(Sat) 12:09, Hugh Graham wrote: > > > The machine that slowly received the ports tree crashed upon reboot, > > > and I have included the traces. I

Re: panic - mutex? - AMD64/7.4 to current

2024-08-24 Thread Martin Pieuchot
6:leave > x86_ipi_db(8000489eaff0) at x86_ipi_db+0x16 > x86_ipi_handler() at x86_ipi_handler+0x80 > Xresume_lapic_ipi() at Xresume_lapic_ipi+0x27 > acpicpu_idle() at acpicpu_idle+0x11f > sched_idle(8000489eaff0) at sched_idle+0x282 > end trace frame: 0x0, coun

Re: panic - ffs_write - AMD64/7.4 to current

2024-08-24 Thread Martin Pieuchot
Hugh, If you can reproduce this easily, please send a new panic with the outputs of: - show uvm - show bcstats - And the traces of all running processes... In the two reports below we only have the trace of pax(1) which is running on CPU2. The two panics are due to corruptions of two different g

Re: gdb broken on arm64/MT

2024-07-25 Thread Martin Pieuchot
On 25/07/24(Thu) 17:33, Claudio Jeker wrote: > On Thu, Jul 25, 2024 at 05:15:32PM +0200, Martin Pieuchot wrote: > > On 25/07/24(Thu) 14:51, Claudio Jeker wrote: > > > On Thu, Jul 25, 2024 at 11:09:44AM +0200, Martin Pieuchot wrote: > > > [...] > &g

Re: gdb broken on arm64/MT

2024-07-25 Thread Martin Pieuchot
On 25/07/24(Thu) 14:51, Claudio Jeker wrote: > On Thu, Jul 25, 2024 at 11:09:44AM +0200, Martin Pieuchot wrote: > [...] > > > Index: kern/kern_synch.c > > > === > > > RCS file: /cvs/src/sys/ker

Re: gdb broken on arm64/MT

2024-07-25 Thread Martin Pieuchot
Thanks a lot for figuring that out. This is awesome! On 24/07/24(Wed) 16:19, Claudio Jeker wrote: > On Fri, Jun 21, 2024 at 01:24:27PM +0200, Martin Pieuchot wrote: > > So I'm trying to see where the remaining sched_yield() are coming from > > ld(1): > > > &

Re: kernel diagnostic assertion "p->p_kq->kq_refcnt.r_refs == 1" failed

2024-06-27 Thread Martin Pieuchot
Thanks, On 27/06/24(Thu) 11:24, kir...@korins.ky wrote: > [...] > > panic: kernel diagnostic assertion "p->p_kq->kq_refcnt.r_refs == 1" failed: > file "/usr/src/sys/kern/kern_event.c", line 894 > Stopped at db_enter+0x14: movq%rbp > TID PID UID PRFLAGSPFLAGS

Re: Mac Studio hangs; locking problems on WITNESS/MP_LOCKDEBUG kernels

2024-06-25 Thread Martin Pieuchot
On 24/06/24(Mon) 22:32, Dana Koch wrote: > Dana Koch schrieb am So., 23. Juni 2024, 19:50: > > > > Could you try the diff below? Stuart confirmed it prevents the hang on > > > his machine. > > > > This also seems to be working well for me so far. > > > > Okay, I've got an actual panic now, with

Re: Mac Studio hangs; locking problems on WITNESS/MP_LOCKDEBUG kernels

2024-06-23 Thread Martin Pieuchot
Hello Dana, On 20/06/24(Thu) 17:16, Dana Koch wrote: > On Thu, Jun 20, 2024 at 3:33 PM Martin Pieuchot wrote: > > > > Hello Dana, > > > > Thanks again for your report. > > > > On 19/06/24(Wed) 09:37, Dana Koch wrote: > > > On Wed, Jun 19, 2024 at 6

gdb broken on arm64/MT

2024-06-21 Thread Martin Pieuchot
So I'm trying to see where the remaining sched_yield() are coming from ld(1): $ cd /sys/arch/arm64/compile/GENERIC.MP $ LD="egdb --args ld" make -j32 Then I add a breakpoint on sched_yield & hit run. As soon as the first thread is stopped, I can see the trace as usual, however the process is now

Re: Mac Studio hangs; locking problems on WITNESS/MP_LOCKDEBUG kernels

2024-06-20 Thread Martin Pieuchot
Hello Dana, Thanks again for your report. On 19/06/24(Wed) 09:37, Dana Koch wrote: > On Wed, Jun 19, 2024 at 6:58 AM Martin Pieuchot wrote: > > This is a lock order reversal reported by WITNESS. Thankfully claudio@ > > already committed a fix for this on the 16th. So please,

Re: Mac Studio hangs; locking problems on WITNESS/MP_LOCKDEBUG kernels

2024-06-19 Thread Martin Pieuchot
On 18/06/24(Tue) 23:34, Dana Koch wrote: > >Synopsis: Mac Studio hangs; locking problems on WITNESS/MP_LOCKDEBUG kernels > >Category: kernel > >Environment: > System : OpenBSD 7.5 > Details : OpenBSD 7.5-current (GENERIC.MP) #69: Wed Jun 12 04:43:28 MDT > 2024 > dera...@arm64.openbsd.org:

Re: panic: pool_do_get: mcl2k free list modified

2024-06-17 Thread Martin Pieuchot
On 16/06/24(Sun) 20:37, Daniel Jakots wrote: > On Sat, 15 Jun 2024 18:56:14 +0200, Jan Klemkow wrote: > > > Does ist also happend, if you disable LRO? > > > > try: > > > > ifconfig vio0 -tcplro > > Thanks for the cue, it doesn't happen indeed. This is a/the wg(4) race.

arc4random lock order issue

2024-06-03 Thread Martin Pieuchot
Now that the SCHED_LOCK() is a mutex I see the following WITNESS report on arm64. witness: lock order reversal: 1st 0xff80012486e8 /usr/src/sys/dev/rnd.c:321 (/usr/src/sys/dev/rnd.c:321) 2nd 0xff800120afb0 /usr/src/sys/kern/kern_timeout.c:57 (/usr/src/sys/kern/kern_timeout.c:57) lock o

Re: powerpc64/pmap.c trouble report

2024-05-31 Thread Martin Pieuchot
On 30/05/24(Thu) 13:11, Eric Grosse wrote: > And, fairly quickly, another one. The load depends on what's in the Go > team build queue, which is not under my control.To avoid further > spamming the list I won't report any more of these until I can get > something reproducible under my control. Of c

Re: WireGuard(?) issues

2024-05-20 Thread Martin Pieuchot
On 19/05/24(Sun) 23:50, Vitaliy Makkoveev wrote: > > > > On 19 May 2024, at 22:05, Anthony J. Bentley wrote: > > > > Vitaliy Makkoveev writes: > >>> On 17 May 2024, at 12:06, Stuart Henderson = > >> wrote: > >>> =20 > >>> There are problems with wg(4) that people with some workloads have = > >

Re: lock order reversal in soreceive and NFS

2024-04-30 Thread Martin Pieuchot
On 27/04/24(Sat) 13:44, Visa Hankala wrote: > On Tue, Apr 23, 2024 at 02:48:32PM +0200, Martin Pieuchot wrote: > > [...] > > I agree. Now I'd be very grateful if someone could dig into WITNESS to > > figure out why we see such reports. Are these false positive or are we

Re: lock order reversal in soreceive and NFS

2024-04-23 Thread Martin Pieuchot
On 22/04/24(Mon) 16:18, Mark Kettenis wrote: > > Date: Mon, 22 Apr 2024 15:39:55 +0200 > > From: Alexander Bluhm > > > > Hi, > > > > I see a witness lock order reversal warning with soreceive. It > > happens during NFS regress tests. In /var/log/messages is more > > context from regress. > >

Re: protection fault in amap_wipeout

2024-04-13 Thread Martin Pieuchot
On 30/03/24(Sat) 18:38, Martin Pieuchot wrote: > Hello Alexander, > > Thanks for the report. > > On 01/03/24(Fri) 16:39, Alexander Bluhm wrote: > > Hi, > > > > An OpenBSD 7.4 machine on KVM running postgress and pagedaemon > > crashed in amap_wi

Re: protection fault in amap_wipeout

2024-03-30 Thread Martin Pieuchot
Hello Alexander, Thanks for the report. On 01/03/24(Fri) 16:39, Alexander Bluhm wrote: > Hi, > > An OpenBSD 7.4 machine on KVM running postgress and pagedaemon > crashed in amap_wipeout(). > > bluhm > > kernel: protection fault trap, code=0 > Stopped at amap_wipeout+0x76: movq%rc

Re: panic: "wakeup: p_stat is 2" using btrace(8) & vmd(8)

2024-03-24 Thread Martin Pieuchot
On 22/02/24(Thu) 17:24, Claudio Jeker wrote: > On Thu, Feb 22, 2024 at 04:16:57PM +0100, Martin Pieuchot wrote: > > On 21/02/24(Wed) 13:05, Claudio Jeker wrote: > > > On Tue, Feb 20, 2024 at 09:34:12PM +0100, Martin Pieuchot wrote: > > > > On 28/10/21(Thu) 05:45, Vi

Re: panic: kernel diagnostic assertion "p->p_wchan == NULL" failed

2024-02-28 Thread Martin Pieuchot
On 28/02/24(Wed) 16:39, Vitaliy Makkoveev wrote: > On Wed, Feb 28, 2024 at 02:22:31PM +0100, Mark Kettenis wrote: > > > Date: Wed, 28 Feb 2024 16:16:09 +0300 > > > From: Vitaliy Makkoveev > > > > > > On Wed, Feb 28, 2024 at 12:36:26PM +0100, Claudio Jeker wrote: > > > > On Wed, Feb 28, 2024 at 12

Re: panic: kernel diagnostic assertion "p->p_wchan == NULL" failed

2024-02-28 Thread Martin Pieuchot
On 28/02/24(Wed) 12:36, Claudio Jeker wrote: > On Wed, Feb 28, 2024 at 12:26:43PM +0100, Marko Cupać wrote: > > Hi, > > > > thank you for looking into it, and for the advice. > > > > On Wed, 28 Feb 2024 10:13:06 + > > Stuart Henderson wrote: > > > > > Please try to re-type at least the most

Re: panic: "wakeup: p_stat is 2" using btrace(8) & vmd(8)

2024-02-22 Thread Martin Pieuchot
On 21/02/24(Wed) 13:05, Claudio Jeker wrote: > On Tue, Feb 20, 2024 at 09:34:12PM +0100, Martin Pieuchot wrote: > > On 28/10/21(Thu) 05:45, Visa Hankala wrote: > > > On Wed, Oct 27, 2021 at 09:02:08PM -0400, Dave Voutila wrote: > > > > Dave Voutila writes: > >

Re: panic: "wakeup: p_stat is 2" using btrace(8) & vmd(8)

2024-02-20 Thread Martin Pieuchot
On 28/10/21(Thu) 05:45, Visa Hankala wrote: > On Wed, Oct 27, 2021 at 09:02:08PM -0400, Dave Voutila wrote: > > > > Dave Voutila writes: > > > > > Was tinkering on a bt(5) script for trying to debug an issue in vmm(4) > > > when I managed to start hitting a panic "wakeup: p_stat is 2" being > >

Re: Sparc64 rthreads Instablilty

2024-02-16 Thread Martin Pieuchot
On 15/02/24(Thu) 20:06, Kurt Miller wrote: > On Feb 15, 2024, at 3:01 PM, Miod Vallat wrote: > > > >> Has been running for the last few hours without any issue. > >> OK claudio@ on that diff. > > > > But it's your diff! I only polished it a bit. > > > > I have also been testing various version

Re: Sparc64 livelock/system freeze w/cpu traces

2023-09-02 Thread Martin Pieuchot
On 28/06/23(Wed) 20:07, Kurt Miller wrote: > On Jun 28, 2023, at 7:16 AM, Martin Pieuchot wrote: > > > > On 28/06/23(Wed) 08:58, Claudio Jeker wrote: > >> > >> I doubt this is a missing wakeup. It is more the system is thrashing and > >> not making p

Re: Sparc64 rthreads Instablilty

2023-09-02 Thread Martin Pieuchot
On 13/08/23(Sun) 22:59, Kurt Miller wrote: > I’ve been hunting an intermittent jdk crash on sparc64 for some time now. > Since egdb has not been up to the task, I created a small c program which > reproduces the problem. This partially mimics the jdk startup where a number > of detached threads are

Re: resume failures/lockups

2023-09-02 Thread Martin Pieuchot
Hello Ross, On 27/08/23(Sun) 15:16, Ross L Richardson wrote: > For the past several weeks (using -current), I've had problems with > resume on an amd64 desktop. It's intermittent (but if anything > becoming increasingly frequent). If you can still reproduce the issue, please try enabling WITNESS

Re: panic: rw_enter: vmmaplk locking agaist myself

2023-06-29 Thread Martin Pieuchot
On 29/06/23(Thu) 11:17, Stefan Sperling wrote: > On Thu, Jun 29, 2023 at 10:59:32AM +0200, Martin Pieuchot wrote: > > On 28/06/23(Wed) 15:47, Moritz Buhl wrote: > > > Dear bugs@, > > > > > > with the following snapshot I had two panics on my x270 recen

Re: panic: rw_enter: vmmaplk locking agaist myself

2023-06-29 Thread Martin Pieuchot
On 28/06/23(Wed) 15:47, Moritz Buhl wrote: > Dear bugs@, > > with the following snapshot I had two panics on my x270 recently. This is a bug in iwm(4) suggesting a missing SPL protection. > sysctl kern.version > kern.version=OpenBSD 7.3-current (GENERIC.MP) #1256: Thu Jun 22 10:53:02 MDT > 2023

Re: Sparc64 livelock/system freeze w/cpu traces

2023-06-28 Thread Martin Pieuchot
On 28/06/23(Wed) 08:58, Claudio Jeker wrote: > On Tue, Jun 27, 2023 at 08:18:15PM -0400, Kurt Miller wrote: > > On Jun 27, 2023, at 1:52 PM, Kurt Miller wrote: > > > > > > On Jun 14, 2023, at 12:51 PM, Vitaliy Makkoveev wrote: > > >> > > >&g

Re: Sparc64 livelock/system freeze w/cpu traces

2023-05-30 Thread Martin Pieuchot
On 25/05/23(Thu) 16:33, Kurt Miller wrote: > On May 22, 2023, at 2:27 AM, Claudio Jeker wrote: > > I have seen these WITNESS warnings on other systems as well. I doubt this > > is the problem. IIRC this warning is because sys_mount() is doing it wrong > > but it is not really an issue since sys_mo

Re: Sparc64 livelock/system freeze w/cpu traces

2023-05-12 Thread Martin Pieuchot
On 09/05/23(Tue) 20:02, Kurt Miller wrote: > While building devel/jdk/1.8 on May 3rd snapshot I noticed the build freezing > and processes getting stuck like ps. After enabling ddb.console I was able to > reproduce the livelock and capture cpu traces. Dmesg at the end. > Let me know if more informa

Re: Repeated crashes with OpenBSD 7.2 on Raspberry Pi 4 (arm64)

2023-02-20 Thread Martin Pieuchot
u can still trigger the panic with a -current snapshot on your machine, that would motivate me to look at it. Cheers, Martin

Re: bbolt can freeze 7.2 from userspace

2023-02-20 Thread Martin Pieuchot
On 20/02/23(Mon) 03:59, Renato Aguiar wrote: > [...] > I can't reproduce it anymore with this patch on 7.2-stable :) Thanks a lot for testing! Here's a better fix from Chuck Silvers. That's what I believe we should commit. The idea is to prevent sibling from modifying the vm_map by marking it a

Re: Repeated crashes with OpenBSD 7.2 on Raspberry Pi 4 (arm64)

2023-02-19 Thread Martin Pieuchot
ed the git repo, downloaded the latest build-farm.X.tgz, the client and data. I installed gmake, bison and flex and I'm now reading the conf file I need to edit but I'm not sure how to glue everything together. Any example of setup on OpenBSD would be appreciated. Thanks, Martin On 05/12/

Re: bbolt can freeze 7.2 from userspace

2023-02-18 Thread Martin Pieuchot
On 24/01/23(Tue) 04:40, Renato Aguiar wrote: > Hi Martin, > > "David Hill" writes: > > > > > Yes, same result as before. This patch does not seem to help. > > > > I could also reproduce it with patched 'current' :( Here's anot

Re: bbolt can freeze 7.2 from userspace

2023-01-29 Thread Martin Pieuchot
On 29/01/23(Sun) 14:36, Mark Kettenis wrote: > > Date: Sun, 29 Jan 2023 12:31:22 +0100 > > From: Martin Pieuchot > > > > On 23/01/23(Mon) 22:57, David Hill wrote: > > > On 1/20/23 09:02, Martin Pieuchot wrote: > > > > > [...] > > > >

Re: bbolt can freeze 7.2 from userspace

2023-01-29 Thread Martin Pieuchot
On 23/01/23(Mon) 22:57, David Hill wrote: > On 1/20/23 09:02, Martin Pieuchot wrote: > > > [...] > > > Ran it 20 times and all completed and passed. I was also able to > > > interrupt > > > it as well. no issues. > > > > > > Excellen

Re: bbolt can freeze 7.2 from userspace

2023-01-20 Thread Martin Pieuchot
Hello David, On 21/12/22(Wed) 11:37, David Hill wrote: > On 12/21/22 11:23, Martin Pieuchot wrote: > > On 21/12/22(Wed) 09:20, David Hill wrote: > > > On 12/21/22 07:08, David Hill wrote: > > > > On 12/21/22 05:33, Martin Pieuchot wrote: > > > > >

Re: bbolt can freeze 7.2 from userspace

2022-12-21 Thread Martin Pieuchot
On 21/12/22(Wed) 09:20, David Hill wrote: > > > On 12/21/22 07:08, David Hill wrote: > > > > > > On 12/21/22 05:33, Martin Pieuchot wrote: > > > On 18/12/22(Sun) 20:55, Martin Pieuchot wrote: > > > > On 17/12/22(Sat) 14:15, David Hill wrote: &

Re: bbolt can freeze 7.2 from userspace

2022-12-21 Thread Martin Pieuchot
On 18/12/22(Sun) 20:55, Martin Pieuchot wrote: > On 17/12/22(Sat) 14:15, David Hill wrote: > > > > > > On 10/28/22 03:46, Renato Aguiar wrote: > > > Use of bbolt Go library causes 7.2 to freeze. I suspect it is triggering > > > some > > > sort

Re: bbolt can freeze 7.2 from userspace

2022-12-18 Thread Martin Pieuchot
On 17/12/22(Sat) 14:15, David Hill wrote: > > > On 10/28/22 03:46, Renato Aguiar wrote: > > Use of bbolt Go library causes 7.2 to freeze. I suspect it is triggering > > some > > sort of deadlock in mmap because threads get stuck at vmmaplk. > > > > I managed to reproduce it consistently in a la

Re: macppc panic: vref used where vget required

2022-11-09 Thread Martin Pieuchot
On 09/09/22(Fri) 14:41, Martin Pieuchot wrote: > On 09/09/22(Fri) 12:25, Theo Buehler wrote: > > > Yesterday gnezdo@ fixed a race in uvn_attach() that lead to the same > > > assert. Here's an rebased diff for the bug discussed in this thread, > > > could you

Re: bse(4) media/link bug

2022-11-07 Thread Martin Pieuchot
On 07/11/22(Mon) 13:20, Martin Pieuchot wrote: > On a raspberry pi4, with the following configuration : > > $ cat /etc/hostname.bse0 > dhcp > > ...and with the cable directly connected to my laptop (amd64 w/ em(4)) I > have to

bse(4) media/link bug

2022-11-07 Thread Martin Pieuchot
On a raspberry pi4, with the following configuration : $ cat /etc/hostname.bse0 dhcp ...and with the cable directly connected to my laptop (amd64 w/ em(4)) I have to force the media type, with the command below, to make it work. # ifconfig bse0 media

arm64 (rockpro64) regression

2022-09-18 Thread Martin Pieuchot
The rockpro64 no longer boots in multi-user on -current. It hangs after displaying the following lines: rkiis0 at mainbus0 rkiis1 at mainbus0 The 8/09 snapshot works, the next one from 11/09 doesn't. bsd.rd still boots. Dmesg below. OpenBSD 7.2-beta (GENERIC.MP) #1815: Thu Sep 8 13:20:08 MDT

Swap on sdhc(4) and dwmmc(4) is broken

2022-09-10 Thread Martin Pieuchot
On the rockpro64 as well as on the rpi4 if too much swapping occurs biowait() returns an error (B_ERROR) in both cases it seems to come from sdmmc_complete_xs(). I see the following: sdmmc_complete_xs: write error = 35 sdmmc_complete_xs: read error = 35 c++: B_ERROR after biowait() c++: error 4 f

Re: macppc panic: vref used where vget required

2022-09-09 Thread Martin Pieuchot
On 09/09/22(Fri) 12:25, Theo Buehler wrote: > > Yesterday gnezdo@ fixed a race in uvn_attach() that lead to the same > > assert. Here's an rebased diff for the bug discussed in this thread, > > could you try again and let us know? Thanks! > > This seems to be stable now. It's been running for ne

Re: macppc panic: vref used where vget required

2022-09-01 Thread Martin Pieuchot
On 29/07/22(Fri) 14:22, Theo Buehler wrote: > On Mon, Jul 11, 2022 at 01:05:19PM +0200, Martin Pieuchot wrote: > > On 11/07/22(Mon) 07:50, Theo Buehler wrote: > > > On Fri, Jun 03, 2022 at 03:02:36PM +0200, Theo Buehler wrote: > > > > > Please do note that this

Re: macppc panic: vref used where vget required

2022-07-11 Thread Martin Pieuchot
On 11/07/22(Mon) 07:50, Theo Buehler wrote: > On Fri, Jun 03, 2022 at 03:02:36PM +0200, Theo Buehler wrote: > > > Please do note that this change can introduce/expose other issues. > > > > It seems that this diff causes occasional hangs when building snapshots > > on my mac M1 mini. This happened

Re: System frequently hangs, found commit that probably causes it

2022-07-06 Thread Martin Pieuchot
On 01/07/22(Fri) 07:13, Sebastien Marie wrote: > On Mon, Jun 27, 2022 at 06:29:55PM +0200, Martin Pieuchot wrote: > > On 27/06/22(Mon) 18:04, Caspar Schutijser wrote: > > > On Sun, Jun 26, 2022 at 10:03:59PM +0200, Martin Pieuchot wrote: > > > > On 26/06/22(Sun)

  1   2   3   4   5   6   7   >