Re: Mac Studio hangs; locking problems on WITNESS/MP_LOCKDEBUG kernels

2024-06-19 Thread Mark Kettenis
> From: Dana Koch 
> Date: Tue, 18 Jun 2024 23:34:07 -0400

Hi Dana,

Thanks for the report.  I have an M2 Pro Mac Mini that is very
reliable.  And I believe there are folks using machines with M2 Max
without issues as well.  So these issues are likely specific to the M2
Ultra SoC.

The fact that "mach ddbcpu X" doesn't work for X > 17 makes me wonder
if there is something subtly wrong with interrupts on the M2 Ultra.
I'll need to see if I can find out more.

One thing that would help me investigate further is "eeprom -p" output
for this machine.

Thanks,

Mark

> >Synopsis: Mac Studio hangs; locking problems on WITNESS/MP_LOCKDEBUG kernels
> >Category: kernel
> >Environment:
> System  : OpenBSD 7.5
> Details : OpenBSD 7.5-current (GENERIC.MP) #69: Wed Jun 12 04:43:28 MDT 
> 2024
> dera...@arm64.openbsd.org:/usr/src/sys/arch/arm64/compile/GENERIC.MP
> 
> Architecture: OpenBSD.arm64
> Machine : arm64
> >Description:
> System can hang and be unresponsive on a Mac Studio (M2, Ultra),
> either soon after boot passes to userland during/after "starting
> network", or under load. When on a kernel with MP_LOCKDEBUG and
> WITNESS options turned on, these points will trigger locking-related
> panics
> 
> When trying to bisect and boot onto different kernel binaries built
> from different points of time, the system may successfully pass the
> "starting network" point at boot without panic'ing, but may instead
> panic at some other seemingly random point under load. There appeared
> to be no good correlation between commits at different points of time
> and the reliability of these locking-related panics happening. (FWIW,
> I did not bisect far back enough such that I would need to completely
> wipe and downgrade userland.)
> 
> See below for ddb session fragments.
> 
> >How-To-Repeat:
> * Build a recent kernel with MP_LOCKDEBUG and WITNESS options turned on.
> * Disable apldrm(4), since display output is currently not working
> with this device enabled (separate problem).
> * Boot on this new kernel.
> * If the system does not panic after "starting network", building a
> kernel with `make -j24` will often trigger a similar locking-related
> panic instead.
> 
> >Fix:
> Workarounds:
> * use a single-processor kernel;
> * non-WITNESS/MP_LOCKDEBUG kernels will obviously not panic, but can still 
> hang
> 
> ddb fragments:
> 1. during "make -j24" (oddly, mach ddbcpu X did not seem to give
> information for X > 17 when there are 24 processors)
> ddb{16}> show panic
>  cpu0: kernel diagnostic assertion "((flags & PGO_LOCKED) != 0 && 
> rw_lock_held(
> uobj->vmobjlock)) || (flags & PGO_LOCKED) == 0" failed: file 
> "/home/dana/src/op
> enbsd/openbsd-src/sys/uvm/uvm_vnode.c", line 953
>  cpu22: kernel diagnostic assertion "((flags & PGO_LOCKED) != 0 && 
> rw_lock_held
> (uobj->vmobjlock)) || (flags & PGO_LOCKED) == 0" failed: file 
> "/home/dana/src/o
> penbsd/openbsd-src/sys/uvm/uvm_vnode.c", line 953
>  cpu20: kernel diagnostic assertion "((flags & PGO_LOCKED) != 0 && 
> rw_lock_held
> (uobj->vmobjlock)) || (flags & PGO_LOCKED) == 0" failed: file 
> "/home/dana/src/o
> penbsd/openbsd-src/sys/uvm/uvm_vnode.c", line 953
>  cpu19: kernel diagnostic assertion "((flags & PGO_LOCKED) != 0 && 
> rw_lock_held
> (uobj->vmobjlock)) || (flags & PGO_LOCKED) == 0" failed: file 
> "/home/dana/src/o
> penbsd/openbsd-src/sys/uvm/uvm_vnode.c", line 953
>  cpu17: kernel diagnostic assertion "((flags & PGO_LOCKED) != 0 && 
> rw_lock_held
> (uobj->vmobjlock)) || (flags & PGO_LOCKED) == 0" failed: file 
> "/home/dana/src/o
> penbsd/openbsd-src/sys/uvm/uvm_vnode.c", line 953
> *cpu16: acquiring blockable sleep lock with spinlock or critical section held 
> (
> kernel_lock) &kernel_lock
>  cpu15: kernel diagnostic assertion "((flags & PGO_LOCKED) != 0 && 
> rw_lock_held
> (uobj->vmobjlock)) || (flags & PGO_LOCKED) == 0" failed: file 
> "/home/dana/src/o
> penbsd/openbsd-src/sys/uvm/uvm_vnode.c", line 953
>  cpu14: kernel diagnostic assertion "((flags & PGO_LOCKED) != 0 && 
> rw_lock_held
> (uobj->vmobjlock)) || (flags & PGO_LOCKED) == 0" failed: file 
> "/home/dana/src/o
> penbsd/openbsd-src/sys/uvm/uvm_vnode.c", line 953
>  cpu11: kernel diagnostic assertion "((flags & PGO_LOCKED) != 0 && 
> rw_lock_held
> (uobj->vmobjlock)) || (flags & PGO_LOCKED) == 0" failed: file 
> "/home/dana/src/o
> penbsd/openbsd-src/sys/uvm/uvm_vnode.c", line 953
>  cpu9: kernel diagnostic assertion "((flags & PGO_LOCKED) != 0 && 
> rw_lock_held(
> uobj->vmobjlock)) || (flags & PGO_LOCKED) == 0" failed: file 
> "/home/dana/src/op
> enbsd/openbsd-src/sys/uvm/uvm_vnode.c", line 953
>  cpu4: kernel diagnostic assertion "((flags & PGO_LOCKED) != 0 && 
> rw_lock_held(
> uobj->vmobjlock)) || (flags & PGO_LOCKED) == 0" failed: file 
> "/home/dana/src/op
> enbsd/openbsd-src/sys/uvm/uvm_vnode.c", line 953
>  cpu2: kernel diagnostic assertion "anon == NULL || anon->an_lock == NULL || 
> rw
> _write_held(anon->an_lock)" failed: file 
> "/home/dana/src/openbsd/openbsd-s

Re: Mac Studio hangs; locking problems on WITNESS/MP_LOCKDEBUG kernels

2024-06-19 Thread Claudio Jeker
On Wed, Jun 19, 2024 at 12:08:19PM +0200, Mark Kettenis wrote:
> > From: Dana Koch 
> > Date: Tue, 18 Jun 2024 23:34:07 -0400
> 
> Hi Dana,
> 
> Thanks for the report.  I have an M2 Pro Mac Mini that is very
> reliable.  And I believe there are folks using machines with M2 Max
> without issues as well.  So these issues are likely specific to the M2
> Ultra SoC.
> 
> The fact that "mach ddbcpu X" doesn't work for X > 17 makes me wonder
> if there is something subtly wrong with interrupts on the M2 Ultra.
> I'll need to see if I can find out more.

Be careful with ddb. By default the input is in hex. So 17 is actually 23
which is the last CPU in the system. Use 0t17 to force it to decimal
numbers.

I hate this bit about ddb a lot. I shoot myself with this over and over
again.
 
> One thing that would help me investigate further is "eeprom -p" output
> for this machine.
> 
> Thanks,
> 
> Mark
> 
> > >Synopsis: Mac Studio hangs; locking problems on WITNESS/MP_LOCKDEBUG 
> > >kernels
> > >Category: kernel
> > >Environment:
> > System  : OpenBSD 7.5
> > Details : OpenBSD 7.5-current (GENERIC.MP) #69: Wed Jun 12 04:43:28 MDT 
> > 2024
> > dera...@arm64.openbsd.org:/usr/src/sys/arch/arm64/compile/GENERIC.MP
> > 
> > Architecture: OpenBSD.arm64
> > Machine : arm64
> > >Description:
> > System can hang and be unresponsive on a Mac Studio (M2, Ultra),
> > either soon after boot passes to userland during/after "starting
> > network", or under load. When on a kernel with MP_LOCKDEBUG and
> > WITNESS options turned on, these points will trigger locking-related
> > panics
> > 
> > When trying to bisect and boot onto different kernel binaries built
> > from different points of time, the system may successfully pass the
> > "starting network" point at boot without panic'ing, but may instead
> > panic at some other seemingly random point under load. There appeared
> > to be no good correlation between commits at different points of time
> > and the reliability of these locking-related panics happening. (FWIW,
> > I did not bisect far back enough such that I would need to completely
> > wipe and downgrade userland.)
> > 
> > See below for ddb session fragments.
> > 
> > >How-To-Repeat:
> > * Build a recent kernel with MP_LOCKDEBUG and WITNESS options turned on.
> > * Disable apldrm(4), since display output is currently not working
> > with this device enabled (separate problem).
> > * Boot on this new kernel.
> > * If the system does not panic after "starting network", building a
> > kernel with `make -j24` will often trigger a similar locking-related
> > panic instead.
> > 
> > >Fix:
> > Workarounds:
> > * use a single-processor kernel;
> > * non-WITNESS/MP_LOCKDEBUG kernels will obviously not panic, but can still 
> > hang
> > 
> > ddb fragments:
> > 1. during "make -j24" (oddly, mach ddbcpu X did not seem to give
> > information for X > 17 when there are 24 processors)
> > ddb{16}> show panic
> >  cpu0: kernel diagnostic assertion "((flags & PGO_LOCKED) != 0 && 
> > rw_lock_held(
> > uobj->vmobjlock)) || (flags & PGO_LOCKED) == 0" failed: file 
> > "/home/dana/src/op
> > enbsd/openbsd-src/sys/uvm/uvm_vnode.c", line 953
> >  cpu22: kernel diagnostic assertion "((flags & PGO_LOCKED) != 0 && 
> > rw_lock_held
> > (uobj->vmobjlock)) || (flags & PGO_LOCKED) == 0" failed: file 
> > "/home/dana/src/o
> > penbsd/openbsd-src/sys/uvm/uvm_vnode.c", line 953
> >  cpu20: kernel diagnostic assertion "((flags & PGO_LOCKED) != 0 && 
> > rw_lock_held
> > (uobj->vmobjlock)) || (flags & PGO_LOCKED) == 0" failed: file 
> > "/home/dana/src/o
> > penbsd/openbsd-src/sys/uvm/uvm_vnode.c", line 953
> >  cpu19: kernel diagnostic assertion "((flags & PGO_LOCKED) != 0 && 
> > rw_lock_held
> > (uobj->vmobjlock)) || (flags & PGO_LOCKED) == 0" failed: file 
> > "/home/dana/src/o
> > penbsd/openbsd-src/sys/uvm/uvm_vnode.c", line 953
> >  cpu17: kernel diagnostic assertion "((flags & PGO_LOCKED) != 0 && 
> > rw_lock_held
> > (uobj->vmobjlock)) || (flags & PGO_LOCKED) == 0" failed: file 
> > "/home/dana/src/o
> > penbsd/openbsd-src/sys/uvm/uvm_vnode.c", line 953
> > *cpu16: acquiring blockable sleep lock with spinlock or critical section 
> > held (
> > kernel_lock) &kernel_lock
> >  cpu15: kernel diagnostic assertion "((flags & PGO_LOCKED) != 0 && 
> > rw_lock_held
> > (uobj->vmobjlock)) || (flags & PGO_LOCKED) == 0" failed: file 
> > "/home/dana/src/o
> > penbsd/openbsd-src/sys/uvm/uvm_vnode.c", line 953
> >  cpu14: kernel diagnostic assertion "((flags & PGO_LOCKED) != 0 && 
> > rw_lock_held
> > (uobj->vmobjlock)) || (flags & PGO_LOCKED) == 0" failed: file 
> > "/home/dana/src/o
> > penbsd/openbsd-src/sys/uvm/uvm_vnode.c", line 953
> >  cpu11: kernel diagnostic assertion "((flags & PGO_LOCKED) != 0 && 
> > rw_lock_held
> > (uobj->vmobjlock)) || (flags & PGO_LOCKED) == 0" failed: file 
> > "/home/dana/src/o
> > penbsd/openbsd-src/sys/uvm/uvm_vnode.c", line 953
> >  cpu9: kernel diagnostic assertion "((flags & PGO_LOCKED) != 0 && 
> > rw_lock_

Re: Mac Studio hangs; locking problems on WITNESS/MP_LOCKDEBUG kernels

2024-06-19 Thread Martin Pieuchot
On 18/06/24(Tue) 23:34, Dana Koch wrote:
> >Synopsis: Mac Studio hangs; locking problems on WITNESS/MP_LOCKDEBUG kernels
> >Category: kernel
> >Environment:
> System  : OpenBSD 7.5
> Details : OpenBSD 7.5-current (GENERIC.MP) #69: Wed Jun 12 04:43:28 MDT 
> 2024
> dera...@arm64.openbsd.org:/usr/src/sys/arch/arm64/compile/GENERIC.MP
> 
> Architecture: OpenBSD.arm64
> Machine : arm64
> >Description:
> System can hang and be unresponsive on a Mac Studio (M2, Ultra),
> either soon after boot passes to userland during/after "starting
> network", or under load. When on a kernel with MP_LOCKDEBUG and
> WITNESS options turned on, these points will trigger locking-related
> panics

The panic is the following:

> *cpu16: acquiring blockable sleep lock with spinlock or critical section held 
> (
> kernel_lock) &kernel_lock

Which correspond to the following trace (note 0x10 is 16):

> ddb{9}> mach ddbcpu 10
> ddb{16}> trace
> db_enter() at panic+0x148
> panic() at witness_checkorder+0x84c
> witness_checkorder() at __mp_lock+0x64
> __mp_lock() at selwakeup+0x14
> selwakeup() at ptsstart+0x74
> ptsstart() at tputchar+0x84
> tputchar() at kputchar+0x7c
> kputchar() at kprintf+0x614
> kprintf() at printf+0x88
> printf() at witness_checkorder+0x518
> witness_checkorder() at mtx_enter+0x50
> mtx_enter() at timeout_del+0x2c
> timeout_del() at dequeue_randomness+0x38
> dequeue_randomness() at extract_entropy+0x90
> extract_entropy() at _rs_stir+0x28
> _rs_stir() at arc4random+0xf4
> arc4random() at uvm_map_hint+0x5c
> uvm_map_hint() at uaddr_rnd_select+0xf8
> uaddr_rnd_select() at uvm_addr_invoke+0xc0
> uvm_addr_invoke() at uvm_map_findspace+0x78
> uvm_map_findspace() at uvm_mapanon+0x228
> uvm_mapanon() at uvm_mmapanon+0xd0
> uvm_mmapanon() at sys_mmap+0x330
> sys_mmap() at svc_handler+0x480
> svc_handler() at do_el0_sync+0xc8
> do_el0_sync() at handle_el0_sync+0x70
> handle_el0_sync() at 0x4d20f2bf8
> --- trap ---
> end of kernel

This is a lock order reversal reported by WITNESS.  Thankfully claudio@
already committed a fix for this on the 16th.  So please, try with
up-to-date sources 



Re: strmode should take a mode_t instead of int.

2024-06-19 Thread Otto Moerbeek
On Tue, Jun 18, 2024 at 10:00:20PM -0700, Collin Funk wrote:

> Hi,
> 
> I noticed that strmode(3) says that the first argument should be
> mode_t. OpenBSD declares it with int which is not compatible since
> mode_t appears to be unsigned, from what I can tell.
> 
> NetBSD fixed this a long time ago and FreeBSD did the same before the
> 14.0 release.
> 
> Apologies for the lack of diff, I don't have access to an OpenBSD
> machine at the moment. I think something like this would work though:
> 
> In sys/_types.h:

I think this snippet should be in sys/types.h.

> 
> #ifndef _MODE_T_DEFINED_
> #define _MODE_T_DEFINED_
> typedef __mode_t  mode_t
> #endif
> 
> and then in string.h:

This part is not going to work as string.h include machine/_types.h
but not sys/_types.h (or sys/types.h for that matter). FreeBSD
modified it to include sys/_types.h

> #ifndef _MODE_T_DEFINED_
> #define _MODE_T_DEFINED_
> typedef __mode_t  mode_t
> #endif
> void   strmode(mode_t, char *);
> 
> Thanks,
> Collin
> 

Additionally, the implementation in src/libn/libc/string/strmode.c
needs to start using mode_t.

Building base now with the diff below. So far so good.

But this is more tricky you would think. Modifying string.h to include
more could have unwanted side effects for applications.

-Otto

Index: include/string.h
===
RCS file: /home/cvs/src/include/string.h,v
diff -u -p -r1.32 string.h
--- include/string.h5 Sep 2017 03:16:13 -   1.32
+++ include/string.h19 Jun 2024 13:11:42 -
@@ -37,7 +37,7 @@
 
 #include 
 #include 
-#include 
+#include 
 
 /*
  * POSIX mandates that certain string functions not present in ISO C
@@ -128,7 +128,11 @@ size_t  strlcat(char *, const char *, si
__attribute__ ((__bounded__(__string__,1,3)));
 size_t  strlcpy(char *, const char *, size_t)
__attribute__ ((__bounded__(__string__,1,3)));
-voidstrmode(int, char *);
+#ifndef _MODE_T_DEFINED_
+#define _MODE_T_DEFINED_
+typedef __mode_t   mode_t;
+#endif
+voidstrmode(mode_t, char *);
 char   *strsep(char **, const char *);
 int timingsafe_bcmp(const void *, const void *, size_t);
 int timingsafe_memcmp(const void *, const void *, size_t);
Index: lib/libc/string/strmode.c
===
RCS file: /home/cvs/src/lib/libc/string/strmode.c,v
diff -u -p -r1.8 strmode.c
--- lib/libc/string/strmode.c   31 Aug 2015 02:53:57 -  1.8
+++ lib/libc/string/strmode.c   19 Jun 2024 13:11:42 -
@@ -32,10 +32,8 @@
 #include 
 #include 
 
-/* XXX mode should be mode_t */
-
 void
-strmode(int mode, char *p)
+strmode(mode_t mode, char *p)
 {
 /* print type */
switch (mode & S_IFMT) {
Index: sys/sys/types.h
===
RCS file: /home/cvs/src/sys/sys/types.h,v
diff -u -p -r1.49 types.h
--- sys/sys/types.h 6 Aug 2022 13:31:13 -   1.49
+++ sys/sys/types.h 19 Jun 2024 13:11:43 -
@@ -140,7 +140,10 @@ typedef__gid_t gid_t;  /* group id */
 typedef__id_t  id_t;   /* may contain pid, uid or gid 
*/
 typedef__ino_t ino_t;  /* inode number */
 typedef__key_t key_t;  /* IPC key (for Sys V IPC) */
+#ifndef _MODE_T_DEFINED_
+#define _MODE_T_DEFINED_
 typedef__mode_tmode_t; /* permissions */
+#endif
 typedef__nlink_t   nlink_t;/* link count */
 typedef__rlim_trlim_t; /* resource limit */
 typedef__segsz_t   segsz_t;/* segment size */



Re: Mac Studio hangs; locking problems on WITNESS/MP_LOCKDEBUG kernels

2024-06-19 Thread Dana Koch
On Wed, Jun 19, 2024 at 6:58 AM Martin Pieuchot  wrote:
> This is a lock order reversal reported by WITNESS.  Thankfully claudio@
> already committed a fix for this on the 16th.  So please, try with
> up-to-date sources

Just to be paranoid, I built a kernel with recent sources and
MP_LOCKDEBUG and WITNESS. I experienced both the "lock spun out" error
after "starting network" -- but not on serial console, unfortunately
-- and from `make -j24` as mentioned which I did capture.

Here is the ddb session -- properly capturing each `mach ddbcpu` and
`trace` with hex this time:

__mp_lock_spin: 0xff8001292800 lock spun out
Stopped at  __mp_lock+0x138:ldr w8, [x23,#712]
ddb{16}> __mp_lock_spin: 0xff8001292800 lock spun out

ddb{16}> show panic__mp_lock_spin: 0xff8001292800 lock spun out

the kernel did not panic
ddb{16}> trace
db_enter() at __mp_lock+0x134
__mp_lock() at svc_handler+0x42c
svc_handler() at do_el0_sync+0xc8
do_el0_sync() at handle_el0_sync+0x70
handle_el0_sync() at 0x479a436a0
--- trap ---
end of kernel
ddb{16}> mach ddbcpu 0
Stopped at  __mp_lock+0x138:ldr w8, [x23,#712]
ddb{0}> trace
db_enter() at __mp_lock+0x134
__mp_lock() at aplintc_irq_handler+0x158
aplintc_irq_handler() at arm_cpu_irq+0x34
arm_cpu_irq() at handle_el1h_irq+0x68
handle_el1h_irq() at db_enter_ddb+0x268
db_enter_ddb() at kdb_trap+0x64
kdb_trap() at db_trapper+0x30
db_trapper() at handle_el1h_sync+0x68
handle_el1h_sync() at db_enter+0x14
db_enter() at aplintc_fiq_handler+0x6c
aplintc_fiq_handler() at arm_cpu_fiq+0x34
arm_cpu_fiq() at handle_el1h_fiq+0x68
handle_el1h_fiq() at aq_lookup+0x80
aq_intr() at arm_cpu_irq+0x34
arm_cpu_irq() at handle_el1h_irq+0x68
handle_el1h_irq() at do_el1h_sync+0x40
do_el1h_sync() at handle_el1h_sync+0x68
handle_el1h_sync() at db_enter+0x14
db_enter() at __mp_lock+0x134
__mp_lock() at softintr_biglock_wrap+0x14
softintr_biglock_wrap() at softintr_dispatch+0x84
softintr_dispatch() at arm_do_pending_intr+0xfc
arm_do_pending_intr() at arm_cpu_fiq+0x34
arm_cpu_fiq() at handle_el1h_fiq+0x68
handle_el1h_fiq() at mtx_enter+0xec
mtx_enter() at sched_idle+0x12c
sched_idle() at proc_trampoline+0xc
ddb{0}> mach ddbcpu 1
Stopped at  aplintc_fiq_handler+0x70:   b   ff8000265cb0 
ddb{1}> trace
db_enter() at aplintc_fiq_handler+0x6c
aplintc_fiq_handler() at arm_cpu_fiq+0x34
arm_cpu_fiq() at handle_el1h_fiq+0x68
handle_el1h_fiq() at mtx_enter+0xec
mtx_enter() at mi_switch+0x2f0
mi_switch() at sched_idle+0x134
sched_idle() at proc_trampoline+0xc
ddb{1}> mach ddbcpu 2
Stopped at  aplintc_fiq_handler+0x70:   b   ff8000265cb0 
ddb{2}> trace
db_enter() at aplintc_fiq_handler+0x6c
aplintc_fiq_handler() at arm_cpu_fiq+0x34
arm_cpu_fiq() at handle_el1h_fiq+0x68
handle_el1h_fiq() at mtx_enter+0xec
mtx_enter() at sched_idle+0x12c
sched_idle() at proc_trampoline+0xc
ddb{2}> mach ddbcpu 3
Stopped at  aplintc_fiq_handler+0x70:   b   ff8000265cb0 
ddb{3}> trace
db_enter() at aplintc_fiq_handler+0x6c
aplintc_fiq_handler() at arm_cpu_fiq+0x34
arm_cpu_fiq() at handle_el1h_fiq+0x68
handle_el1h_fiq() at mtx_enter+0xb8
mtx_enter() at sched_idle+0x12c
sched_idle() at proc_trampoline+0xc
ddb{3}> mach ddbcpu 4
Stopped at  aplintc_fiq_handler+0x70:   b   ff8000265cb0 
ddb{4}> trace
db_enter() at aplintc_fiq_handler+0x6c
aplintc_fiq_handler() at arm_cpu_fiq+0x34
arm_cpu_fiq() at handle_el1h_fiq+0x68
handle_el1h_fiq() at mtx_enter+0x114
mtx_enter() at sys_sched_yield+0x6c
sys_sched_yield() at svc_handler+0x478
svc_handler() at do_el0_sync+0xc8
do_el0_sync() at handle_el0_sync+0x70
handle_el0_sync() at 0x479a67b88
--- trap ---
end of kernel
ddb{4}> mach ddbcpu 5
Stopped at  aplintc_fiq_handler+0x70:   b   ff8000265cb0 
ddb{5}> trace
db_enter() at aplintc_fiq_handler+0x6c
aplintc_fiq_handler() at arm_cpu_fiq+0x34
arm_cpu_fiq() at handle_el1h_fiq+0x68
handle_el1h_fiq() at cpu_idle_cycle+0x28
cpu_idle_cycle() at sched_idle+0x218
sched_idle() at proc_trampoline+0xc
ddb{5}> mach ddbcpu 6
Stopped at  aplintc_fiq_handler+0x70:   b   ff8000265cb0 
ddb{6}> trace
db_enter() at aplintc_fiq_handler+0x6c
aplintc_fiq_handler() at arm_cpu_fiq+0x34
arm_cpu_fiq() at handle_el1h_fiq+0x68
handle_el1h_fiq() at __mp_lock+0x104
__mp_lock() at svc_handler+0x42c
svc_handler() at do_el0_sync+0xc8
do_el0_sync() at handle_el0_sync+0x70
handle_el0_sync() at 0x479a436a0
--- trap ---
end of kernel
ddb{6}> mach ddbcpu 7
Stopped at  aplintc_fiq_handler+0x70:   b   ff8000265cb0 
ddb{7}> trace
db_enter() at aplintc_fiq_handler+0x6c
aplintc_fiq_handler() at arm_cpu_fiq+0x34
arm_cpu_fiq() at handle_el1h_fiq+0x68
handle_el1h_fiq() at __mp_lock+0x104
__mp_lock() at reaper+0x108
reaper() at proc_trampoline+0xc
ddb{7}> mach ddbcpu 8
Stopped at  aplintc_fiq_handler+0x70:   b   ff8000265cb0 
ddb{8}> trace
db_enter() at aplintc_fiq_handler+0x6c
aplintc_fiq_handler() at arm_cpu_fiq+0x34
arm_cpu_fiq() at handle_el1h_fi

Re: strmode should take a mode_t instead of int.

2024-06-19 Thread Mark Kettenis
> Date: Wed, 19 Jun 2024 15:17:05 +0200
> From: Otto Moerbeek 
> 
> On Tue, Jun 18, 2024 at 10:00:20PM -0700, Collin Funk wrote:
> 
> > Hi,
> > 
> > I noticed that strmode(3) says that the first argument should be
> > mode_t. OpenBSD declares it with int which is not compatible since
> > mode_t appears to be unsigned, from what I can tell.
> > 
> > NetBSD fixed this a long time ago and FreeBSD did the same before the
> > 14.0 release.
> > 
> > Apologies for the lack of diff, I don't have access to an OpenBSD
> > machine at the moment. I think something like this would work though:
> > 
> > In sys/_types.h:
> 
> I think this snippet should be in sys/types.h.
> 
> > 
> > #ifndef _MODE_T_DEFINED_
> > #define _MODE_T_DEFINED_
> > typedef __mode_tmode_t
> > #endif
> > 
> > and then in string.h:
> 
> This part is not going to work as string.h include machine/_types.h
> but not sys/_types.h (or sys/types.h for that matter). FreeBSD
> modified it to include sys/_types.h
> 
> > #ifndef _MODE_T_DEFINED_
> > #define _MODE_T_DEFINED_
> > typedef __mode_tmode_t
> > #endif
> > void strmode(mode_t, char *);
> > 
> > Thanks,
> > Collin
> > 
> 
> Additionally, the implementation in src/libn/libc/string/strmode.c
> needs to start using mode_t.
> 
> Building base now with the diff below. So far so good.
> 
> But this is more tricky you would think. Modifying string.h to include
> more could have unwanted side effects for applications.
> 
>   -Otto
> 
> Index: include/string.h
> ===
> RCS file: /home/cvs/src/include/string.h,v
> diff -u -p -r1.32 string.h
> --- include/string.h  5 Sep 2017 03:16:13 -   1.32
> +++ include/string.h  19 Jun 2024 13:11:42 -
> @@ -37,7 +37,7 @@
>  
>  #include 
>  #include 
> -#include 
> +#include 
>  
>  /*
>   * POSIX mandates that certain string functions not present in ISO C
> @@ -128,7 +128,11 @@ size_tstrlcat(char *, const char *, si
>   __attribute__ ((__bounded__(__string__,1,3)));
>  size_tstrlcpy(char *, const char *, size_t)
>   __attribute__ ((__bounded__(__string__,1,3)));
> -void  strmode(int, char *);
> +#ifndef _MODE_T_DEFINED_
> +#define _MODE_T_DEFINED_
> +typedef __mode_t mode_t;
> +#endif

It may be safer to drop this bit...

> +void  strmode(mode_t, char *);

...and use __mode_t in the prototype and implementation.

>  char *strsep(char **, const char *);
>  int   timingsafe_bcmp(const void *, const void *, size_t);
>  int   timingsafe_memcmp(const void *, const void *, size_t);
> Index: lib/libc/string/strmode.c
> ===
> RCS file: /home/cvs/src/lib/libc/string/strmode.c,v
> diff -u -p -r1.8 strmode.c
> --- lib/libc/string/strmode.c 31 Aug 2015 02:53:57 -  1.8
> +++ lib/libc/string/strmode.c 19 Jun 2024 13:11:42 -
> @@ -32,10 +32,8 @@
>  #include 
>  #include 
>  
> -/* XXX mode should be mode_t */
> -
>  void
> -strmode(int mode, char *p)
> +strmode(mode_t mode, char *p)
>  {
>/* print type */
>   switch (mode & S_IFMT) {
> Index: sys/sys/types.h
> ===
> RCS file: /home/cvs/src/sys/sys/types.h,v
> diff -u -p -r1.49 types.h
> --- sys/sys/types.h   6 Aug 2022 13:31:13 -   1.49
> +++ sys/sys/types.h   19 Jun 2024 13:11:43 -
> @@ -140,7 +140,10 @@ typedef  __gid_t gid_t;  /* group id */
>  typedef  __id_t  id_t;   /* may contain pid, uid or gid 
> */
>  typedef  __ino_t ino_t;  /* inode number */
>  typedef  __key_t key_t;  /* IPC key (for Sys V IPC) */
> +#ifndef _MODE_T_DEFINED_
> +#define _MODE_T_DEFINED_
>  typedef  __mode_tmode_t; /* permissions */
> +#endif
>  typedef  __nlink_t   nlink_t;/* link count */
>  typedef  __rlim_trlim_t; /* resource limit */
>  typedef  __segsz_t   segsz_t;/* segment size */
> 
> 



Re: strmode should take a mode_t instead of int.

2024-06-19 Thread Todd C . Miller
On Wed, 19 Jun 2024 15:17:05 +0200, Otto Moerbeek wrote:

> Additionally, the implementation in src/libn/libc/string/strmode.c
> needs to start using mode_t.
>
> Building base now with the diff below. So far so good.
>
> But this is more tricky you would think. Modifying string.h to include
> more could have unwanted side effects for applications.

This looks fine to me but I agree with kettenis@ that just using
__mode_t directly in the prototype would be a bit safer since it
avoids any namespace pollution.  I don't expect any problems including
sys/_types.h in place of machine/_types.h.

 - todd



Re: strmode should take a mode_t instead of int.

2024-06-19 Thread Theo de Raadt
Todd C. Miller  wrote:

> On Wed, 19 Jun 2024 15:17:05 +0200, Otto Moerbeek wrote:
> 
> > Additionally, the implementation in src/libn/libc/string/strmode.c
> > needs to start using mode_t.
> >
> > Building base now with the diff below. So far so good.
> >
> > But this is more tricky you would think. Modifying string.h to include
> > more could have unwanted side effects for applications.
> 
> This looks fine to me but I agree with kettenis@ that just using
> __mode_t directly in the prototype would be a bit safer since it
> avoids any namespace pollution.  I don't expect any problems including
> sys/_types.h in place of machine/_types.h.

Agree.  I don't think it needs any API/ABI cranks, either.



Re: strmode should take a mode_t instead of int.

2024-06-19 Thread Theo Buehler
These are the ports using strmode.

archivers/libarchive
archivers/libtar
editors/emacs
games/gemrb
math/octave
misc/findutils
net/lftp
security/ssh-ldap-helper
shells/ksh93
sysutils/bfs
sysutils/colorls
sysutils/coreutils
sysutils/lnav
sysutils/tarsnap

Given the short list and the nature of the change, I don't think it's
necessary to run a bulk, but inspecting a few of them would be good,
especially libarchive and coreutils are depended upon by a lot of ports.
And there's emacs in this list.



TLS handshake failure during pkg_add

2024-06-19 Thread Mizsei Zoltán
Hi,

I am facing this issue on my VPS. All other machines are unaffected. All of 
them are in the same TZ.

vps# pkg-add -u
/bin/ksh: pkg-add: not found
vps#
vps# pkg_add -u
quirks-7.14 signed on 2024-06-15T18:27:56Z
https://cloudflare.cdn.openbsd.org/pub/OpenBSD//7.5/packages/amd64/updatedb-0p0.tgz:
 TLS handshake failure: handshake failed: error:02FFF00D:system 
library:func(4095):Permission denied
signify: gzheader truncated
vps# uname -a
OpenBSD vps.extrowerk.com 7.5 GENERIC#79 amd64
vps# df -h
Filesystem SizeUsed   Avail Capacity  Mounted on
/dev/sd1a  605M   83.6M491M15%/
/dev/sd1k  3.7G1.0M3.5G 1%/home
/dev/sd1d  848M   10.0K806M 1%/tmp
/dev/sd1f  2.3G1.5G764M67%/usr
/dev/sd1g  643M303M308M50%/usr/X11R6
/dev/sd1h  2.3G182M2.0G 9%/usr/local
/dev/sd1j  5.2G2.0K4.9G 1%/usr/obj
/dev/sd1i  1.6G2.0K1.5G 1%/usr/src
/dev/sd1e  1.2G   29.1M1.1G 3%/var
vps# cat /etc/installurl
https://cloudflare.cdn.openbsd.org/pub/OpenBSD/
#http://cdn.openbsd.org/pub/OpenBSD <- I have disabled it, the same issue 
happens with that too.
vps# cat /etc/t
termcap  ttys
vps# date
Wed Jun 19 23:31:01 CEST 2024 <- it is correct
vps# ls -la /etc/localtime
lrwxr-xr-x  1 root  wheel  35 Feb 27 21:45 /etc/localtime -> 
/usr/share/zoneinfo/Europe/Budapest <- it is correct


--ext



Re: TLS handshake failure during pkg_add

2024-06-19 Thread Stuart Henderson
On 2024/06/19 23:36, Mizsei Zoltán wrote:
> Hi,
> 
> I am facing this issue on my VPS. All other machines are unaffected. All of 
> them are in the same TZ.
> 
> vps# pkg-add -u
> /bin/ksh: pkg-add: not found
> vps#
> vps# pkg_add -u
> quirks-7.14 signed on 2024-06-15T18:27:56Z
> https://cloudflare.cdn.openbsd.org/pub/OpenBSD//7.5/packages/amd64/updatedb-0p0.tgz:
>  TLS handshake failure: handshake failed: error:02FFF00D:system 
> library:func(4095):Permission denied

That looks rather like PF is blocking the outbound connnection.



Re: strmode should take a mode_t instead of int.

2024-06-19 Thread Collin Funk
Mark Kettenis wrote:
>> +#ifndef _MODE_T_DEFINED_
>> +#define _MODE_T_DEFINED_
>> +typedef __mode_tmode_t;
>> +#endif
>
> It may be safer to drop this bit...
> 
>> +void strmode(mode_t, char *);
> ...and use __mode_t in the prototype and implementation.

Someone including  might expect mode_t to be defined without
having to include another header.

However, no standard requires that and it seems strmode is the only
function that uses it. So if it is too much of an inconvenience I
wouldn't stress about it.

Collin



Re: TLS handshake failure during pkg_add

2024-06-19 Thread Mizsei Zoltán
Hi and thanks for your reply.

Some extra information:
- If i try pkg_add many times, it will eventually do its job without any error. 
But it needs many tries.
Also  switching to other mirror using the /etc/installurl helps *sometimes*...
I don't have any issue with other networking programs.
Your suggestion regarding firewall can still be the culprit, I have set up pf 
according to this blogpost: 
https://blog.thechases.com/posts/bsd/aggressive-pf-config-for-ssh-protection/
Do you see any obvious errors here?

Regards,
-ext


Stuart Henderson írta 2024. jún.. 19, Sze-n 23:41 órakor:
> On 2024/06/19 23:36, Mizsei Zoltán wrote:
>> Hi,
>> 
>> I am facing this issue on my VPS. All other machines are unaffected. All of 
>> them are in the same TZ.
>> 
>> vps# pkg-add -u
>> /bin/ksh: pkg-add: not found
>> vps#
>> vps# pkg_add -u
>> quirks-7.14 signed on 2024-06-15T18:27:56Z
>> https://cloudflare.cdn.openbsd.org/pub/OpenBSD//7.5/packages/amd64/updatedb-0p0.tgz:
>>  TLS handshake failure: handshake failed: error:02FFF00D:system 
>> library:func(4095):Permission denied
>
> That looks rather like PF is blocking the outbound connnection.

-- 
--Z--