Re: vmd/vmm: remove an ioctl from the vcpu hotpath, go brrr

2023-09-05 Thread Dave Voutila


Mischa  writes:

> On 2023-09-05 14:27, Dave Voutila wrote:
>> Mike Larkin  writes:
>>
>>> On Mon, Sep 04, 2023 at 07:57:18PM +0200, Mischa wrote:
 On 2023-09-04 18:58, Mischa wrote:
 > On 2023-09-04 18:55, Mischa wrote:
>> /snip
>>
 > > Adding the sleep 2 does indeed help. I managed to get 20 VMs started
 > > this way, before it would choke on 2-3.
 > >
 > > Do I only need the unpatched kernel or also the vmd/vmctl from snap?
 >
 > I do still get the same message on the console, but the machine isn't
 > freezing up.
 >
 > [umd173152/210775 sp=7a5f577a1780 inside 702698535000-702698d34fff: not
 > MAP_STACK
 Starting 30 VMs this way caused the machine to become unresponsive
 again,
 but nothing on the console. :(
 Mischa
>>> Were you seeing these uvm errors before this diff? If so, this
>>> isn't
>>> causing the problem and something else is.
>> I don't believe we solved any of the underlying uvm issues in Bruges
>> last year. Mischa, can you test with just the latest snapshot/-current?
>> I'd imagine starting and stopping many vm's now is exacerbating the
>> issue because of the fork/exec for devices plus the ioctl to do a uvm
>> share into the device process address space.
>>
>>> If this diff causes the errors to occur, and without the diff it's
>>> fine, then
>>> we need to look into that.
>>> Also I think a pid number in that printf might be useful, I'll see
>>> what I can
>>> find. If it's not vmd causing this and rather some other process
>>> then that
>>> would be good to know also.
>> Sadly it looks like that printf doesn't spit out the offending
>> pid. :(
>
> Just to confirm I am seeing this behavior on the latest snap without
> the patch as well.

Since this diff isn't the cause, I've committed it. Thanks for
testing. I'll see if I can reproduce your MAP_STACK issues.

> Just started 10 VMs with sleep 2, machine freezes, but nothing on the
> console. :(

For now, I'd recommend spacing out vm launches. I'm pretty sure it's
related to the uvm corruption we saw last year when creating, starting,
and destroying vm's rapidly in a loop.

-dv



Re: clang 15 and zlib

2023-09-05 Thread Todd C . Miller
On Wed, 06 Sep 2023 10:43:34 +1000, Jonathan Gray wrote:

> tb updated us to the newer version a while ago

OK millert@

 - todd



Re: riscv64 possible alignment issue?

2023-09-05 Thread Theo Buehler
> If you have another idea why I get these let me know, I'm going to bed now
> and won't be up until 6AM or later tomorrow.  I'm letting my kernel compile
> on the slow QEMU host overnight.

Please cut that crap out. Nobody cares.



Re: clang 15 and zlib

2023-09-05 Thread Theo Buehler
On Wed, Sep 06, 2023 at 10:43:34AM +1000, Jonathan Gray wrote:
> On Wed, Dec 28, 2022 at 02:36:56PM -0700, Todd C. Miller wrote:
> > OK millert@ as well.  There is no point in trying to fix this locally
> > when upstream zlib will be changing it in the near(?) future.
> > 
> >  - todd
> 
> tb updated us to the newer version a while ago

Please go ahead



Re: clang 15 and zlib

2023-09-05 Thread Jonathan Gray
On Wed, Dec 28, 2022 at 02:36:56PM -0700, Todd C. Miller wrote:
> OK millert@ as well.  There is no point in trying to fix this locally
> when upstream zlib will be changing it in the near(?) future.
> 
>  - todd

tb updated us to the newer version a while ago

diff --git sys/arch/amd64/conf/Makefile.amd64 sys/arch/amd64/conf/Makefile.amd64
index 6f760b174bb..e246d883a63 100644
--- sys/arch/amd64/conf/Makefile.amd64
+++ sys/arch/amd64/conf/Makefile.amd64
@@ -78,9 +78,6 @@ CMACHFLAGS+=  -mretpoline-external-thunk 
-fcf-protection=branch
 NO_INTEGR_AS=  -no-integrated-as
 CWARNFLAGS+=   -Wno-address-of-packed-member -Wno-constant-conversion \
-Wno-unused-but-set-variable -Wno-gnu-folding-constant
-# XXX Workaround for zlib + clang 15
-# https://github.com/madler/zlib/issues/633
-CWARNFLAGS+=   -Wno-deprecated-non-prototype -Wno-unknown-warning-option
 .endif
 
 DEBUG?=-g
diff --git sys/arch/amd64/stand/Makefile.inc sys/arch/amd64/stand/Makefile.inc
index d423cae6555..e37f23005d6 100644
--- sys/arch/amd64/stand/Makefile.inc
+++ sys/arch/amd64/stand/Makefile.inc
@@ -25,9 +25,6 @@ SACFLAGS+=-nostdinc -fno-builtin -fpack-struct
 
 .include 
 .if ${COMPILER_VERSION:Mclang}
-# XXX Workaround for zlib + clang 15
-# https://github.com/madler/zlib/issues/633
-CFLAGS+=   -Wno-deprecated-non-prototype -Wno-unknown-warning-option
 NO_INTEGR_AS=  -no-integrated-as
 .endif
 
diff --git sys/arch/arm64/conf/Makefile.arm64 sys/arch/arm64/conf/Makefile.arm64
index 1719e3d045d..1149af714a8 100644
--- sys/arch/arm64/conf/Makefile.arm64
+++ sys/arch/arm64/conf/Makefile.arm64
@@ -55,9 +55,6 @@ CWARNFLAGS=   -Werror -Wall -Wimplicit-function-declaration \
-Wno-constant-conversion -Wno-address-of-packed-member \
-Wno-unused-but-set-variable -Wno-gnu-folding-constant \
-Wframe-larger-than=2047
-# XXX Workaround for zlib + clang 15
-# https://github.com/madler/zlib/issues/633
-CWARNFLAGS+=   -Wno-deprecated-non-prototype -Wno-unknown-warning-option
 
 CMACHFLAGS=-march=armv8-a+nofp+nosimd \
-fno-omit-frame-pointer -mno-omit-leaf-frame-pointer \
diff --git sys/arch/arm64/stand/efiboot/Makefile 
sys/arch/arm64/stand/efiboot/Makefile
index fbd86b1c292..216a9a98b7c 100644
--- sys/arch/arm64/stand/efiboot/Makefile
+++ sys/arch/arm64/stand/efiboot/Makefile
@@ -53,9 +53,6 @@ COPTS+=   -Wno-attributes -Wno-format
 COPTS+=-ffreestanding -fno-stack-protector
 COPTS+=-fshort-wchar -fPIC -fno-builtin
 COPTS+=-Wall -Werror
-# XXX Workaround for zlib + clang 15
-# https://github.com/madler/zlib/issues/633
-COPTS+=-Wno-deprecated-non-prototype 
-Wno-unknown-warning-option
 
 PROG.elf=  ${PROG:S/.EFI/.elf/}
 CLEANFILES+=   ${PROG.elf} ${PROG.elf}.tmp
diff --git sys/arch/armv7/conf/Makefile.armv7 sys/arch/armv7/conf/Makefile.armv7
index 78944a7d74d..30d2aea8d7e 100644
--- sys/arch/armv7/conf/Makefile.armv7
+++ sys/arch/armv7/conf/Makefile.armv7
@@ -28,9 +28,6 @@ CWARNFLAGS=   -Werror -Wall -Wimplicit-function-declaration \
-Wno-constant-conversion -Wno-address-of-packed-member \
-Wno-unused-but-set-variable -Wno-gnu-folding-constant \
-Wframe-larger-than=2047
-# XXX Workaround for zlib + clang 15
-# https://github.com/madler/zlib/issues/633
-CWARNFLAGS+=   -Wno-deprecated-non-prototype -Wno-unknown-warning-option
 
 CMACHFLAGS=-msoft-float -march=armv7a
 CMACHFLAGS+=   -ffreestanding ${NOPIE_FLAGS}
diff --git sys/arch/armv7/stand/efiboot/Makefile 
sys/arch/armv7/stand/efiboot/Makefile
index f769148f6ae..907d21e7a8e 100644
--- sys/arch/armv7/stand/efiboot/Makefile
+++ sys/arch/armv7/stand/efiboot/Makefile
@@ -51,9 +51,6 @@ COPTS+=   -ffreestanding -fno-stack-protector
 COPTS+=-fshort-wchar -fPIC -fno-builtin
 COPTS+=-Wall -Werror
 COPTS+=-mfloat-abi=soft
-# XXX Workaround for zlib + clang 15
-# https://github.com/madler/zlib/issues/633
-COPTS+=-Wno-deprecated-non-prototype 
-Wno-unknown-warning-option
 
 PROG.elf=  ${PROG:S/.EFI/.elf/}
 CLEANFILES+=   ${PROG.elf} ${PROG.elf}.tmp
diff --git sys/arch/i386/conf/Makefile.i386 sys/arch/i386/conf/Makefile.i386
index 77b2bb5540e..ddc752aa3bc 100644
--- sys/arch/i386/conf/Makefile.i386
+++ sys/arch/i386/conf/Makefile.i386
@@ -46,9 +46,6 @@ CMACHFLAGS+=  -mretpoline
 NO_INTEGR_AS=  -no-integrated-as
 CWARNFLAGS+=   -Wno-address-of-packed-member -Wno-constant-conversion \
-Wno-unused-but-set-variable -Wno-gnu-folding-constant
-# XXX Workaround for zlib + clang 15
-# https://github.com/madler/zlib/issues/633
-CWARNFLAGS+=   -Wno-deprecated-non-prototype -Wno-unknown-warning-option
 .endif
 
 DEBUG?=-g
diff --git sys/arch/i386/stand/Makefile.inc sys/arch/i386/stand/Makefile.inc
index d9898462e50..ca6c22b84f3 100644
--- sys/arch/i386/stand/Makefile.inc
+++ sys/arch/i386/st

riscv64 possible alignment issue?

2023-09-05 Thread Peter J. Philipp
Hi,

I'm porting OpenBSD to the Mango Pi D1.  Most of the work is done but now
is the time to try to get a RAMDISK kernel to boot.

I'm having an issue with atomic_store_64() as shown here (line 1192):

   1184 for (; va < DMAP_MAX_ADDRESS && pa < max_pa;
   1185 pa += L1_SIZE, va += L1_SIZE, l1_slot++) {
   1186 KASSERT(l1_slot < Ln_ENTRIES);
   1187
   1188 /* gigapages */
   1189 pn = (pa / PAGE_SIZE);
   1190 entry = PTE_KERN;
   1191 entry |= (pn << PTE_PPN0_S);
   1192 atomic_store_64(&l1[l1_slot], entry);
   1193 }

The D1 seems to hang on this call.  To me that seems like the store hangs that
I did on powerpc64 when memory was not aligned.  When I replaced the 
AMO instruction with a simple C replacement it worked.  So I wonder if the 
following is correct or not:

entry = _ALIGN(PTE_KERN);

I looked at riscv-privileged-202111203.pdf (page 79) which is for Sv32 and
a few pages down is the Sv39 (page 84).  But I can't make sense of it.  And
my Mona Lisa book that I have from Waterman mentions on page 101:

''Misaligned address exceptions occur when the effective address isn't divisible
by the access size - for example, amoadd.w with an address of 0x12.''

I don't think in the bootstrapping of pmap that traps are turned on yet,
but I could be mistaken, hence the hang.

If you have another idea why I get these let me know, I'm going to bed now
and won't be up until 6AM or later tomorrow.  I'm letting my kernel compile
on the slow QEMU host overnight.

Best Regards,
-peter

-- 
Over thirty years experience on Unix-like Operating Systems starting with QNX.



pax(1): pax format write support

2023-09-05 Thread Jeremie Courreges-Anglas


A long time ago the posix folks extended the ustar format to allow
representing arbitrarily big files, long file names, precise timestamps,
etc.  We have support to read such archives but no support to write them
out.  Here's a minimal proposal following discussions with Caspar and
other folks.

What the diff below does:

- provide a 'pax' format usable with pax(1) -x pax

- use extended headers to store file names and link names that don't fit
  in a standard ustar header.  Long file names in the ports tree were
  the intial motive for this effort.

What the diff below doesn't do:

- handle all the file attributes that ought to benefit from extended
  headers.  Right now I haven't even written that code and I hope to
  extend handled attributes gradually.

- the diff *doesn't change the default format* used by either pax(1) or
  tar(1).  I think changing the default at some point makes sense, since
  the format is superior and support is widely available.  My opinion is
  that we shouldn't expose this format by default until we're confident
  that the code is reasonably complete and exercised.  I have no idea
  whether this ought to happen before the next release.

- since the diff doesn't change the default for tar(1) and since tar(1)
  doesn't have a generic option to choose the format used, this code is
  unreachable using tar(1).  I'd *love* to have a bikeshed discussion
  about the proper way to handle that, but preferably after the code gets
  reviewed, pushed in the tree and improved. :)

Two more notes:
- no size increase for distrib/special/pax
- pax 101: you can exercise the new format with:
pax -x pax -w files... > outfile.tar

Thoughts?  ok?


Index: cpio.1
===
RCS file: /home/cvs/src/bin/pax/cpio.1,v
retrieving revision 1.36
diff -u -p -r1.36 cpio.1
--- cpio.1  16 Jan 2020 16:46:46 -  1.36
+++ cpio.1  4 Sep 2023 16:26:53 -
@@ -98,6 +98,8 @@ format.
 Old octal character
 .Nm
 format.
+.It Ar pax
+POSIX pax format.
 .It Ar sv4cpio
 SVR4 hex
 .Nm
@@ -173,6 +175,8 @@ format.
 Old octal character
 .Nm
 format.
+.It Ar pax
+POSIX pax format.
 .It Ar sv4cpio
 SVR4 hex
 .Nm
@@ -298,6 +302,8 @@ be used for larger files.
 .It bcpio Ta "4 Gigabytes"
 .It sv4cpio Ta "4 Gigabytes"
 .It cpio Ta "8 Gigabytes"
+.\" XXX should be "unlimited"
+.It pax Ta "8 Gigabytes"
 .It tar Ta "8 Gigabytes"
 .It ustar Ta "8 Gigabytes"
 .El
Index: extern.h
===
RCS file: /home/cvs/src/bin/pax/extern.h,v
retrieving revision 1.60
diff -u -p -r1.60 extern.h
--- extern.h23 Mar 2020 20:04:19 -  1.60
+++ extern.h4 Sep 2023 14:59:23 -
@@ -296,6 +296,7 @@ int tar_wr(ARCHD *);
 int ustar_id(char *, int);
 int ustar_rd(ARCHD *, char *);
 int ustar_wr(ARCHD *);
+int pax_wr(ARCHD *);
 
 /*
  * tty_subs.c
Index: options.c
===
RCS file: /home/cvs/src/bin/pax/options.c,v
retrieving revision 1.105
diff -u -p -r1.105 options.c
--- options.c   17 Jan 2023 16:20:28 -  1.105
+++ options.c   4 Sep 2023 16:44:55 -
@@ -214,6 +214,8 @@ FSUB fsub[] = {
{ },
 /* 9: gzip, to detect failure to use -z */
{ },
+/* 10: POSIX PAX */
+   { },
 #else
 /* 6: compress, to detect failure to use -Z */
{NULL, 0, 4, 0, 0, 0, 0, compress_id},
@@ -223,6 +225,10 @@ FSUB fsub[] = {
{NULL, 0, 4, 0, 0, 0, 0, bzip2_id},
 /* 9: gzip, to detect failure to use -z */
{NULL, 0, 4, 0, 0, 0, 0, gzip_id},
+/* 10: POSIX PAX */
+   {"pax", 5120, BLKMULT, 0, 1, BLKMULT, 0, ustar_id, no_op,
+   ustar_rd, tar_endrd, no_op, pax_wr, tar_endwr, tar_trail,
+   tar_opt},
 #endif
 };
 #defineF_OCPIO 0   /* format when called as cpio -6 */
Index: pax.1
===
RCS file: /home/cvs/src/bin/pax/pax.1,v
retrieving revision 1.76
diff -u -p -r1.76 pax.1
--- pax.1   31 Mar 2022 17:27:14 -  1.76
+++ pax.1   5 Sep 2023 13:46:15 -
@@ -868,6 +868,11 @@ standard.
 The default blocksize for this format is 10240 bytes.
 Filenames stored by this format must be 100 characters or less in length;
 the total pathname must be 256 characters or less.
+.It Cm pax
+The pax interchange format specified in the
+.St -p1003.1-2001
+standard.
+The default blocksize for this format is 5120 bytes.
 .El
 .Pp
 .Nm
@@ -1081,9 +1086,10 @@ utility is compliant with the
 specification,
 except that the
 .Cm pax
-archive format and the
+archive format is only partially supported,
+and the
 .Cm listopt
-keyword are unsupported.
+keyword is unsupported.
 .Pp
 The flags
 .Op Fl 0BDEGjOPTUYZz ,
Index: tar.c
===
RCS file: /home/cvs/src/bin/pax/tar.c,v
retrieving revision 1.73
diff -u -p -r1.73 tar.c
--- tar.c   4 Sep 2023 17:05:34 -   1.73
+++ tar.c   4 Sep 2023 2

Re: clockintr: add clockintr_advance_random()

2023-09-05 Thread Mike Larkin
On Tue, Sep 05, 2023 at 09:17:27AM -0500, Scott Cheloha wrote:
> mpi@ suggests folding the pseudorandom advance code from
> clockintr_statclock() into the clockintr API itself.  This replaces
> three API calls -- clockintr_expiration(), clockintr_nsecuptime(), and
> clockintr_schedule() -- we just one call to a new function,
> clockintr_advance_random().
>
> I'm fine with it.  A pseudorandom period is an odd thing and
> supporting it is difficult.  Having a single bespoke API to support it
> might be the lesser of two evils.
>
> With this in place, the statclock() patch on tech@ can be simplified.
>
> ok?
>

This seems like a good idea. ok mlarkin

> Index: kern_clockintr.c
> ===
> RCS file: /cvs/src/sys/kern/kern_clockintr.c,v
> retrieving revision 1.33
> diff -u -p -r1.33 kern_clockintr.c
> --- kern_clockintr.c  26 Aug 2023 22:21:00 -  1.33
> +++ kern_clockintr.c  5 Sep 2023 14:11:38 -
> @@ -42,8 +42,8 @@ uint32_t statclock_avg; /* [I] average
>  uint32_t statclock_min;  /* [I] minimum statclock period 
> (ns) */
>  uint32_t statclock_mask; /* [I] set of allowed offsets */
>
> +uint64_t clockintr_advance_random(struct clockintr *, uint64_t, uint32_t);
>  void clockintr_cancel_locked(struct clockintr *);
> -uint64_t clockintr_expiration(const struct clockintr *);
>  void clockintr_hardclock(struct clockintr *, void *);
>  uint64_t clockintr_nsecuptime(const struct clockintr *);
>  void clockintr_schedule(struct clockintr *, uint64_t);
> @@ -345,6 +345,30 @@ clockintr_advance(struct clockintr *cl,
>   return count;
>  }
>
> +/*
> + * Custom version of clockintr_advance() to support a pseudorandom
> + * statclock() period.  Hopefully we can throw this out at some point
> + * in the future.
> + */
> +uint64_t
> +clockintr_advance_random(struct clockintr *cl, uint64_t lo, uint32_t mask)
> +{
> + uint64_t count = 0;
> + struct clockintr_queue *cq = cl->cl_queue;
> + uint32_t off;
> +
> + KASSERT(cl == &cq->cq_shadow);
> +
> + while (cl->cl_expiration <= cq->cq_uptime) {
> + while ((off = (random() & mask)) == 0)
> + continue;
> + cl->cl_expiration += lo + off;
> + count++;
> + }
> + SET(cl->cl_flags, CLST_SHADOW_PENDING);
> + return count;
> +}
> +
>  void
>  clockintr_cancel(struct clockintr *cl)
>  {
> @@ -402,21 +426,6 @@ clockintr_establish(struct clockintr_que
>   return cl;
>  }
>
> -uint64_t
> -clockintr_expiration(const struct clockintr *cl)
> -{
> - uint64_t expiration;
> - struct clockintr_queue *cq = cl->cl_queue;
> -
> - if (cl == &cq->cq_shadow)
> - return cl->cl_expiration;
> -
> - mtx_enter(&cq->cq_mtx);
> - expiration = cl->cl_expiration;
> - mtx_leave(&cq->cq_mtx);
> - return expiration;
> -}
> -
>  void
>  clockintr_schedule(struct clockintr *cl, uint64_t expiration)
>  {
> @@ -478,13 +487,6 @@ clockintr_stagger(struct clockintr *cl,
>   mtx_leave(&cq->cq_mtx);
>  }
>
> -uint64_t
> -clockintr_nsecuptime(const struct clockintr *cl)
> -{
> - KASSERT(cl == &cl->cl_queue->cq_shadow);
> - return cl->cl_queue->cq_uptime;
> -}
> -
>  void
>  clockintr_hardclock(struct clockintr *cl, void *frame)
>  {
> @@ -498,20 +500,11 @@ clockintr_hardclock(struct clockintr *cl
>  void
>  clockintr_statclock(struct clockintr *cl, void *frame)
>  {
> - uint64_t count, expiration, i, uptime;
> - uint32_t off;
> + uint64_t count, i;
>
>   if (ISSET(clockintr_flags, CL_RNDSTAT)) {
> - count = 0;
> - expiration = clockintr_expiration(cl);
> - uptime = clockintr_nsecuptime(cl);
> - while (expiration <= uptime) {
> - while ((off = (random() & statclock_mask)) == 0)
> - continue;
> - expiration += statclock_min + off;
> - count++;
> - }
> - clockintr_schedule(cl, expiration);
> + count = clockintr_advance_random(cl, statclock_min,
> + statclock_mask);
>   } else {
>   count = clockintr_advance(cl, statclock_avg);
>   }
>



Re: vmd/vmm: remove an ioctl from the vcpu hotpath, go brrr

2023-09-05 Thread Mischa

On 2023-09-05 14:27, Dave Voutila wrote:

Mike Larkin  writes:


On Mon, Sep 04, 2023 at 07:57:18PM +0200, Mischa wrote:

On 2023-09-04 18:58, Mischa wrote:
> On 2023-09-04 18:55, Mischa wrote:


/snip


> > Adding the sleep 2 does indeed help. I managed to get 20 VMs started
> > this way, before it would choke on 2-3.
> >
> > Do I only need the unpatched kernel or also the vmd/vmctl from snap?
>
> I do still get the same message on the console, but the machine isn't
> freezing up.
>
> [umd173152/210775 sp=7a5f577a1780 inside 702698535000-702698d34fff: not
> MAP_STACK

Starting 30 VMs this way caused the machine to become unresponsive 
again,

but nothing on the console. :(

Mischa


Were you seeing these uvm errors before this diff? If so, this isn't
causing the problem and something else is.


I don't believe we solved any of the underlying uvm issues in Bruges
last year. Mischa, can you test with just the latest snapshot/-current?

I'd imagine starting and stopping many vm's now is exacerbating the
issue because of the fork/exec for devices plus the ioctl to do a uvm
share into the device process address space.



If this diff causes the errors to occur, and without the diff it's 
fine, then

we need to look into that.


Also I think a pid number in that printf might be useful, I'll see 
what I can
find. If it's not vmd causing this and rather some other process then 
that

would be good to know also.


Sadly it looks like that printf doesn't spit out the offending pid. :(


Just to confirm I am seeing this behavior on the latest snap without the 
patch as well.
Just started 10 VMs with sleep 2, machine freezes, but nothing on the 
console. :(


Mischa



clockintr: add clockintr_advance_random()

2023-09-05 Thread Scott Cheloha
mpi@ suggests folding the pseudorandom advance code from
clockintr_statclock() into the clockintr API itself.  This replaces
three API calls -- clockintr_expiration(), clockintr_nsecuptime(), and
clockintr_schedule() -- we just one call to a new function,
clockintr_advance_random().

I'm fine with it.  A pseudorandom period is an odd thing and
supporting it is difficult.  Having a single bespoke API to support it
might be the lesser of two evils.

With this in place, the statclock() patch on tech@ can be simplified.

ok?

Index: kern_clockintr.c
===
RCS file: /cvs/src/sys/kern/kern_clockintr.c,v
retrieving revision 1.33
diff -u -p -r1.33 kern_clockintr.c
--- kern_clockintr.c26 Aug 2023 22:21:00 -  1.33
+++ kern_clockintr.c5 Sep 2023 14:11:38 -
@@ -42,8 +42,8 @@ uint32_t statclock_avg;   /* [I] average
 uint32_t statclock_min;/* [I] minimum statclock period 
(ns) */
 uint32_t statclock_mask;   /* [I] set of allowed offsets */
 
+uint64_t clockintr_advance_random(struct clockintr *, uint64_t, uint32_t);
 void clockintr_cancel_locked(struct clockintr *);
-uint64_t clockintr_expiration(const struct clockintr *);
 void clockintr_hardclock(struct clockintr *, void *);
 uint64_t clockintr_nsecuptime(const struct clockintr *);
 void clockintr_schedule(struct clockintr *, uint64_t);
@@ -345,6 +345,30 @@ clockintr_advance(struct clockintr *cl, 
return count;
 }
 
+/*
+ * Custom version of clockintr_advance() to support a pseudorandom
+ * statclock() period.  Hopefully we can throw this out at some point
+ * in the future.
+ */
+uint64_t
+clockintr_advance_random(struct clockintr *cl, uint64_t lo, uint32_t mask)
+{
+   uint64_t count = 0;
+   struct clockintr_queue *cq = cl->cl_queue;
+   uint32_t off;
+
+   KASSERT(cl == &cq->cq_shadow);
+
+   while (cl->cl_expiration <= cq->cq_uptime) {
+   while ((off = (random() & mask)) == 0)
+   continue;
+   cl->cl_expiration += lo + off;
+   count++;
+   }
+   SET(cl->cl_flags, CLST_SHADOW_PENDING);
+   return count;
+}
+
 void
 clockintr_cancel(struct clockintr *cl)
 {
@@ -402,21 +426,6 @@ clockintr_establish(struct clockintr_que
return cl;
 }
 
-uint64_t
-clockintr_expiration(const struct clockintr *cl)
-{
-   uint64_t expiration;
-   struct clockintr_queue *cq = cl->cl_queue;
-
-   if (cl == &cq->cq_shadow)
-   return cl->cl_expiration;
-
-   mtx_enter(&cq->cq_mtx);
-   expiration = cl->cl_expiration;
-   mtx_leave(&cq->cq_mtx);
-   return expiration;
-}
-
 void
 clockintr_schedule(struct clockintr *cl, uint64_t expiration)
 {
@@ -478,13 +487,6 @@ clockintr_stagger(struct clockintr *cl, 
mtx_leave(&cq->cq_mtx);
 }
 
-uint64_t
-clockintr_nsecuptime(const struct clockintr *cl)
-{
-   KASSERT(cl == &cl->cl_queue->cq_shadow);
-   return cl->cl_queue->cq_uptime;
-}
-
 void
 clockintr_hardclock(struct clockintr *cl, void *frame)
 {
@@ -498,20 +500,11 @@ clockintr_hardclock(struct clockintr *cl
 void
 clockintr_statclock(struct clockintr *cl, void *frame)
 {
-   uint64_t count, expiration, i, uptime;
-   uint32_t off;
+   uint64_t count, i;
 
if (ISSET(clockintr_flags, CL_RNDSTAT)) {
-   count = 0;
-   expiration = clockintr_expiration(cl);
-   uptime = clockintr_nsecuptime(cl);
-   while (expiration <= uptime) {
-   while ((off = (random() & statclock_mask)) == 0)
-   continue;
-   expiration += statclock_min + off;
-   count++;
-   }
-   clockintr_schedule(cl, expiration);
+   count = clockintr_advance_random(cl, statclock_min,
+   statclock_mask);
} else {
count = clockintr_advance(cl, statclock_avg);
}



ip send shared netlock

2023-09-05 Thread Alexander Bluhm
Hi,

ip_output() and ip6_output() should be MP safe when called with
NULL options.

ok?

bluhm

Index: netinet/ip_input.c
===
RCS file: /data/mirror/openbsd/cvs/src/sys/netinet/ip_input.c,v
retrieving revision 1.385
diff -u -p -r1.385 ip_input.c
--- netinet/ip_input.c  18 May 2023 09:59:43 -  1.385
+++ netinet/ip_input.c  5 Sep 2023 11:48:25 -
@@ -1851,7 +1851,7 @@ ip_send_do_dispatch(void *xmq, int flags
if (ml_empty(&ml))
return;
 
-   NET_LOCK();
+   NET_LOCK_SHARED();
while ((m = ml_dequeue(&ml)) != NULL) {
u_int32_t ipsecflowinfo = 0;
 
@@ -1862,7 +1862,7 @@ ip_send_do_dispatch(void *xmq, int flags
}
ip_output(m, NULL, NULL, flags, NULL, NULL, ipsecflowinfo);
}
-   NET_UNLOCK();
+   NET_UNLOCK_SHARED();
 }
 
 void
Index: netinet6/ip6_input.c
===
RCS file: /data/mirror/openbsd/cvs/src/sys/netinet6/ip6_input.c,v
retrieving revision 1.254
diff -u -p -r1.254 ip6_input.c
--- netinet6/ip6_input.c21 Aug 2022 14:15:55 -  1.254
+++ netinet6/ip6_input.c5 Sep 2023 11:48:44 -
@@ -1572,11 +1572,11 @@ ip6_send_dispatch(void *xmq)
if (ml_empty(&ml))
return;
 
-   NET_LOCK();
+   NET_LOCK_SHARED();
while ((m = ml_dequeue(&ml)) != NULL) {
ip6_output(m, NULL, NULL, 0, NULL, NULL);
}
-   NET_UNLOCK();
+   NET_UNLOCK_SHARED();
 }
 
 void



Re: vmd/vmm: remove an ioctl from the vcpu hotpath, go brrr

2023-09-05 Thread Mischa



On 2023-09-05 14:27, Dave Voutila wrote:

Mike Larkin  writes:


On Mon, Sep 04, 2023 at 07:57:18PM +0200, Mischa wrote:

On 2023-09-04 18:58, Mischa wrote:
> On 2023-09-04 18:55, Mischa wrote:


/snip


> > Adding the sleep 2 does indeed help. I managed to get 20 VMs started
> > this way, before it would choke on 2-3.
> >
> > Do I only need the unpatched kernel or also the vmd/vmctl from snap?
>
> I do still get the same message on the console, but the machine isn't
> freezing up.
>
> [umd173152/210775 sp=7a5f577a1780 inside 702698535000-702698d34fff: not
> MAP_STACK

Starting 30 VMs this way caused the machine to become unresponsive 
again,

but nothing on the console. :(

Mischa


Were you seeing these uvm errors before this diff? If so, this isn't
causing the problem and something else is.


I don't believe we solved any of the underlying uvm issues in Bruges
last year. Mischa, can you test with just the latest snapshot/-current?


Yes, after Mike's email I already started getting an extra machine up 
and running.

Will finish that shortly and run the tests on the latest snap.


I'd imagine starting and stopping many vm's now is exacerbating the
issue because of the fork/exec for devices plus the ioctl to do a uvm
share into the device process address space.


I will adjust my scripts accordingly. I currently start as many VMs as 
there are cores in production. Will test if that is still possible.


Mischa

If this diff causes the errors to occur, and without the diff it's 
fine, then

we need to look into that.


Also I think a pid number in that printf might be useful, I'll see 
what I can
find. If it's not vmd causing this and rather some other process then 
that

would be good to know also.


Sadly it looks like that printf doesn't spit out the offending pid. :(




Re: vmd/vmm: remove an ioctl from the vcpu hotpath, go brrr

2023-09-05 Thread Dave Voutila


Mike Larkin  writes:

> On Mon, Sep 04, 2023 at 07:57:18PM +0200, Mischa wrote:
>> On 2023-09-04 18:58, Mischa wrote:
>> > On 2023-09-04 18:55, Mischa wrote:

/snip

>> > > Adding the sleep 2 does indeed help. I managed to get 20 VMs started
>> > > this way, before it would choke on 2-3.
>> > >
>> > > Do I only need the unpatched kernel or also the vmd/vmctl from snap?
>> >
>> > I do still get the same message on the console, but the machine isn't
>> > freezing up.
>> >
>> > [umd173152/210775 sp=7a5f577a1780 inside 702698535000-702698d34fff: not
>> > MAP_STACK
>>
>> Starting 30 VMs this way caused the machine to become unresponsive again,
>> but nothing on the console. :(
>>
>> Mischa
>
> Were you seeing these uvm errors before this diff? If so, this isn't
> causing the problem and something else is.

I don't believe we solved any of the underlying uvm issues in Bruges
last year. Mischa, can you test with just the latest snapshot/-current?

I'd imagine starting and stopping many vm's now is exacerbating the
issue because of the fork/exec for devices plus the ioctl to do a uvm
share into the device process address space.

>
> If this diff causes the errors to occur, and without the diff it's fine, then
> we need to look into that.
>
>
> Also I think a pid number in that printf might be useful, I'll see what I can
> find. If it's not vmd causing this and rather some other process then that
> would be good to know also.

Sadly it looks like that printf doesn't spit out the offending pid. :(

-dv



Re: unifdef HAS_INLINES in make(1)

2023-09-05 Thread Marc Espie
On Tue, Sep 05, 2023 at 11:51:34AM +1000, Jonathan Gray wrote:
> On Mon, Sep 04, 2023 at 11:51:33PM +0200, Marc Espie wrote:
> > On Tue, Sep 05, 2023 at 12:04:24AM +1000, Jonathan Gray wrote:
> > > inline is part of gnu89 and c99
> > > 
> > > Index: defines.h
> > > ===
> > > RCS file: /cvs/src/usr.bin/make/defines.h,v
> > > retrieving revision 1.15
> > > diff -u -p -r1.15 defines.h
> > > --- defines.h 14 Oct 2015 13:50:22 -  1.15
> > > +++ defines.h 4 Sep 2023 13:49:16 -
> > > @@ -61,16 +61,8 @@ typedef struct Suff_ Suff;
> > >  
> > >  #ifdef __GNUC__
> > >  # define UNUSED  __attribute__((__unused__))
> > > -# define HAS_INLINES
> > > -# define INLINE  __inline__
> > >  #else
> > >  # define UNUSED
> > > -#endif
> > > -
> > > -#ifdef HAS_INLINES
> > > -# ifndef INLINE
> > > -#  define INLINE inline
> > > -# endif
> > >  #endif
> > >  
> > >  /*
> > > Index: lst.h
> > > ===
> > > RCS file: /cvs/src/usr.bin/make/lst.h,v
> > > retrieving revision 1.33
> > > diff -u -p -r1.33 lst.h
> > > --- lst.h 4 Mar 2021 09:45:31 -   1.33
> > > +++ lst.h 4 Sep 2023 13:49:52 -
> > > @@ -159,25 +159,16 @@ extern void *   Lst_DeQueue(Lst);
> > >  #define Lst_Adv(ln)  ((ln)->nextPtr)
> > >  #define Lst_Rev(ln)  ((ln)->prevPtr)
> > >  
> > > -
> > > -/* Inlines are preferable to macros here because of the type checking. */
> > > -#ifdef HAS_INLINES
> > > -static INLINE LstNode
> > > +static inline LstNode
> > >  Lst_FindConst(Lst l, FindProcConst cProc, const void *d)
> > >  {
> > >   return Lst_FindFrom(Lst_First(l), (FindProc)cProc, (void *)d);
> > >  }
> > >  
> > > -static INLINE LstNode
> > > +static inline LstNode
> > >  Lst_FindFromConst(LstNode ln, FindProcConst cProc, const void *d)
> > >  {
> > >   return Lst_FindFrom(ln, (FindProc)cProc, (void *)d);
> > >  }
> > > -#else
> > > -#define Lst_FindConst(l, cProc, d) \
> > > - Lst_FindFrom(Lst_First(l), (FindProc)cProc, (void *)d)
> > > -#define Lst_FindFromConst(ln, cProc, d) \
> > > - Lst_FindFrom(ln, (FindProc)cProc, (void *)d)
> > > -#endif
> > >  
> > >  #endif /* _LST_H_ */
> > > Index: lst.lib/lst.h
> > > ===
> > > RCS file: /cvs/src/usr.bin/make/lst.lib/lst.h,v
> > > retrieving revision 1.2
> > > diff -u -p -r1.2 lst.h
> > > --- lst.lib/lst.h 14 Oct 2015 13:52:11 -  1.2
> > > +++ lst.lib/lst.h 4 Sep 2023 13:50:30 -
> > > @@ -154,25 +154,16 @@ extern void *   Lst_DeQueue(Lst);
> > >  #define Lst_Adv(ln)  ((ln)->nextPtr)
> > >  #define Lst_Rev(ln)  ((ln)->prevPtr)
> > >  
> > > -
> > > -/* Inlines are preferable to macros here because of the type checking. */
> > > -#ifdef HAS_INLINES
> > > -static INLINE LstNode
> > > +static inline LstNode
> > >  Lst_FindConst(Lst l, FindProcConst cProc, const void *d)
> > >  {
> > >   return Lst_FindFrom(Lst_First(l), (FindProc)cProc, (void *)d);
> > >  }
> > >  
> > > -static INLINE LstNode
> > > +static inline LstNode
> > >  Lst_FindFromConst(LstNode ln, FindProcConst cProc, const void *d)
> > >  {
> > >   return Lst_FindFrom(ln, (FindProc)cProc, (void *)d);
> > >  }
> > > -#else
> > > -#define Lst_FindConst(l, cProc, d) \
> > > - Lst_FindFrom(Lst_First(l), (FindProc)cProc, (void *)d)
> > > -#define Lst_FindFromConst(ln, cProc, d) \
> > > - Lst_FindFrom(ln, (FindProc)cProc, (void *)d)
> > > -#endif
> > >  
> > >  #endif /* _LST_H_ */
> > > 
> > > 
> > Please Cc: me on make stuff, I don't always see all mails.
> > 
> > Have you checked it works on gcc3 or gotten someone to check ?
> > If so, it's okay.
> 
> No, but it must work.  Used in many places, such as:
> sys/kern/subr_pool.c:static inline int
> usr.bin/sed/process.c:static inline int
> 
> 
Okay then