On Mon, Jun 19, 2023 at 04:45:03PM -0500, Scott Cheloha wrote:
> On Mon, Jun 19, 2023 at 10:22:56AM +0200, Claudio Jeker wrote:
> > On Sun, Jun 18, 2023 at 12:43:18PM -0500, Scott Cheloha wrote:
> > > On Sun, Jun 18, 2023 at 12:36:07PM -0500, Scott Cheloha wrote:
> > > > On Sun, Jun 18, 2023 at 07:32:56PM +0200, Mark Kettenis wrote:
> > > > > > Date: Sun, 18 Jun 2023 12:27:17 -0500
> > > > > > From: Scott Cheloha <[email protected]>
> > > > > >
> > > > > > The intent here is to update the load averages every five seconds.
> > > > > > However:
> > > > > >
> > > > > > 1. Measuring elapsed time with the UTC clock is unreliable because
> > > > > > of
> > > > > > settimeofday(2).
> > > > > >
> > > > > > 2. "Call uvm_loadav() no more than once every five seconds", is not
> > > > > > equivalent to "call uvm_loadav() if the current second is equal
> > > > > > to zero, modulo five".
> > > > > >
> > > > > > Not hard to imagine edge cases where timeouts are delayed and
> > > > > > the load averages are not updated.
> > > > > >
> > > > > > So, (1) use the monotonic clock, and (2) keep the next uvm_loadav()
> > > > > > call time in a static value.
> > > > > >
> > > > > > ok?
> > > > >
> > > > > I really don't see why the calculatin of something vague like the load
> > > > > average warrants complicating the code like this.
> > > >
> > > > Aren't load averages used to make decisions about thread placement in
> > > > the scheduler?
> > > >
> > > > Regardless, the code is still wrong. At minimum you should use
> > > > getuptime(9).
> > >
> > > Maybe I misunderstood. Are you suggesting this?
> > >
> > >
> > > now = getuptime();
> > > if (now >= next_uvm_loadav) {
> > > next_uvm_loadav = now + 5;
> > > uvm_loadav(...);
> > > }
> > >
> > > The patch I posted preserves the current behavior. It is equivalent
> > > to:
> > >
> > > while (next_uvm_loadav <= now)
> > > next_uvm_loadav += 5;
> > >
> > > Changing it to (now + 5) changes the behavior.
> >
> > To be honest, I think the uvm_meter should be called via timeout(9) and
> > not be called via the current path using schedcpu().
> > At some point schedcpu() may be removed and then we need to fix this
> > proper anyway.
>
> See attached.
>
> This changes the wakeup interval for the proc0 (the swapper) on
> patched kernels where maxslp is patched (highly unusual).
>
> On default kernels, maxslp is 20, which divides evenly into 5, so
> the patch does not change the proc0 wakeup interval.
>
> Another approach would be to run the uvm_meter() timeout every second
> and track the uvm_loadav() deadline in a static variable.
I think the proper fix is to just remove this wakeup.
Since proc0 does nothing anymore.
This wakeup triggers this other bit of code:
/*
* proc0: nothing to do, back to sleep
*/
while (1)
tsleep_nsec(&proc0, PVM, "scheduler", INFSLP);
/* NOTREACHED */
at the end of the main(). So we wakeup a while (1) loop.
I need to look more into this. I wonder if there is some silly side-effect
that this code tickles. All of this just smells like old cruft.
> > Running the lbolt wakeup inside schedcpu() has the same issue.
>
> Eliminating lbolt has been a goal of mine for several years. Tried
> to remove as many users as possible a few years ago. The tricky cases
> are in the sys/kern tty code.
> Index: kern/sched_bsd.c
> ===================================================================
> RCS file: /cvs/src/sys/kern/sched_bsd.c,v
> retrieving revision 1.74
> diff -u -p -r1.74 sched_bsd.c
> --- kern/sched_bsd.c 4 Feb 2023 19:33:03 -0000 1.74
> +++ kern/sched_bsd.c 19 Jun 2023 21:35:22 -0000
> @@ -234,7 +234,6 @@ schedcpu(void *arg)
> }
> SCHED_UNLOCK(s);
> }
> - uvm_meter();
> wakeup(&lbolt);
> timeout_add_sec(to, 1);
> }
> @@ -669,6 +668,7 @@ scheduler_start(void)
>
> rrticks_init = hz / 10;
> schedcpu(&schedcpu_to);
> + uvm_meter_start();
>
> #ifndef SMALL_KERNEL
> if (perfpolicy == PERFPOL_AUTO)
> Index: uvm/uvm_meter.c
> ===================================================================
> RCS file: /cvs/src/sys/uvm/uvm_meter.c,v
> retrieving revision 1.42
> diff -u -p -r1.42 uvm_meter.c
> --- uvm/uvm_meter.c 28 Dec 2020 14:01:23 -0000 1.42
> +++ uvm/uvm_meter.c 19 Jun 2023 21:35:22 -0000
> @@ -42,6 +42,7 @@
> #include <sys/percpu.h>
> #include <sys/proc.h>
> #include <sys/sysctl.h>
> +#include <sys/timeout.h>
> #include <sys/vmmeter.h>
> #include <uvm/uvm.h>
> #include <uvm/uvm_ddb.h>
> @@ -65,6 +66,9 @@
> int maxslp = MAXSLP; /* patchable ... */
> struct loadavg averunnable;
>
> +/* Update load averages every five seconds. */
> +#define UVM_METER_INTVL 5
> +
> /*
> * constants for averages over 1, 5, and 15 minutes when sampling at
> * 5 second intervals.
> @@ -78,17 +82,29 @@ static fixpt_t cexp[3] = {
>
>
> static void uvm_loadav(struct loadavg *);
> +void uvm_meter(void *);
> void uvm_total(struct vmtotal *);
> void uvmexp_read(struct uvmexp *);
>
> +void
> +uvm_meter_start(void)
> +{
> + static struct timeout to = TIMEOUT_INITIALIZER(uvm_meter, &to);
> +
> + uvm_meter(&to);
> +}
> +
> /*
> * uvm_meter: calculate load average and wake up the swapper (if needed)
> */
> void
> -uvm_meter(void)
> +uvm_meter(void *arg)
> {
> - if ((gettime() % 5) == 0)
> - uvm_loadav(&averunnable);
> + struct timeout *to = arg;
> +
> + timeout_add_sec(to, UVM_METER_INTVL);
> +
> + uvm_loadav(&averunnable);
> if (proc0.p_slptime > (maxslp / 2))
> wakeup(&proc0);
> }
> Index: uvm/uvm_extern.h
> ===================================================================
> RCS file: /cvs/src/sys/uvm/uvm_extern.h,v
> retrieving revision 1.168
> diff -u -p -r1.168 uvm_extern.h
> --- uvm/uvm_extern.h 30 May 2023 08:30:01 -0000 1.168
> +++ uvm/uvm_extern.h 19 Jun 2023 21:35:22 -0000
> @@ -414,7 +414,7 @@ void uvmspace_free(struct vmspace *);
> struct vmspace *uvmspace_share(struct process *);
> int uvm_share(vm_map_t, vaddr_t, vm_prot_t,
> vm_map_t, vaddr_t, vsize_t);
> -void uvm_meter(void);
> +void uvm_meter_start(void);
> int uvm_sysctl(int *, u_int, void *, size_t *,
> void *, size_t, struct proc *);
> struct vm_page *uvm_pagealloc(struct uvm_object *,
--
:wq Claudio