Re: mips64: trigger deferred timer interrupt from splx(9)

2022-08-18 Thread Visa Hankala
On Thu, Aug 18, 2022 at 02:33:55PM +, Miod Vallat wrote:
> > After about 92 hours, one machine showed cp0_raise_calls=622486 and
> > another 695892. cp0_raise_miss was zero on both of them. On two other
> > machines I had forgotten to allow ddb access from console and could
> > not check the values.
> 
> Put kern.allowkmem=1 in /etc/sysctl.conf, and then you can do fetch the
> values with pstat -d.

Not without rebooting first. And I don't want to run the machines
with kern.allowkmem=1.



Re: mips64: trigger deferred timer interrupt from splx(9)

2022-08-18 Thread Miod Vallat
> After about 92 hours, one machine showed cp0_raise_calls=622486 and
> another 695892. cp0_raise_miss was zero on both of them. On two other
> machines I had forgotten to allow ddb access from console and could
> not check the values.

Put kern.allowkmem=1 in /etc/sysctl.conf, and then you can do fetch the
values with pstat -d.



Re: mips64: trigger deferred timer interrupt from splx(9)

2022-08-18 Thread Visa Hankala
On Wed, Aug 17, 2022 at 11:42:50AM -0500, Scott Cheloha wrote:
> On Wed, Aug 17, 2022 at 01:30:29PM +, Visa Hankala wrote:
> > On Tue, Aug 09, 2022 at 09:54:02AM -0500, Scott Cheloha wrote:
> > > On Tue, Aug 09, 2022 at 02:03:31PM +, Visa Hankala wrote:
> > > > On Mon, Aug 08, 2022 at 02:52:37AM -0500, Scott Cheloha wrote:
> > > > > One thing I'm still uncertain about is how glxclk fits into the
> > > > > loongson picture.  It's an interrupt clock that runs hardclock() and
> > > > > statclock(), but the code doesn't do any logical masking, so I don't
> > > > > know whether or not I need to adjust anything in that code or account
> > > > > for it at all.  If there's no logical masking there's no deferral, so
> > > > > it would never call need to call md_triggerclock() from splx(9).
> > > > 
> > > > I think the masking of glxclk interrupts are handled by the ISA
> > > > interrupt code.
> > > 
> > > Do those machines not have Coprocessor 0?  If they do, why would you
> > > prefer glxclk over CP0?
> > > 
> > > > The patch misses md_triggerclock definition in mips64_machdep.c.
> > > 
> > > Whoops, forgot that file.  Fuller patch below.
> > > 
> > > > I have put this to the test on the mips64 ports builder machines.
> > 
> > The machines completed a build with this patch without problems.
> > I tested with the debug counters removed from cp0_trigger_int5().
> > 
> > OK visa@
> 
> Thank you for testing!
> 
> There was a loongson portion to this patch.  Is this OK on loongson or
> just octeon?

OK for both.

> Also, what did the debug counters look like when you yanked them?  If
> cp0_raise_miss was non-zero I will double the initial offset to 32
> cycles.

After about 92 hours, one machine showed cp0_raise_calls=622486 and
another 695892. cp0_raise_miss was zero on both of them. On two other
machines I had forgotten to allow ddb access from console and could
not check the values.

The current offset is good enough.



Re: mips64: trigger deferred timer interrupt from splx(9)

2022-08-17 Thread Scott Cheloha
On Wed, Aug 17, 2022 at 01:30:29PM +, Visa Hankala wrote:
> On Tue, Aug 09, 2022 at 09:54:02AM -0500, Scott Cheloha wrote:
> > On Tue, Aug 09, 2022 at 02:03:31PM +, Visa Hankala wrote:
> > > On Mon, Aug 08, 2022 at 02:52:37AM -0500, Scott Cheloha wrote:
> > > > One thing I'm still uncertain about is how glxclk fits into the
> > > > loongson picture.  It's an interrupt clock that runs hardclock() and
> > > > statclock(), but the code doesn't do any logical masking, so I don't
> > > > know whether or not I need to adjust anything in that code or account
> > > > for it at all.  If there's no logical masking there's no deferral, so
> > > > it would never call need to call md_triggerclock() from splx(9).
> > > 
> > > I think the masking of glxclk interrupts are handled by the ISA
> > > interrupt code.
> > 
> > Do those machines not have Coprocessor 0?  If they do, why would you
> > prefer glxclk over CP0?
> > 
> > > The patch misses md_triggerclock definition in mips64_machdep.c.
> > 
> > Whoops, forgot that file.  Fuller patch below.
> > 
> > > I have put this to the test on the mips64 ports builder machines.
> 
> The machines completed a build with this patch without problems.
> I tested with the debug counters removed from cp0_trigger_int5().
> 
> OK visa@

Thank you for testing!

There was a loongson portion to this patch.  Is this OK on loongson or
just octeon?

Also, what did the debug counters look like when you yanked them?  If
cp0_raise_miss was non-zero I will double the initial offset to 32
cycles.



Re: mips64: trigger deferred timer interrupt from splx(9)

2022-08-17 Thread Visa Hankala
On Tue, Aug 09, 2022 at 09:54:02AM -0500, Scott Cheloha wrote:
> On Tue, Aug 09, 2022 at 02:03:31PM +, Visa Hankala wrote:
> > On Mon, Aug 08, 2022 at 02:52:37AM -0500, Scott Cheloha wrote:
> > > One thing I'm still uncertain about is how glxclk fits into the
> > > loongson picture.  It's an interrupt clock that runs hardclock() and
> > > statclock(), but the code doesn't do any logical masking, so I don't
> > > know whether or not I need to adjust anything in that code or account
> > > for it at all.  If there's no logical masking there's no deferral, so
> > > it would never call need to call md_triggerclock() from splx(9).
> > 
> > I think the masking of glxclk interrupts are handled by the ISA
> > interrupt code.
> 
> Do those machines not have Coprocessor 0?  If they do, why would you
> prefer glxclk over CP0?
> 
> > The patch misses md_triggerclock definition in mips64_machdep.c.
> 
> Whoops, forgot that file.  Fuller patch below.
> 
> > I have put this to the test on the mips64 ports builder machines.

The machines completed a build with this patch without problems.
I tested with the debug counters removed from cp0_trigger_int5().

OK visa@

> Index: mips64/mips64/clock.c
> ===
> RCS file: /cvs/src/sys/arch/mips64/mips64/clock.c,v
> retrieving revision 1.45
> diff -u -p -r1.45 clock.c
> --- mips64/mips64/clock.c 6 Apr 2022 18:59:26 -   1.45
> +++ mips64/mips64/clock.c 9 Aug 2022 14:48:47 -
> @@ -60,6 +60,7 @@ const struct cfattach clock_ca = {
>  };
>  
>  void cp0_startclock(struct cpu_info *);
> +void cp0_trigger_int5(void);
>  uint32_t cp0_int5(uint32_t, struct trapframe *);
>  
>  int
> @@ -86,19 +87,20 @@ clockattach(struct device *parent, struc
>   cp0_set_compare(cp0_get_count() - 1);
>  
>   md_startclock = cp0_startclock;
> + md_triggerclock = cp0_trigger_int5;
>  }
>  
>  /*
>   *  Interrupt handler for targets using the internal count register
>   *  as interval clock. Normally the system is run with the clock
>   *  interrupt always enabled. Masking is done here and if the clock
> - *  can not be run the tick is just counted and handled later when
> - *  the clock is logically unmasked again.
> + *  cannot be run the tick is handled later when the clock is logically
> + *  unmasked again.
>   */
>  uint32_t
>  cp0_int5(uint32_t mask, struct trapframe *tf)
>  {
> - u_int32_t clkdiff;
> + u_int32_t clkdiff, pendingticks = 0;
>   struct cpu_info *ci = curcpu();
>  
>   /*
> @@ -113,15 +115,26 @@ cp0_int5(uint32_t mask, struct trapframe
>   }
>  
>   /*
> +  * If the clock interrupt is masked, defer any work until it
> +  * is unmasked from splx(9).
> +  */
> + if (tf->ipl >= IPL_CLOCK) {
> + ci->ci_clock_deferred = 1;
> + cp0_set_compare(cp0_get_count() - 1);
> + return CR_INT_5;
> + }
> + ci->ci_clock_deferred = 0;
> +
> + /*
>* Count how many ticks have passed since the last clock interrupt...
>*/
>   clkdiff = cp0_get_count() - ci->ci_cpu_counter_last;
>   while (clkdiff >= ci->ci_cpu_counter_interval) {
>   ci->ci_cpu_counter_last += ci->ci_cpu_counter_interval;
>   clkdiff = cp0_get_count() - ci->ci_cpu_counter_last;
> - ci->ci_pendingticks++;
> + pendingticks++;
>   }
> - ci->ci_pendingticks++;
> + pendingticks++;
>   ci->ci_cpu_counter_last += ci->ci_cpu_counter_interval;
>  
>   /*
> @@ -132,32 +145,64 @@ cp0_int5(uint32_t mask, struct trapframe
>   clkdiff = cp0_get_count() - ci->ci_cpu_counter_last;
>   if ((int)clkdiff >= 0) {
>   ci->ci_cpu_counter_last += ci->ci_cpu_counter_interval;
> - ci->ci_pendingticks++;
> + pendingticks++;
>   cp0_set_compare(ci->ci_cpu_counter_last);
>   }
>  
>   /*
> -  * Process clock interrupt unless it is currently masked.
> +  * Process clock interrupt.
>*/
> - if (tf->ipl < IPL_CLOCK) {
>  #ifdef MULTIPROCESSOR
> - register_t sr;
> + register_t sr;
>  
> - sr = getsr();
> - ENABLEIPI();
> + sr = getsr();
> + ENABLEIPI();
>  #endif
> - while (ci->ci_pendingticks) {
> - atomic_inc_long(
> - (unsigned long *)_clock_count.ec_count);
> - hardclock(tf);
> - ci->ci_pendingticks--;
> - }
> + while (pendingticks) {
> + atomic_inc_long((unsigned long *)_clock_count.ec_count);
> + hardclock(tf);
> + pendingticks--;
> + }
>  #ifdef MULTIPROCESSOR
> - setsr(sr);
> + setsr(sr);
>  #endif
> - }
>  
>   return CR_INT_5;/* Clock is always on 5 */
> +}
> +
> +unsigned long cp0_raise_calls, cp0_raise_miss;
> +
> +/*
> + * Trigger the clock interrupt.
> + * 
> + * We need to spin 

Re: mips64: trigger deferred timer interrupt from splx(9)

2022-08-09 Thread Scott Cheloha
On Tue, Aug 09, 2022 at 06:02:10PM +, Miod Vallat wrote:
> > Other platforms (architectures?) (powerpc, powerpc64, arm64, riscv64)
> > multiplex their singular interrupt clock to schedule both a
> > fixed-period hardclock and a pseudorandom statclock.
> > 
> > This is the direction I intend to take every platform, mips64
> > included, after the next release.
> > 
> > In that context, would there be any reason to prefer glxclk to
> > CP0.count?
> 
> No. The cop0 timer is supposed to be the most reliable timer available.
> (although one may argue that, on sgi, the xbow timer on some systems is
> even better quality)

Alright, got it.  If glxclk provides no other utility aside from an
interrupt clock on loongson, then you and I can coordinate unhooking
it when we switch loongson to the new clockintr code in the Fall.

If I'm missing something and it does other work, then nevermind.

Does the latest patch work on any loongson machines you have?

I didn't see any other splx(9) implementations aside from bonito and
the one for loongson3.

Index: mips64/mips64/clock.c
===
RCS file: /cvs/src/sys/arch/mips64/mips64/clock.c,v
retrieving revision 1.45
diff -u -p -r1.45 clock.c
--- mips64/mips64/clock.c   6 Apr 2022 18:59:26 -   1.45
+++ mips64/mips64/clock.c   9 Aug 2022 14:48:47 -
@@ -60,6 +60,7 @@ const struct cfattach clock_ca = {
 };
 
 void   cp0_startclock(struct cpu_info *);
+void   cp0_trigger_int5(void);
 uint32_t cp0_int5(uint32_t, struct trapframe *);
 
 int
@@ -86,19 +87,20 @@ clockattach(struct device *parent, struc
cp0_set_compare(cp0_get_count() - 1);
 
md_startclock = cp0_startclock;
+   md_triggerclock = cp0_trigger_int5;
 }
 
 /*
  *  Interrupt handler for targets using the internal count register
  *  as interval clock. Normally the system is run with the clock
  *  interrupt always enabled. Masking is done here and if the clock
- *  can not be run the tick is just counted and handled later when
- *  the clock is logically unmasked again.
+ *  cannot be run the tick is handled later when the clock is logically
+ *  unmasked again.
  */
 uint32_t
 cp0_int5(uint32_t mask, struct trapframe *tf)
 {
-   u_int32_t clkdiff;
+   u_int32_t clkdiff, pendingticks = 0;
struct cpu_info *ci = curcpu();
 
/*
@@ -113,15 +115,26 @@ cp0_int5(uint32_t mask, struct trapframe
}
 
/*
+* If the clock interrupt is masked, defer any work until it
+* is unmasked from splx(9).
+*/
+   if (tf->ipl >= IPL_CLOCK) {
+   ci->ci_clock_deferred = 1;
+   cp0_set_compare(cp0_get_count() - 1);
+   return CR_INT_5;
+   }
+   ci->ci_clock_deferred = 0;
+
+   /*
 * Count how many ticks have passed since the last clock interrupt...
 */
clkdiff = cp0_get_count() - ci->ci_cpu_counter_last;
while (clkdiff >= ci->ci_cpu_counter_interval) {
ci->ci_cpu_counter_last += ci->ci_cpu_counter_interval;
clkdiff = cp0_get_count() - ci->ci_cpu_counter_last;
-   ci->ci_pendingticks++;
+   pendingticks++;
}
-   ci->ci_pendingticks++;
+   pendingticks++;
ci->ci_cpu_counter_last += ci->ci_cpu_counter_interval;
 
/*
@@ -132,32 +145,64 @@ cp0_int5(uint32_t mask, struct trapframe
clkdiff = cp0_get_count() - ci->ci_cpu_counter_last;
if ((int)clkdiff >= 0) {
ci->ci_cpu_counter_last += ci->ci_cpu_counter_interval;
-   ci->ci_pendingticks++;
+   pendingticks++;
cp0_set_compare(ci->ci_cpu_counter_last);
}
 
/*
-* Process clock interrupt unless it is currently masked.
+* Process clock interrupt.
 */
-   if (tf->ipl < IPL_CLOCK) {
 #ifdef MULTIPROCESSOR
-   register_t sr;
+   register_t sr;
 
-   sr = getsr();
-   ENABLEIPI();
+   sr = getsr();
+   ENABLEIPI();
 #endif
-   while (ci->ci_pendingticks) {
-   atomic_inc_long(
-   (unsigned long *)_clock_count.ec_count);
-   hardclock(tf);
-   ci->ci_pendingticks--;
-   }
+   while (pendingticks) {
+   atomic_inc_long((unsigned long *)_clock_count.ec_count);
+   hardclock(tf);
+   pendingticks--;
+   }
 #ifdef MULTIPROCESSOR
-   setsr(sr);
+   setsr(sr);
 #endif
-   }
 
return CR_INT_5;/* Clock is always on 5 */
+}
+
+unsigned long cp0_raise_calls, cp0_raise_miss;
+
+/*
+ * Trigger the clock interrupt.
+ * 
+ * We need to spin until either (a) INT5 is pending or (b) the compare
+ * register leads the count register, i.e. we know INT5 will be pending
+ * very soon.
+ *
+ * To ensure we don't spin forever, double the compensatory offset
+ 

Re: mips64: trigger deferred timer interrupt from splx(9)

2022-08-09 Thread Miod Vallat
> Other platforms (architectures?) (powerpc, powerpc64, arm64, riscv64)
> multiplex their singular interrupt clock to schedule both a
> fixed-period hardclock and a pseudorandom statclock.
> 
> This is the direction I intend to take every platform, mips64
> included, after the next release.
> 
> In that context, would there be any reason to prefer glxclk to
> CP0.count?

No. The cop0 timer is supposed to be the most reliable timer available.
(although one may argue that, on sgi, the xbow timer on some systems is
even better quality)



Re: mips64: trigger deferred timer interrupt from splx(9)

2022-08-09 Thread Scott Cheloha
On Tue, Aug 09, 2022 at 02:56:54PM +, Miod Vallat wrote:
> > Do those machines not have Coprocessor 0?  If they do, why would you
> > prefer glxclk over CP0?
> 
> cop0 only provides one timer, from which both the scheduling clock and
> statclk are derived. glxclk allows two timers to be used, and thus can
> provide a more reliable statclk (see the Torek paper, etc - it is even
> mentioned in the glxclk manual page).

Other platforms (architectures?) (powerpc, powerpc64, arm64, riscv64)
multiplex their singular interrupt clock to schedule both a
fixed-period hardclock and a pseudorandom statclock.

This is the direction I intend to take every platform, mips64
included, after the next release.

In that context, would there be any reason to prefer glxclk to
CP0.count?



Re: mips64: trigger deferred timer interrupt from splx(9)

2022-08-09 Thread Miod Vallat
> Do those machines not have Coprocessor 0?  If they do, why would you
> prefer glxclk over CP0?

cop0 only provides one timer, from which both the scheduling clock and
statclk are derived. glxclk allows two timers to be used, and thus can
provide a more reliable statclk (see the Torek paper, etc - it is even
mentioned in the glxclk manual page).



Re: mips64: trigger deferred timer interrupt from splx(9)

2022-08-09 Thread Scott Cheloha
On Tue, Aug 09, 2022 at 02:03:31PM +, Visa Hankala wrote:
> On Mon, Aug 08, 2022 at 02:52:37AM -0500, Scott Cheloha wrote:
> > One thing I'm still uncertain about is how glxclk fits into the
> > loongson picture.  It's an interrupt clock that runs hardclock() and
> > statclock(), but the code doesn't do any logical masking, so I don't
> > know whether or not I need to adjust anything in that code or account
> > for it at all.  If there's no logical masking there's no deferral, so
> > it would never call need to call md_triggerclock() from splx(9).
> 
> I think the masking of glxclk interrupts are handled by the ISA
> interrupt code.

Do those machines not have Coprocessor 0?  If they do, why would you
prefer glxclk over CP0?

> The patch misses md_triggerclock definition in mips64_machdep.c.

Whoops, forgot that file.  Fuller patch below.

> I have put this to the test on the mips64 ports builder machines.

Cool, thank you for testing.

Index: mips64/mips64/clock.c
===
RCS file: /cvs/src/sys/arch/mips64/mips64/clock.c,v
retrieving revision 1.45
diff -u -p -r1.45 clock.c
--- mips64/mips64/clock.c   6 Apr 2022 18:59:26 -   1.45
+++ mips64/mips64/clock.c   9 Aug 2022 14:48:47 -
@@ -60,6 +60,7 @@ const struct cfattach clock_ca = {
 };
 
 void   cp0_startclock(struct cpu_info *);
+void   cp0_trigger_int5(void);
 uint32_t cp0_int5(uint32_t, struct trapframe *);
 
 int
@@ -86,19 +87,20 @@ clockattach(struct device *parent, struc
cp0_set_compare(cp0_get_count() - 1);
 
md_startclock = cp0_startclock;
+   md_triggerclock = cp0_trigger_int5;
 }
 
 /*
  *  Interrupt handler for targets using the internal count register
  *  as interval clock. Normally the system is run with the clock
  *  interrupt always enabled. Masking is done here and if the clock
- *  can not be run the tick is just counted and handled later when
- *  the clock is logically unmasked again.
+ *  cannot be run the tick is handled later when the clock is logically
+ *  unmasked again.
  */
 uint32_t
 cp0_int5(uint32_t mask, struct trapframe *tf)
 {
-   u_int32_t clkdiff;
+   u_int32_t clkdiff, pendingticks = 0;
struct cpu_info *ci = curcpu();
 
/*
@@ -113,15 +115,26 @@ cp0_int5(uint32_t mask, struct trapframe
}
 
/*
+* If the clock interrupt is masked, defer any work until it
+* is unmasked from splx(9).
+*/
+   if (tf->ipl >= IPL_CLOCK) {
+   ci->ci_clock_deferred = 1;
+   cp0_set_compare(cp0_get_count() - 1);
+   return CR_INT_5;
+   }
+   ci->ci_clock_deferred = 0;
+
+   /*
 * Count how many ticks have passed since the last clock interrupt...
 */
clkdiff = cp0_get_count() - ci->ci_cpu_counter_last;
while (clkdiff >= ci->ci_cpu_counter_interval) {
ci->ci_cpu_counter_last += ci->ci_cpu_counter_interval;
clkdiff = cp0_get_count() - ci->ci_cpu_counter_last;
-   ci->ci_pendingticks++;
+   pendingticks++;
}
-   ci->ci_pendingticks++;
+   pendingticks++;
ci->ci_cpu_counter_last += ci->ci_cpu_counter_interval;
 
/*
@@ -132,32 +145,64 @@ cp0_int5(uint32_t mask, struct trapframe
clkdiff = cp0_get_count() - ci->ci_cpu_counter_last;
if ((int)clkdiff >= 0) {
ci->ci_cpu_counter_last += ci->ci_cpu_counter_interval;
-   ci->ci_pendingticks++;
+   pendingticks++;
cp0_set_compare(ci->ci_cpu_counter_last);
}
 
/*
-* Process clock interrupt unless it is currently masked.
+* Process clock interrupt.
 */
-   if (tf->ipl < IPL_CLOCK) {
 #ifdef MULTIPROCESSOR
-   register_t sr;
+   register_t sr;
 
-   sr = getsr();
-   ENABLEIPI();
+   sr = getsr();
+   ENABLEIPI();
 #endif
-   while (ci->ci_pendingticks) {
-   atomic_inc_long(
-   (unsigned long *)_clock_count.ec_count);
-   hardclock(tf);
-   ci->ci_pendingticks--;
-   }
+   while (pendingticks) {
+   atomic_inc_long((unsigned long *)_clock_count.ec_count);
+   hardclock(tf);
+   pendingticks--;
+   }
 #ifdef MULTIPROCESSOR
-   setsr(sr);
+   setsr(sr);
 #endif
-   }
 
return CR_INT_5;/* Clock is always on 5 */
+}
+
+unsigned long cp0_raise_calls, cp0_raise_miss;
+
+/*
+ * Trigger the clock interrupt.
+ * 
+ * We need to spin until either (a) INT5 is pending or (b) the compare
+ * register leads the count register, i.e. we know INT5 will be pending
+ * very soon.
+ *
+ * To ensure we don't spin forever, double the compensatory offset
+ * added to the compare value every time we miss the count register.
+ */
+void

Re: mips64: trigger deferred timer interrupt from splx(9)

2022-08-09 Thread Visa Hankala
On Mon, Aug 08, 2022 at 02:52:37AM -0500, Scott Cheloha wrote:
> One thing I'm still uncertain about is how glxclk fits into the
> loongson picture.  It's an interrupt clock that runs hardclock() and
> statclock(), but the code doesn't do any logical masking, so I don't
> know whether or not I need to adjust anything in that code or account
> for it at all.  If there's no logical masking there's no deferral, so
> it would never call need to call md_triggerclock() from splx(9).

I think the masking of glxclk interrupts are handled by the ISA
interrupt code.

The patch misses md_triggerclock definition in mips64_machdep.c.

I have put this to the test on the mips64 ports builder machines.



Re: mips64: trigger deferred timer interrupt from splx(9)

2022-08-08 Thread Scott Cheloha
On Sun, Aug 07, 2022 at 11:05:37AM +, Visa Hankala wrote:
> On Sun, Jul 31, 2022 at 01:28:18PM -0500, Scott Cheloha wrote:
> > Apparently mips64, i.e. octeon and loongson, has the same problem as
> > powerpc/macppc and powerpc64.  The timer interrupt is normally only
> > logically masked, not physically masked in the hardware, when we're
> > running at or above IPL_CLOCK.  If we arrive at cp0_int5() when the
> > clock interrupt is logically masked we postpone all work until the
> > next tick.  This is a problem for my WIP clock interrupt work.
> 
> I think the use of logical masking has been a design choice, not
> something dictated by the hardware. Physical masking should be possible,
> but some extra care would be needed to implement it, as the mips64
> interrupt code is a bit clunky.

That would be cleaner, but from the sound of it, it's easier to start
with this.

> > So, this patch is basically the same as what I did for macppc and what
> > I have proposed for powerpc64.
> > 
> > - Add a new member, ci_timer_deferred, to mips64's cpu_info struct.
> > 
> >   While here, remove ci_pendingticks.  We don't need it anymore.
> > 
> > - If we get to cp0_int5() and our IPL is too high, set
> >   cpu_info.ci_timer_deferred and return.
> > 
> > - If we get to cp0_int5() and our IPL is low enough, clear
> >   cpu_info.ci_timer_deferred and do clock interrupt work.
> > 
> > - In splx(9), if the new IPL is low enough and cpu_info.ci_timer_deferred
> >   is set, trigger the clock interrupt.
> > 
> > The only big difference is that mips64 uses an equality comparison
> > when deciding whether to arm the timer interrupt, so it's really easy
> > to "miss" CP0.count when you're setting CP0.compare.
> > 
> > To address this I've written a function, cp0_raise_int5(), that spins
> > until it is sure the timer interrupt will go off.  The function needed
> > a bit of new code for reading CP0.cause, which I've added to
> > cp0access.S.  I am using an initial offset of 16 cycles based on
> > experimentation with the machine I have access to, a 500Mhz CN50xx.
> > Values lower than 16 require more than one loop to arm the timer.  If
> > that value is insufficient for other machines we can try passing the
> > initial offset as an argument to the function.
> 
> It should not be necessary to make the initial offset variable. The
> offset is primarily a function of the length and content of the
> instruction sequence. Some unpredictability comes from cache misses
> and maybe branch prediction failures.

Gotcha.  So it mostly depends on the number of instructions between
loading CP0.count and storing CP0.compare.

> > I wasn't sure where to put the prototype for cp0_raise_int5() so I
> > stuck it in mips64/cpu.h.  If there's a better place for it, just say
> > so.
> 
> Currently, mips64 clock.c is formulated as a proper driver. I think
> callers should not invoke its functions directly but use a hook instead.
> The MI mips64 code starts the clock through the md_startclock function
> pointer. Maybe there could be md_triggerclock.
> 
> To reduce risk of confusion, I would rename cp0_raise_int5 to
> cp0_trigger_int5, as `raise' overlaps with the spl API. Also,
> ci_clock_deferred instead of ci_timer_deferred would look more
> consistent with the surrounding code.

Okay, I took all these suggestions and incorporated them.  Updated
patch attached.

One thing I'm still uncertain about is how glxclk fits into the
loongson picture.  It's an interrupt clock that runs hardclock() and
statclock(), but the code doesn't do any logical masking, so I don't
know whether or not I need to adjust anything in that code or account
for it at all.  If there's no logical masking there's no deferral, so
it would never call need to call md_triggerclock() from splx(9).

Also:

My EdgeRouter PoE just finished a serial `make build`.  Took almost 12
days.  Which is a good sign!  Lots of opportunity for the patch to
fail and the clock to die.

In that time, under what I assume is relatively heavy load, the clock
interrupt deferral counters look like this:

cp0_raise_calls at 0x81701308: 133049
cp0_raise_miss at 0x81701300: 0

So 16 cycles as the initial offset works great.  We never ran the loop
more than once, i.e. we never "missed" CP0.count.

The machine has been up a little more than a million seconds.  So, at
100hz, with no separate statclock, and 2 CPUs, we'd expect ~200 clock
interrupts a second, or 200 million in total.

In ~200,000,000 cp0_int5() calls, we deferred ~133,000 of them, or
~0.0665%.

Index: mips64/mips64/clock.c
===
RCS file: /cvs/src/sys/arch/mips64/mips64/clock.c,v
retrieving revision 1.45
diff -u -p -r1.45 clock.c
--- mips64/mips64/clock.c   6 Apr 2022 18:59:26 -   1.45
+++ mips64/mips64/clock.c   8 Aug 2022 07:41:44 -
@@ -60,6 +60,7 @@ const struct cfattach clock_ca = {
 };
 
 void   cp0_startclock(struct cpu_info *);
+void   

Re: mips64: trigger deferred timer interrupt from splx(9)

2022-08-07 Thread Visa Hankala
On Sun, Jul 31, 2022 at 01:28:18PM -0500, Scott Cheloha wrote:
> Apparently mips64, i.e. octeon and loongson, has the same problem as
> powerpc/macppc and powerpc64.  The timer interrupt is normally only
> logically masked, not physically masked in the hardware, when we're
> running at or above IPL_CLOCK.  If we arrive at cp0_int5() when the
> clock interrupt is logically masked we postpone all work until the
> next tick.  This is a problem for my WIP clock interrupt work.

I think the use of logical masking has been a design choice, not
something dictated by the hardware. Physical masking should be possible,
but some extra care would be needed to implement it, as the mips64
interrupt code is a bit clunky.

> So, this patch is basically the same as what I did for macppc and what
> I have proposed for powerpc64.
> 
> - Add a new member, ci_timer_deferred, to mips64's cpu_info struct.
> 
>   While here, remove ci_pendingticks.  We don't need it anymore.
> 
> - If we get to cp0_int5() and our IPL is too high, set
>   cpu_info.ci_timer_deferred and return.
> 
> - If we get to cp0_int5() and our IPL is low enough, clear
>   cpu_info.ci_timer_deferred and do clock interrupt work.
> 
> - In splx(9), if the new IPL is low enough and cpu_info.ci_timer_deferred
>   is set, trigger the clock interrupt.
> 
> The only big difference is that mips64 uses an equality comparison
> when deciding whether to arm the timer interrupt, so it's really easy
> to "miss" CP0.count when you're setting CP0.compare.
> 
> To address this I've written a function, cp0_raise_int5(), that spins
> until it is sure the timer interrupt will go off.  The function needed
> a bit of new code for reading CP0.cause, which I've added to
> cp0access.S.  I am using an initial offset of 16 cycles based on
> experimentation with the machine I have access to, a 500Mhz CN50xx.
> Values lower than 16 require more than one loop to arm the timer.  If
> that value is insufficient for other machines we can try passing the
> initial offset as an argument to the function.

It should not be necessary to make the initial offset variable. The
offset is primarily a function of the length and content of the
instruction sequence. Some unpredictability comes from cache misses
and maybe branch prediction failures.

> I wasn't sure where to put the prototype for cp0_raise_int5() so I
> stuck it in mips64/cpu.h.  If there's a better place for it, just say
> so.

Currently, mips64 clock.c is formulated as a proper driver. I think
callers should not invoke its functions directly but use a hook instead.
The MI mips64 code starts the clock through the md_startclock function
pointer. Maybe there could be md_triggerclock.

To reduce risk of confusion, I would rename cp0_raise_int5 to
cp0_trigger_int5, as `raise' overlaps with the spl API. Also,
ci_clock_deferred instead of ci_timer_deferred would look more
consistent with the surrounding code.



mips64: trigger deferred timer interrupt from splx(9)

2022-07-31 Thread Scott Cheloha
Hi,

Apparently mips64, i.e. octeon and loongson, has the same problem as
powerpc/macppc and powerpc64.  The timer interrupt is normally only
logically masked, not physically masked in the hardware, when we're
running at or above IPL_CLOCK.  If we arrive at cp0_int5() when the
clock interrupt is logically masked we postpone all work until the
next tick.  This is a problem for my WIP clock interrupt work.

So, this patch is basically the same as what I did for macppc and what
I have proposed for powerpc64.

- Add a new member, ci_timer_deferred, to mips64's cpu_info struct.

  While here, remove ci_pendingticks.  We don't need it anymore.

- If we get to cp0_int5() and our IPL is too high, set
  cpu_info.ci_timer_deferred and return.

- If we get to cp0_int5() and our IPL is low enough, clear
  cpu_info.ci_timer_deferred and do clock interrupt work.

- In splx(9), if the new IPL is low enough and cpu_info.ci_timer_deferred
  is set, trigger the clock interrupt.

The only big difference is that mips64 uses an equality comparison
when deciding whether to arm the timer interrupt, so it's really easy
to "miss" CP0.count when you're setting CP0.compare.

To address this I've written a function, cp0_raise_int5(), that spins
until it is sure the timer interrupt will go off.  The function needed
a bit of new code for reading CP0.cause, which I've added to
cp0access.S.  I am using an initial offset of 16 cycles based on
experimentation with the machine I have access to, a 500Mhz CN50xx.
Values lower than 16 require more than one loop to arm the timer.  If
that value is insufficient for other machines we can try passing the
initial offset as an argument to the function.

I wasn't sure where to put the prototype for cp0_raise_int5() so I
stuck it in mips64/cpu.h.  If there's a better place for it, just say
so.

I also left some atomic counters for you to poke at with pstat(8) if
you want to see what the machine is doing in cp0_raise_int5(), i.e.
how often we defer clock interrupt work and how many loops you take to
arm the timer interrupt.  Those will be removed before commit.

I'm running a `make build` on my EdgeRouter PoE.  It only has 512MB of
RAM, so I can't do a parallel build without hanging the machine when
attempting to compile LLVM.  The build has been running for four days
and the machine has not yet hung, so I think this patch is correct-ish.
I will holler if it hangs.

visa: Assuming this code looks right, could you test this on a
  beefier octeon machine?  Preferably a parallel build?

miod: I'm unclear whether loongson uses cp0_int5().  Am I missing
  code here, or are my changes in arch/loongson sufficient?
  If it's sufficient, could you test this?

  I have no loongson hardware, so this is uncompiled there.
  Sorry in advance if it does not compile.

Thoughts?

Index: mips64/mips64/clock.c
===
RCS file: /cvs/src/sys/arch/mips64/mips64/clock.c,v
retrieving revision 1.45
diff -u -p -r1.45 clock.c
--- mips64/mips64/clock.c   6 Apr 2022 18:59:26 -   1.45
+++ mips64/mips64/clock.c   31 Jul 2022 18:18:05 -
@@ -92,13 +92,13 @@ clockattach(struct device *parent, struc
  *  Interrupt handler for targets using the internal count register
  *  as interval clock. Normally the system is run with the clock
  *  interrupt always enabled. Masking is done here and if the clock
- *  can not be run the tick is just counted and handled later when
- *  the clock is logically unmasked again.
+ *  can not be run the is tick handled later when the clock is logically
+ *  unmasked again.
  */
 uint32_t
 cp0_int5(uint32_t mask, struct trapframe *tf)
 {
-   u_int32_t clkdiff;
+   u_int32_t clkdiff, pendingticks = 0;
struct cpu_info *ci = curcpu();
 
/*
@@ -113,15 +113,26 @@ cp0_int5(uint32_t mask, struct trapframe
}
 
/*
+* If the clock interrupt is masked we can't do any work until
+* it is unmasked.
+*/
+   if (tf->ipl >= IPL_CLOCK) {
+   ci->ci_timer_deferred = 1;
+   cp0_set_compare(cp0_get_count() - 1);
+   return CR_INT_5;
+   }
+   ci->ci_timer_deferred = 0;
+
+   /*
 * Count how many ticks have passed since the last clock interrupt...
 */
clkdiff = cp0_get_count() - ci->ci_cpu_counter_last;
while (clkdiff >= ci->ci_cpu_counter_interval) {
ci->ci_cpu_counter_last += ci->ci_cpu_counter_interval;
clkdiff = cp0_get_count() - ci->ci_cpu_counter_last;
-   ci->ci_pendingticks++;
+   pendingticks++;
}
-   ci->ci_pendingticks++;
+   pendingticks++;
ci->ci_cpu_counter_last += ci->ci_cpu_counter_interval;
 
/*
@@ -132,32 +143,64 @@ cp0_int5(uint32_t mask, struct trapframe
clkdiff = cp0_get_count() - ci->ci_cpu_counter_last;
if ((int)clkdiff >= 0) {