Re: Consider switching to WQ_UNBOUND messages (was: Re: [PATCH v2 6/7] workqueue: Report work funcs that trigger automatic CPU_INTENSIVE mechanism)

2023-07-25 Thread Geert Uytterhoeven
Hi Tejun,

On Wed, Jul 19, 2023 at 12:01 AM Tejun Heo  wrote:
> On Tue, Jul 18, 2023 at 11:54:58AM +0200, Geert Uytterhoeven wrote:
> > I gave it a try on a system with an 800 MHz Cortex A9, only to discover
> > it makes no difference, as that machine has 1600 BogoMIPS:
>
> Oops.
>
> > workqueue: blk_mq_run_work_fn hogged CPU for >1us 4 times,
> > consider switching to WQ_UNBOUND
>
> It could be that we actually want to switch to UNBOUND for some reports but
> the above triggering most likely indicates that the threshold is too
> aggressive.
>
> > Artificially low BogoMIPS numbers only happen on systems that have
> > the related timers (Cortex A7/A15 and later, Cortex A9 MPCore,
> > and arm64).
>
> Ah, I see. Thanks for the explanation.
>
> > I will test on more systems, but that will probably not happen until
> > next week...
>
> Thanks, really appreciate it. Can you try the following instead when you
> have time? I just pushed up the lower boundary to 4000 MIPS. The scaling is
> still capped at 1s.

Thanks, with the below, I see no more WQ_UNBOUND messages.

> From 8555cbd4b22e5f85eb2bdcb84fd1d1f519a0a0d3 Mon Sep 17 00:00:00 2001
> From: Tejun Heo 
> Date: Mon, 17 Jul 2023 12:50:02 -1000
> Subject: [PATCH] workqueue: Scale up wq_cpu_intensive_thresh_us if BogoMIPS is
>  below 4000

> --- a/kernel/workqueue.c
> +++ b/kernel/workqueue.c

> @@ -6513,6 +6516,42 @@ void __init workqueue_init_early(void)
>!system_freezable_power_efficient_wq);
>  }
>
> +static void __init wq_cpu_intensive_thresh_init(void)
> +{
> +   unsigned long thresh;
> +   unsigned long mips;

This fails to build on mips.
Apparently mips is a predefined preprocessor macro:

$ echo | mipsel-linux-gnu-gcc -dM -E - | grep -w mips
#define mips 1

Gr{oetje,eeting}s,

Geert

-- 
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- ge...@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds


Re: Consider switching to WQ_UNBOUND messages (was: Re: [PATCH v2 6/7] workqueue: Report work funcs that trigger automatic CPU_INTENSIVE mechanism)

2023-07-18 Thread Tejun Heo
Hello,

On Tue, Jul 18, 2023 at 11:54:58AM +0200, Geert Uytterhoeven wrote:
> I gave it a try on a system with an 800 MHz Cortex A9, only to discover
> it makes no difference, as that machine has 1600 BogoMIPS:

Oops.

> workqueue: blk_mq_run_work_fn hogged CPU for >1us 4 times,
> consider switching to WQ_UNBOUND

It could be that we actually want to switch to UNBOUND for some reports but
the above triggering most likely indicates that the threshold is too
aggressive.

> Artificially low BogoMIPS numbers only happen on systems that have
> the related timers (Cortex A7/A15 and later, Cortex A9 MPCore,
> and arm64).

Ah, I see. Thanks for the explanation.

> I will test on more systems, but that will probably not happen until
> next week...

Thanks, really appreciate it. Can you try the following instead when you
have time? I just pushed up the lower boundary to 4000 MIPS. The scaling is
still capped at 1s.

>From 8555cbd4b22e5f85eb2bdcb84fd1d1f519a0a0d3 Mon Sep 17 00:00:00 2001
From: Tejun Heo 
Date: Mon, 17 Jul 2023 12:50:02 -1000
Subject: [PATCH] workqueue: Scale up wq_cpu_intensive_thresh_us if BogoMIPS is
 below 4000

wq_cpu_intensive_thresh_us is used to detect CPU-hogging per-cpu work items.
Once detected, they're excluded from concurrency management to prevent them
from blocking other per-cpu work items. If CONFIG_WQ_CPU_INTENSIVE_REPORT is
enabled, repeat offenders are also reported so that the code can be updated.

The default threshold is 10ms which is long enough to do fair bit of work on
modern CPUs while short enough to be usually not noticeable. This
unfortunately leads to a lot of, arguable spurious, detections on very slow
CPUs. Using the same threshold across CPUs whose performance levels may be
apart by multiple levels of magnitude doesn't make whole lot of sense.

This patch scales up wq_cpu_intensive_thresh_us upto 1 second when BogoMIPS
is below 4000. This is obviously very inaccurate but it doesn't have to be
accurate to be useful. The mechanism is still useful when the threshold is
fully scaled up and the benefits of reports are usually shared with everyone
regardless of who's reporting, so as long as there are sufficient number of
fast machines reporting, we don't lose much.

Some (or is it all?) ARM CPUs systemtically report significantly lower
BogoMIPS. While this doesn't break anything, given how widespread ARM CPUs
are, it's at least a missed opportunity and it probably would be a good idea
to teach workqueue about it.

Signed-off-by: Tejun Heo 
Cc: Geert Uytterhoeven 
---
 kernel/workqueue.c | 43 ++-
 1 file changed, 42 insertions(+), 1 deletion(-)

diff --git a/kernel/workqueue.c b/kernel/workqueue.c
index 02a8f402eeb5..0d7a3d29762f 100644
--- a/kernel/workqueue.c
+++ b/kernel/workqueue.c
@@ -52,6 +52,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include "workqueue_internal.h"
 
@@ -338,8 +339,10 @@ static cpumask_var_t *wq_numa_possible_cpumask;
  * Per-cpu work items which run for longer than the following threshold are
  * automatically considered CPU intensive and excluded from concurrency
  * management to prevent them from noticeably delaying other per-cpu work 
items.
+ * ULONG_MAX indicates that the user hasn't overridden it with a boot 
parameter.
+ * The actual value is initialized in wq_cpu_intensive_thresh_init().
  */
-static unsigned long wq_cpu_intensive_thresh_us = 1;
+static unsigned long wq_cpu_intensive_thresh_us = ULONG_MAX;
 module_param_named(cpu_intensive_thresh_us, wq_cpu_intensive_thresh_us, ulong, 
0644);
 
 static bool wq_disable_numa;
@@ -6513,6 +6516,42 @@ void __init workqueue_init_early(void)
   !system_freezable_power_efficient_wq);
 }
 
+static void __init wq_cpu_intensive_thresh_init(void)
+{
+   unsigned long thresh;
+   unsigned long mips;
+
+   /* if the user set it to a specific value, keep it */
+   if (wq_cpu_intensive_thresh_us != ULONG_MAX)
+   return;
+
+   /*
+* The default of 10ms is derived from the fact that most modern (as of
+* 2023) processors can do a lot in 10ms and that it's just below what
+* most consider human-perceivable. However, the kernel also runs on a
+* lot slower CPUs including microcontrollers where the threshold is way
+* too low.
+*
+* Let's scale up the threshold upto 1 second if BogoMips is below 4000.
+* This is by no means accurate but it doesn't have to be. The mechanism
+* is still useful even when the threshold is fully scaled up. Also, as
+* the reports would usually be applicable to everyone, some machines
+* operating on longer thresholds won't significantly diminish their
+* usefulness.
+*/
+   thresh = 10 * USEC_PER_MSEC;
+
+   /* see init/calibrate.c for lpj -> BogoMIPS calculation */
+   mips = max_t(unsigned long, loops_per_jiffy / 50 * HZ, 1);
+   if (mips < 4000)
+

Re: Consider switching to WQ_UNBOUND messages (was: Re: [PATCH v2 6/7] workqueue: Report work funcs that trigger automatic CPU_INTENSIVE mechanism)

2023-07-18 Thread Geert Uytterhoeven
Hi Tejun,

On Tue, Jul 18, 2023 at 1:03 AM Tejun Heo  wrote:
> Can you please the following patch and see how many reports you get? Looking
> back at your reports, I think some of them probably should be converted to
> UNBOUND but we should have a better idea with the adjusted threshold.
>
> Thanks.
>
> From 8555cbd4b22e5f85eb2bdcb84fd1d1f519a0a0d3 Mon Sep 17 00:00:00 2001
> From: Tejun Heo 
> Date: Mon, 17 Jul 2023 12:50:02 -1000
> Subject: [PATCH] workqueue: Scale up wq_cpu_intensive_thresh_us if BogoMIPS is
>  below 1000
>
> wq_cpu_intensive_thresh_us is used to detect CPU-hogging per-cpu work items.
> Once detected, they're excluded from concurrency management to prevent them
> from blocking other per-cpu work items. If CONFIG_WQ_CPU_INTENSIVE_REPORT is
> enabled, repeat offenders are also reported so that the code can be updated.
>
> The default threshold is 10ms which is long enough to do fair bit of work on
> modern CPUs while short enough to be usually not noticeable. This
> unfortunately leads to a lot of, arguable spurious, detections on very slow
> CPUs. Using the same threshold across CPUs whose performance levels may be
> apart by multiple levels of magnitude doesn't make whole lot of sense.
>
> This patch scales up wq_cpu_intensive_thresh_us upto 1 second when BogoMIPS
> is below 1000. This is obviously very inaccurate but it doesn't have to be
> accurate to be useful. The mechanism is still useful when the threshold is
> fully scaled up and the benefits of reports are usually shared with everyone
> regardless of who's reporting, so as long as there are sufficient number of
> fast machines reporting, we don't lose much.
>
> Some (or is it all?) ARM CPUs systemtically report significantly lower
> BogoMIPS. While this doesn't break anything, given how widespread ARM CPUs
> are, it's at least a missed opportunity and it probably would be a good idea
> to teach workqueue about it.
>
> Signed-off-by: Tejun Heo 

Thanks!

I gave it a try on a system with an 800 MHz Cortex A9, only to discover
it makes no difference, as that machine has 1600 BogoMIPS:

workqueue: drm_fb_helper_damage_work hogged CPU for >1us 4 times,
consider switching to WQ_UNBOUND
workqueue: drm_fb_helper_damage_work hogged CPU for >1us 8 times,
consider switching to WQ_UNBOUND
workqueue: genpd_power_off_work_fn hogged CPU for >1us 4 times,
consider switching to WQ_UNBOUND
workqueue: blk_mq_run_work_fn hogged CPU for >1us 4 times,
consider switching to WQ_UNBOUND
workqueue: pm_runtime_work hogged CPU for >1us 4 times, consider
switching to WQ_UNBOUND
workqueue: phy_state_machine hogged CPU for >1us 4 times, consider
switching to WQ_UNBOUND
workqueue: drm_mode_rmfb_work_fn hogged CPU for >1us 4 times,
consider switching to WQ_UNBOUND
workqueue: sync_hw_clock hogged CPU for >1us 4 times, consider
switching to WQ_UNBOUND
workqueue: rtc_timer_do_work hogged CPU for >1us 4 times, consider
switching to WQ_UNBOUND

Artificially low BogoMIPS numbers only happen on systems that have
the related timers (Cortex A7/A15 and later, Cortex A9 MPCore,
and arm64).

I will test on more systems, but that will probably not happen until
next week...

Gr{oetje,eeting}s,

Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- ge...@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds


Re: Consider switching to WQ_UNBOUND messages (was: Re: [PATCH v2 6/7] workqueue: Report work funcs that trigger automatic CPU_INTENSIVE mechanism)

2023-07-17 Thread Tejun Heo
Hello, Geert.

Can you please the following patch and see how many reports you get? Looking
back at your reports, I think some of them probably should be converted to
UNBOUND but we should have a better idea with the adjusted threshold.

Thanks.

>From 8555cbd4b22e5f85eb2bdcb84fd1d1f519a0a0d3 Mon Sep 17 00:00:00 2001
From: Tejun Heo 
Date: Mon, 17 Jul 2023 12:50:02 -1000
Subject: [PATCH] workqueue: Scale up wq_cpu_intensive_thresh_us if BogoMIPS is
 below 1000

wq_cpu_intensive_thresh_us is used to detect CPU-hogging per-cpu work items.
Once detected, they're excluded from concurrency management to prevent them
from blocking other per-cpu work items. If CONFIG_WQ_CPU_INTENSIVE_REPORT is
enabled, repeat offenders are also reported so that the code can be updated.

The default threshold is 10ms which is long enough to do fair bit of work on
modern CPUs while short enough to be usually not noticeable. This
unfortunately leads to a lot of, arguable spurious, detections on very slow
CPUs. Using the same threshold across CPUs whose performance levels may be
apart by multiple levels of magnitude doesn't make whole lot of sense.

This patch scales up wq_cpu_intensive_thresh_us upto 1 second when BogoMIPS
is below 1000. This is obviously very inaccurate but it doesn't have to be
accurate to be useful. The mechanism is still useful when the threshold is
fully scaled up and the benefits of reports are usually shared with everyone
regardless of who's reporting, so as long as there are sufficient number of
fast machines reporting, we don't lose much.

Some (or is it all?) ARM CPUs systemtically report significantly lower
BogoMIPS. While this doesn't break anything, given how widespread ARM CPUs
are, it's at least a missed opportunity and it probably would be a good idea
to teach workqueue about it.

Signed-off-by: Tejun Heo 
Cc: Geert Uytterhoeven 
---
 kernel/workqueue.c | 43 ++-
 1 file changed, 42 insertions(+), 1 deletion(-)

diff --git a/kernel/workqueue.c b/kernel/workqueue.c
index 02a8f402eeb5..0d7a3d29762f 100644
--- a/kernel/workqueue.c
+++ b/kernel/workqueue.c
@@ -52,6 +52,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include "workqueue_internal.h"
 
@@ -338,8 +339,10 @@ static cpumask_var_t *wq_numa_possible_cpumask;
  * Per-cpu work items which run for longer than the following threshold are
  * automatically considered CPU intensive and excluded from concurrency
  * management to prevent them from noticeably delaying other per-cpu work 
items.
+ * ULONG_MAX indicates that the user hasn't overridden it with a boot 
parameter.
+ * The actual value is initialized in wq_cpu_intensive_thresh_init().
  */
-static unsigned long wq_cpu_intensive_thresh_us = 1;
+static unsigned long wq_cpu_intensive_thresh_us = ULONG_MAX;
 module_param_named(cpu_intensive_thresh_us, wq_cpu_intensive_thresh_us, ulong, 
0644);
 
 static bool wq_disable_numa;
@@ -6513,6 +6516,42 @@ void __init workqueue_init_early(void)
   !system_freezable_power_efficient_wq);
 }
 
+static void __init wq_cpu_intensive_thresh_init(void)
+{
+   unsigned long thresh;
+   unsigned long mips;
+
+   /* if the user set it to a specific value, keep it */
+   if (wq_cpu_intensive_thresh_us != ULONG_MAX)
+   return;
+
+   /*
+* The default of 10ms is derived from the fact that most modern (as of
+* 2023) processors can do a lot in 10ms and that it's just below what
+* most consider human-perceivable. However, the kernel also runs on a
+* lot slower CPUs including microcontrollers where the threshold is way
+* too low.
+*
+* Let's scale up the threshold upto 1 second if BogoMips is below 1000.
+* This is by no means accurate but it doesn't have to be. The mechanism
+* is still useful even when the threshold is fully scaled up. Also, as
+* the reports would usually be applicable to everyone, some machines
+* operating on longer thresholds won't significantly diminish their
+* usefulness.
+*/
+   thresh = 10 * USEC_PER_MSEC;
+
+   /* see init/calibrate.c for lpj -> BogoMIPS calculation */
+   mips = max_t(unsigned long, loops_per_jiffy / 50 * HZ, 1);
+   if (mips < 1000)
+   thresh = min_t(unsigned long, thresh * 1000 / mips, 
USEC_PER_SEC);
+
+   pr_debug("wq_cpu_intensive_thresh: lpj=%lu mips=%lu thresh_us=%lu\n",
+loops_per_jiffy, mips, thresh);
+
+   wq_cpu_intensive_thresh_us = thresh;
+}
+
 /**
  * workqueue_init - bring workqueue subsystem fully online
  *
@@ -6528,6 +6567,8 @@ void __init workqueue_init(void)
struct worker_pool *pool;
int cpu, bkt;
 
+   wq_cpu_intensive_thresh_init();
+
/*
 * It'd be simpler to initialize NUMA in workqueue_init_early() but
 * CPU to node mapping may not be available that early on some
-- 
2.41.0



Re: Consider switching to WQ_UNBOUND messages (was: Re: [PATCH v2 6/7] workqueue: Report work funcs that trigger automatic CPU_INTENSIVE mechanism)

2023-07-13 Thread Tejun Heo
On Wed, Jul 12, 2023 at 02:27:45PM +0200, Peter Zijlstra wrote:
> On Wed, Jul 12, 2023 at 11:04:16AM +0200, Geert Uytterhoeven wrote:
> > Hoi Peter,
> > 
> > On Wed, Jul 12, 2023 at 10:05 AM Peter Zijlstra  
> > wrote:
> > > On Tue, Jul 11, 2023 at 11:39:17AM -1000, Tejun Heo wrote:
> > > > I wonder whether the right thing to do here is somehow scaling the 
> > > > threshold
> > > > according to the relative processing power. It's difficult to come up 
> > > > with a
> > > > threshold which works well across the latest & fastest and really tiny 
> > > > CPUs.
> > > > I'll think about it some more but if you have some ideas, please feel 
> > > > free
> > > > to suggest.
> > >
> > > We could scale by BogoMIPS I suppose, it's a bogus measurement, as per
> > > the name, but it does have some relation to how fast the machine is.
> > 
> > That's gonna fail miserably on e.g. ARM and RISC-V, where BogoMIPS
> > depends on some timer frequency.
> > 
> > R-Car M2-W with 1.5 GHz Cortex-A15: 40.00 BogoMIPS
> > R-Car V4H with 1.8 GHz Cortex-A76: 33.33 BogoMIPS
> > 
> > while the real slow 48 MHz VexRiscV gets 128 BogoMIPS.
> 
> Hehe, OK, really bogus then. Lets file this idea in the bit-bucket then.

I think it can still be useful. On ryzen 3975wx, it's 6989.92, so while it
may be off by some hundreds of percents, there are still orders of magnitude
signal range and that should be enough to suppress most spurious warnings.
I'll post something later today.

Thanks.

-- 
tejun


Re: Consider switching to WQ_UNBOUND messages (was: Re: [PATCH v2 6/7] workqueue: Report work funcs that trigger automatic CPU_INTENSIVE mechanism)

2023-07-12 Thread Peter Zijlstra
On Wed, Jul 12, 2023 at 11:04:16AM +0200, Geert Uytterhoeven wrote:
> Hoi Peter,
> 
> On Wed, Jul 12, 2023 at 10:05 AM Peter Zijlstra  wrote:
> > On Tue, Jul 11, 2023 at 11:39:17AM -1000, Tejun Heo wrote:
> > > I wonder whether the right thing to do here is somehow scaling the 
> > > threshold
> > > according to the relative processing power. It's difficult to come up 
> > > with a
> > > threshold which works well across the latest & fastest and really tiny 
> > > CPUs.
> > > I'll think about it some more but if you have some ideas, please feel free
> > > to suggest.
> >
> > We could scale by BogoMIPS I suppose, it's a bogus measurement, as per
> > the name, but it does have some relation to how fast the machine is.
> 
> That's gonna fail miserably on e.g. ARM and RISC-V, where BogoMIPS
> depends on some timer frequency.
> 
> R-Car M2-W with 1.5 GHz Cortex-A15: 40.00 BogoMIPS
> R-Car V4H with 1.8 GHz Cortex-A76: 33.33 BogoMIPS
> 
> while the real slow 48 MHz VexRiscV gets 128 BogoMIPS.

Hehe, OK, really bogus then. Lets file this idea in the bit-bucket then.


Re: Consider switching to WQ_UNBOUND messages (was: Re: [PATCH v2 6/7] workqueue: Report work funcs that trigger automatic CPU_INTENSIVE mechanism)

2023-07-12 Thread Geert Uytterhoeven
Hi Tejun,

On Wed, Jul 12, 2023 at 2:30 AM Tejun Heo  wrote:
> On Tue, Jul 11, 2023 at 11:39:17AM -1000, Tejun Heo wrote:
> > On Tue, Jul 11, 2023 at 04:06:22PM +0200, Geert Uytterhoeven wrote:
> > > On Tue, Jul 11, 2023 at 3:55 PM Geert Uytterhoeven  
> > > wrote:
> ...
> > > workqueue: neigh_managed_work hogged CPU for >1us 4 times,
> > > consider switching to WQ_UNBOUND
> >
> > I wonder whether the right thing to do here is somehow scaling the threshold
> > according to the relative processing power. It's difficult to come up with a
> > threshold which works well across the latest & fastest and really tiny CPUs.
> > I'll think about it some more but if you have some ideas, please feel free
> > to suggest.
>
> Geert, do you mind posting the full kernel logs for the affected machines?

https://drive.google.com/file/d/1toDs7ugZJ2eXatpdvySY4yxSsNam9xAC
is an archive with boot and s2ram logs.  Note that my kernels do contain
local debug code, and may be noisy.

Gr{oetje,eeting}s,

Geert

-- 
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- ge...@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds


Re: Consider switching to WQ_UNBOUND messages (was: Re: [PATCH v2 6/7] workqueue: Report work funcs that trigger automatic CPU_INTENSIVE mechanism)

2023-07-12 Thread Geert Uytterhoeven
Hoi Peter,

On Wed, Jul 12, 2023 at 10:05 AM Peter Zijlstra  wrote:
> On Tue, Jul 11, 2023 at 11:39:17AM -1000, Tejun Heo wrote:
> > I wonder whether the right thing to do here is somehow scaling the threshold
> > according to the relative processing power. It's difficult to come up with a
> > threshold which works well across the latest & fastest and really tiny CPUs.
> > I'll think about it some more but if you have some ideas, please feel free
> > to suggest.
>
> We could scale by BogoMIPS I suppose, it's a bogus measurement, as per
> the name, but it does have some relation to how fast the machine is.

That's gonna fail miserably on e.g. ARM and RISC-V, where BogoMIPS
depends on some timer frequency.

R-Car M2-W with 1.5 GHz Cortex-A15: 40.00 BogoMIPS
R-Car V4H with 1.8 GHz Cortex-A76: 33.33 BogoMIPS

while the real slow 48 MHz VexRiscV gets 128 BogoMIPS.

Gr{oetje,eeting}s,

Geert

-- 
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- ge...@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds


Re: Consider switching to WQ_UNBOUND messages (was: Re: [PATCH v2 6/7] workqueue: Report work funcs that trigger automatic CPU_INTENSIVE mechanism)

2023-07-12 Thread Peter Zijlstra
On Tue, Jul 11, 2023 at 11:39:17AM -1000, Tejun Heo wrote:

> I wonder whether the right thing to do here is somehow scaling the threshold
> according to the relative processing power. It's difficult to come up with a
> threshold which works well across the latest & fastest and really tiny CPUs.
> I'll think about it some more but if you have some ideas, please feel free
> to suggest.

We could scale by BogoMIPS I suppose, it's a bogus measurement, as per
the name, but it does have some relation to how fast the machine is.


Re: Consider switching to WQ_UNBOUND messages (was: Re: [PATCH v2 6/7] workqueue: Report work funcs that trigger automatic CPU_INTENSIVE mechanism)

2023-07-11 Thread Tejun Heo
On Tue, Jul 11, 2023 at 11:39:17AM -1000, Tejun Heo wrote:
> On Tue, Jul 11, 2023 at 04:06:22PM +0200, Geert Uytterhoeven wrote:
> > On Tue, Jul 11, 2023 at 3:55 PM Geert Uytterhoeven  
> > wrote:
...
> > workqueue: neigh_managed_work hogged CPU for >1us 4 times,
> > consider switching to WQ_UNBOUND
> 
> I wonder whether the right thing to do here is somehow scaling the threshold
> according to the relative processing power. It's difficult to come up with a
> threshold which works well across the latest & fastest and really tiny CPUs.
> I'll think about it some more but if you have some ideas, please feel free
> to suggest.

Geert, do you mind posting the full kernel logs for the affected machines?

Thanks.

-- 
tejun


Re: Consider switching to WQ_UNBOUND messages (was: Re: [PATCH v2 6/7] workqueue: Report work funcs that trigger automatic CPU_INTENSIVE mechanism)

2023-07-11 Thread Tejun Heo
Hello,

On Tue, Jul 11, 2023 at 04:06:22PM +0200, Geert Uytterhoeven wrote:
> On Tue, Jul 11, 2023 at 3:55 PM Geert Uytterhoeven  
> wrote:
> >
> > Hi Tejun,
> >
> > On Fri, May 12, 2023 at 9:54 PM Tejun Heo  wrote:
> > > Workqueue now automatically marks per-cpu work items that hog CPU for too
> > > long as CPU_INTENSIVE, which excludes them from concurrency management and
> > > prevents stalling other concurrency-managed work items. If a work function
> > > keeps running over the thershold, it likely needs to be switched to use an
> > > unbound workqueue.
> > >
> > > This patch adds a debug mechanism which tracks the work functions which
> > > trigger the automatic CPU_INTENSIVE mechanism and report them using
> > > pr_warn() with exponential backoff.
> > >
> > > v2: Drop bouncing through kthread_worker for printing messages. It was to
> > > avoid introducing circular locking dependency but wasn't effective as 
> > > it
> > > still had pool lock -> wci_lock -> printk -> pool lock loop. Let's 
> > > just
> > > print directly using printk_deferred().
> > >
> > > Signed-off-by: Tejun Heo 
> > > Suggested-by: Peter Zijlstra 
> >
> > Thanks for your patch, which is now commit 6363845005202148
> > ("workqueue: Report work funcs that trigger automatic CPU_INTENSIVE
> > mechanism") in v6.5-rc1.
> >
> > I guess you are interested to know where this triggers.
> > I enabled CONFIG_WQ_CPU_INTENSIVE_REPORT=y, and tested
> > the result on various machines...
> 
> > OrangeCrab/Linux-on-LiteX-VexRiscV with ht16k33 14-seg display and 
> > ssd130xdrmfb:
> >
> >   workqueue: check_lifetime hogged CPU for >1us 4 times, consider
> > switching to WQ_UNBOUND
> >   workqueue: drm_fb_helper_damage_work hogged CPU for >1us 1024
> > times, consider switching to WQ_UNBOUND
> >   workqueue: fb_flashcursor hogged CPU for >1us 128 times,
> > consider switching to WQ_UNBOUND
> >   workqueue: ht16k33_seg14_update hogged CPU for >1us 128 times,
> > consider switching to WQ_UNBOUND
> >   workqueue: mmc_rescan hogged CPU for >1us 128 times, consider
> > switching to WQ_UNBOUND
> 
> Got one more after a while:
> 
> workqueue: neigh_managed_work hogged CPU for >1us 4 times,
> consider switching to WQ_UNBOUND

I wonder whether the right thing to do here is somehow scaling the threshold
according to the relative processing power. It's difficult to come up with a
threshold which works well across the latest & fastest and really tiny CPUs.
I'll think about it some more but if you have some ideas, please feel free
to suggest.

Thanks.

-- 
tejun


Re: Consider switching to WQ_UNBOUND messages (was: Re: [PATCH v2 6/7] workqueue: Report work funcs that trigger automatic CPU_INTENSIVE mechanism)

2023-07-11 Thread Geert Uytterhoeven
On Tue, Jul 11, 2023 at 3:55 PM Geert Uytterhoeven  wrote:
>
> Hi Tejun,
>
> On Fri, May 12, 2023 at 9:54 PM Tejun Heo  wrote:
> > Workqueue now automatically marks per-cpu work items that hog CPU for too
> > long as CPU_INTENSIVE, which excludes them from concurrency management and
> > prevents stalling other concurrency-managed work items. If a work function
> > keeps running over the thershold, it likely needs to be switched to use an
> > unbound workqueue.
> >
> > This patch adds a debug mechanism which tracks the work functions which
> > trigger the automatic CPU_INTENSIVE mechanism and report them using
> > pr_warn() with exponential backoff.
> >
> > v2: Drop bouncing through kthread_worker for printing messages. It was to
> > avoid introducing circular locking dependency but wasn't effective as it
> > still had pool lock -> wci_lock -> printk -> pool lock loop. Let's just
> > print directly using printk_deferred().
> >
> > Signed-off-by: Tejun Heo 
> > Suggested-by: Peter Zijlstra 
>
> Thanks for your patch, which is now commit 6363845005202148
> ("workqueue: Report work funcs that trigger automatic CPU_INTENSIVE
> mechanism") in v6.5-rc1.
>
> I guess you are interested to know where this triggers.
> I enabled CONFIG_WQ_CPU_INTENSIVE_REPORT=y, and tested
> the result on various machines...

> OrangeCrab/Linux-on-LiteX-VexRiscV with ht16k33 14-seg display and 
> ssd130xdrmfb:
>
>   workqueue: check_lifetime hogged CPU for >1us 4 times, consider
> switching to WQ_UNBOUND
>   workqueue: drm_fb_helper_damage_work hogged CPU for >1us 1024
> times, consider switching to WQ_UNBOUND
>   workqueue: fb_flashcursor hogged CPU for >1us 128 times,
> consider switching to WQ_UNBOUND
>   workqueue: ht16k33_seg14_update hogged CPU for >1us 128 times,
> consider switching to WQ_UNBOUND
>   workqueue: mmc_rescan hogged CPU for >1us 128 times, consider
> switching to WQ_UNBOUND

Got one more after a while:

workqueue: neigh_managed_work hogged CPU for >1us 4 times,
consider switching to WQ_UNBOUND

Gr{oetje,eeting}s,

Geert

-- 
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- ge...@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds