Re: [PATCH v5 7/7] sched, smp: Trace smp callback causing an IPI

2023-03-22 Thread Peter Zijlstra
On Wed, Mar 22, 2023 at 06:22:28PM +, Valentin Schneider wrote:
> On 22/03/23 18:22, Peter Zijlstra wrote:

> >>hackbench-157   [001]10.894320: ipi_send_cpu: cpu=3 
> >> callsite=check_preempt_curr+0x37 callback=0x0
> >
> > Arguably we should be setting callback to scheduler_ipi(), except
> > ofcourse, that's not an actual function...
> >
> > Maybe we can do "extern inline" for the actual users and provide a dummy
> > function for the symbol when tracing.
> >
> 
> Huh, I wasn't aware that was an option, I'll look into that. I did scribble
> down a comment next to smp_send_reschedule(), but having a decodable
> function name would be better!

So clang-15 builds the below (and generates the expected code), but
gcc-12 vomits nonsense about a non-static inline calling a static inline
or somesuch bollocks :-/

--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -1991,7 +1991,7 @@ extern char *__get_task_comm(char *to, s
 })
 
 #ifdef CONFIG_SMP
-static __always_inline void scheduler_ipi(void)
+extern __always_inline void scheduler_ipi(void)
 {
/*
 * Fold TIF_NEED_RESCHED into the preempt_count; anybody setting
--- a/include/linux/smp.h
+++ b/include/linux/smp.h
@@ -130,9 +130,9 @@ extern void arch_smp_send_reschedule(int
  * scheduler_ipi() is inline so can't be passed as callback reason, but the
  * callsite IP should be sufficient for root-causing IPIs sent from here.
  */
-#define smp_send_reschedule(cpu) ({  \
-   trace_ipi_send_cpu(cpu, _RET_IP_, NULL);  \
-   arch_smp_send_reschedule(cpu);\
+#define smp_send_reschedule(cpu) ({\
+   trace_ipi_send_cpu(cpu, _RET_IP_, &scheduler_ipi);  \
+   arch_smp_send_reschedule(cpu);  \
 })
 
 /*
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -3790,6 +3790,15 @@ static int ttwu_runnable(struct task_str
 }
 
 #ifdef CONFIG_SMP
+void scheduler_ipi(void)
+{
+   /*
+* Actual users should end up using the extern inline, this is only
+* here for the symbol.
+*/
+   BUG();
+}
+
 void sched_ttwu_pending(void *arg)
 {
struct llist_node *llist = arg;

___
linux-snps-arc mailing list
linux-snps-arc@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-snps-arc


Re: [PATCH v5 7/7] sched, smp: Trace smp callback causing an IPI

2023-03-22 Thread Valentin Schneider
On 22/03/23 18:22, Peter Zijlstra wrote:
> On Wed, Mar 22, 2023 at 05:01:13PM +, Valentin Schneider wrote:
>
>> > So I was thinking something like this:
>
>> Hm, this does get rid of the func being passed down the helpers, but this
>> means the trace events are now stateful, i.e. I need the first and last
>> events in a CSD stack to figure out which one actually caused the IPI.
>
> Isn't much of tracing stateful? I mean, why am I always writing awk
> programs to parse trace output?
>
> The one that is directly followed by
> generic_smp_call_function_single_interrupt() (horrible name that), is
> the one that tripped the IPI.
>

Right.

>> It also requires whoever is looking at the trace to be aware of which IPIs
>> are attached to a CSD, and which ones aren't. ATM that's only the resched
>> IPI, but per the cover letter there's more to come (e.g. tick_broadcast()
>> for arm64/riscv and a few others). For instance:
>> 
>>hackbench-157   [001]10.894320: ipi_send_cpu: cpu=3 
>> callsite=check_preempt_curr+0x37 callback=0x0
>
> Arguably we should be setting callback to scheduler_ipi(), except
> ofcourse, that's not an actual function...
>
> Maybe we can do "extern inline" for the actual users and provide a dummy
> function for the symbol when tracing.
>

Huh, I wasn't aware that was an option, I'll look into that. I did scribble
down a comment next to smp_send_reschedule(), but having a decodable
function name would be better!

>>hackbench-157   [001]10.895068: ipi_send_cpu: cpu=3 
>> callsite=try_to_wake_up+0x29e callback=sched_ttwu_pending+0x0
>>hackbench-157   [001]10.895068: ipi_send_cpu: cpu=3 
>> callsite=try_to_wake_up+0x29e 
>> callback=generic_smp_call_function_single_interrupt+0x0
>> 
>> That first one sent a RESCHEDULE IPI, the second one a CALL_FUNCTION one,
>> but you really have to know what you're looking at...
>
> But you have to know that anyway, you can't do tracing and not know wtf
> you're doing. Or rather, if you do, I don't give a crap and you can keep
> the pieces :-)
>
> Grepping the callback should be pretty quick resolution at to what trips
> it, no?
>
> (also, if you *really* can't manage, we can always add yet another
> argument that gives a type thingy)
>

Ah, I was a bit unclear here - I don't care too much about the IPI type
being used, but rather being able to figure out on IRQ entry where that IPI
came from - thinking some more about now, I don't think logging *all* CSDs
causes an issue there, as you'd look at the earliest-not-seen-yet event
targeting this CPU anyway.

That'll be made easy once I get to having cpumask filters for ftrace, so
I can just issue something like:

  trace-cmd record -e 'ipi_send_cpu' -f "cpu == 3" -e 'ipi_send_cpumask' -f 
"cpus \in {3}" -T hackbench 

(it's somewhere on the todolist...)

TL;DR: I *think* I've convinced myself logging all of them isn't an issue -
I'm going to play with this on something "smarter" than just hackbench
under QEMU just to drill it in.


___
linux-snps-arc mailing list
linux-snps-arc@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-snps-arc


Re: [PATCH v5 7/7] sched, smp: Trace smp callback causing an IPI

2023-03-22 Thread Peter Zijlstra
On Wed, Mar 22, 2023 at 05:01:13PM +, Valentin Schneider wrote:

> > So I was thinking something like this:

> Hm, this does get rid of the func being passed down the helpers, but this
> means the trace events are now stateful, i.e. I need the first and last
> events in a CSD stack to figure out which one actually caused the IPI.

Isn't much of tracing stateful? I mean, why am I always writing awk
programs to parse trace output?

The one that is directly followed by
generic_smp_call_function_single_interrupt() (horrible name that), is
the one that tripped the IPI.

> It also requires whoever is looking at the trace to be aware of which IPIs
> are attached to a CSD, and which ones aren't. ATM that's only the resched
> IPI, but per the cover letter there's more to come (e.g. tick_broadcast()
> for arm64/riscv and a few others). For instance:
> 
>hackbench-157   [001]10.894320: ipi_send_cpu: cpu=3 
> callsite=check_preempt_curr+0x37 callback=0x0

Arguably we should be setting callback to scheduler_ipi(), except
ofcourse, that's not an actual function...

Maybe we can do "extern inline" for the actual users and provide a dummy
function for the symbol when tracing.

>hackbench-157   [001]10.895068: ipi_send_cpu: cpu=3 
> callsite=try_to_wake_up+0x29e callback=sched_ttwu_pending+0x0
>hackbench-157   [001]10.895068: ipi_send_cpu: cpu=3 
> callsite=try_to_wake_up+0x29e 
> callback=generic_smp_call_function_single_interrupt+0x0
> 
> That first one sent a RESCHEDULE IPI, the second one a CALL_FUNCTION one,
> but you really have to know what you're looking at...

But you have to know that anyway, you can't do tracing and not know wtf
you're doing. Or rather, if you do, I don't give a crap and you can keep
the pieces :-)

Grepping the callback should be pretty quick resolution at to what trips
it, no?

(also, if you *really* can't manage, we can always add yet another
argument that gives a type thingy)

> Are you worried about the @func being pushed down?

Not really, I was finding it odd that only the first csd was being
logged. Either you should log them all (after all, the target CPU will
run them all and you might still wonder where the heck they came from)
or it should log none and always report that hideous long function name
I can't be arsed to type again :-)

> Staring at x86 asm is not good for the soul,

Scarred for life :-) What's worse, due to being exposed to Intel syntax
at a young age, I'm now permantently confused as to the argument order
of x86 asm.


___
linux-snps-arc mailing list
linux-snps-arc@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-snps-arc


Re: [PATCH v5 7/7] sched, smp: Trace smp callback causing an IPI

2023-03-22 Thread Valentin Schneider
On 22/03/23 15:04, Peter Zijlstra wrote:
> On Wed, Mar 22, 2023 at 12:20:28PM +, Valentin Schneider wrote:
>> On 22/03/23 10:53, Peter Zijlstra wrote:
>
>> > Hurmph... so we only really consume @func when we IPI. Would it not be
>> > more useful to trace this thing for *every* csd enqeued?
>>
>> It's true that any CSD enqueued on that CPU's call_single_queue in the
>> [first CSD llist_add()'ed, IPI IRQ hits] timeframe is a potential source of
>> interference.
>>
>> However, can we be sure that first CSD isn't an indirect cause for the
>> following ones? say the target CPU exits RCU EQS due to the IPI, there's a
>> bit of time before it gets to flush_smp_call_function_queue() where some 
>> other CSD
>> could be enqueued *because* of that change in state.
>>
>> I couldn't find a easy example of that, I might be biased as this is where
>> I'd like to go wrt IPI'ing isolated CPUs in usermode. But regardless, when
>> correlating an IPI IRQ with its source, we'd always have to look at the
>> first CSD in that CSD stack.
>
> So I was thinking something like this:
>
> ---
> Subject: trace,smp: Trace all smp_function_call*() invocations
> From: Peter Zijlstra 
> Date: Wed Mar 22 14:58:36 CET 2023
>
> (Ab)use the trace_ipi_send_cpu*() family to trace all
> smp_function_call*() invocations, not only those that result in an
> actual IPI.
>
> The queued entries log their callback function while the actual IPIs
> are traced on generic_smp_call_function_single_interrupt().
>
> Signed-off-by: Peter Zijlstra (Intel) 
> ---
>  kernel/smp.c |   58 
> ++
>  1 file changed, 30 insertions(+), 28 deletions(-)
>
> --- a/kernel/smp.c
> +++ b/kernel/smp.c
> @@ -106,18 +106,20 @@ void __init call_function_init(void)
>  }
>
>  static __always_inline void
> -send_call_function_single_ipi(int cpu, smp_call_func_t func)
> +send_call_function_single_ipi(int cpu)
>  {
>   if (call_function_single_prep_ipi(cpu)) {
> - trace_ipi_send_cpu(cpu, _RET_IP_, func);
> + trace_ipi_send_cpu(cpu, _RET_IP_,
> +generic_smp_call_function_single_interrupt);

Hm, this does get rid of the func being passed down the helpers, but this
means the trace events are now stateful, i.e. I need the first and last
events in a CSD stack to figure out which one actually caused the IPI.

It also requires whoever is looking at the trace to be aware of which IPIs
are attached to a CSD, and which ones aren't. ATM that's only the resched
IPI, but per the cover letter there's more to come (e.g. tick_broadcast()
for arm64/riscv and a few others). For instance:

   hackbench-157   [001]10.894320: ipi_send_cpu: cpu=3 
callsite=check_preempt_curr+0x37 callback=0x0
   hackbench-157   [001]10.895068: ipi_send_cpu: cpu=3 
callsite=try_to_wake_up+0x29e callback=sched_ttwu_pending+0x0
   hackbench-157   [001]10.895068: ipi_send_cpu: cpu=3 
callsite=try_to_wake_up+0x29e 
callback=generic_smp_call_function_single_interrupt+0x0

That first one sent a RESCHEDULE IPI, the second one a CALL_FUNCTION one,
but you really have to know what you're looking at...

Are you worried about the @func being pushed down? Staring at x86 asm is
not good for the soul, but AFAICT this does cause an extra register to be
popped in the prologue because all of the helpers are __always_inline, so
both paths of the static key(s) are in the same stackframe.

I can "improve" this with:

---
diff --git a/kernel/smp.c b/kernel/smp.c
index 5cd680a7e78ef..55f120dae1713 100644
--- a/kernel/smp.c
+++ b/kernel/smp.c
@@ -511,6 +511,26 @@ raw_smp_call_single_queue(int cpu, struct llist_node 
*node, smp_call_func_t func
 
 static DEFINE_PER_CPU_SHARED_ALIGNED(call_single_data_t, csd_data);
 
+static noinline void __smp_call_single_queue_trace(int cpu, struct llist_node 
*node)
+{
+   call_single_data_t *csd;
+   smp_call_func_t func;
+
+
+   /*
+* We have to check the type of the CSD before queueing it, because
+* once queued it can have its flags cleared by
+*   flush_smp_call_function_queue()
+* even if we haven't sent the smp_call IPI yet (e.g. the stopper
+* executes migration_cpu_stop() on the remote CPU).
+*/
+   csd = container_of(node, call_single_data_t, node.llist);
+   func = CSD_TYPE(csd) == CSD_TYPE_TTWU ?
+   sched_ttwu_pending : csd->func;
+
+   raw_smp_call_single_queue(cpu, node, func);
+}
+
 void __smp_call_single_queue(int cpu, struct llist_node *node)
 {
 #ifdef CONFIG_CSD_LOCK_WAIT_DEBUG
@@ -525,25 +545,10 @@ void __smp_call_single_queue(int cpu, struct llist_node 
*node)
}
}
 #endif
-   /*
-* We have to check the type of the CSD before queueing it, because
-* once queued it can have its flags cleared by
-*   flush_smp_call_function_queue()
-* even if we haven't sent the smp_call IPI yet (e.g. th

Re: [PATCH v5 7/7] sched, smp: Trace smp callback causing an IPI

2023-03-22 Thread Peter Zijlstra
On Wed, Mar 22, 2023 at 12:20:28PM +, Valentin Schneider wrote:
> On 22/03/23 10:53, Peter Zijlstra wrote:

> > Hurmph... so we only really consume @func when we IPI. Would it not be
> > more useful to trace this thing for *every* csd enqeued?
> 
> It's true that any CSD enqueued on that CPU's call_single_queue in the
> [first CSD llist_add()'ed, IPI IRQ hits] timeframe is a potential source of
> interference.
> 
> However, can we be sure that first CSD isn't an indirect cause for the
> following ones? say the target CPU exits RCU EQS due to the IPI, there's a
> bit of time before it gets to flush_smp_call_function_queue() where some 
> other CSD
> could be enqueued *because* of that change in state.
> 
> I couldn't find a easy example of that, I might be biased as this is where
> I'd like to go wrt IPI'ing isolated CPUs in usermode. But regardless, when
> correlating an IPI IRQ with its source, we'd always have to look at the
> first CSD in that CSD stack.

So I was thinking something like this:

---
Subject: trace,smp: Trace all smp_function_call*() invocations
From: Peter Zijlstra 
Date: Wed Mar 22 14:58:36 CET 2023

(Ab)use the trace_ipi_send_cpu*() family to trace all
smp_function_call*() invocations, not only those that result in an
actual IPI.

The queued entries log their callback function while the actual IPIs
are traced on generic_smp_call_function_single_interrupt().

Signed-off-by: Peter Zijlstra (Intel) 
---
 kernel/smp.c |   58 ++
 1 file changed, 30 insertions(+), 28 deletions(-)

--- a/kernel/smp.c
+++ b/kernel/smp.c
@@ -106,18 +106,20 @@ void __init call_function_init(void)
 }
 
 static __always_inline void
-send_call_function_single_ipi(int cpu, smp_call_func_t func)
+send_call_function_single_ipi(int cpu)
 {
if (call_function_single_prep_ipi(cpu)) {
-   trace_ipi_send_cpu(cpu, _RET_IP_, func);
+   trace_ipi_send_cpu(cpu, _RET_IP_,
+  generic_smp_call_function_single_interrupt);
arch_send_call_function_single_ipi(cpu);
}
 }
 
 static __always_inline void
-send_call_function_ipi_mask(const struct cpumask *mask, smp_call_func_t func)
+send_call_function_ipi_mask(const struct cpumask *mask)
 {
-   trace_ipi_send_cpumask(mask, _RET_IP_, func);
+   trace_ipi_send_cpumask(mask, _RET_IP_,
+  generic_smp_call_function_single_interrupt);
arch_send_call_function_ipi_mask(mask);
 }
 
@@ -318,25 +320,6 @@ static __always_inline void csd_unlock(s
smp_store_release(&csd->node.u_flags, 0);
 }
 
-static __always_inline void
-raw_smp_call_single_queue(int cpu, struct llist_node *node, smp_call_func_t 
func)
-{
-   /*
-* The list addition should be visible to the target CPU when it pops
-* the head of the list to pull the entry off it in the IPI handler
-* because of normal cache coherency rules implied by the underlying
-* llist ops.
-*
-* If IPIs can go out of order to the cache coherency protocol
-* in an architecture, sufficient synchronisation should be added
-* to arch code to make it appear to obey cache coherency WRT
-* locking and barrier primitives. Generic code isn't really
-* equipped to do the right thing...
-*/
-   if (llist_add(node, &per_cpu(call_single_queue, cpu)))
-   send_call_function_single_ipi(cpu, func);
-}
-
 static DEFINE_PER_CPU_SHARED_ALIGNED(call_single_data_t, csd_data);
 
 void __smp_call_single_queue(int cpu, struct llist_node *node)
@@ -356,10 +339,23 @@ void __smp_call_single_queue(int cpu, st
func = CSD_TYPE(csd) == CSD_TYPE_TTWU ?
sched_ttwu_pending : csd->func;
 
-   raw_smp_call_single_queue(cpu, node, func);
-   } else {
-   raw_smp_call_single_queue(cpu, node, NULL);
+   trace_ipi_send_cpu(cpu, _RET_IP_, func);
}
+
+   /*
+* The list addition should be visible to the target CPU when it pops
+* the head of the list to pull the entry off it in the IPI handler
+* because of normal cache coherency rules implied by the underlying
+* llist ops.
+*
+* If IPIs can go out of order to the cache coherency protocol
+* in an architecture, sufficient synchronisation should be added
+* to arch code to make it appear to obey cache coherency WRT
+* locking and barrier primitives. Generic code isn't really
+* equipped to do the right thing...
+*/
+   if (llist_add(node, &per_cpu(call_single_queue, cpu)))
+   send_call_function_single_ipi(cpu);
 }
 
 /*
@@ -798,14 +794,20 @@ static void smp_call_function_many_cond(
}
 
/*
+* Trace each smp_function_call_*() as an IPI, actual IPIs
+* will be traced with 
func==generic_smp_call_fu

Re: [PATCH v5 7/7] sched, smp: Trace smp callback causing an IPI

2023-03-22 Thread Valentin Schneider
On 22/03/23 10:53, Peter Zijlstra wrote:
> On Tue, Mar 07, 2023 at 02:35:58PM +, Valentin Schneider wrote:
>
>> @@ -477,6 +490,25 @@ static __always_inline void csd_unlock(struct 
>> __call_single_data *csd)
>>  smp_store_release(&csd->node.u_flags, 0);
>>  }
>>
>> +static __always_inline void
>> +raw_smp_call_single_queue(int cpu, struct llist_node *node, smp_call_func_t 
>> func)
>> +{
>> +/*
>> + * The list addition should be visible to the target CPU when it pops
>> + * the head of the list to pull the entry off it in the IPI handler
>> + * because of normal cache coherency rules implied by the underlying
>> + * llist ops.
>> + *
>> + * If IPIs can go out of order to the cache coherency protocol
>> + * in an architecture, sufficient synchronisation should be added
>> + * to arch code to make it appear to obey cache coherency WRT
>> + * locking and barrier primitives. Generic code isn't really
>> + * equipped to do the right thing...
>> + */
>> +if (llist_add(node, &per_cpu(call_single_queue, cpu)))
>> +send_call_function_single_ipi(cpu, func);
>> +}
>> +
>>  static DEFINE_PER_CPU_SHARED_ALIGNED(call_single_data_t, csd_data);
>>
>>  void __smp_call_single_queue(int cpu, struct llist_node *node)
>> @@ -493,21 +525,25 @@ void __smp_call_single_queue(int cpu, struct 
>> llist_node *node)
>>  }
>>  }
>>  #endif
>>  /*
>> + * We have to check the type of the CSD before queueing it, because
>> + * once queued it can have its flags cleared by
>> + *   flush_smp_call_function_queue()
>> + * even if we haven't sent the smp_call IPI yet (e.g. the stopper
>> + * executes migration_cpu_stop() on the remote CPU).
>>   */
>> +if (trace_ipi_send_cpumask_enabled()) {
>> +call_single_data_t *csd;
>> +smp_call_func_t func;
>> +
>> +csd = container_of(node, call_single_data_t, node.llist);
>> +func = CSD_TYPE(csd) == CSD_TYPE_TTWU ?
>> +sched_ttwu_pending : csd->func;
>> +
>> +raw_smp_call_single_queue(cpu, node, func);
>> +} else {
>> +raw_smp_call_single_queue(cpu, node, NULL);
>> +}
>>  }
>
> Hurmph... so we only really consume @func when we IPI. Would it not be
> more useful to trace this thing for *every* csd enqeued?

It's true that any CSD enqueued on that CPU's call_single_queue in the
[first CSD llist_add()'ed, IPI IRQ hits] timeframe is a potential source of
interference.

However, can we be sure that first CSD isn't an indirect cause for the
following ones? say the target CPU exits RCU EQS due to the IPI, there's a
bit of time before it gets to flush_smp_call_function_queue() where some other 
CSD
could be enqueued *because* of that change in state.

I couldn't find a easy example of that, I might be biased as this is where
I'd like to go wrt IPI'ing isolated CPUs in usermode. But regardless, when
correlating an IPI IRQ with its source, we'd always have to look at the
first CSD in that CSD stack.


___
linux-snps-arc mailing list
linux-snps-arc@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-snps-arc


Re: [PATCH v5 1/7] trace: Add trace_ipi_send_cpumask()

2023-03-22 Thread Valentin Schneider
On 22/03/23 11:30, Peter Zijlstra wrote:
> On Wed, Mar 22, 2023 at 10:39:55AM +0100, Peter Zijlstra wrote:
>> On Tue, Mar 07, 2023 at 02:35:52PM +, Valentin Schneider wrote:
>> > +TRACE_EVENT(ipi_send_cpumask,
>> > +
>> > +  TP_PROTO(const struct cpumask *cpumask, unsigned long callsite, void 
>> > *callback),
>> > +
>> > +  TP_ARGS(cpumask, callsite, callback),
>> > +
>> > +  TP_STRUCT__entry(
>> > +  __cpumask(cpumask)
>> > +  __field(void *, callsite)
>> > +  __field(void *, callback)
>> > +  ),
>> > +
>> > +  TP_fast_assign(
>> > +  __assign_cpumask(cpumask, cpumask_bits(cpumask));
>> > +  __entry->callsite = (void *)callsite;
>> > +  __entry->callback = callback;
>> > +  ),
>> > +
>> > +  TP_printk("cpumask=%s callsite=%pS callback=%pS",
>> > +__get_cpumask(cpumask), __entry->callsite, __entry->callback)
>> > +);
>>
>> Would it make sense to add a variant like: ipi_send_cpu() that records a
>> single cpu instead of a cpumask. A lot of sites seems to do:
>> cpumask_of(cpu) for that first argument, and it seems to me it is quite
>> daft to have to memcpy a full multi-word cpumask in those cases.
>>
>> Remember, nr_possible_cpus > 64 is quite common these days.
>
> Something we litte bit like so...
>

I was wondering whether we could stick with a single trace event, but let
ftrace be aware of weight=1 vs weight>1 cpumasks.

For weight>1, it would memcpy() as usual, for weight=1, it could write a
pointer to a cpu_bit_bitmap[] equivalent embedded in the trace itself.

Unfortunately, Ftrace bitmasks are represented as a u32 made of two 16 bit
values: [offset in event record, size], so there isn't a straightforward
way to point to a "reusable" cpumask. AFAICT the only alternative would be
to do that via a different trace event, but then we should just go with a
plain old uint - i.e. do what you're doing here, so:

Tested-and-reviewed-by: Valentin Schneider 

(with the tiny typo fix below)

> @@ -35,6 +35,28 @@ TRACE_EVENT(ipi_raise,
>   TP_printk("target_mask=%s (%s)", __get_bitmask(target_cpus), 
> __entry->reason)
>  );
>
> +TRACE_EVENT(ipi_send_cpu,
> +
> + TP_PROTO(const unsigned int cpu, unsigned long callsite, void 
> *callback),
> +
> + TP_ARGS(cpu, callsite, callback),
> +
> + TP_STRUCT__entry(
> + __field(unsigned int, cpu)
> + __field(void *, callsite)
> + __field(void *, callback)
> + ),
> +
> + TP_fast_assign(
> + __entry->cpu = cpu;
> + __entry->callsite = (void *)callsite;
> + __entry->callback = callback;
> + ),
> +
> + TP_printk("cpu=%s callsite=%pS callback=%pS",
^
  s/s/u/

> +   __entry->cpu, __entry->callsite, __entry->callback)
> +);
> +


___
linux-snps-arc mailing list
linux-snps-arc@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-snps-arc


Re: [PATCH v5 1/7] trace: Add trace_ipi_send_cpumask()

2023-03-22 Thread Peter Zijlstra
On Wed, Mar 22, 2023 at 10:39:55AM +0100, Peter Zijlstra wrote:
> On Tue, Mar 07, 2023 at 02:35:52PM +, Valentin Schneider wrote:
> > trace_ipi_raise() is unsuitable for generically tracing IPI sources due to
> > its "reason" argument being an uninformative string (on arm64 all you get
> > is "Function call interrupts" for SMP calls).
> > 
> > Add a variant of it that exports a target cpumask, a callsite and a 
> > callback.
> > 
> > Signed-off-by: Valentin Schneider 
> > Reviewed-by: Steven Rostedt (Google) 
> > ---
> >  include/trace/events/ipi.h | 22 ++
> >  1 file changed, 22 insertions(+)
> > 
> > diff --git a/include/trace/events/ipi.h b/include/trace/events/ipi.h
> > index 0be71dad6ec03..b1125dc27682c 100644
> > --- a/include/trace/events/ipi.h
> > +++ b/include/trace/events/ipi.h
> > @@ -35,6 +35,28 @@ TRACE_EVENT(ipi_raise,
> > TP_printk("target_mask=%s (%s)", __get_bitmask(target_cpus), 
> > __entry->reason)
> >  );
> >  
> > +TRACE_EVENT(ipi_send_cpumask,
> > +
> > +   TP_PROTO(const struct cpumask *cpumask, unsigned long callsite, void 
> > *callback),
> > +
> > +   TP_ARGS(cpumask, callsite, callback),
> > +
> > +   TP_STRUCT__entry(
> > +   __cpumask(cpumask)
> > +   __field(void *, callsite)
> > +   __field(void *, callback)
> > +   ),
> > +
> > +   TP_fast_assign(
> > +   __assign_cpumask(cpumask, cpumask_bits(cpumask));
> > +   __entry->callsite = (void *)callsite;
> > +   __entry->callback = callback;
> > +   ),
> > +
> > +   TP_printk("cpumask=%s callsite=%pS callback=%pS",
> > + __get_cpumask(cpumask), __entry->callsite, __entry->callback)
> > +);
> 
> Would it make sense to add a variant like: ipi_send_cpu() that records a
> single cpu instead of a cpumask. A lot of sites seems to do:
> cpumask_of(cpu) for that first argument, and it seems to me it is quite
> daft to have to memcpy a full multi-word cpumask in those cases.
> 
> Remember, nr_possible_cpus > 64 is quite common these days.

Something we litte bit like so...

---
Subject: trace: Add trace_ipi_send_cpu()
From: Peter Zijlstra 
Date: Wed Mar 22 11:28:36 CET 2023

Because copying cpumasks around when targeting a single CPU is a bit
daft...

Signed-off-by: Peter Zijlstra (Intel) 
---
 include/linux/smp.h|6 +++---
 include/trace/events/ipi.h |   22 ++
 kernel/irq_work.c  |6 ++
 kernel/smp.c   |4 ++--
 4 files changed, 29 insertions(+), 9 deletions(-)

--- a/include/linux/smp.h
+++ b/include/linux/smp.h
@@ -130,9 +130,9 @@ extern void arch_smp_send_reschedule(int
  * scheduler_ipi() is inline so can't be passed as callback reason, but the
  * callsite IP should be sufficient for root-causing IPIs sent from here.
  */
-#define smp_send_reschedule(cpu) ({  \
-   trace_ipi_send_cpumask(cpumask_of(cpu), _RET_IP_, NULL);  \
-   arch_smp_send_reschedule(cpu);\
+#define smp_send_reschedule(cpu) ({  \
+   trace_ipi_send_cpu(cpu, _RET_IP_, NULL);  \
+   arch_smp_send_reschedule(cpu);\
 })
 
 /*
--- a/include/trace/events/ipi.h
+++ b/include/trace/events/ipi.h
@@ -35,6 +35,28 @@ TRACE_EVENT(ipi_raise,
TP_printk("target_mask=%s (%s)", __get_bitmask(target_cpus), 
__entry->reason)
 );
 
+TRACE_EVENT(ipi_send_cpu,
+
+   TP_PROTO(const unsigned int cpu, unsigned long callsite, void 
*callback),
+
+   TP_ARGS(cpu, callsite, callback),
+
+   TP_STRUCT__entry(
+   __field(unsigned int, cpu)
+   __field(void *, callsite)
+   __field(void *, callback)
+   ),
+
+   TP_fast_assign(
+   __entry->cpu = cpu;
+   __entry->callsite = (void *)callsite;
+   __entry->callback = callback;
+   ),
+
+   TP_printk("cpu=%s callsite=%pS callback=%pS",
+ __entry->cpu, __entry->callsite, __entry->callback)
+);
+
 TRACE_EVENT(ipi_send_cpumask,
 
TP_PROTO(const struct cpumask *cpumask, unsigned long callsite, void 
*callback),
--- a/kernel/irq_work.c
+++ b/kernel/irq_work.c
@@ -78,10 +78,8 @@ void __weak arch_irq_work_raise(void)
 
 static __always_inline void irq_work_raise(struct irq_work *work)
 {
-   if (trace_ipi_send_cpumask_enabled() && arch_irq_work_has_interrupt())
-   trace_ipi_send_cpumask(cpumask_of(smp_processor_id()),
-  _RET_IP_,
-  work->func);
+   if (trace_ipi_send_cpu_enabled() && arch_irq_work_has_interrupt())
+   trace_ipi_send_cpu(smp_processor_id(), _RET_IP_, work->func);
 
arch_irq_work_raise();
 }
--- a/kernel/smp.c
+++ b/kernel/smp.c
@@ -109,7 +109,7 @@ static __always_inline void
 send_call_function_single_ipi(int cpu, smp_call_func_t func)
 {
if (call_function_single_prep_ipi(cpu)) {
-   trace_ipi_send_cpumask(cpumask_of(cpu), _RET_IP_, f

Re: [PATCH v5 7/7] sched, smp: Trace smp callback causing an IPI

2023-03-22 Thread Peter Zijlstra
On Tue, Mar 07, 2023 at 02:35:58PM +, Valentin Schneider wrote:

> @@ -477,6 +490,25 @@ static __always_inline void csd_unlock(struct 
> __call_single_data *csd)
>   smp_store_release(&csd->node.u_flags, 0);
>  }
>  
> +static __always_inline void
> +raw_smp_call_single_queue(int cpu, struct llist_node *node, smp_call_func_t 
> func)
> +{
> + /*
> +  * The list addition should be visible to the target CPU when it pops
> +  * the head of the list to pull the entry off it in the IPI handler
> +  * because of normal cache coherency rules implied by the underlying
> +  * llist ops.
> +  *
> +  * If IPIs can go out of order to the cache coherency protocol
> +  * in an architecture, sufficient synchronisation should be added
> +  * to arch code to make it appear to obey cache coherency WRT
> +  * locking and barrier primitives. Generic code isn't really
> +  * equipped to do the right thing...
> +  */
> + if (llist_add(node, &per_cpu(call_single_queue, cpu)))
> + send_call_function_single_ipi(cpu, func);
> +}
> +
>  static DEFINE_PER_CPU_SHARED_ALIGNED(call_single_data_t, csd_data);
>  
>  void __smp_call_single_queue(int cpu, struct llist_node *node)
> @@ -493,21 +525,25 @@ void __smp_call_single_queue(int cpu, struct llist_node 
> *node)
>   }
>   }
>  #endif
>   /*
> +  * We have to check the type of the CSD before queueing it, because
> +  * once queued it can have its flags cleared by
> +  *   flush_smp_call_function_queue()
> +  * even if we haven't sent the smp_call IPI yet (e.g. the stopper
> +  * executes migration_cpu_stop() on the remote CPU).
>*/
> + if (trace_ipi_send_cpumask_enabled()) {
> + call_single_data_t *csd;
> + smp_call_func_t func;
> +
> + csd = container_of(node, call_single_data_t, node.llist);
> + func = CSD_TYPE(csd) == CSD_TYPE_TTWU ?
> + sched_ttwu_pending : csd->func;
> +
> + raw_smp_call_single_queue(cpu, node, func);
> + } else {
> + raw_smp_call_single_queue(cpu, node, NULL);
> + }
>  }

Hurmph... so we only really consume @func when we IPI. Would it not be
more useful to trace this thing for *every* csd enqeued?




___
linux-snps-arc mailing list
linux-snps-arc@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-snps-arc


Re: [PATCH v5 1/7] trace: Add trace_ipi_send_cpumask()

2023-03-22 Thread Peter Zijlstra
On Tue, Mar 07, 2023 at 02:35:52PM +, Valentin Schneider wrote:
> trace_ipi_raise() is unsuitable for generically tracing IPI sources due to
> its "reason" argument being an uninformative string (on arm64 all you get
> is "Function call interrupts" for SMP calls).
> 
> Add a variant of it that exports a target cpumask, a callsite and a callback.
> 
> Signed-off-by: Valentin Schneider 
> Reviewed-by: Steven Rostedt (Google) 
> ---
>  include/trace/events/ipi.h | 22 ++
>  1 file changed, 22 insertions(+)
> 
> diff --git a/include/trace/events/ipi.h b/include/trace/events/ipi.h
> index 0be71dad6ec03..b1125dc27682c 100644
> --- a/include/trace/events/ipi.h
> +++ b/include/trace/events/ipi.h
> @@ -35,6 +35,28 @@ TRACE_EVENT(ipi_raise,
>   TP_printk("target_mask=%s (%s)", __get_bitmask(target_cpus), 
> __entry->reason)
>  );
>  
> +TRACE_EVENT(ipi_send_cpumask,
> +
> + TP_PROTO(const struct cpumask *cpumask, unsigned long callsite, void 
> *callback),
> +
> + TP_ARGS(cpumask, callsite, callback),
> +
> + TP_STRUCT__entry(
> + __cpumask(cpumask)
> + __field(void *, callsite)
> + __field(void *, callback)
> + ),
> +
> + TP_fast_assign(
> + __assign_cpumask(cpumask, cpumask_bits(cpumask));
> + __entry->callsite = (void *)callsite;
> + __entry->callback = callback;
> + ),
> +
> + TP_printk("cpumask=%s callsite=%pS callback=%pS",
> +   __get_cpumask(cpumask), __entry->callsite, __entry->callback)
> +);

Would it make sense to add a variant like: ipi_send_cpu() that records a
single cpu instead of a cpumask. A lot of sites seems to do:
cpumask_of(cpu) for that first argument, and it seems to me it is quite
daft to have to memcpy a full multi-word cpumask in those cases.

Remember, nr_possible_cpus > 64 is quite common these days.

___
linux-snps-arc mailing list
linux-snps-arc@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-snps-arc