Re: [PATCH 11/11 v2] ftrace: Add recording of functions that caused recursion

2020-11-03 Thread Petr Mladek
On Mon 2020-11-02 12:09:07, Steven Rostedt wrote:
> On Mon, 2 Nov 2020 17:41:47 +0100
> Petr Mladek  wrote:
> 
> > On Fri 2020-10-30 17:31:53, Steven Rostedt wrote:
> > > From: "Steven Rostedt (VMware)" 
> > > 
> > > This adds CONFIG_FTRACE_RECORD_RECURSION that will record to a file
> > > "recursed_functions" all the functions that caused recursion while a
> > > callback to the function tracer was running.
> > >   
> > 
> > > --- /dev/null
> > > +++ b/kernel/trace/trace_recursion_record.c
> > > + if (index >= CONFIG_FTRACE_RECORD_RECURSION_SIZE)
> > > + return;
> > > +
> > > + for (i = index - 1; i >= 0; i--) {
> > > + if (recursed_functions[i].ip == ip) {
> > > + cached_function = ip;
> > > + return;
> > > + }
> > > + }
> > > +
> > > + cached_function = ip;
> > > +
> > > + /*
> > > +  * We only want to add a function if it hasn't been added before.
> > > +  * Add to the current location before incrementing the count.
> > > +  * If it fails to add, then increment the index (save in i)
> > > +  * and try again.
> > > +  */
> > > + old = cmpxchg(_functions[index].ip, 0, ip);
> > > + if (old != 0) {
> > > + /* Did something else already added this for us? */
> > > + if (old == ip)
> > > + return;
> > > + /* Try the next location (use i for the next index) */
> > > + i = index + 1;  
> > 
> > What about
> > 
> > index++;
> > 
> > We basically want to run the code again with index + 1 limit.
> 
> But something else could update nr_records, and we want to use that if
> nr_records is greater than i.
> 
> Now, we could swap the use case, and have
> 
>   int index = 0;
> 
>   [..]
>   i = atomic_read(_records);
>   if (i > index)
>   index = i;
> 
>   [..]
> 
>   index++;
>   goto again;
> 
> 
> > 
> > Maybe, it even does not make sense to check the array again
> > and we should just try to store the value into the next slot.
> 
> We do this dance to prevent duplicates.

I see.

My code was wrong. It reserved slot for the new "ip" by cmpxchg
on nr_records. The "ip" was stored later so that any parallel
call need not see that it is a dumplicate.

Your code reserves the slot by cmpxchg of "ip".
Any parallel call would fail to take the slot and see
the "ip" in the next iteration.

Best Regards,
Petr


Re: [PATCH 11/11 v2] ftrace: Add recording of functions that caused recursion

2020-11-02 Thread Steven Rostedt
On Mon, 2 Nov 2020 12:37:21 -0500
Steven Rostedt  wrote:


> The only race that I see that can happen, is the one in the comment I
> showed. And that is after enabling the recursed functions again after
> clearing, one CPU could add a function while another CPU that just added
> that same function could be just exiting this routine, notice that a
> clearing of the array happened, and remove its function (which was the same
> as the one just happened). So we get a "zero" in the array. If this
> happens, it is likely that that function will recurse again and will be
> added later.
> 

Updated version of this function:

-- Steve


void ftrace_record_recursion(unsigned long ip, unsigned long parent_ip)
{
int index = 0;
int i;
unsigned long old;

 again:
/* First check the last one recorded */
if (ip == cached_function)
return;

i = atomic_read(_records);
/* nr_records is -1 when clearing records */
smp_mb__after_atomic();
if (i < 0)
return;

/*
 * If there's two writers and this writer comes in second,
 * the cmpxchg() below to update the ip will fail. Then this
 * writer will try again. It is possible that index will now
 * be greater than nr_records. This is because the writer
 * that succeeded has not updated the nr_records yet.
 * This writer could keep trying again until the other writer
 * updates nr_records. But if the other writer takes an
 * interrupt, and that interrupt locks up that CPU, we do
 * not want this CPU to lock up due to the recursion protection,
 * and have a bug report showing this CPU as the cause of
 * locking up the computer. To not lose this record, this
 * writer will simply use the next position to update the
 * recursed_functions, and it will update the nr_records
 * accordingly.
 */
if (index < i)
index = i;
if (index >= CONFIG_FTRACE_RECORD_RECURSION_SIZE)
return;

for (i = index - 1; i >= 0; i--) {
if (recursed_functions[i].ip == ip) {
cached_function = ip;
return;
}
}

cached_function = ip;

/*
 * We only want to add a function if it hasn't been added before.
 * Add to the current location before incrementing the count.
 * If it fails to add, then increment the index (save in i)
 * and try again.
 */
old = cmpxchg(_functions[index].ip, 0, ip);
if (old != 0) {
/* Did something else already added this for us? */
if (old == ip)
return;
/* Try the next location (use i for the next index) */
index++;
goto again;
}

recursed_functions[index].parent_ip = parent_ip;

/*
 * It's still possible that we could race with the clearing
 *CPU0CPU1
 *
 *   ip = func
 *  nr_records = -1;
 *  recursed_functions[0] = 0;
 *   i = -1
 *   if (i < 0)
 *  nr_records = 0;
 *  (new recursion detected)
 *  recursed_functions[0] = func
 *
cmpxchg(recursed_functions[0],
 *func, 0)
 *
 * But the worse that could happen is that we get a zero in
 * the recursed_functions array, and it's likely that "func" will
 * be recorded again.
 */
i = atomic_read(_records);
smp_mb__after_atomic();
if (i < 0)
cmpxchg(_functions[index].ip, ip, 0);
else if (i <= index)
atomic_cmpxchg(_records, i, index + 1);
}


Re: [PATCH 11/11 v2] ftrace: Add recording of functions that caused recursion

2020-11-02 Thread Steven Rostedt
On Mon, 2 Nov 2020 17:41:47 +0100
Petr Mladek  wrote:

> > +   i = atomic_read(_records);
> > +   smp_mb__after_atomic();
> > +   if (i < 0)
> > +   cmpxchg(_functions[index].ip, ip, 0);
> > +   else if (i <= index)
> > +   atomic_cmpxchg(_records, i, index + 1);  
> 
> This looks weird. It would shift nr_records past the record added
> in this call. It might skip many slots that were zeroed when clearing.
> Also we do not know if our entry was not zeroed as well.

nr_records always holds the next position to write to.

index = nr_records;
recursed_functions[index].ip = ip;
nr_records++;

Before clearing, we have:

nr_records = -1;
smp_mb();
memset(recursed_functions, 0);
smp_wmb();
nr_records = 0;

When we enter this function:

i = nr_records;
smp_mb();
if (i < 0)
return;


Thus, we just stopped all new updates while clearing the records.

But what about if something is currently updating?

i = nr_records;
smp_mb();
if (i < 0)
cmpxchg(recursed_functions, ip, 0);

The above shows that if the current updating process notices that the
clearing happens, it will clear the function it added.

else if (i <= index)
cmpxchg(nr_records, i, index + 1);

This makes sure that nr_records only grows if it is greater or equal to
zero.

The only race that I see that can happen, is the one in the comment I
showed. And that is after enabling the recursed functions again after
clearing, one CPU could add a function while another CPU that just added
that same function could be just exiting this routine, notice that a
clearing of the array happened, and remove its function (which was the same
as the one just happened). So we get a "zero" in the array. If this
happens, it is likely that that function will recurse again and will be
added later.

-- Steve


Re: [PATCH 11/11 v2] ftrace: Add recording of functions that caused recursion

2020-11-02 Thread Steven Rostedt
On Mon, 2 Nov 2020 12:09:07 -0500
Steven Rostedt  wrote:

> > > +void ftrace_record_recursion(unsigned long ip, unsigned long parent_ip)
> > > +{
> > > + int index;
> > > + int i = 0;
> > > + unsigned long old;
> > > +
> > > + again:
> > > + /* First check the last one recorded */
> > > + if (ip == cached_function)
> > > + return;
> > > +
> > > + index = atomic_read(_records);
> > > + /* nr_records is -1 when clearing records */
> > > + smp_mb__after_atomic();
> > > + if (index < 0)
> > > + return;
> > > +
> > > + /* See below */
> > > + if (i > index)
> > > + index = i;
> > 
> > This looks like a complicated way to do index++ via "i" variable.
> > I guess that it was needed only in some older variant of the code.
> > See below.  
> 
> Because we reread the index above, and index could be bigger than i (more
> than index + 1).
> 
> >   
> > > + if (index >= CONFIG_FTRACE_RECORD_RECURSION_SIZE)
> > > + return;
> > > +
> > > + for (i = index - 1; i >= 0; i--) {
> > > + if (recursed_functions[i].ip == ip) {
> > > + cached_function = ip;
> > > + return;
> > > + }
> > > + }
> > > +
> > > + cached_function = ip;
> > > +
> > > + /*
> > > +  * We only want to add a function if it hasn't been added before.
> > > +  * Add to the current location before incrementing the count.
> > > +  * If it fails to add, then increment the index (save in i)
> > > +  * and try again.
> > > +  */
> > > + old = cmpxchg(_functions[index].ip, 0, ip);
> > > + if (old != 0) {
> > > + /* Did something else already added this for us? */
> > > + if (old == ip)
> > > + return;
> > > + /* Try the next location (use i for the next index) */
> > > + i = index + 1;
> > 
> > What about
> > 
> > index++;
> > 
> > We basically want to run the code again with index + 1 limit.  
> 
> But something else could update nr_records, and we want to use that if
> nr_records is greater than i.
> 
> Now, we could swap the use case, and have
> 
>   int index = 0;
> 
>   [..]
>   i = atomic_read(_records);
>   if (i > index)
>   index = i;
> 
>   [..]
> 
>   index++;
>   goto again;
> 
> 
> > 
> > Maybe, it even does not make sense to check the array again
> > and we should just try to store the value into the next slot.  
> 
> We do this dance to prevent duplicates.
> 
> But you are correct, that this went through a few iterations. And the first
> ones didn't have the cmpxchg on the ip itself, and that could make it so
> that we don't need this index = i dance.

Playing with this more, I remember why I did this song and dance.

If we have two or more writers, and one beats the other in updating the ip
(with a different function). This one will go and try again. The reason to
look at one passed nr_records, is because of the race between the multiple
writers. This one may loop before the other can update nr_records, and it
will fail to apply it again.

You could just say, "hey we'll just keep looping until the other writer
eventually updates nr_records". But this is where my paranoia gets in. What
happens if that other writer takes an interrupt (interrupts are not
disabled), and then deadlocks, or does something bad? This CPU will not get
locked up spinning.

Unlikely scenario, and it would require a bug someplace else. But I don't
want a bug report stating that it found this recursion locking locking up
the CPU and hide the real culprit.

I'll add a comment to explain this in the code. And also swap the i and
index around to make a little more sense.

-- Steve


Re: [PATCH 11/11 v2] ftrace: Add recording of functions that caused recursion

2020-11-02 Thread Steven Rostedt
On Mon, 2 Nov 2020 17:41:47 +0100
Petr Mladek  wrote:

> On Fri 2020-10-30 17:31:53, Steven Rostedt wrote:
> > From: "Steven Rostedt (VMware)" 
> > 
> > This adds CONFIG_FTRACE_RECORD_RECURSION that will record to a file
> > "recursed_functions" all the functions that caused recursion while a
> > callback to the function tracer was running.
> >   
> 
> > --- /dev/null
> > +++ b/kernel/trace/trace_recursion_record.c
> > @@ -0,0 +1,220 @@
> > +// SPDX-License-Identifier: GPL-2.0
> > +
> > +#include 
> > +#include 
> > +#include 
> > +#include 
> > +#include 
> > +
> > +#include "trace_output.h"
> > +
> > +struct recursed_functions {
> > +   unsigned long   ip;
> > +   unsigned long   parent_ip;
> > +};
> > +
> > +static struct recursed_functions 
> > recursed_functions[CONFIG_FTRACE_RECORD_RECURSION_SIZE];  
> 
> The code tries to be lockless safe as much as possible. It would make
> sense to allign the array.

Hmm, is there an arch where the compiler would put an array of structures
with two unsigned long, misaligned?

> 
> 
> > +static atomic_t nr_records;
> > +
> > +/*
> > + * Cache the last found function. Yes, updates to this is racey, but
> > + * so is memory cache ;-)
> > + */
> > +static unsigned long cached_function;
> > +
> > +void ftrace_record_recursion(unsigned long ip, unsigned long parent_ip)
> > +{
> > +   int index;
> > +   int i = 0;
> > +   unsigned long old;
> > +
> > + again:
> > +   /* First check the last one recorded */
> > +   if (ip == cached_function)
> > +   return;
> > +
> > +   index = atomic_read(_records);
> > +   /* nr_records is -1 when clearing records */
> > +   smp_mb__after_atomic();
> > +   if (index < 0)
> > +   return;
> > +
> > +   /* See below */
> > +   if (i > index)
> > +   index = i;  
> 
> This looks like a complicated way to do index++ via "i" variable.
> I guess that it was needed only in some older variant of the code.
> See below.

Because we reread the index above, and index could be bigger than i (more
than index + 1).

> 
> > +   if (index >= CONFIG_FTRACE_RECORD_RECURSION_SIZE)
> > +   return;
> > +
> > +   for (i = index - 1; i >= 0; i--) {
> > +   if (recursed_functions[i].ip == ip) {
> > +   cached_function = ip;
> > +   return;
> > +   }
> > +   }
> > +
> > +   cached_function = ip;
> > +
> > +   /*
> > +* We only want to add a function if it hasn't been added before.
> > +* Add to the current location before incrementing the count.
> > +* If it fails to add, then increment the index (save in i)
> > +* and try again.
> > +*/
> > +   old = cmpxchg(_functions[index].ip, 0, ip);
> > +   if (old != 0) {
> > +   /* Did something else already added this for us? */
> > +   if (old == ip)
> > +   return;
> > +   /* Try the next location (use i for the next index) */
> > +   i = index + 1;  
> 
> What about
> 
>   index++;
> 
> We basically want to run the code again with index + 1 limit.

But something else could update nr_records, and we want to use that if
nr_records is greater than i.

Now, we could swap the use case, and have

int index = 0;

[..]
i = atomic_read(_records);
if (i > index)
index = i;

[..]

index++;
goto again;


> 
> Maybe, it even does not make sense to check the array again
> and we should just try to store the value into the next slot.

We do this dance to prevent duplicates.

But you are correct, that this went through a few iterations. And the first
ones didn't have the cmpxchg on the ip itself, and that could make it so
that we don't need this index = i dance.

> 
> > +   goto again;
> > +   }
> > +
> > +   recursed_functions[index].parent_ip = parent_ip;  
> 
> WRITE_ONCE() ?

Does it really matter?

> 
> > +
> > +   /*
> > +* It's still possible that we could race with the clearing
> > +*CPU0CPU1
> > +*
> > +*   ip = func
> > +*  nr_records = -1;
> > +*  recursed_functions[0] = 0;
> > +*   i = -1
> > +*   if (i < 0)
> > +*  nr_records = 0;
> > +*  (new recursion detected)
> > +*  recursed_functions[0] = func
> > +*
> > cmpxchg(recursed_functions[0],
> > +*func, 0)
> > +*
> > +* But the worse that could happen is that we get a zero in
> > +* the recursed_functions array, and it's likely that "func" will
> > +* be recorded again.
> > +*/
> > +   i = atomic_read(_records);
> > +   smp_mb__after_atomic();
> > +   if (i < 0)
> > +   cmpxchg(_functions[index].ip, ip, 0);
> > +   else if (i <= 

Re: [PATCH 11/11 v2] ftrace: Add recording of functions that caused recursion

2020-11-02 Thread Petr Mladek
On Fri 2020-10-30 17:31:53, Steven Rostedt wrote:
> From: "Steven Rostedt (VMware)" 
> 
> This adds CONFIG_FTRACE_RECORD_RECURSION that will record to a file
> "recursed_functions" all the functions that caused recursion while a
> callback to the function tracer was running.
> 

> --- /dev/null
> +++ b/kernel/trace/trace_recursion_record.c
> @@ -0,0 +1,220 @@
> +// SPDX-License-Identifier: GPL-2.0
> +
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +
> +#include "trace_output.h"
> +
> +struct recursed_functions {
> + unsigned long   ip;
> + unsigned long   parent_ip;
> +};
> +
> +static struct recursed_functions 
> recursed_functions[CONFIG_FTRACE_RECORD_RECURSION_SIZE];

The code tries to be lockless safe as much as possible. It would make
sense to allign the array.


> +static atomic_t nr_records;
> +
> +/*
> + * Cache the last found function. Yes, updates to this is racey, but
> + * so is memory cache ;-)
> + */
> +static unsigned long cached_function;
> +
> +void ftrace_record_recursion(unsigned long ip, unsigned long parent_ip)
> +{
> + int index;
> + int i = 0;
> + unsigned long old;
> +
> + again:
> + /* First check the last one recorded */
> + if (ip == cached_function)
> + return;
> +
> + index = atomic_read(_records);
> + /* nr_records is -1 when clearing records */
> + smp_mb__after_atomic();
> + if (index < 0)
> + return;
> +
> + /* See below */
> + if (i > index)
> + index = i;

This looks like a complicated way to do index++ via "i" variable.
I guess that it was needed only in some older variant of the code.
See below.

> + if (index >= CONFIG_FTRACE_RECORD_RECURSION_SIZE)
> + return;
> +
> + for (i = index - 1; i >= 0; i--) {
> + if (recursed_functions[i].ip == ip) {
> + cached_function = ip;
> + return;
> + }
> + }
> +
> + cached_function = ip;
> +
> + /*
> +  * We only want to add a function if it hasn't been added before.
> +  * Add to the current location before incrementing the count.
> +  * If it fails to add, then increment the index (save in i)
> +  * and try again.
> +  */
> + old = cmpxchg(_functions[index].ip, 0, ip);
> + if (old != 0) {
> + /* Did something else already added this for us? */
> + if (old == ip)
> + return;
> + /* Try the next location (use i for the next index) */
> + i = index + 1;

What about

index++;

We basically want to run the code again with index + 1 limit.

Maybe, it even does not make sense to check the array again
and we should just try to store the value into the next slot.

> + goto again;
> + }
> +
> + recursed_functions[index].parent_ip = parent_ip;

WRITE_ONCE() ?

> +
> + /*
> +  * It's still possible that we could race with the clearing
> +  *CPU0CPU1
> +  *
> +  *   ip = func
> +  *  nr_records = -1;
> +  *  recursed_functions[0] = 0;
> +  *   i = -1
> +  *   if (i < 0)
> +  *  nr_records = 0;
> +  *  (new recursion detected)
> +  *  recursed_functions[0] = func
> +  *
> cmpxchg(recursed_functions[0],
> +  *func, 0)
> +  *
> +  * But the worse that could happen is that we get a zero in
> +  * the recursed_functions array, and it's likely that "func" will
> +  * be recorded again.
> +  */
> + i = atomic_read(_records);
> + smp_mb__after_atomic();
> + if (i < 0)
> + cmpxchg(_functions[index].ip, ip, 0);
> + else if (i <= index)
> + atomic_cmpxchg(_records, i, index + 1);

This looks weird. It would shift nr_records past the record added
in this call. It might skip many slots that were zeroed when clearing.
Also we do not know if our entry was not zeroed as well.

I would suggest to do it some other way (not even compile tested):

void ftrace_record_recursion(unsigned long ip, unsigned long parent_ip)
{
int index, old_index;
int i = 0;
unsigned long old_ip;

 again:
/* First check the last one recorded. */
if (ip == READ_ONCE(cached_function))
return;

index = atomic_read(_records);
/* nr_records is -1 when clearing records. */
smp_mb__after_atomic();
if (index < 0)
return;

/* Already cached? */
for (i = index - 1; i >= 0; i--) {
if (recursed_functions[i].ip == ip) {
WRITE_ONCE(cached_function, ip);
return;