On Tue, May 24, 2016 at 03:52:28PM -0700, Andy Lutomirski wrote:
> Currently, the trace_printk code chooses which static buffer to use based
> on what type of atomic context (NMI, IRQ, etc) it's in.  Simplify the
> code and make it more robust: simply count the nesting depth and choose
> a buffer based on the current nesting depth.
> 
> The new code will only drop an event if we nest more than 4 deep,
> and the old code was guaranteed to malfunction if that happened.
> 
> Signed-off-by: Andy Lutomirski <[email protected]>
> ---
>  kernel/trace/trace.c | 83 
> +++++++++++++++-------------------------------------
>  1 file changed, 24 insertions(+), 59 deletions(-)
> 
> diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c
> index a2f0b9f33e9b..4508f3bf4a97 100644
> --- a/kernel/trace/trace.c
> +++ b/kernel/trace/trace.c
> @@ -1986,83 +1986,41 @@ static void __trace_userstack(struct trace_array *tr, 
> unsigned long flags)
>  
>  /* created for use with alloc_percpu */
>  struct trace_buffer_struct {
> -     char buffer[TRACE_BUF_SIZE];
> +     int nesting;
> +     char buffer[4][TRACE_BUF_SIZE];
>  };
>  
>  static struct trace_buffer_struct *trace_percpu_buffer;
>  /*
> + * Thise allows for lockless recording.  If we're nested too deeply, then
> + * this returns NULL.
>   */
>  static char *get_trace_buf(void)
>  {
> +     struct trace_buffer_struct *buffer = this_cpu_ptr(trace_percpu_buffer);
>  
> +     if (!buffer || buffer->nesting >= 4)
>               return NULL;

This is buggy fwiw; you need to unconditionally increment
buffer->nesting to match the unconditional decrement.

Otherwise 5 'increments' and 5 decrements will land you at -1.

>  
> +     return &buffer->buffer[buffer->nesting++][0];
> +}
> +
> +static void put_trace_buf(void)
> +{
> +     this_cpu_dec(trace_percpu_buffer->nesting);
>  }

So I don't know about tracing; but for perf this construct would not
work 'properly'.

The per context counter -- which is lost in this scheme -- guards
against in-context recursion.

Only if we nest from another context do we allow generation of a new
event.

Reply via email to