On Mon, 14 Sep 2020 10:00:50 +0530
Gaurav Kohli <gko...@codeaurora.org> wrote:

> Hi Steven,
> 
> Please let us know, if below change looks good.
> Or let us know some other way to solve this.
> 
> Thanks,
> Gaurav
> 
> 

Hmm, for some reason, I don't see this in my INBOX, but it shows up in my
LKML folder. :-/


> 
> On 9/4/2020 11:39 AM, Gaurav Kohli wrote:
> > Below race can come, if trace_open and resize of
> > cpu buffer is running parallely on different cpus
> > CPUX                                CPUY
> >                                 ring_buffer_resize
> >                                 atomic_read(&buffer->resize_disabled)
> > tracing_open
> > tracing_reset_online_cpus
> > ring_buffer_reset_cpu
> > rb_reset_cpu
> >                                 rb_update_pages
> >                                 remove/insert pages
> > resetting pointer
> > This race can cause data abort or some times infinte loop in
> > rb_remove_pages and rb_insert_pages while checking pages
> > for sanity.
> > Take ring buffer lock in trace_open to avoid resetting of cpu buffer.
> > 
> > Signed-off-by: Gaurav Kohli <gko...@codeaurora.org>
> > 
> > diff --git a/include/linux/ring_buffer.h b/include/linux/ring_buffer.h
> > index 136ea09..55f9115 100644
> > --- a/include/linux/ring_buffer.h
> > +++ b/include/linux/ring_buffer.h
> > @@ -163,6 +163,8 @@ bool ring_buffer_empty_cpu(struct trace_buffer *buffer, 
> > int cpu);
> >   
> >   void ring_buffer_record_disable(struct trace_buffer *buffer);
> >   void ring_buffer_record_enable(struct trace_buffer *buffer);
> > +void ring_buffer_mutex_acquire(struct trace_buffer *buffer);
> > +void ring_buffer_mutex_release(struct trace_buffer *buffer);
> >   void ring_buffer_record_off(struct trace_buffer *buffer);
> >   void ring_buffer_record_on(struct trace_buffer *buffer);
> >   bool ring_buffer_record_is_on(struct trace_buffer *buffer);
> > diff --git a/kernel/trace/ring_buffer.c b/kernel/trace/ring_buffer.c
> > index 93ef0ab..638ec8f 100644
> > --- a/kernel/trace/ring_buffer.c
> > +++ b/kernel/trace/ring_buffer.c
> > @@ -3632,6 +3632,25 @@ void ring_buffer_record_enable(struct trace_buffer 
> > *buffer)
> >   EXPORT_SYMBOL_GPL(ring_buffer_record_enable);
> >   
> >   /**
> > + * ring_buffer_mutex_acquire - prevent resetting of buffer
> > + * during resize
> > + */
> > +void ring_buffer_mutex_acquire(struct trace_buffer *buffer)
> > +{
> > +   mutex_lock(&buffer->mutex);
> > +}
> > +EXPORT_SYMBOL_GPL(ring_buffer_mutex_acquire);
> > +
> > +/**
> > + * ring_buffer_mutex_release - prevent resetting of buffer
> > + * during resize
> > + */
> > +void ring_buffer_mutex_release(struct trace_buffer *buffer)
> > +{
> > +   mutex_unlock(&buffer->mutex);
> > +}
> > +EXPORT_SYMBOL_GPL(ring_buffer_mutex_release);

I really do not like to export these.

> > +/**
> >    * ring_buffer_record_off - stop all writes into the buffer
> >    * @buffer: The ring buffer to stop writes to.
> >    *
> > @@ -4918,6 +4937,8 @@ void ring_buffer_reset(struct trace_buffer *buffer)
> >     struct ring_buffer_per_cpu *cpu_buffer;
> >     int cpu;
> >   
> > +   /* prevent another thread from changing buffer sizes */
> > +   mutex_lock(&buffer->mutex);
> >     for_each_buffer_cpu(buffer, cpu) {
> >             cpu_buffer = buffer->buffers[cpu];
> >   
> > @@ -4936,6 +4957,7 @@ void ring_buffer_reset(struct trace_buffer *buffer)
> >             atomic_dec(&cpu_buffer->record_disabled);
> >             atomic_dec(&cpu_buffer->resize_disabled);
> >     }
> > +   mutex_unlock(&buffer->mutex);
> >   }
> >   EXPORT_SYMBOL_GPL(ring_buffer_reset);
> >   
> > diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c
> > index f40d850..392e9aa 100644
> > --- a/kernel/trace/trace.c
> > +++ b/kernel/trace/trace.c
> > @@ -2006,6 +2006,8 @@ void tracing_reset_online_cpus(struct array_buffer 
> > *buf)
> >     if (!buffer)
> >             return;
> >   
> > +   ring_buffer_mutex_acquire(buffer);
> > +
> >     ring_buffer_record_disable(buffer);

Hmm, why do we disable here as it gets disabled again in the call to
ring_buffer_reset_online_cpus()? Perhaps we don't need to disable the
buffer here. The only difference is that we have:

 buf->time_start = buffer_ftrace_now(buf, buf->cpu);

And that the above disables the entire buffer, whereas the reset only
resets individual ones.

But I don't think that will make any difference.

-- Steve


> >   
> >     /* Make sure all commits have finished */
> > @@ -2016,6 +2018,8 @@ void tracing_reset_online_cpus(struct array_buffer 
> > *buf)
> >     ring_buffer_reset_online_cpus(buffer);
> >   
> >     ring_buffer_record_enable(buffer);
> > +
> > +   ring_buffer_mutex_release(buffer);
> >   }
> >   
> >   /* Must have trace_types_lock held */
> >   
> 

Reply via email to