On Mon, 14 Sep 2020 10:00:50 +0530 Gaurav Kohli <gko...@codeaurora.org> wrote:
> Hi Steven, > > Please let us know, if below change looks good. > Or let us know some other way to solve this. > > Thanks, > Gaurav > > Hmm, for some reason, I don't see this in my INBOX, but it shows up in my LKML folder. :-/ > > On 9/4/2020 11:39 AM, Gaurav Kohli wrote: > > Below race can come, if trace_open and resize of > > cpu buffer is running parallely on different cpus > > CPUX CPUY > > ring_buffer_resize > > atomic_read(&buffer->resize_disabled) > > tracing_open > > tracing_reset_online_cpus > > ring_buffer_reset_cpu > > rb_reset_cpu > > rb_update_pages > > remove/insert pages > > resetting pointer > > This race can cause data abort or some times infinte loop in > > rb_remove_pages and rb_insert_pages while checking pages > > for sanity. > > Take ring buffer lock in trace_open to avoid resetting of cpu buffer. > > > > Signed-off-by: Gaurav Kohli <gko...@codeaurora.org> > > > > diff --git a/include/linux/ring_buffer.h b/include/linux/ring_buffer.h > > index 136ea09..55f9115 100644 > > --- a/include/linux/ring_buffer.h > > +++ b/include/linux/ring_buffer.h > > @@ -163,6 +163,8 @@ bool ring_buffer_empty_cpu(struct trace_buffer *buffer, > > int cpu); > > > > void ring_buffer_record_disable(struct trace_buffer *buffer); > > void ring_buffer_record_enable(struct trace_buffer *buffer); > > +void ring_buffer_mutex_acquire(struct trace_buffer *buffer); > > +void ring_buffer_mutex_release(struct trace_buffer *buffer); > > void ring_buffer_record_off(struct trace_buffer *buffer); > > void ring_buffer_record_on(struct trace_buffer *buffer); > > bool ring_buffer_record_is_on(struct trace_buffer *buffer); > > diff --git a/kernel/trace/ring_buffer.c b/kernel/trace/ring_buffer.c > > index 93ef0ab..638ec8f 100644 > > --- a/kernel/trace/ring_buffer.c > > +++ b/kernel/trace/ring_buffer.c > > @@ -3632,6 +3632,25 @@ void ring_buffer_record_enable(struct trace_buffer > > *buffer) > > EXPORT_SYMBOL_GPL(ring_buffer_record_enable); > > > > /** > > + * ring_buffer_mutex_acquire - prevent resetting of buffer > > + * during resize > > + */ > > +void ring_buffer_mutex_acquire(struct trace_buffer *buffer) > > +{ > > + mutex_lock(&buffer->mutex); > > +} > > +EXPORT_SYMBOL_GPL(ring_buffer_mutex_acquire); > > + > > +/** > > + * ring_buffer_mutex_release - prevent resetting of buffer > > + * during resize > > + */ > > +void ring_buffer_mutex_release(struct trace_buffer *buffer) > > +{ > > + mutex_unlock(&buffer->mutex); > > +} > > +EXPORT_SYMBOL_GPL(ring_buffer_mutex_release); I really do not like to export these. > > +/** > > * ring_buffer_record_off - stop all writes into the buffer > > * @buffer: The ring buffer to stop writes to. > > * > > @@ -4918,6 +4937,8 @@ void ring_buffer_reset(struct trace_buffer *buffer) > > struct ring_buffer_per_cpu *cpu_buffer; > > int cpu; > > > > + /* prevent another thread from changing buffer sizes */ > > + mutex_lock(&buffer->mutex); > > for_each_buffer_cpu(buffer, cpu) { > > cpu_buffer = buffer->buffers[cpu]; > > > > @@ -4936,6 +4957,7 @@ void ring_buffer_reset(struct trace_buffer *buffer) > > atomic_dec(&cpu_buffer->record_disabled); > > atomic_dec(&cpu_buffer->resize_disabled); > > } > > + mutex_unlock(&buffer->mutex); > > } > > EXPORT_SYMBOL_GPL(ring_buffer_reset); > > > > diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c > > index f40d850..392e9aa 100644 > > --- a/kernel/trace/trace.c > > +++ b/kernel/trace/trace.c > > @@ -2006,6 +2006,8 @@ void tracing_reset_online_cpus(struct array_buffer > > *buf) > > if (!buffer) > > return; > > > > + ring_buffer_mutex_acquire(buffer); > > + > > ring_buffer_record_disable(buffer); Hmm, why do we disable here as it gets disabled again in the call to ring_buffer_reset_online_cpus()? Perhaps we don't need to disable the buffer here. The only difference is that we have: buf->time_start = buffer_ftrace_now(buf, buf->cpu); And that the above disables the entire buffer, whereas the reset only resets individual ones. But I don't think that will make any difference. -- Steve > > > > /* Make sure all commits have finished */ > > @@ -2016,6 +2018,8 @@ void tracing_reset_online_cpus(struct array_buffer > > *buf) > > ring_buffer_reset_online_cpus(buffer); > > > > ring_buffer_record_enable(buffer); > > + > > + ring_buffer_mutex_release(buffer); > > } > > > > /* Must have trace_types_lock held */ > > >