Re: [PATCH] ring-buffer: Race when writing and swapping cpu buffer in parallel
On Thu 2014-06-26 20:55:00, Steven Rostedt wrote: > On Thu, 26 Jun 2014 09:58:31 -0400 > Steven Rostedt wrote: > > > What we can do is force ring_buffer_swap_cpu() to only work for the CPU > > that it is on. As we have snapshot in per_cpu buffers, to make that > > work, we will need to change the per_cpu version of snapshot to do a > > smp_call_function_single() to the CPU that it wants to take a snapshot > > of, and run the swap there. > > > > To force this, we can remove the cpu parameter from the > > ring_buffer_swap_cpu(). By doing this, we may be able to remove some of > > the CONFIG_RING_BUFFER_ALLOW_SWAP hacks too! > > > > I'm not going to sacrifice the general performance of the ring buffer > > for a feature that is seldom (if ever) used. Fair enough. I see the point. > Did you want to do the above, or do you want me to write something up? I could look at it the following week. When I think about it, the race is not that critical. In the worst case, it slightly modifies the swapped buffer but there should not be danger of any data corruption. Anyway, I will try to get rid of it the way you suggested. It might be more important when the ringbuffer has more users. Best Regards, Petr -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] ring-buffer: Race when writing and swapping cpu buffer in parallel
On Thu 2014-06-26 20:55:00, Steven Rostedt wrote: On Thu, 26 Jun 2014 09:58:31 -0400 Steven Rostedt rost...@goodmis.org wrote: What we can do is force ring_buffer_swap_cpu() to only work for the CPU that it is on. As we have snapshot in per_cpu buffers, to make that work, we will need to change the per_cpu version of snapshot to do a smp_call_function_single() to the CPU that it wants to take a snapshot of, and run the swap there. To force this, we can remove the cpu parameter from the ring_buffer_swap_cpu(). By doing this, we may be able to remove some of the CONFIG_RING_BUFFER_ALLOW_SWAP hacks too! I'm not going to sacrifice the general performance of the ring buffer for a feature that is seldom (if ever) used. Fair enough. I see the point. Did you want to do the above, or do you want me to write something up? I could look at it the following week. When I think about it, the race is not that critical. In the worst case, it slightly modifies the swapped buffer but there should not be danger of any data corruption. Anyway, I will try to get rid of it the way you suggested. It might be more important when the ringbuffer has more users. Best Regards, Petr -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] ring-buffer: Race when writing and swapping cpu buffer in parallel
On Thu, 26 Jun 2014 09:58:31 -0400 Steven Rostedt wrote: > What we can do is force ring_buffer_swap_cpu() to only work for the CPU > that it is on. As we have snapshot in per_cpu buffers, to make that > work, we will need to change the per_cpu version of snapshot to do a > smp_call_function_single() to the CPU that it wants to take a snapshot > of, and run the swap there. > > To force this, we can remove the cpu parameter from the > ring_buffer_swap_cpu(). By doing this, we may be able to remove some of > the CONFIG_RING_BUFFER_ALLOW_SWAP hacks too! > > I'm not going to sacrifice the general performance of the ring buffer > for a feature that is seldom (if ever) used. Did you want to do the above, or do you want me to write something up? -- Steve -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] ring-buffer: Race when writing and swapping cpu buffer in parallel
On Thu, 26 Jun 2014 15:22:38 +0200 Petr Mladek wrote: > The trace/ring_buffer allows to swap the entire ring buffer. Everything has to > be done lockless. I think that I have found a race when trying to understand > the code. The problematic situation is the following: > > CPU 1 (write/reserve event) CPU 2 (swap the cpu buffer) > - > ring_buffer_write() > if (cpu_buffer->record_disabled) > ^^^ test fails and write continues > > ring_buffer_swap_cpu() This is a per cpu swap, not a full ring buffer swap. > > inc(_buffer_a->record_disabled); > inc(_buffer_b->record_disabled); > > if (cpu_buffer_a->committing) > ^^^ test fails and swap continues > if (cpu_buffer_b->committing) > ^^^ test fails and swap continues Grumble, this was originally written such that the swap can only be done on the CPU that it is running on. > > rb_reserve_next_event() > rb_start_commit() > local_inc(_buffer->committing); > local_inc(_buffer->commits); > > if (unlikely(ACCESS_ONCE(cpu_buffer->buffer) != buffer)) { > ^^^ test fails and reserve_next continues > > buffer_a->buffers[cpu] = cpu_buffer_b; > buffer_b->buffers[cpu] = cpu_buffer_a; > cpu_buffer_b->buffer = buffer_a; > cpu_buffer_a->buffer = buffer_b; > > Ph, reservation continues in the removed buffer. > > This can be solved by a better check in rb_reserve_next_event(). The > reservation > could continue only when "committing" is enabled and there is no swap in > progress or when any swap was not finished in the meantime. > > Note that the solution is not perfect. It stops writing also in this > situation: > > CPU 1 (write/reserve event) CPU 2 (swap the cpu buffer) > > ring_buffer_write() > if (cpu_buffer->record_disabled) > ^^^ test fails and write continues > > ring_buffer_swap_cpu() > > inc(_buffer_a->record_disabled); > inc(_buffer_b->record_disabled); > > rb_reserve_next_event() > rb_start_commit() > local_inc(_buffer->committing); > local_inc(_buffer->commits); > > if (cpu_buffer_a->committing) > ^^^ test passes and swap is canceled > > if (cpu_buffer->record_disabled) > ^^^ test passes and reserve is canceled > > dec(_buffer_a->record_disabled); > dec(_buffer_b->record_disabled); > > Ph, both actions were canceled. The write/reserve could have continued > if the recording was enabled in time. > > Well, it is the price for using lockless algorithms. I think that it happens > in more situations here, it is not worth more complicated code, and we could > live with it. > > The patch also adds some missing memory barriers. Note that compiler barriers > are not enough because the data can be accessed by different CPUs. > > Signed-off-by: Petr Mladek > --- > kernel/trace/ring_buffer.c | 27 +-- > 1 file changed, 21 insertions(+), 6 deletions(-) > > diff --git a/kernel/trace/ring_buffer.c b/kernel/trace/ring_buffer.c > index 7c56c3d06943..3020060ded7e 100644 > --- a/kernel/trace/ring_buffer.c > +++ b/kernel/trace/ring_buffer.c > @@ -2529,13 +2529,15 @@ rb_reserve_next_event(struct ring_buffer *buffer, > > #ifdef CONFIG_RING_BUFFER_ALLOW_SWAP > /* > - * Due to the ability to swap a cpu buffer from a buffer > - * it is possible it was swapped before we committed. > - * (committing stops a swap). We check for it here and > - * if it happened, we have to fail the write. > + * The cpu buffer swap could have started before we managed to stop it > + * by incrementing the "committing" values. If the swap is in progress, > + * we see disabled recording. If the swap has finished, we see the new > + * cpu buffer. In both cases, we should go back and stop committing > + * to the old buffer. See also ring_buffer_swap_cpu() >*/ > - barrier(); > - if (unlikely(ACCESS_ONCE(cpu_buffer->buffer) != buffer)) { > + smp_mb(); > + if (unlikely(atomic_read(_buffer->record_disabled) || > + ACCESS_ONCE(cpu_buffer->buffer) != buffer)) { > local_dec(_buffer->committing); >
[PATCH] ring-buffer: Race when writing and swapping cpu buffer in parallel
The trace/ring_buffer allows to swap the entire ring buffer. Everything has to be done lockless. I think that I have found a race when trying to understand the code. The problematic situation is the following: CPU 1 (write/reserve event) CPU 2 (swap the cpu buffer) - ring_buffer_write() if (cpu_buffer->record_disabled) ^^^ test fails and write continues ring_buffer_swap_cpu() inc(_buffer_a->record_disabled); inc(_buffer_b->record_disabled); if (cpu_buffer_a->committing) ^^^ test fails and swap continues if (cpu_buffer_b->committing) ^^^ test fails and swap continues rb_reserve_next_event() rb_start_commit() local_inc(_buffer->committing); local_inc(_buffer->commits); if (unlikely(ACCESS_ONCE(cpu_buffer->buffer) != buffer)) { ^^^ test fails and reserve_next continues buffer_a->buffers[cpu] = cpu_buffer_b; buffer_b->buffers[cpu] = cpu_buffer_a; cpu_buffer_b->buffer = buffer_a; cpu_buffer_a->buffer = buffer_b; Ph, reservation continues in the removed buffer. This can be solved by a better check in rb_reserve_next_event(). The reservation could continue only when "committing" is enabled and there is no swap in progress or when any swap was not finished in the meantime. Note that the solution is not perfect. It stops writing also in this situation: CPU 1 (write/reserve event) CPU 2 (swap the cpu buffer) ring_buffer_write() if (cpu_buffer->record_disabled) ^^^ test fails and write continues ring_buffer_swap_cpu() inc(_buffer_a->record_disabled); inc(_buffer_b->record_disabled); rb_reserve_next_event() rb_start_commit() local_inc(_buffer->committing); local_inc(_buffer->commits); if (cpu_buffer_a->committing) ^^^ test passes and swap is canceled if (cpu_buffer->record_disabled) ^^^ test passes and reserve is canceled dec(_buffer_a->record_disabled); dec(_buffer_b->record_disabled); Ph, both actions were canceled. The write/reserve could have continued if the recording was enabled in time. Well, it is the price for using lockless algorithms. I think that it happens in more situations here, it is not worth more complicated code, and we could live with it. The patch also adds some missing memory barriers. Note that compiler barriers are not enough because the data can be accessed by different CPUs. Signed-off-by: Petr Mladek --- kernel/trace/ring_buffer.c | 27 +-- 1 file changed, 21 insertions(+), 6 deletions(-) diff --git a/kernel/trace/ring_buffer.c b/kernel/trace/ring_buffer.c index 7c56c3d06943..3020060ded7e 100644 --- a/kernel/trace/ring_buffer.c +++ b/kernel/trace/ring_buffer.c @@ -2529,13 +2529,15 @@ rb_reserve_next_event(struct ring_buffer *buffer, #ifdef CONFIG_RING_BUFFER_ALLOW_SWAP /* -* Due to the ability to swap a cpu buffer from a buffer -* it is possible it was swapped before we committed. -* (committing stops a swap). We check for it here and -* if it happened, we have to fail the write. +* The cpu buffer swap could have started before we managed to stop it +* by incrementing the "committing" values. If the swap is in progress, +* we see disabled recording. If the swap has finished, we see the new +* cpu buffer. In both cases, we should go back and stop committing +* to the old buffer. See also ring_buffer_swap_cpu() */ - barrier(); - if (unlikely(ACCESS_ONCE(cpu_buffer->buffer) != buffer)) { + smp_mb(); + if (unlikely(atomic_read(_buffer->record_disabled) || +ACCESS_ONCE(cpu_buffer->buffer) != buffer)) { local_dec(_buffer->committing); local_dec(_buffer->commits); return NULL; @@ -4334,6 +4336,13 @@ int ring_buffer_swap_cpu(struct ring_buffer *buffer_a, atomic_inc(_buffer_a->record_disabled); atomic_inc(_buffer_b->record_disabled); + /* +* We could not swap if a commit is in progress. Also any commit could +* not start after we have stopped recording.
[PATCH] ring-buffer: Race when writing and swapping cpu buffer in parallel
The trace/ring_buffer allows to swap the entire ring buffer. Everything has to be done lockless. I think that I have found a race when trying to understand the code. The problematic situation is the following: CPU 1 (write/reserve event) CPU 2 (swap the cpu buffer) - ring_buffer_write() if (cpu_buffer-record_disabled) ^^^ test fails and write continues ring_buffer_swap_cpu() inc(cpu_buffer_a-record_disabled); inc(cpu_buffer_b-record_disabled); if (cpu_buffer_a-committing) ^^^ test fails and swap continues if (cpu_buffer_b-committing) ^^^ test fails and swap continues rb_reserve_next_event() rb_start_commit() local_inc(cpu_buffer-committing); local_inc(cpu_buffer-commits); if (unlikely(ACCESS_ONCE(cpu_buffer-buffer) != buffer)) { ^^^ test fails and reserve_next continues buffer_a-buffers[cpu] = cpu_buffer_b; buffer_b-buffers[cpu] = cpu_buffer_a; cpu_buffer_b-buffer = buffer_a; cpu_buffer_a-buffer = buffer_b; Ph, reservation continues in the removed buffer. This can be solved by a better check in rb_reserve_next_event(). The reservation could continue only when committing is enabled and there is no swap in progress or when any swap was not finished in the meantime. Note that the solution is not perfect. It stops writing also in this situation: CPU 1 (write/reserve event) CPU 2 (swap the cpu buffer) ring_buffer_write() if (cpu_buffer-record_disabled) ^^^ test fails and write continues ring_buffer_swap_cpu() inc(cpu_buffer_a-record_disabled); inc(cpu_buffer_b-record_disabled); rb_reserve_next_event() rb_start_commit() local_inc(cpu_buffer-committing); local_inc(cpu_buffer-commits); if (cpu_buffer_a-committing) ^^^ test passes and swap is canceled if (cpu_buffer-record_disabled) ^^^ test passes and reserve is canceled dec(cpu_buffer_a-record_disabled); dec(cpu_buffer_b-record_disabled); Ph, both actions were canceled. The write/reserve could have continued if the recording was enabled in time. Well, it is the price for using lockless algorithms. I think that it happens in more situations here, it is not worth more complicated code, and we could live with it. The patch also adds some missing memory barriers. Note that compiler barriers are not enough because the data can be accessed by different CPUs. Signed-off-by: Petr Mladek pmla...@suse.cz --- kernel/trace/ring_buffer.c | 27 +-- 1 file changed, 21 insertions(+), 6 deletions(-) diff --git a/kernel/trace/ring_buffer.c b/kernel/trace/ring_buffer.c index 7c56c3d06943..3020060ded7e 100644 --- a/kernel/trace/ring_buffer.c +++ b/kernel/trace/ring_buffer.c @@ -2529,13 +2529,15 @@ rb_reserve_next_event(struct ring_buffer *buffer, #ifdef CONFIG_RING_BUFFER_ALLOW_SWAP /* -* Due to the ability to swap a cpu buffer from a buffer -* it is possible it was swapped before we committed. -* (committing stops a swap). We check for it here and -* if it happened, we have to fail the write. +* The cpu buffer swap could have started before we managed to stop it +* by incrementing the committing values. If the swap is in progress, +* we see disabled recording. If the swap has finished, we see the new +* cpu buffer. In both cases, we should go back and stop committing +* to the old buffer. See also ring_buffer_swap_cpu() */ - barrier(); - if (unlikely(ACCESS_ONCE(cpu_buffer-buffer) != buffer)) { + smp_mb(); + if (unlikely(atomic_read(cpu_buffer-record_disabled) || +ACCESS_ONCE(cpu_buffer-buffer) != buffer)) { local_dec(cpu_buffer-committing); local_dec(cpu_buffer-commits); return NULL; @@ -4334,6 +4336,13 @@ int ring_buffer_swap_cpu(struct ring_buffer *buffer_a, atomic_inc(cpu_buffer_a-record_disabled); atomic_inc(cpu_buffer_b-record_disabled); + /* +* We could not swap if a commit is in progress. Also any commit could +* not start
Re: [PATCH] ring-buffer: Race when writing and swapping cpu buffer in parallel
On Thu, 26 Jun 2014 15:22:38 +0200 Petr Mladek pmla...@suse.cz wrote: The trace/ring_buffer allows to swap the entire ring buffer. Everything has to be done lockless. I think that I have found a race when trying to understand the code. The problematic situation is the following: CPU 1 (write/reserve event) CPU 2 (swap the cpu buffer) - ring_buffer_write() if (cpu_buffer-record_disabled) ^^^ test fails and write continues ring_buffer_swap_cpu() This is a per cpu swap, not a full ring buffer swap. inc(cpu_buffer_a-record_disabled); inc(cpu_buffer_b-record_disabled); if (cpu_buffer_a-committing) ^^^ test fails and swap continues if (cpu_buffer_b-committing) ^^^ test fails and swap continues Grumble, this was originally written such that the swap can only be done on the CPU that it is running on. rb_reserve_next_event() rb_start_commit() local_inc(cpu_buffer-committing); local_inc(cpu_buffer-commits); if (unlikely(ACCESS_ONCE(cpu_buffer-buffer) != buffer)) { ^^^ test fails and reserve_next continues buffer_a-buffers[cpu] = cpu_buffer_b; buffer_b-buffers[cpu] = cpu_buffer_a; cpu_buffer_b-buffer = buffer_a; cpu_buffer_a-buffer = buffer_b; Ph, reservation continues in the removed buffer. This can be solved by a better check in rb_reserve_next_event(). The reservation could continue only when committing is enabled and there is no swap in progress or when any swap was not finished in the meantime. Note that the solution is not perfect. It stops writing also in this situation: CPU 1 (write/reserve event) CPU 2 (swap the cpu buffer) ring_buffer_write() if (cpu_buffer-record_disabled) ^^^ test fails and write continues ring_buffer_swap_cpu() inc(cpu_buffer_a-record_disabled); inc(cpu_buffer_b-record_disabled); rb_reserve_next_event() rb_start_commit() local_inc(cpu_buffer-committing); local_inc(cpu_buffer-commits); if (cpu_buffer_a-committing) ^^^ test passes and swap is canceled if (cpu_buffer-record_disabled) ^^^ test passes and reserve is canceled dec(cpu_buffer_a-record_disabled); dec(cpu_buffer_b-record_disabled); Ph, both actions were canceled. The write/reserve could have continued if the recording was enabled in time. Well, it is the price for using lockless algorithms. I think that it happens in more situations here, it is not worth more complicated code, and we could live with it. The patch also adds some missing memory barriers. Note that compiler barriers are not enough because the data can be accessed by different CPUs. Signed-off-by: Petr Mladek pmla...@suse.cz --- kernel/trace/ring_buffer.c | 27 +-- 1 file changed, 21 insertions(+), 6 deletions(-) diff --git a/kernel/trace/ring_buffer.c b/kernel/trace/ring_buffer.c index 7c56c3d06943..3020060ded7e 100644 --- a/kernel/trace/ring_buffer.c +++ b/kernel/trace/ring_buffer.c @@ -2529,13 +2529,15 @@ rb_reserve_next_event(struct ring_buffer *buffer, #ifdef CONFIG_RING_BUFFER_ALLOW_SWAP /* - * Due to the ability to swap a cpu buffer from a buffer - * it is possible it was swapped before we committed. - * (committing stops a swap). We check for it here and - * if it happened, we have to fail the write. + * The cpu buffer swap could have started before we managed to stop it + * by incrementing the committing values. If the swap is in progress, + * we see disabled recording. If the swap has finished, we see the new + * cpu buffer. In both cases, we should go back and stop committing + * to the old buffer. See also ring_buffer_swap_cpu() */ - barrier(); - if (unlikely(ACCESS_ONCE(cpu_buffer-buffer) != buffer)) { + smp_mb(); + if (unlikely(atomic_read(cpu_buffer-record_disabled) || + ACCESS_ONCE(cpu_buffer-buffer) != buffer)) { local_dec(cpu_buffer-committing); local_dec(cpu_buffer-commits); return NULL; @@ -4334,6 +4336,13 @@
Re: [PATCH] ring-buffer: Race when writing and swapping cpu buffer in parallel
On Thu, 26 Jun 2014 09:58:31 -0400 Steven Rostedt rost...@goodmis.org wrote: What we can do is force ring_buffer_swap_cpu() to only work for the CPU that it is on. As we have snapshot in per_cpu buffers, to make that work, we will need to change the per_cpu version of snapshot to do a smp_call_function_single() to the CPU that it wants to take a snapshot of, and run the swap there. To force this, we can remove the cpu parameter from the ring_buffer_swap_cpu(). By doing this, we may be able to remove some of the CONFIG_RING_BUFFER_ALLOW_SWAP hacks too! I'm not going to sacrifice the general performance of the ring buffer for a feature that is seldom (if ever) used. Did you want to do the above, or do you want me to write something up? -- Steve -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/