Re: [PATCH] ring-buffer: Race when writing and swapping cpu buffer in parallel

2014-06-27 Thread Petr Mládek
On Thu 2014-06-26 20:55:00, Steven Rostedt wrote:
> On Thu, 26 Jun 2014 09:58:31 -0400
> Steven Rostedt  wrote:
> 
> > What we can do is force ring_buffer_swap_cpu() to only work for the CPU
> > that it is on. As we have snapshot in per_cpu buffers, to make that
> > work, we will need to change the per_cpu version of snapshot to do a
> > smp_call_function_single() to the CPU that it wants to take a snapshot
> > of, and run the swap there.
> > 
> > To force this, we can remove the cpu parameter from the
> > ring_buffer_swap_cpu(). By doing this, we may be able to remove some of
> > the CONFIG_RING_BUFFER_ALLOW_SWAP hacks too!
> > 
> > I'm not going to sacrifice the general performance of the ring buffer
> > for a feature that is seldom (if ever) used.

Fair enough. I see the point.

> Did you want to do the above, or do you want me to write something up?

I could look at it the following week.

When I think about it, the race is not that critical. In the worst
case, it slightly modifies the swapped buffer but there should not
be danger of any data corruption. Anyway, I will try to get rid of
it the way you suggested. It might be more important when the
ringbuffer has more users.

Best Regards,
Petr
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] ring-buffer: Race when writing and swapping cpu buffer in parallel

2014-06-27 Thread Petr Mládek
On Thu 2014-06-26 20:55:00, Steven Rostedt wrote:
 On Thu, 26 Jun 2014 09:58:31 -0400
 Steven Rostedt rost...@goodmis.org wrote:
 
  What we can do is force ring_buffer_swap_cpu() to only work for the CPU
  that it is on. As we have snapshot in per_cpu buffers, to make that
  work, we will need to change the per_cpu version of snapshot to do a
  smp_call_function_single() to the CPU that it wants to take a snapshot
  of, and run the swap there.
  
  To force this, we can remove the cpu parameter from the
  ring_buffer_swap_cpu(). By doing this, we may be able to remove some of
  the CONFIG_RING_BUFFER_ALLOW_SWAP hacks too!
  
  I'm not going to sacrifice the general performance of the ring buffer
  for a feature that is seldom (if ever) used.

Fair enough. I see the point.

 Did you want to do the above, or do you want me to write something up?

I could look at it the following week.

When I think about it, the race is not that critical. In the worst
case, it slightly modifies the swapped buffer but there should not
be danger of any data corruption. Anyway, I will try to get rid of
it the way you suggested. It might be more important when the
ringbuffer has more users.

Best Regards,
Petr
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] ring-buffer: Race when writing and swapping cpu buffer in parallel

2014-06-26 Thread Steven Rostedt
On Thu, 26 Jun 2014 09:58:31 -0400
Steven Rostedt  wrote:

> What we can do is force ring_buffer_swap_cpu() to only work for the CPU
> that it is on. As we have snapshot in per_cpu buffers, to make that
> work, we will need to change the per_cpu version of snapshot to do a
> smp_call_function_single() to the CPU that it wants to take a snapshot
> of, and run the swap there.
> 
> To force this, we can remove the cpu parameter from the
> ring_buffer_swap_cpu(). By doing this, we may be able to remove some of
> the CONFIG_RING_BUFFER_ALLOW_SWAP hacks too!
> 
> I'm not going to sacrifice the general performance of the ring buffer
> for a feature that is seldom (if ever) used.

Did you want to do the above, or do you want me to write something up?

-- Steve
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] ring-buffer: Race when writing and swapping cpu buffer in parallel

2014-06-26 Thread Steven Rostedt
On Thu, 26 Jun 2014 15:22:38 +0200
Petr Mladek  wrote:

> The trace/ring_buffer allows to swap the entire ring buffer. Everything has to
> be done lockless. I think that I have found a race when trying to understand
> the code. The problematic situation is the following:
> 
> CPU 1 (write/reserve event)   CPU 2 (swap the cpu buffer)
> -
> ring_buffer_write()
>   if (cpu_buffer->record_disabled)
>   ^^^ test fails and write continues
> 
>   ring_buffer_swap_cpu()

This is a per cpu swap, not a full ring buffer swap.

> 
> inc(_buffer_a->record_disabled);
> inc(_buffer_b->record_disabled);
> 
> if (cpu_buffer_a->committing)
> ^^^ test fails and swap continues
> if (cpu_buffer_b->committing)
> ^^^ test fails and swap continues

Grumble, this was originally written such that the swap can only be
done on the CPU that it is running on.


> 
>   rb_reserve_next_event()
> rb_start_commit()
>   local_inc(_buffer->committing);
>   local_inc(_buffer->commits);
> 
> if (unlikely(ACCESS_ONCE(cpu_buffer->buffer) != buffer)) {
> ^^^ test fails and reserve_next continues
> 
> buffer_a->buffers[cpu] = cpu_buffer_b;
> buffer_b->buffers[cpu] = cpu_buffer_a;
> cpu_buffer_b->buffer = buffer_a;
> cpu_buffer_a->buffer = buffer_b;
> 
>   Ph, reservation continues in the removed buffer.
> 
> This can be solved by a better check in rb_reserve_next_event(). The 
> reservation
> could continue only when "committing" is enabled and there is no swap in
> progress or when any swap was not finished in the meantime.
> 
> Note that the solution is not perfect. It stops writing also in this 
> situation:
> 
> CPU 1 (write/reserve event)   CPU 2 (swap the cpu buffer)
> 
> ring_buffer_write()
>   if (cpu_buffer->record_disabled)
>   ^^^ test fails and write continues
> 
>   ring_buffer_swap_cpu()
> 
> inc(_buffer_a->record_disabled);
> inc(_buffer_b->record_disabled);
> 
>   rb_reserve_next_event()
> rb_start_commit()
>   local_inc(_buffer->committing);
>   local_inc(_buffer->commits);
> 
> if (cpu_buffer_a->committing)
> ^^^ test passes and swap is canceled
> 
> if (cpu_buffer->record_disabled)
> ^^^ test passes and reserve is canceled
> 
> dec(_buffer_a->record_disabled);
> dec(_buffer_b->record_disabled);
> 
>   Ph, both actions were canceled. The write/reserve could have continued
>   if the recording was enabled in time.
> 
> Well, it is the price for using lockless algorithms. I think that it happens
> in more situations here, it is not worth more complicated code, and we could
> live with it.
> 
> The patch also adds some missing memory barriers. Note that compiler barriers
> are not enough because the data can be accessed by different CPUs.
> 
> Signed-off-by: Petr Mladek 
> ---
>  kernel/trace/ring_buffer.c | 27 +--
>  1 file changed, 21 insertions(+), 6 deletions(-)
> 
> diff --git a/kernel/trace/ring_buffer.c b/kernel/trace/ring_buffer.c
> index 7c56c3d06943..3020060ded7e 100644
> --- a/kernel/trace/ring_buffer.c
> +++ b/kernel/trace/ring_buffer.c
> @@ -2529,13 +2529,15 @@ rb_reserve_next_event(struct ring_buffer *buffer,
>  
>  #ifdef CONFIG_RING_BUFFER_ALLOW_SWAP
>   /*
> -  * Due to the ability to swap a cpu buffer from a buffer
> -  * it is possible it was swapped before we committed.
> -  * (committing stops a swap). We check for it here and
> -  * if it happened, we have to fail the write.
> +  * The cpu buffer swap could have started before we managed to stop it
> +  * by incrementing the "committing" values. If the swap is in progress,
> +  * we see disabled recording. If the swap has finished, we see the new
> +  * cpu buffer. In both cases, we should go back and stop committing
> +  * to the old buffer. See also ring_buffer_swap_cpu()
>*/
> - barrier();
> - if (unlikely(ACCESS_ONCE(cpu_buffer->buffer) != buffer)) {
> + smp_mb();
> + if (unlikely(atomic_read(_buffer->record_disabled) ||
> +  ACCESS_ONCE(cpu_buffer->buffer) != buffer)) {
>   local_dec(_buffer->committing);
>   

[PATCH] ring-buffer: Race when writing and swapping cpu buffer in parallel

2014-06-26 Thread Petr Mladek
The trace/ring_buffer allows to swap the entire ring buffer. Everything has to
be done lockless. I think that I have found a race when trying to understand
the code. The problematic situation is the following:

CPU 1 (write/reserve event) CPU 2 (swap the cpu buffer)
-
ring_buffer_write()
  if (cpu_buffer->record_disabled)
  ^^^ test fails and write continues

ring_buffer_swap_cpu()

  inc(_buffer_a->record_disabled);
  inc(_buffer_b->record_disabled);

  if (cpu_buffer_a->committing)
  ^^^ test fails and swap continues
  if (cpu_buffer_b->committing)
  ^^^ test fails and swap continues

  rb_reserve_next_event()
rb_start_commit()
  local_inc(_buffer->committing);
  local_inc(_buffer->commits);

if (unlikely(ACCESS_ONCE(cpu_buffer->buffer) != buffer)) {
^^^ test fails and reserve_next continues

  buffer_a->buffers[cpu] = cpu_buffer_b;
  buffer_b->buffers[cpu] = cpu_buffer_a;
  cpu_buffer_b->buffer = buffer_a;
  cpu_buffer_a->buffer = buffer_b;

  Ph, reservation continues in the removed buffer.

This can be solved by a better check in rb_reserve_next_event(). The reservation
could continue only when "committing" is enabled and there is no swap in
progress or when any swap was not finished in the meantime.

Note that the solution is not perfect. It stops writing also in this situation:

CPU 1 (write/reserve event) CPU 2 (swap the cpu buffer)

ring_buffer_write()
  if (cpu_buffer->record_disabled)
  ^^^ test fails and write continues

ring_buffer_swap_cpu()

  inc(_buffer_a->record_disabled);
  inc(_buffer_b->record_disabled);

  rb_reserve_next_event()
rb_start_commit()
  local_inc(_buffer->committing);
  local_inc(_buffer->commits);

  if (cpu_buffer_a->committing)
  ^^^ test passes and swap is canceled

if (cpu_buffer->record_disabled)
^^^ test passes and reserve is canceled

  dec(_buffer_a->record_disabled);
  dec(_buffer_b->record_disabled);

  Ph, both actions were canceled. The write/reserve could have continued
  if the recording was enabled in time.

Well, it is the price for using lockless algorithms. I think that it happens
in more situations here, it is not worth more complicated code, and we could
live with it.

The patch also adds some missing memory barriers. Note that compiler barriers
are not enough because the data can be accessed by different CPUs.

Signed-off-by: Petr Mladek 
---
 kernel/trace/ring_buffer.c | 27 +--
 1 file changed, 21 insertions(+), 6 deletions(-)

diff --git a/kernel/trace/ring_buffer.c b/kernel/trace/ring_buffer.c
index 7c56c3d06943..3020060ded7e 100644
--- a/kernel/trace/ring_buffer.c
+++ b/kernel/trace/ring_buffer.c
@@ -2529,13 +2529,15 @@ rb_reserve_next_event(struct ring_buffer *buffer,
 
 #ifdef CONFIG_RING_BUFFER_ALLOW_SWAP
/*
-* Due to the ability to swap a cpu buffer from a buffer
-* it is possible it was swapped before we committed.
-* (committing stops a swap). We check for it here and
-* if it happened, we have to fail the write.
+* The cpu buffer swap could have started before we managed to stop it
+* by incrementing the "committing" values. If the swap is in progress,
+* we see disabled recording. If the swap has finished, we see the new
+* cpu buffer. In both cases, we should go back and stop committing
+* to the old buffer. See also ring_buffer_swap_cpu()
 */
-   barrier();
-   if (unlikely(ACCESS_ONCE(cpu_buffer->buffer) != buffer)) {
+   smp_mb();
+   if (unlikely(atomic_read(_buffer->record_disabled) ||
+ACCESS_ONCE(cpu_buffer->buffer) != buffer)) {
local_dec(_buffer->committing);
local_dec(_buffer->commits);
return NULL;
@@ -4334,6 +4336,13 @@ int ring_buffer_swap_cpu(struct ring_buffer *buffer_a,
atomic_inc(_buffer_a->record_disabled);
atomic_inc(_buffer_b->record_disabled);
 
+   /*
+* We could not swap if a commit is in progress. Also any commit could
+* not start after we have stopped recording. 

[PATCH] ring-buffer: Race when writing and swapping cpu buffer in parallel

2014-06-26 Thread Petr Mladek
The trace/ring_buffer allows to swap the entire ring buffer. Everything has to
be done lockless. I think that I have found a race when trying to understand
the code. The problematic situation is the following:

CPU 1 (write/reserve event) CPU 2 (swap the cpu buffer)
-
ring_buffer_write()
  if (cpu_buffer-record_disabled)
  ^^^ test fails and write continues

ring_buffer_swap_cpu()

  inc(cpu_buffer_a-record_disabled);
  inc(cpu_buffer_b-record_disabled);

  if (cpu_buffer_a-committing)
  ^^^ test fails and swap continues
  if (cpu_buffer_b-committing)
  ^^^ test fails and swap continues

  rb_reserve_next_event()
rb_start_commit()
  local_inc(cpu_buffer-committing);
  local_inc(cpu_buffer-commits);

if (unlikely(ACCESS_ONCE(cpu_buffer-buffer) != buffer)) {
^^^ test fails and reserve_next continues

  buffer_a-buffers[cpu] = cpu_buffer_b;
  buffer_b-buffers[cpu] = cpu_buffer_a;
  cpu_buffer_b-buffer = buffer_a;
  cpu_buffer_a-buffer = buffer_b;

  Ph, reservation continues in the removed buffer.

This can be solved by a better check in rb_reserve_next_event(). The reservation
could continue only when committing is enabled and there is no swap in
progress or when any swap was not finished in the meantime.

Note that the solution is not perfect. It stops writing also in this situation:

CPU 1 (write/reserve event) CPU 2 (swap the cpu buffer)

ring_buffer_write()
  if (cpu_buffer-record_disabled)
  ^^^ test fails and write continues

ring_buffer_swap_cpu()

  inc(cpu_buffer_a-record_disabled);
  inc(cpu_buffer_b-record_disabled);

  rb_reserve_next_event()
rb_start_commit()
  local_inc(cpu_buffer-committing);
  local_inc(cpu_buffer-commits);

  if (cpu_buffer_a-committing)
  ^^^ test passes and swap is canceled

if (cpu_buffer-record_disabled)
^^^ test passes and reserve is canceled

  dec(cpu_buffer_a-record_disabled);
  dec(cpu_buffer_b-record_disabled);

  Ph, both actions were canceled. The write/reserve could have continued
  if the recording was enabled in time.

Well, it is the price for using lockless algorithms. I think that it happens
in more situations here, it is not worth more complicated code, and we could
live with it.

The patch also adds some missing memory barriers. Note that compiler barriers
are not enough because the data can be accessed by different CPUs.

Signed-off-by: Petr Mladek pmla...@suse.cz
---
 kernel/trace/ring_buffer.c | 27 +--
 1 file changed, 21 insertions(+), 6 deletions(-)

diff --git a/kernel/trace/ring_buffer.c b/kernel/trace/ring_buffer.c
index 7c56c3d06943..3020060ded7e 100644
--- a/kernel/trace/ring_buffer.c
+++ b/kernel/trace/ring_buffer.c
@@ -2529,13 +2529,15 @@ rb_reserve_next_event(struct ring_buffer *buffer,
 
 #ifdef CONFIG_RING_BUFFER_ALLOW_SWAP
/*
-* Due to the ability to swap a cpu buffer from a buffer
-* it is possible it was swapped before we committed.
-* (committing stops a swap). We check for it here and
-* if it happened, we have to fail the write.
+* The cpu buffer swap could have started before we managed to stop it
+* by incrementing the committing values. If the swap is in progress,
+* we see disabled recording. If the swap has finished, we see the new
+* cpu buffer. In both cases, we should go back and stop committing
+* to the old buffer. See also ring_buffer_swap_cpu()
 */
-   barrier();
-   if (unlikely(ACCESS_ONCE(cpu_buffer-buffer) != buffer)) {
+   smp_mb();
+   if (unlikely(atomic_read(cpu_buffer-record_disabled) ||
+ACCESS_ONCE(cpu_buffer-buffer) != buffer)) {
local_dec(cpu_buffer-committing);
local_dec(cpu_buffer-commits);
return NULL;
@@ -4334,6 +4336,13 @@ int ring_buffer_swap_cpu(struct ring_buffer *buffer_a,
atomic_inc(cpu_buffer_a-record_disabled);
atomic_inc(cpu_buffer_b-record_disabled);
 
+   /*
+* We could not swap if a commit is in progress. Also any commit could
+* not start 

Re: [PATCH] ring-buffer: Race when writing and swapping cpu buffer in parallel

2014-06-26 Thread Steven Rostedt
On Thu, 26 Jun 2014 15:22:38 +0200
Petr Mladek pmla...@suse.cz wrote:

 The trace/ring_buffer allows to swap the entire ring buffer. Everything has to
 be done lockless. I think that I have found a race when trying to understand
 the code. The problematic situation is the following:
 
 CPU 1 (write/reserve event)   CPU 2 (swap the cpu buffer)
 -
 ring_buffer_write()
   if (cpu_buffer-record_disabled)
   ^^^ test fails and write continues
 
   ring_buffer_swap_cpu()

This is a per cpu swap, not a full ring buffer swap.

 
 inc(cpu_buffer_a-record_disabled);
 inc(cpu_buffer_b-record_disabled);
 
 if (cpu_buffer_a-committing)
 ^^^ test fails and swap continues
 if (cpu_buffer_b-committing)
 ^^^ test fails and swap continues

Grumble, this was originally written such that the swap can only be
done on the CPU that it is running on.


 
   rb_reserve_next_event()
 rb_start_commit()
   local_inc(cpu_buffer-committing);
   local_inc(cpu_buffer-commits);
 
 if (unlikely(ACCESS_ONCE(cpu_buffer-buffer) != buffer)) {
 ^^^ test fails and reserve_next continues
 
 buffer_a-buffers[cpu] = cpu_buffer_b;
 buffer_b-buffers[cpu] = cpu_buffer_a;
 cpu_buffer_b-buffer = buffer_a;
 cpu_buffer_a-buffer = buffer_b;
 
   Ph, reservation continues in the removed buffer.
 
 This can be solved by a better check in rb_reserve_next_event(). The 
 reservation
 could continue only when committing is enabled and there is no swap in
 progress or when any swap was not finished in the meantime.
 
 Note that the solution is not perfect. It stops writing also in this 
 situation:
 
 CPU 1 (write/reserve event)   CPU 2 (swap the cpu buffer)
 
 ring_buffer_write()
   if (cpu_buffer-record_disabled)
   ^^^ test fails and write continues
 
   ring_buffer_swap_cpu()
 
 inc(cpu_buffer_a-record_disabled);
 inc(cpu_buffer_b-record_disabled);
 
   rb_reserve_next_event()
 rb_start_commit()
   local_inc(cpu_buffer-committing);
   local_inc(cpu_buffer-commits);
 
 if (cpu_buffer_a-committing)
 ^^^ test passes and swap is canceled
 
 if (cpu_buffer-record_disabled)
 ^^^ test passes and reserve is canceled
 
 dec(cpu_buffer_a-record_disabled);
 dec(cpu_buffer_b-record_disabled);
 
   Ph, both actions were canceled. The write/reserve could have continued
   if the recording was enabled in time.
 
 Well, it is the price for using lockless algorithms. I think that it happens
 in more situations here, it is not worth more complicated code, and we could
 live with it.
 
 The patch also adds some missing memory barriers. Note that compiler barriers
 are not enough because the data can be accessed by different CPUs.
 
 Signed-off-by: Petr Mladek pmla...@suse.cz
 ---
  kernel/trace/ring_buffer.c | 27 +--
  1 file changed, 21 insertions(+), 6 deletions(-)
 
 diff --git a/kernel/trace/ring_buffer.c b/kernel/trace/ring_buffer.c
 index 7c56c3d06943..3020060ded7e 100644
 --- a/kernel/trace/ring_buffer.c
 +++ b/kernel/trace/ring_buffer.c
 @@ -2529,13 +2529,15 @@ rb_reserve_next_event(struct ring_buffer *buffer,
  
  #ifdef CONFIG_RING_BUFFER_ALLOW_SWAP
   /*
 -  * Due to the ability to swap a cpu buffer from a buffer
 -  * it is possible it was swapped before we committed.
 -  * (committing stops a swap). We check for it here and
 -  * if it happened, we have to fail the write.
 +  * The cpu buffer swap could have started before we managed to stop it
 +  * by incrementing the committing values. If the swap is in progress,
 +  * we see disabled recording. If the swap has finished, we see the new
 +  * cpu buffer. In both cases, we should go back and stop committing
 +  * to the old buffer. See also ring_buffer_swap_cpu()
*/
 - barrier();
 - if (unlikely(ACCESS_ONCE(cpu_buffer-buffer) != buffer)) {
 + smp_mb();
 + if (unlikely(atomic_read(cpu_buffer-record_disabled) ||
 +  ACCESS_ONCE(cpu_buffer-buffer) != buffer)) {
   local_dec(cpu_buffer-committing);
   local_dec(cpu_buffer-commits);
   return NULL;
 @@ -4334,6 +4336,13 @@ 

Re: [PATCH] ring-buffer: Race when writing and swapping cpu buffer in parallel

2014-06-26 Thread Steven Rostedt
On Thu, 26 Jun 2014 09:58:31 -0400
Steven Rostedt rost...@goodmis.org wrote:

 What we can do is force ring_buffer_swap_cpu() to only work for the CPU
 that it is on. As we have snapshot in per_cpu buffers, to make that
 work, we will need to change the per_cpu version of snapshot to do a
 smp_call_function_single() to the CPU that it wants to take a snapshot
 of, and run the swap there.
 
 To force this, we can remove the cpu parameter from the
 ring_buffer_swap_cpu(). By doing this, we may be able to remove some of
 the CONFIG_RING_BUFFER_ALLOW_SWAP hacks too!
 
 I'm not going to sacrifice the general performance of the ring buffer
 for a feature that is seldom (if ever) used.

Did you want to do the above, or do you want me to write something up?

-- Steve
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/