Re: [PATCH 2/2] irq_work: Fix racy IRQ_WORK_BUSY flag setting

2012-10-31 Thread Steven Rostedt
On Wed, 2012-10-31 at 20:04 +0900, anish kumar wrote:

> > This now does:
> > 
> > CPU 1   CPU 2
> > -   -
> > (flags = 0)
> > cmpxchg(flags, 0, IRQ_WORK_FLAGS)
> > (flags = 3)
> > [...]

We can still add here:

(fetch flags)

> > xchg(, IRQ_WORK_BUSY)
> > (flags = 2)
> > func()
> > oflags = cmpxchg(, flags, 
> > nflags);

Then the cmpxchg() would fail, and oflags would be 2

> > (sees flags = 2)
> > if (flags & IRQ_WORK_PENDING)

This should be:
  if (oflags & IRQ_WORK_PENDING)


> > (not true)
> > (loop)  
> > 
> > cmpxchg(flags, 2, 0);
> > (flags = 2)

This should be:
  (flags = 0)
as we described the previous cmpxchg() as failing, flags would still be
2 before this cmpxchg(), and this one would succeed.

-- Steve

> > flags = 3
> > 
> > 
> > 


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 2/2] irq_work: Fix racy IRQ_WORK_BUSY flag setting

2012-10-31 Thread Frederic Weisbecker
2012/10/31 Steven Rostedt :
> On Wed, 2012-10-31 at 20:04 +0900, anish kumar wrote:
>> nflags = 1 | 3
>> nflags = 2 | 3
>> In both cases the result would be same.If I am right then wouldn't this
>> operation be redundant?
>
> Right. Actually we could change the new loop to:
>
> for (;;) {
> oflags = cmpxchg(>flags, flags, IRQ_WORK_FLAGS);
> if (oflags == flags)
> break;
> if (oflags & IRQ_WORK_PENDING)
> return false;
> flags = oflags;
> cpu_relax();
> }

We could. But I wanted to keep the code able to handle new flags in
the future (such as IRQ_WORK_LAZY).

> Frederic,
>
> Would you like to add my explanation to your change log? You can add the
> entire thing, which I think would explain a lot to people.

It's indeed a very clear explanation. I'll put that in the changelog, thanks!
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 2/2] irq_work: Fix racy IRQ_WORK_BUSY flag setting

2012-10-31 Thread Steven Rostedt
On Wed, 2012-10-31 at 20:04 +0900, anish kumar wrote:

> > /*
> >  * Claim the entry so that no one else will poke at it.
> >  */
> > static bool irq_work_claim(struct irq_work *work)
> > {
> > unsigned long flags, nflags;
> > 
> > for (;;) {
> > flags = work->flags;
> > if (flags & IRQ_WORK_PENDING)
> > return false;
> > nflags = flags | IRQ_WORK_FLAGS;
> nflags = 1 | 3
> nflags = 2 | 3
> In both cases the result would be same.If I am right then wouldn't this
> operation be redundant?

Right. Actually we could change the new loop to:

for (;;) {
oflags = cmpxchg(>flags, flags, IRQ_WORK_FLAGS);
if (oflags == flags)
break;
if (oflags & IRQ_WORK_PENDING)
return false;
flags = oflags;
cpu_relax();
}


> > if (cmpxchg(>flags, flags, nflags) == flags)
> > break;
> > cpu_relax();
> > }
> > 
> > return true;
> > }
> > 


> > 
> > This now does:
> > 
> > CPU 1   CPU 2
> > -   -
> > (flags = 0)
> > cmpxchg(flags, 0, IRQ_WORK_FLAGS)
> > (flags = 3)
> > [...]
> > xchg(, IRQ_WORK_BUSY)
> > (flags = 2)
> > func()
> > oflags = cmpxchg(, flags, 
> > nflags);
> > (sees flags = 2)
> > if (flags & IRQ_WORK_PENDING)
> > (not true)
> > (loop)  
> > 
> > cmpxchg(flags, 2, 0);
> > (flags = 2)
> > flags = 3
> > 
> > 
> > 
> > With both patch 1 and 2 you fixed the bug.
> This is the best explanation anyone can put forward for this problem.

Frederic,

Would you like to add my explanation to your change log? You can add the
entire thing, which I think would explain a lot to people.

Thanks,

-- Steve


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 2/2] irq_work: Fix racy IRQ_WORK_BUSY flag setting

2012-10-31 Thread anish kumar
On Tue, 2012-10-30 at 14:45 -0400, Steven Rostedt wrote:
> On Wed, 2012-10-31 at 03:33 +0900, anish kumar wrote:
> 
> > > CPU 0 CPU 1
> > > 
> > > data = something flags = IRQ_WORK_BUSY
> > > smp_mb() (implicit with cmpxchg  smp_mb()
> > >   on flags in claim) execute_work (sees data from CPU
> > > 0)
> > > try to claim
> > > 
> > As I understand without the memory barrier proposed by you the situation
> > would be as below:
> > CPU 0 CPU 1
> >  
> > data = something flags = IRQ_WORK_BUSY
> > smp_mb() (implicit with cmpxchg  execute_work (sees data from CPU 0)
> >on flags in claim)
> > _success_ in claiming and goes
> 
> Correct, because it would see the stale value of flags.
> 
> > ahead and execute the work(wrong?)
> >  cmpxchg cause flag to IRQ_WORK_BUSY
> > 
> > Now knows the flag==IRQ_WORK_BUSY
> > 
> > Am I right?
> 
> right.
> 
> > 
> > Probably a stupid question.Why do we return the bool from irq_work_queue
> > when no one bothers to check the return value?Wouldn't it be better if
> > this function is void as used by the users of this function or am I
> > looking at the wrong code.
> 
> Not a stupid question, as I was asking that to myself just earlier
> today. But forgot to mention it as well. Especially, because it makes it
> look like there's a bug in the code. Maybe someday someone will care if
> their work was finished by itself, or some other CPU.
> 
> Probably should just nix the return value.
Can I send a patch to fix this?
> 
> -- Steve
> 
> 


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 2/2] irq_work: Fix racy IRQ_WORK_BUSY flag setting

2012-10-31 Thread anish kumar
On Tue, 2012-10-30 at 22:23 -0400, Steven Rostedt wrote:
> On Wed, 2012-10-31 at 01:36 +0100, Frederic Weisbecker wrote:
> > 2012/10/30 anish kumar :
> > > As I understand without the memory barrier proposed by you the situation
> > > would be as below:
> > > CPU 0 CPU 1
> > >
> > > data = something flags = IRQ_WORK_BUSY
> > > smp_mb() (implicit with cmpxchg  execute_work (sees data from CPU 0)
> > >on flags in claim)
> > > _success_ in claiming and goes
> > > ahead and execute the work(wrong?)
> > >  cmpxchg cause flag to IRQ_WORK_BUSY
> > >
> > > Now knows the flag==IRQ_WORK_BUSY
> > >
> > > Am I right?
> > 
> > (Adding Paul in Cc because I'm again confused with memory barriers)
> > 
> > Actually what I had in mind is rather that CPU 0 fails its claim
> > because it's not seeing the IRQ_WORK_BUSY flag as it should:
> > 
> > 
> > CPU 0 CPU 1
> > 
> > data = something  flags = IRQ_WORK_BUSY
> > cmpxchg() for claim   execute_work (sees data from CPU 0)
> > 
> > CPU 0 should see IRQ_WORK_BUSY but it may not because CPU 1 sets this
> > value in a non-atomic way.
> > 
> > Also, while browsing Paul's perfbook, I realize my changelog is buggy.
> > It seems we can't reliably use memory barriers here because we would
> > be in the following case:
> > 
> > CPU 0  CPU 1
> > 
> > store(work data)store(flags)
> > smp_mb()smp_mb()
> > load(flags)load(work data)
> > 
> > On top of this barrier pairing, we can't make the assumption that, for
> > example, if CPU 1 sees the work data stored in CPU 0 then CPU 0 sees
> > the flags stored in CPU 1.
> > 
> > So now I wonder if cmpxchg() can give us more confidence:
> 
> More confidence over what? The xchg()? They are equivalent (wrt memory
> barriers).
> 
> Here's the issue that currently exists. Let's look at the code:
> 
> 
> /*
>  * Claim the entry so that no one else will poke at it.
>  */
> static bool irq_work_claim(struct irq_work *work)
> {
>   unsigned long flags, nflags;
> 
>   for (;;) {
>   flags = work->flags;
>   if (flags & IRQ_WORK_PENDING)
>   return false;
>   nflags = flags | IRQ_WORK_FLAGS;
nflags = 1 | 3
nflags = 2 | 3
In both cases the result would be same.If I am right then wouldn't this
operation be redundant?
>   if (cmpxchg(>flags, flags, nflags) == flags)
>   break;
>   cpu_relax();
>   }
> 
>   return true;
> }
> 
> and
> 
>   llnode = llist_del_all(this_list);
>   while (llnode != NULL) {
>   work = llist_entry(llnode, struct irq_work, llnode);
> 
>   llnode = llist_next(llnode);
> 
>   /*
>* Clear the PENDING bit, after this point the @work
>* can be re-used.
>*/
>   work->flags = IRQ_WORK_BUSY;
>   work->func(work);
>   /*
>* Clear the BUSY bit and return to the free state if
>* no-one else claimed it meanwhile.
>*/
>   (void)cmpxchg(>flags, IRQ_WORK_BUSY, 0);
>   }
> 
> The irq_work_claim() will only queue its work if it's not already
> pending. If it is pending, then someone is going to process it anyway.
> But once we start running the function, new work needs to be processed
> again.
> 
> Thus we have:
> 
>   CPU 1   CPU 2
>   -   -
>   (flags = 0)
>   cmpxchg(flags, 0, IRQ_WORK_FLAGS)
>   (flags = 3)
>   [...]
> 
>   if (flags & IRQ_WORK_PENDING)
>   return false
>   flags = IRQ_WORK_BUSY
>   (flags = 2)
>   func()
> 
> The above shows the normal case were CPU 2 doesn't need to queue work,
> because its going to be done for it by CPU 1. But...
> 
> 
> 
>   CPU 1   CPU 2
>   -   -
>   (flags = 0)
>   cmpxchg(flags, 0, IRQ_WORK_FLAGS)
>   (flags = 3)
>   [...]
>   flags = IRQ_WORK_BUSY
>   (flags = 2)
>   func()
>   (sees flags = 3)
>   if (flags & IRQ_WORK_PENDING)
>   return false
>   cmpxchg(flags, 2, 0);
>   (flags = 0)
> 
> 
> Here, because we did not do a memory barrier after 
> flags = IRQ_WORK_BUSY, CPU 2 saw stale data and did not queue its work,
> and missed the opportunity. Now if you had this fix with the xchg() as
> you have in your patch, then CPU 2 would not see the stale flags.
> Except, with the code I 

Re: [PATCH 2/2] irq_work: Fix racy IRQ_WORK_BUSY flag setting

2012-10-31 Thread anish kumar
On Tue, 2012-10-30 at 22:23 -0400, Steven Rostedt wrote:
 On Wed, 2012-10-31 at 01:36 +0100, Frederic Weisbecker wrote:
  2012/10/30 anish kumar anish198519851...@gmail.com:
   As I understand without the memory barrier proposed by you the situation
   would be as below:
   CPU 0 CPU 1
  
   data = something flags = IRQ_WORK_BUSY
   smp_mb() (implicit with cmpxchg  execute_work (sees data from CPU 0)
  on flags in claim)
   _success_ in claiming and goes
   ahead and execute the work(wrong?)
cmpxchg cause flag to IRQ_WORK_BUSY
  
   Now knows the flag==IRQ_WORK_BUSY
  
   Am I right?
  
  (Adding Paul in Cc because I'm again confused with memory barriers)
  
  Actually what I had in mind is rather that CPU 0 fails its claim
  because it's not seeing the IRQ_WORK_BUSY flag as it should:
  
  
  CPU 0 CPU 1
  
  data = something  flags = IRQ_WORK_BUSY
  cmpxchg() for claim   execute_work (sees data from CPU 0)
  
  CPU 0 should see IRQ_WORK_BUSY but it may not because CPU 1 sets this
  value in a non-atomic way.
  
  Also, while browsing Paul's perfbook, I realize my changelog is buggy.
  It seems we can't reliably use memory barriers here because we would
  be in the following case:
  
  CPU 0  CPU 1
  
  store(work data)store(flags)
  smp_mb()smp_mb()
  load(flags)load(work data)
  
  On top of this barrier pairing, we can't make the assumption that, for
  example, if CPU 1 sees the work data stored in CPU 0 then CPU 0 sees
  the flags stored in CPU 1.
  
  So now I wonder if cmpxchg() can give us more confidence:
 
 More confidence over what? The xchg()? They are equivalent (wrt memory
 barriers).
 
 Here's the issue that currently exists. Let's look at the code:
 
 
 /*
  * Claim the entry so that no one else will poke at it.
  */
 static bool irq_work_claim(struct irq_work *work)
 {
   unsigned long flags, nflags;
 
   for (;;) {
   flags = work-flags;
   if (flags  IRQ_WORK_PENDING)
   return false;
   nflags = flags | IRQ_WORK_FLAGS;
nflags = 1 | 3
nflags = 2 | 3
In both cases the result would be same.If I am right then wouldn't this
operation be redundant?
   if (cmpxchg(work-flags, flags, nflags) == flags)
   break;
   cpu_relax();
   }
 
   return true;
 }
 
 and
 
   llnode = llist_del_all(this_list);
   while (llnode != NULL) {
   work = llist_entry(llnode, struct irq_work, llnode);
 
   llnode = llist_next(llnode);
 
   /*
* Clear the PENDING bit, after this point the @work
* can be re-used.
*/
   work-flags = IRQ_WORK_BUSY;
   work-func(work);
   /*
* Clear the BUSY bit and return to the free state if
* no-one else claimed it meanwhile.
*/
   (void)cmpxchg(work-flags, IRQ_WORK_BUSY, 0);
   }
 
 The irq_work_claim() will only queue its work if it's not already
 pending. If it is pending, then someone is going to process it anyway.
 But once we start running the function, new work needs to be processed
 again.
 
 Thus we have:
 
   CPU 1   CPU 2
   -   -
   (flags = 0)
   cmpxchg(flags, 0, IRQ_WORK_FLAGS)
   (flags = 3)
   [...]
 
   if (flags  IRQ_WORK_PENDING)
   return false
   flags = IRQ_WORK_BUSY
   (flags = 2)
   func()
 
 The above shows the normal case were CPU 2 doesn't need to queue work,
 because its going to be done for it by CPU 1. But...
 
 
 
   CPU 1   CPU 2
   -   -
   (flags = 0)
   cmpxchg(flags, 0, IRQ_WORK_FLAGS)
   (flags = 3)
   [...]
   flags = IRQ_WORK_BUSY
   (flags = 2)
   func()
   (sees flags = 3)
   if (flags  IRQ_WORK_PENDING)
   return false
   cmpxchg(flags, 2, 0);
   (flags = 0)
 
 
 Here, because we did not do a memory barrier after 
 flags = IRQ_WORK_BUSY, CPU 2 saw stale data and did not queue its work,
 and missed the opportunity. Now if you had this fix with the xchg() as
 you have in your patch, then CPU 2 would not see the stale flags.
 Except, with the code I showed above it still can!
 
   CPU 1   CPU 2
   -   -
   (flags = 0)
   

Re: [PATCH 2/2] irq_work: Fix racy IRQ_WORK_BUSY flag setting

2012-10-31 Thread anish kumar
On Tue, 2012-10-30 at 14:45 -0400, Steven Rostedt wrote:
 On Wed, 2012-10-31 at 03:33 +0900, anish kumar wrote:
 
   CPU 0 CPU 1
   
   data = something flags = IRQ_WORK_BUSY
   smp_mb() (implicit with cmpxchg  smp_mb()
 on flags in claim) execute_work (sees data from CPU
   0)
   try to claim
   
  As I understand without the memory barrier proposed by you the situation
  would be as below:
  CPU 0 CPU 1
   
  data = something flags = IRQ_WORK_BUSY
  smp_mb() (implicit with cmpxchg  execute_work (sees data from CPU 0)
 on flags in claim)
  _success_ in claiming and goes
 
 Correct, because it would see the stale value of flags.
 
  ahead and execute the work(wrong?)
   cmpxchg cause flag to IRQ_WORK_BUSY
  
  Now knows the flag==IRQ_WORK_BUSY
  
  Am I right?
 
 right.
 
  
  Probably a stupid question.Why do we return the bool from irq_work_queue
  when no one bothers to check the return value?Wouldn't it be better if
  this function is void as used by the users of this function or am I
  looking at the wrong code.
 
 Not a stupid question, as I was asking that to myself just earlier
 today. But forgot to mention it as well. Especially, because it makes it
 look like there's a bug in the code. Maybe someday someone will care if
 their work was finished by itself, or some other CPU.
 
 Probably should just nix the return value.
Can I send a patch to fix this?
 
 -- Steve
 
 


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 2/2] irq_work: Fix racy IRQ_WORK_BUSY flag setting

2012-10-31 Thread Steven Rostedt
On Wed, 2012-10-31 at 20:04 +0900, anish kumar wrote:

  /*
   * Claim the entry so that no one else will poke at it.
   */
  static bool irq_work_claim(struct irq_work *work)
  {
  unsigned long flags, nflags;
  
  for (;;) {
  flags = work-flags;
  if (flags  IRQ_WORK_PENDING)
  return false;
  nflags = flags | IRQ_WORK_FLAGS;
 nflags = 1 | 3
 nflags = 2 | 3
 In both cases the result would be same.If I am right then wouldn't this
 operation be redundant?

Right. Actually we could change the new loop to:

for (;;) {
oflags = cmpxchg(work-flags, flags, IRQ_WORK_FLAGS);
if (oflags == flags)
break;
if (oflags  IRQ_WORK_PENDING)
return false;
flags = oflags;
cpu_relax();
}


  if (cmpxchg(work-flags, flags, nflags) == flags)
  break;
  cpu_relax();
  }
  
  return true;
  }
  


  
  This now does:
  
  CPU 1   CPU 2
  -   -
  (flags = 0)
  cmpxchg(flags, 0, IRQ_WORK_FLAGS)
  (flags = 3)
  [...]
  xchg(flags, IRQ_WORK_BUSY)
  (flags = 2)
  func()
  oflags = cmpxchg(flags, flags, 
  nflags);
  (sees flags = 2)
  if (flags  IRQ_WORK_PENDING)
  (not true)
  (loop)  
  
  cmpxchg(flags, 2, 0);
  (flags = 2)
  flags = 3
  
  
  
  With both patch 1 and 2 you fixed the bug.
 This is the best explanation anyone can put forward for this problem.

Frederic,

Would you like to add my explanation to your change log? You can add the
entire thing, which I think would explain a lot to people.

Thanks,

-- Steve


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 2/2] irq_work: Fix racy IRQ_WORK_BUSY flag setting

2012-10-31 Thread Frederic Weisbecker
2012/10/31 Steven Rostedt rost...@goodmis.org:
 On Wed, 2012-10-31 at 20:04 +0900, anish kumar wrote:
 nflags = 1 | 3
 nflags = 2 | 3
 In both cases the result would be same.If I am right then wouldn't this
 operation be redundant?

 Right. Actually we could change the new loop to:

 for (;;) {
 oflags = cmpxchg(work-flags, flags, IRQ_WORK_FLAGS);
 if (oflags == flags)
 break;
 if (oflags  IRQ_WORK_PENDING)
 return false;
 flags = oflags;
 cpu_relax();
 }

We could. But I wanted to keep the code able to handle new flags in
the future (such as IRQ_WORK_LAZY).

 Frederic,

 Would you like to add my explanation to your change log? You can add the
 entire thing, which I think would explain a lot to people.

It's indeed a very clear explanation. I'll put that in the changelog, thanks!
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 2/2] irq_work: Fix racy IRQ_WORK_BUSY flag setting

2012-10-31 Thread Steven Rostedt
On Wed, 2012-10-31 at 20:04 +0900, anish kumar wrote:

  This now does:
  
  CPU 1   CPU 2
  -   -
  (flags = 0)
  cmpxchg(flags, 0, IRQ_WORK_FLAGS)
  (flags = 3)
  [...]

We can still add here:

(fetch flags)

  xchg(flags, IRQ_WORK_BUSY)
  (flags = 2)
  func()
  oflags = cmpxchg(flags, flags, 
  nflags);

Then the cmpxchg() would fail, and oflags would be 2

  (sees flags = 2)
  if (flags  IRQ_WORK_PENDING)

This should be:
  if (oflags  IRQ_WORK_PENDING)


  (not true)
  (loop)  
  
  cmpxchg(flags, 2, 0);
  (flags = 2)

This should be:
  (flags = 0)
as we described the previous cmpxchg() as failing, flags would still be
2 before this cmpxchg(), and this one would succeed.

-- Steve

  flags = 3
  
  
  


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 2/2] irq_work: Fix racy IRQ_WORK_BUSY flag setting

2012-10-30 Thread Frederic Weisbecker
2012/10/31 Steven Rostedt :
> More confidence over what? The xchg()? They are equivalent (wrt memory
> barriers).
>
> Here's the issue that currently exists. Let's look at the code:
>
>
> /*
>  * Claim the entry so that no one else will poke at it.
>  */
> static bool irq_work_claim(struct irq_work *work)
> {
> unsigned long flags, nflags;
>
> for (;;) {
> flags = work->flags;
> if (flags & IRQ_WORK_PENDING)
> return false;
> nflags = flags | IRQ_WORK_FLAGS;
> if (cmpxchg(>flags, flags, nflags) == flags)
> break;
> cpu_relax();
> }
>
> return true;
> }
>
> and
>
> llnode = llist_del_all(this_list);
> while (llnode != NULL) {
> work = llist_entry(llnode, struct irq_work, llnode);
>
> llnode = llist_next(llnode);
>
> /*
>  * Clear the PENDING bit, after this point the @work
>  * can be re-used.
>  */
> work->flags = IRQ_WORK_BUSY;
> work->func(work);
> /*
>  * Clear the BUSY bit and return to the free state if
>  * no-one else claimed it meanwhile.
>  */
> (void)cmpxchg(>flags, IRQ_WORK_BUSY, 0);
> }
>
> The irq_work_claim() will only queue its work if it's not already
> pending. If it is pending, then someone is going to process it anyway.
> But once we start running the function, new work needs to be processed
> again.
>
> Thus we have:
>
> CPU 1   CPU 2
> -   -
> (flags = 0)
> cmpxchg(flags, 0, IRQ_WORK_FLAGS)
> (flags = 3)
> [...]
>
> if (flags & IRQ_WORK_PENDING)
> return false
> flags = IRQ_WORK_BUSY
> (flags = 2)
> func()
>
> The above shows the normal case were CPU 2 doesn't need to queue work,
> because its going to be done for it by CPU 1. But...
>
>
>
> CPU 1   CPU 2
> -   -
> (flags = 0)
> cmpxchg(flags, 0, IRQ_WORK_FLAGS)
> (flags = 3)
> [...]
> flags = IRQ_WORK_BUSY
> (flags = 2)
> func()
> (sees flags = 3)
> if (flags & IRQ_WORK_PENDING)
> return false
> cmpxchg(flags, 2, 0);
> (flags = 0)
>
>
> Here, because we did not do a memory barrier after
> flags = IRQ_WORK_BUSY, CPU 2 saw stale data and did not queue its work,
> and missed the opportunity. Now if you had this fix with the xchg() as
> you have in your patch, then CPU 2 would not see the stale flags.
> Except, with the code I showed above it still can!
>
> CPU 1   CPU 2
> -   -
> (flags = 0)
> cmpxchg(flags, 0, IRQ_WORK_FLAGS)
> (flags = 3)
> [...]
> (fetch flags)
> xchg(, IRQ_WORK_BUSY)
> (flags = 2)
> func()
> (sees flags = 3)
> if (flags & IRQ_WORK_PENDING)
> return false
> cmpxchg(flags, 2, 0);
> (flags = 0)
>
>
> Even with the update of xchg(), if CPU2 fetched the flags before CPU1
> did the xchg, then it would still lose out. But that's where your
> previous patch comes in that does:
>
>flags = work->flags & ~IRQ_WORK_PENDING;
>for (;;) {
>nflags = flags | IRQ_WORK_FLAGS;
>oflags = cmpxchg(>flags, flags, nflags);
>if (oflags == flags)
>break;
>if (oflags & IRQ_WORK_PENDING)
>return false;
>flags = oflags;
>cpu_relax();
>}
>
>
> This now does:
>
> CPU 1   CPU 2
> -   -
> (flags = 0)
> cmpxchg(flags, 0, IRQ_WORK_FLAGS)
> (flags = 3)
> [...]
> xchg(, IRQ_WORK_BUSY)
> (flags = 2)
> func()
> oflags = cmpxchg(, 
> flags, nflags);
> (sees flags = 2)
> if (flags & IRQ_WORK_PENDING)
> (not true)
> (loop)

Re: [PATCH 2/2] irq_work: Fix racy IRQ_WORK_BUSY flag setting

2012-10-30 Thread Steven Rostedt
On Wed, 2012-10-31 at 01:36 +0100, Frederic Weisbecker wrote:
> 2012/10/30 anish kumar :
> > As I understand without the memory barrier proposed by you the situation
> > would be as below:
> > CPU 0 CPU 1
> >
> > data = something flags = IRQ_WORK_BUSY
> > smp_mb() (implicit with cmpxchg  execute_work (sees data from CPU 0)
> >on flags in claim)
> > _success_ in claiming and goes
> > ahead and execute the work(wrong?)
> >  cmpxchg cause flag to IRQ_WORK_BUSY
> >
> > Now knows the flag==IRQ_WORK_BUSY
> >
> > Am I right?
> 
> (Adding Paul in Cc because I'm again confused with memory barriers)
> 
> Actually what I had in mind is rather that CPU 0 fails its claim
> because it's not seeing the IRQ_WORK_BUSY flag as it should:
> 
> 
> CPU 0 CPU 1
> 
> data = something  flags = IRQ_WORK_BUSY
> cmpxchg() for claim   execute_work (sees data from CPU 0)
> 
> CPU 0 should see IRQ_WORK_BUSY but it may not because CPU 1 sets this
> value in a non-atomic way.
> 
> Also, while browsing Paul's perfbook, I realize my changelog is buggy.
> It seems we can't reliably use memory barriers here because we would
> be in the following case:
> 
> CPU 0  CPU 1
> 
> store(work data)store(flags)
> smp_mb()smp_mb()
> load(flags)load(work data)
> 
> On top of this barrier pairing, we can't make the assumption that, for
> example, if CPU 1 sees the work data stored in CPU 0 then CPU 0 sees
> the flags stored in CPU 1.
> 
> So now I wonder if cmpxchg() can give us more confidence:

More confidence over what? The xchg()? They are equivalent (wrt memory
barriers).

Here's the issue that currently exists. Let's look at the code:


/*
 * Claim the entry so that no one else will poke at it.
 */
static bool irq_work_claim(struct irq_work *work)
{
unsigned long flags, nflags;

for (;;) {
flags = work->flags;
if (flags & IRQ_WORK_PENDING)
return false;
nflags = flags | IRQ_WORK_FLAGS;
if (cmpxchg(>flags, flags, nflags) == flags)
break;
cpu_relax();
}

return true;
}

and

llnode = llist_del_all(this_list);
while (llnode != NULL) {
work = llist_entry(llnode, struct irq_work, llnode);

llnode = llist_next(llnode);

/*
 * Clear the PENDING bit, after this point the @work
 * can be re-used.
 */
work->flags = IRQ_WORK_BUSY;
work->func(work);
/*
 * Clear the BUSY bit and return to the free state if
 * no-one else claimed it meanwhile.
 */
(void)cmpxchg(>flags, IRQ_WORK_BUSY, 0);
}

The irq_work_claim() will only queue its work if it's not already
pending. If it is pending, then someone is going to process it anyway.
But once we start running the function, new work needs to be processed
again.

Thus we have:

CPU 1   CPU 2
-   -
(flags = 0)
cmpxchg(flags, 0, IRQ_WORK_FLAGS)
(flags = 3)
[...]

if (flags & IRQ_WORK_PENDING)
return false
flags = IRQ_WORK_BUSY
(flags = 2)
func()

The above shows the normal case were CPU 2 doesn't need to queue work,
because its going to be done for it by CPU 1. But...



CPU 1   CPU 2
-   -
(flags = 0)
cmpxchg(flags, 0, IRQ_WORK_FLAGS)
(flags = 3)
[...]
flags = IRQ_WORK_BUSY
(flags = 2)
func()
(sees flags = 3)
if (flags & IRQ_WORK_PENDING)
return false
cmpxchg(flags, 2, 0);
(flags = 0)


Here, because we did not do a memory barrier after 
flags = IRQ_WORK_BUSY, CPU 2 saw stale data and did not queue its work,
and missed the opportunity. Now if you had this fix with the xchg() as
you have in your patch, then CPU 2 would not see the stale flags.
Except, with the code I showed above it still can!

CPU 1   CPU 2
-   -
(flags = 0)
cmpxchg(flags, 0, IRQ_WORK_FLAGS)
(flags = 3)
[...]
(fetch flags)
xchg(, IRQ_WORK_BUSY)
(flags = 2)
func()

Re: [PATCH 2/2] irq_work: Fix racy IRQ_WORK_BUSY flag setting

2012-10-30 Thread Frederic Weisbecker
2012/10/30 anish kumar :
> As I understand without the memory barrier proposed by you the situation
> would be as below:
> CPU 0 CPU 1
>
> data = something flags = IRQ_WORK_BUSY
> smp_mb() (implicit with cmpxchg  execute_work (sees data from CPU 0)
>on flags in claim)
> _success_ in claiming and goes
> ahead and execute the work(wrong?)
>  cmpxchg cause flag to IRQ_WORK_BUSY
>
> Now knows the flag==IRQ_WORK_BUSY
>
> Am I right?

(Adding Paul in Cc because I'm again confused with memory barriers)

Actually what I had in mind is rather that CPU 0 fails its claim
because it's not seeing the IRQ_WORK_BUSY flag as it should:


CPU 0 CPU 1

data = something  flags = IRQ_WORK_BUSY
cmpxchg() for claim   execute_work (sees data from CPU 0)

CPU 0 should see IRQ_WORK_BUSY but it may not because CPU 1 sets this
value in a non-atomic way.

Also, while browsing Paul's perfbook, I realize my changelog is buggy.
It seems we can't reliably use memory barriers here because we would
be in the following case:

CPU 0  CPU 1

store(work data)store(flags)
smp_mb()smp_mb()
load(flags)load(work data)

On top of this barrier pairing, we can't make the assumption that, for
example, if CPU 1 sees the work data stored in CPU 0 then CPU 0 sees
the flags stored in CPU 1.

So now I wonder if cmpxchg() can give us more confidence:


CPU 0CPU 1

store(work data)  xchg(flags, IRQ_WORK_BUSY)
cmpxchg(flags, IRQ_WORK_FLAGS)load(work data)

Can I make this assumption?

- If CPU 0 fails the cmpxchg() (which means CPU 1 has not yet xchg())
then CPU 1 will execute the work and see our data.

At least cmpxchg / xchg pair orders correctly to ensure somebody will
execute our work. Now probably some locking is needed from the work
function itself if it's not per cpu.

>
> Probably a stupid question.Why do we return the bool from irq_work_queue
> when no one bothers to check the return value?Wouldn't it be better if
> this function is void as used by the users of this function or am I
> looking at the wrong code.

No idea. Probably Peter had plans there.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 2/2] irq_work: Fix racy IRQ_WORK_BUSY flag setting

2012-10-30 Thread Steven Rostedt
On Wed, 2012-10-31 at 03:33 +0900, anish kumar wrote:

> > CPU 0 CPU 1
> > 
> > data = something flags = IRQ_WORK_BUSY
> > smp_mb() (implicit with cmpxchg  smp_mb()
> >   on flags in claim) execute_work (sees data from CPU
> > 0)
> > try to claim
> > 
> As I understand without the memory barrier proposed by you the situation
> would be as below:
> CPU 0 CPU 1
>  
> data = something flags = IRQ_WORK_BUSY
> smp_mb() (implicit with cmpxchg  execute_work (sees data from CPU 0)
>on flags in claim)
> _success_ in claiming and goes

Correct, because it would see the stale value of flags.

> ahead and execute the work(wrong?)
>cmpxchg cause flag to IRQ_WORK_BUSY
> 
> Now knows the flag==IRQ_WORK_BUSY
> 
> Am I right?

right.

> 
> Probably a stupid question.Why do we return the bool from irq_work_queue
> when no one bothers to check the return value?Wouldn't it be better if
> this function is void as used by the users of this function or am I
> looking at the wrong code.

Not a stupid question, as I was asking that to myself just earlier
today. But forgot to mention it as well. Especially, because it makes it
look like there's a bug in the code. Maybe someday someone will care if
their work was finished by itself, or some other CPU.

Probably should just nix the return value.

-- Steve


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 2/2] irq_work: Fix racy IRQ_WORK_BUSY flag setting

2012-10-30 Thread anish kumar
> The IRQ_WORK_BUSY flag is set right before we execute the
> work. Once this flag value is set, the work enters a
> claimable state again.
> 
> This is necessary because if we want to enqueue a work but we
> fail the claim, we want to ensure that the CPU where that work
> is still pending will see and handle the data we expected the
> work to compute.
> 
> This might not work as expected though because IRQ_WORK_BUSY
> isn't set atomically. By the time a CPU fails a work claim,
> this work may well have been already executed by the CPU where
> it was previously pending.
> 
> Due to the lack of appropriate memory barrier, the IRQ_WORK_BUSY
> flag value may not be visible by the CPU trying to claim while
> the work is executing, and that until we clear the busy bit in
> the work flags using cmpxchg() that implies the full barrier.
> 
> One solution could involve a full barrier between setting
> IRQ_WORK_BUSY flag and the work execution. This way we
> ensure that the work execution site sees the expected data
> and the claim site sees the IRQ_WORK_BUSY:
> 
> CPU 0 CPU 1
> 
> data = something flags = IRQ_WORK_BUSY
> smp_mb() (implicit with cmpxchg  smp_mb()
>   on flags in claim) execute_work (sees data from CPU
> 0)
> try to claim
> 
As I understand without the memory barrier proposed by you the situation
would be as below:
CPU 0 CPU 1
 
data = something flags = IRQ_WORK_BUSY
smp_mb() (implicit with cmpxchg  execute_work (sees data from CPU 0)
   on flags in claim)
_success_ in claiming and goes
ahead and execute the work(wrong?)
 cmpxchg cause flag to IRQ_WORK_BUSY

Now knows the flag==IRQ_WORK_BUSY

Am I right?

Probably a stupid question.Why do we return the bool from irq_work_queue
when no one bothers to check the return value?Wouldn't it be better if
this function is void as used by the users of this function or am I
looking at the wrong code.
> 
> As a shortcut, let's just use xchg() that implies a full memory
> barrier.
> 
> Signed-off-by: Frederic Weisbecker 
> Cc: Peter Zijlstra 
> Cc: Ingo Molnar 
> Cc: Thomas Gleixner 
> Cc: Andrew Morton 
> Cc: Steven Rostedt 
> Cc: Paul Gortmaker 
> ---
>  kernel/irq_work.c |7 +--
>  1 files changed, 5 insertions(+), 2 deletions(-)
> 
> diff --git a/kernel/irq_work.c b/kernel/irq_work.c
> index 764240a..ea79365 100644
> --- a/kernel/irq_work.c
> +++ b/kernel/irq_work.c
> @@ -130,9 +130,12 @@ void irq_work_run(void)
>  
> /*
>  * Clear the PENDING bit, after this point the @work
> -* can be re-used.
> +* can be re-used. Use xchg to force ordering against
> +* data to process, such that if claiming fails on
> +* another CPU, we see and handle the data it wants
> +* us to process on the work.
>  */
> -   work->flags = IRQ_WORK_BUSY;
> +   xchg(>flags, IRQ_WORK_BUSY);
> work->func(work);
> /*
>  * Clear the BUSY bit and return to the free state if
> -- 
> 1.7.5.4
> 
> --
> To unsubscribe from this list: send the line "unsubscribe
> linux-kernel" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/ 

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 2/2] irq_work: Fix racy IRQ_WORK_BUSY flag setting

2012-10-30 Thread Steven Rostedt
On Tue, 2012-10-30 at 16:35 +0100, Frederic Weisbecker wrote:
> The IRQ_WORK_BUSY flag is set right before we execute the
> work. Once this flag value is set, the work enters a
> claimable state again.
> 
> This is necessary because if we want to enqueue a work but we
> fail the claim, we want to ensure that the CPU where that work
> is still pending will see and handle the data we expected the
> work to compute.
> 
> This might not work as expected though because IRQ_WORK_BUSY
> isn't set atomically. By the time a CPU fails a work claim,
> this work may well have been already executed by the CPU where
> it was previously pending.
> 
> Due to the lack of appropriate memory barrier, the IRQ_WORK_BUSY
> flag value may not be visible by the CPU trying to claim while
> the work is executing, and that until we clear the busy bit in
> the work flags using cmpxchg() that implies the full barrier.
> 
> One solution could involve a full barrier between setting
> IRQ_WORK_BUSY flag and the work execution. This way we
> ensure that the work execution site sees the expected data
> and the claim site sees the IRQ_WORK_BUSY:
> 
> CPU 0 CPU 1
> 
> data = something flags = IRQ_WORK_BUSY
> smp_mb() (implicit with cmpxchg  smp_mb()
>   on flags in claim) execute_work (sees data from CPU 0)
> try to claim
> 
> As a shortcut, let's just use xchg() that implies a full memory
> barrier.
> 
> Signed-off-by: Frederic Weisbecker 
> Cc: Peter Zijlstra 
> Cc: Ingo Molnar 
> Cc: Thomas Gleixner 
> Cc: Andrew Morton 
> Cc: Steven Rostedt 
> Cc: Paul Gortmaker 

Reviewed-by: Steven Rostedt 

-- Steve

> ---
>  kernel/irq_work.c |7 +--
>  1 files changed, 5 insertions(+), 2 deletions(-)
> 
> diff --git a/kernel/irq_work.c b/kernel/irq_work.c
> index 764240a..ea79365 100644
> --- a/kernel/irq_work.c
> +++ b/kernel/irq_work.c
> @@ -130,9 +130,12 @@ void irq_work_run(void)
>  
>   /*
>* Clear the PENDING bit, after this point the @work
> -  * can be re-used.
> +  * can be re-used. Use xchg to force ordering against
> +  * data to process, such that if claiming fails on
> +  * another CPU, we see and handle the data it wants
> +  * us to process on the work.
>*/
> - work->flags = IRQ_WORK_BUSY;
> + xchg(>flags, IRQ_WORK_BUSY);
>   work->func(work);
>   /*
>* Clear the BUSY bit and return to the free state if


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 2/2] irq_work: Fix racy IRQ_WORK_BUSY flag setting

2012-10-30 Thread Steven Rostedt
On Tue, 2012-10-30 at 16:35 +0100, Frederic Weisbecker wrote:
 The IRQ_WORK_BUSY flag is set right before we execute the
 work. Once this flag value is set, the work enters a
 claimable state again.
 
 This is necessary because if we want to enqueue a work but we
 fail the claim, we want to ensure that the CPU where that work
 is still pending will see and handle the data we expected the
 work to compute.
 
 This might not work as expected though because IRQ_WORK_BUSY
 isn't set atomically. By the time a CPU fails a work claim,
 this work may well have been already executed by the CPU where
 it was previously pending.
 
 Due to the lack of appropriate memory barrier, the IRQ_WORK_BUSY
 flag value may not be visible by the CPU trying to claim while
 the work is executing, and that until we clear the busy bit in
 the work flags using cmpxchg() that implies the full barrier.
 
 One solution could involve a full barrier between setting
 IRQ_WORK_BUSY flag and the work execution. This way we
 ensure that the work execution site sees the expected data
 and the claim site sees the IRQ_WORK_BUSY:
 
 CPU 0 CPU 1
 
 data = something flags = IRQ_WORK_BUSY
 smp_mb() (implicit with cmpxchg  smp_mb()
   on flags in claim) execute_work (sees data from CPU 0)
 try to claim
 
 As a shortcut, let's just use xchg() that implies a full memory
 barrier.
 
 Signed-off-by: Frederic Weisbecker fweis...@gmail.com
 Cc: Peter Zijlstra pet...@infradead.org
 Cc: Ingo Molnar mi...@kernel.org
 Cc: Thomas Gleixner t...@linutronix.de
 Cc: Andrew Morton a...@linux-foundation.org
 Cc: Steven Rostedt rost...@goodmis.org
 Cc: Paul Gortmaker paul.gortma...@windriver.com

Reviewed-by: Steven Rostedt rost...@goodmis.org

-- Steve

 ---
  kernel/irq_work.c |7 +--
  1 files changed, 5 insertions(+), 2 deletions(-)
 
 diff --git a/kernel/irq_work.c b/kernel/irq_work.c
 index 764240a..ea79365 100644
 --- a/kernel/irq_work.c
 +++ b/kernel/irq_work.c
 @@ -130,9 +130,12 @@ void irq_work_run(void)
  
   /*
* Clear the PENDING bit, after this point the @work
 -  * can be re-used.
 +  * can be re-used. Use xchg to force ordering against
 +  * data to process, such that if claiming fails on
 +  * another CPU, we see and handle the data it wants
 +  * us to process on the work.
*/
 - work-flags = IRQ_WORK_BUSY;
 + xchg(work-flags, IRQ_WORK_BUSY);
   work-func(work);
   /*
* Clear the BUSY bit and return to the free state if


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 2/2] irq_work: Fix racy IRQ_WORK_BUSY flag setting

2012-10-30 Thread anish kumar
 The IRQ_WORK_BUSY flag is set right before we execute the
 work. Once this flag value is set, the work enters a
 claimable state again.
 
 This is necessary because if we want to enqueue a work but we
 fail the claim, we want to ensure that the CPU where that work
 is still pending will see and handle the data we expected the
 work to compute.
 
 This might not work as expected though because IRQ_WORK_BUSY
 isn't set atomically. By the time a CPU fails a work claim,
 this work may well have been already executed by the CPU where
 it was previously pending.
 
 Due to the lack of appropriate memory barrier, the IRQ_WORK_BUSY
 flag value may not be visible by the CPU trying to claim while
 the work is executing, and that until we clear the busy bit in
 the work flags using cmpxchg() that implies the full barrier.
 
 One solution could involve a full barrier between setting
 IRQ_WORK_BUSY flag and the work execution. This way we
 ensure that the work execution site sees the expected data
 and the claim site sees the IRQ_WORK_BUSY:
 
 CPU 0 CPU 1
 
 data = something flags = IRQ_WORK_BUSY
 smp_mb() (implicit with cmpxchg  smp_mb()
   on flags in claim) execute_work (sees data from CPU
 0)
 try to claim
 
As I understand without the memory barrier proposed by you the situation
would be as below:
CPU 0 CPU 1
 
data = something flags = IRQ_WORK_BUSY
smp_mb() (implicit with cmpxchg  execute_work (sees data from CPU 0)
   on flags in claim)
_success_ in claiming and goes
ahead and execute the work(wrong?)
 cmpxchg cause flag to IRQ_WORK_BUSY

Now knows the flag==IRQ_WORK_BUSY

Am I right?

Probably a stupid question.Why do we return the bool from irq_work_queue
when no one bothers to check the return value?Wouldn't it be better if
this function is void as used by the users of this function or am I
looking at the wrong code.
 
 As a shortcut, let's just use xchg() that implies a full memory
 barrier.
 
 Signed-off-by: Frederic Weisbecker fweis...@gmail.com
 Cc: Peter Zijlstra pet...@infradead.org
 Cc: Ingo Molnar mi...@kernel.org
 Cc: Thomas Gleixner t...@linutronix.de
 Cc: Andrew Morton a...@linux-foundation.org
 Cc: Steven Rostedt rost...@goodmis.org
 Cc: Paul Gortmaker paul.gortma...@windriver.com
 ---
  kernel/irq_work.c |7 +--
  1 files changed, 5 insertions(+), 2 deletions(-)
 
 diff --git a/kernel/irq_work.c b/kernel/irq_work.c
 index 764240a..ea79365 100644
 --- a/kernel/irq_work.c
 +++ b/kernel/irq_work.c
 @@ -130,9 +130,12 @@ void irq_work_run(void)
  
 /*
  * Clear the PENDING bit, after this point the @work
 -* can be re-used.
 +* can be re-used. Use xchg to force ordering against
 +* data to process, such that if claiming fails on
 +* another CPU, we see and handle the data it wants
 +* us to process on the work.
  */
 -   work-flags = IRQ_WORK_BUSY;
 +   xchg(work-flags, IRQ_WORK_BUSY);
 work-func(work);
 /*
  * Clear the BUSY bit and return to the free state if
 -- 
 1.7.5.4
 
 --
 To unsubscribe from this list: send the line unsubscribe
 linux-kernel in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
 Please read the FAQ at  http://www.tux.org/lkml/ 

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 2/2] irq_work: Fix racy IRQ_WORK_BUSY flag setting

2012-10-30 Thread Steven Rostedt
On Wed, 2012-10-31 at 03:33 +0900, anish kumar wrote:

  CPU 0 CPU 1
  
  data = something flags = IRQ_WORK_BUSY
  smp_mb() (implicit with cmpxchg  smp_mb()
on flags in claim) execute_work (sees data from CPU
  0)
  try to claim
  
 As I understand without the memory barrier proposed by you the situation
 would be as below:
 CPU 0 CPU 1
  
 data = something flags = IRQ_WORK_BUSY
 smp_mb() (implicit with cmpxchg  execute_work (sees data from CPU 0)
on flags in claim)
 _success_ in claiming and goes

Correct, because it would see the stale value of flags.

 ahead and execute the work(wrong?)
cmpxchg cause flag to IRQ_WORK_BUSY
 
 Now knows the flag==IRQ_WORK_BUSY
 
 Am I right?

right.

 
 Probably a stupid question.Why do we return the bool from irq_work_queue
 when no one bothers to check the return value?Wouldn't it be better if
 this function is void as used by the users of this function or am I
 looking at the wrong code.

Not a stupid question, as I was asking that to myself just earlier
today. But forgot to mention it as well. Especially, because it makes it
look like there's a bug in the code. Maybe someday someone will care if
their work was finished by itself, or some other CPU.

Probably should just nix the return value.

-- Steve


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 2/2] irq_work: Fix racy IRQ_WORK_BUSY flag setting

2012-10-30 Thread Frederic Weisbecker
2012/10/30 anish kumar anish198519851...@gmail.com:
 As I understand without the memory barrier proposed by you the situation
 would be as below:
 CPU 0 CPU 1

 data = something flags = IRQ_WORK_BUSY
 smp_mb() (implicit with cmpxchg  execute_work (sees data from CPU 0)
on flags in claim)
 _success_ in claiming and goes
 ahead and execute the work(wrong?)
  cmpxchg cause flag to IRQ_WORK_BUSY

 Now knows the flag==IRQ_WORK_BUSY

 Am I right?

(Adding Paul in Cc because I'm again confused with memory barriers)

Actually what I had in mind is rather that CPU 0 fails its claim
because it's not seeing the IRQ_WORK_BUSY flag as it should:


CPU 0 CPU 1

data = something  flags = IRQ_WORK_BUSY
cmpxchg() for claim   execute_work (sees data from CPU 0)

CPU 0 should see IRQ_WORK_BUSY but it may not because CPU 1 sets this
value in a non-atomic way.

Also, while browsing Paul's perfbook, I realize my changelog is buggy.
It seems we can't reliably use memory barriers here because we would
be in the following case:

CPU 0  CPU 1

store(work data)store(flags)
smp_mb()smp_mb()
load(flags)load(work data)

On top of this barrier pairing, we can't make the assumption that, for
example, if CPU 1 sees the work data stored in CPU 0 then CPU 0 sees
the flags stored in CPU 1.

So now I wonder if cmpxchg() can give us more confidence:


CPU 0CPU 1

store(work data)  xchg(flags, IRQ_WORK_BUSY)
cmpxchg(flags, IRQ_WORK_FLAGS)load(work data)

Can I make this assumption?

- If CPU 0 fails the cmpxchg() (which means CPU 1 has not yet xchg())
then CPU 1 will execute the work and see our data.

At least cmpxchg / xchg pair orders correctly to ensure somebody will
execute our work. Now probably some locking is needed from the work
function itself if it's not per cpu.


 Probably a stupid question.Why do we return the bool from irq_work_queue
 when no one bothers to check the return value?Wouldn't it be better if
 this function is void as used by the users of this function or am I
 looking at the wrong code.

No idea. Probably Peter had plans there.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 2/2] irq_work: Fix racy IRQ_WORK_BUSY flag setting

2012-10-30 Thread Steven Rostedt
On Wed, 2012-10-31 at 01:36 +0100, Frederic Weisbecker wrote:
 2012/10/30 anish kumar anish198519851...@gmail.com:
  As I understand without the memory barrier proposed by you the situation
  would be as below:
  CPU 0 CPU 1
 
  data = something flags = IRQ_WORK_BUSY
  smp_mb() (implicit with cmpxchg  execute_work (sees data from CPU 0)
 on flags in claim)
  _success_ in claiming and goes
  ahead and execute the work(wrong?)
   cmpxchg cause flag to IRQ_WORK_BUSY
 
  Now knows the flag==IRQ_WORK_BUSY
 
  Am I right?
 
 (Adding Paul in Cc because I'm again confused with memory barriers)
 
 Actually what I had in mind is rather that CPU 0 fails its claim
 because it's not seeing the IRQ_WORK_BUSY flag as it should:
 
 
 CPU 0 CPU 1
 
 data = something  flags = IRQ_WORK_BUSY
 cmpxchg() for claim   execute_work (sees data from CPU 0)
 
 CPU 0 should see IRQ_WORK_BUSY but it may not because CPU 1 sets this
 value in a non-atomic way.
 
 Also, while browsing Paul's perfbook, I realize my changelog is buggy.
 It seems we can't reliably use memory barriers here because we would
 be in the following case:
 
 CPU 0  CPU 1
 
 store(work data)store(flags)
 smp_mb()smp_mb()
 load(flags)load(work data)
 
 On top of this barrier pairing, we can't make the assumption that, for
 example, if CPU 1 sees the work data stored in CPU 0 then CPU 0 sees
 the flags stored in CPU 1.
 
 So now I wonder if cmpxchg() can give us more confidence:

More confidence over what? The xchg()? They are equivalent (wrt memory
barriers).

Here's the issue that currently exists. Let's look at the code:


/*
 * Claim the entry so that no one else will poke at it.
 */
static bool irq_work_claim(struct irq_work *work)
{
unsigned long flags, nflags;

for (;;) {
flags = work-flags;
if (flags  IRQ_WORK_PENDING)
return false;
nflags = flags | IRQ_WORK_FLAGS;
if (cmpxchg(work-flags, flags, nflags) == flags)
break;
cpu_relax();
}

return true;
}

and

llnode = llist_del_all(this_list);
while (llnode != NULL) {
work = llist_entry(llnode, struct irq_work, llnode);

llnode = llist_next(llnode);

/*
 * Clear the PENDING bit, after this point the @work
 * can be re-used.
 */
work-flags = IRQ_WORK_BUSY;
work-func(work);
/*
 * Clear the BUSY bit and return to the free state if
 * no-one else claimed it meanwhile.
 */
(void)cmpxchg(work-flags, IRQ_WORK_BUSY, 0);
}

The irq_work_claim() will only queue its work if it's not already
pending. If it is pending, then someone is going to process it anyway.
But once we start running the function, new work needs to be processed
again.

Thus we have:

CPU 1   CPU 2
-   -
(flags = 0)
cmpxchg(flags, 0, IRQ_WORK_FLAGS)
(flags = 3)
[...]

if (flags  IRQ_WORK_PENDING)
return false
flags = IRQ_WORK_BUSY
(flags = 2)
func()

The above shows the normal case were CPU 2 doesn't need to queue work,
because its going to be done for it by CPU 1. But...



CPU 1   CPU 2
-   -
(flags = 0)
cmpxchg(flags, 0, IRQ_WORK_FLAGS)
(flags = 3)
[...]
flags = IRQ_WORK_BUSY
(flags = 2)
func()
(sees flags = 3)
if (flags  IRQ_WORK_PENDING)
return false
cmpxchg(flags, 2, 0);
(flags = 0)


Here, because we did not do a memory barrier after 
flags = IRQ_WORK_BUSY, CPU 2 saw stale data and did not queue its work,
and missed the opportunity. Now if you had this fix with the xchg() as
you have in your patch, then CPU 2 would not see the stale flags.
Except, with the code I showed above it still can!

CPU 1   CPU 2
-   -
(flags = 0)
cmpxchg(flags, 0, IRQ_WORK_FLAGS)
(flags = 3)
[...]
(fetch flags)
xchg(flags, IRQ_WORK_BUSY)
(flags = 2)
func()
 

Re: [PATCH 2/2] irq_work: Fix racy IRQ_WORK_BUSY flag setting

2012-10-30 Thread Frederic Weisbecker
2012/10/31 Steven Rostedt rost...@goodmis.org:
 More confidence over what? The xchg()? They are equivalent (wrt memory
 barriers).

 Here's the issue that currently exists. Let's look at the code:


 /*
  * Claim the entry so that no one else will poke at it.
  */
 static bool irq_work_claim(struct irq_work *work)
 {
 unsigned long flags, nflags;

 for (;;) {
 flags = work-flags;
 if (flags  IRQ_WORK_PENDING)
 return false;
 nflags = flags | IRQ_WORK_FLAGS;
 if (cmpxchg(work-flags, flags, nflags) == flags)
 break;
 cpu_relax();
 }

 return true;
 }

 and

 llnode = llist_del_all(this_list);
 while (llnode != NULL) {
 work = llist_entry(llnode, struct irq_work, llnode);

 llnode = llist_next(llnode);

 /*
  * Clear the PENDING bit, after this point the @work
  * can be re-used.
  */
 work-flags = IRQ_WORK_BUSY;
 work-func(work);
 /*
  * Clear the BUSY bit and return to the free state if
  * no-one else claimed it meanwhile.
  */
 (void)cmpxchg(work-flags, IRQ_WORK_BUSY, 0);
 }

 The irq_work_claim() will only queue its work if it's not already
 pending. If it is pending, then someone is going to process it anyway.
 But once we start running the function, new work needs to be processed
 again.

 Thus we have:

 CPU 1   CPU 2
 -   -
 (flags = 0)
 cmpxchg(flags, 0, IRQ_WORK_FLAGS)
 (flags = 3)
 [...]

 if (flags  IRQ_WORK_PENDING)
 return false
 flags = IRQ_WORK_BUSY
 (flags = 2)
 func()

 The above shows the normal case were CPU 2 doesn't need to queue work,
 because its going to be done for it by CPU 1. But...



 CPU 1   CPU 2
 -   -
 (flags = 0)
 cmpxchg(flags, 0, IRQ_WORK_FLAGS)
 (flags = 3)
 [...]
 flags = IRQ_WORK_BUSY
 (flags = 2)
 func()
 (sees flags = 3)
 if (flags  IRQ_WORK_PENDING)
 return false
 cmpxchg(flags, 2, 0);
 (flags = 0)


 Here, because we did not do a memory barrier after
 flags = IRQ_WORK_BUSY, CPU 2 saw stale data and did not queue its work,
 and missed the opportunity. Now if you had this fix with the xchg() as
 you have in your patch, then CPU 2 would not see the stale flags.
 Except, with the code I showed above it still can!

 CPU 1   CPU 2
 -   -
 (flags = 0)
 cmpxchg(flags, 0, IRQ_WORK_FLAGS)
 (flags = 3)
 [...]
 (fetch flags)
 xchg(flags, IRQ_WORK_BUSY)
 (flags = 2)
 func()
 (sees flags = 3)
 if (flags  IRQ_WORK_PENDING)
 return false
 cmpxchg(flags, 2, 0);
 (flags = 0)


 Even with the update of xchg(), if CPU2 fetched the flags before CPU1
 did the xchg, then it would still lose out. But that's where your
 previous patch comes in that does:

flags = work-flags  ~IRQ_WORK_PENDING;
for (;;) {
nflags = flags | IRQ_WORK_FLAGS;
oflags = cmpxchg(work-flags, flags, nflags);
if (oflags == flags)
break;
if (oflags  IRQ_WORK_PENDING)
return false;
flags = oflags;
cpu_relax();
}


 This now does:

 CPU 1   CPU 2
 -   -
 (flags = 0)
 cmpxchg(flags, 0, IRQ_WORK_FLAGS)
 (flags = 3)
 [...]
 xchg(flags, IRQ_WORK_BUSY)
 (flags = 2)
 func()
 oflags = cmpxchg(flags, 
 flags, nflags);
 (sees flags = 2)
 if (flags  IRQ_WORK_PENDING)
 (not true)
 (loop)
 cmpxchg(flags, 2, 0);
 (flags = 2)
 flags =