subject:"\[PATCH\] PHYLIB\: IRQ event workqueue handling fixes"

Re: [PATCH] flush_work_sync vs. flush_scheduled_work Re: [PATCH] PHYLIB: IRQ event workqueue handling fixes

2007-10-23 Thread Jarek Poplawski

On Mon, Oct 22, 2007 at 10:02:59PM +0400, Oleg Nesterov wrote:
...
> If this work doesn't rearm itself - yes. (otherwise, the same ->func
> can run twice _at the same time_)
> 
> But again, in this case wait_on_work() after try_to_grab_pending() == 1
> doesn't block, so we can just do
> 
>   if (cancel_work_sync(w))
>   w->func();
> 

...but, if it were run just before work_clear_pending()?

Jarek P.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] flush_work_sync vs. flush_scheduled_work Re: [PATCH] PHYLIB: IRQ event workqueue handling fixes

2007-10-23 Thread Jarek Poplawski

On Mon, Oct 22, 2007 at 10:02:59PM +0400, Oleg Nesterov wrote:
> On 10/22, Jarek Poplawski wrote:
...
> > OK, I know I'm dumber and dumber everyday,
> 
> You are not alone. I have the same feeling about myself!

Feeling is not the same, only true knowledge counts!

> 
> > these all flushes are
> > rtnl lockup vulnerable wrt. other work functions, but cancel_work_sync
> > looks perfectly fine
> 
> Yes,
> 
> > Then, if by any chance I'm right, something like flush_work_sync
> > (or changed flush_scheduled_work, if there is no problem with such
> > a change of implementation) could be safely (if it's called without
> > locks used by flushed work only) done cancel_work_sync() way, by
> > running a work function after try_to_grab_pending() returns 1
> 
> If this work doesn't rearm itself - yes. (otherwise, the same ->func
> can run twice _at the same time_)
> 
> But again, in this case wait_on_work() after try_to_grab_pending() == 1
> doesn't block, so we can just do
> 
>   if (cancel_work_sync(w))
>   w->func();

Of course, I should have written it much shorter by saying
flush_scheduled_work could be done internally just like you suggested!

My point is to make this all safer and simpler, so we don't have to
remember each time about these differences wrt. locking. Then checking
of such patches could be much easier. Unless this current behavior
of flush_scheduled_work has any real advantages, worth of this
potential unsafety. (Btw. is there any reason to use this with
rearming works, anyway?)

Thanks,
Jarek P.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] flush_work_sync vs. flush_scheduled_work Re: [PATCH] PHYLIB: IRQ event workqueue handling fixes

2007-10-23 Thread Jarek Poplawski

On Mon, Oct 22, 2007 at 10:02:59PM +0400, Oleg Nesterov wrote:
 On 10/22, Jarek Poplawski wrote:
...
  OK, I know I'm dumber and dumber everyday,
 
 You are not alone. I have the same feeling about myself!

Feeling is not the same, only true knowledge counts!

 
  these all flushes are
  rtnl lockup vulnerable wrt. other work functions, but cancel_work_sync
  looks perfectly fine
 
 Yes,
 
  Then, if by any chance I'm right, something like flush_work_sync
  (or changed flush_scheduled_work, if there is no problem with such
  a change of implementation) could be safely (if it's called without
  locks used by flushed work only) done cancel_work_sync() way, by
  running a work function after try_to_grab_pending() returns 1
 
 If this work doesn't rearm itself - yes. (otherwise, the same -func
 can run twice _at the same time_)
 
 But again, in this case wait_on_work() after try_to_grab_pending() == 1
 doesn't block, so we can just do
 
   if (cancel_work_sync(w))
   w-func();

Of course, I should have written it much shorter by saying
flush_scheduled_work could be done internally just like you suggested!

My point is to make this all safer and simpler, so we don't have to
remember each time about these differences wrt. locking. Then checking
of such patches could be much easier. Unless this current behavior
of flush_scheduled_work has any real advantages, worth of this
potential unsafety. (Btw. is there any reason to use this with
rearming works, anyway?)

Thanks,
Jarek P.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] flush_work_sync vs. flush_scheduled_work Re: [PATCH] PHYLIB: IRQ event workqueue handling fixes

2007-10-23 Thread Jarek Poplawski

On Mon, Oct 22, 2007 at 10:02:59PM +0400, Oleg Nesterov wrote:
...
 If this work doesn't rearm itself - yes. (otherwise, the same -func
 can run twice _at the same time_)
 
 But again, in this case wait_on_work() after try_to_grab_pending() == 1
 doesn't block, so we can just do
 
   if (cancel_work_sync(w))
   w-func();
 

...but, if it were run just before work_clear_pending()?

Jarek P.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] flush_work_sync vs. flush_scheduled_work Re: [PATCH] PHYLIB: IRQ event workqueue handling fixes

2007-10-22 Thread Oleg Nesterov

On 10/22, Jarek Poplawski wrote:
>
> On Fri, Oct 19, 2007 at 09:50:14AM +0200, Jarek Poplawski wrote:
> > On Thu, Oct 18, 2007 at 07:48:19PM +0400, Oleg Nesterov wrote:
> > > On 10/18, Jarek Poplawski wrote:
> > > >
> > > > +/**
> > > > + * flush_work_sync - block until a work_struct's callback has 
> > > > terminated
> > > 
> > > ^^^
> > > Hmm...
> > > 
> > > > + * Similar to cancel_work_sync() but will only busy wait (without 
> > > > cancel)
> > > > + * if the work is queued.
> > > 
> > > Yes, it won't block, but will spin in busy-wait loop until all other works
> > > scheduled before this work are finished. Not good. After that it really
> > > blocks waiting for this work to complete.
> > > 
> > > And I am a bit confused. We can't use flush_workqueue() because some of 
> > > the
> > > queued work_structs may take rtnl_lock, yes? But in that case we can't use
> > > the new flush_work_sync() helper as well, no?
> 
> OK, I know I'm dumber and dumber everyday,

You are not alone. I have the same feeling about myself!

> these all flushes are
> rtnl lockup vulnerable wrt. other work functions, but cancel_work_sync
> looks perfectly fine

Yes,

> Then, if by any chance I'm right, something like flush_work_sync
> (or changed flush_scheduled_work, if there is no problem with such
> a change of implementation) could be safely (if it's called without
> locks used by flushed work only) done cancel_work_sync() way, by
> running a work function after try_to_grab_pending() returns 1

If this work doesn't rearm itself - yes. (otherwise, the same ->func
can run twice _at the same time_)

But again, in this case wait_on_work() after try_to_grab_pending() == 1
doesn't block, so we can just do

if (cancel_work_sync(w))
w->func();

Oleg.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] flush_work_sync vs. flush_scheduled_work Re: [PATCH] PHYLIB: IRQ event workqueue handling fixes

2007-10-22 Thread Jarek Poplawski

On Fri, Oct 19, 2007 at 09:50:14AM +0200, Jarek Poplawski wrote:
> On Thu, Oct 18, 2007 at 07:48:19PM +0400, Oleg Nesterov wrote:
> > On 10/18, Jarek Poplawski wrote:
> > >
> > > +/**
> > > + * flush_work_sync - block until a work_struct's callback has terminated
> > ^^^
> > Hmm...
> > 
> > > + * Similar to cancel_work_sync() but will only busy wait (without cancel)
> > > + * if the work is queued.
> > 
> > Yes, it won't block, but will spin in busy-wait loop until all other works
> > scheduled before this work are finished. Not good. After that it really
> > blocks waiting for this work to complete.
> > 
> > And I am a bit confused. We can't use flush_workqueue() because some of the
> > queued work_structs may take rtnl_lock, yes? But in that case we can't use
> > the new flush_work_sync() helper as well, no?

OK, I know I'm dumber and dumber everyday, but it seems in a hurry I
got it wrong again or miss something (as usual): these all flushes are
rtnl lockup vulnerable wrt. other work functions, but cancel_work_sync
looks perfectly fine... (Or am I wrong because: ...?)

Then, if by any chance I'm right, something like flush_work_sync
(or changed flush_scheduled_work, if there is no problem with such
a change of implementation) could be safely (if it's called without
locks used by flushed work only) done cancel_work_sync() way, by
running a work function after try_to_grab_pending() returns 1 (after
list_del_init - of course without respecting a queue order).

Regards,
Jarek P.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] flush_work_sync vs. flush_scheduled_work Re: [PATCH] PHYLIB: IRQ event workqueue handling fixes

2007-10-22 Thread Jarek Poplawski

On Fri, Oct 19, 2007 at 09:50:14AM +0200, Jarek Poplawski wrote:
 On Thu, Oct 18, 2007 at 07:48:19PM +0400, Oleg Nesterov wrote:
  On 10/18, Jarek Poplawski wrote:
  
   +/**
   + * flush_work_sync - block until a work_struct's callback has terminated
  ^^^
  Hmm...
  
   + * Similar to cancel_work_sync() but will only busy wait (without cancel)
   + * if the work is queued.
  
  Yes, it won't block, but will spin in busy-wait loop until all other works
  scheduled before this work are finished. Not good. After that it really
  blocks waiting for this work to complete.
  
  And I am a bit confused. We can't use flush_workqueue() because some of the
  queued work_structs may take rtnl_lock, yes? But in that case we can't use
  the new flush_work_sync() helper as well, no?

OK, I know I'm dumber and dumber everyday, but it seems in a hurry I
got it wrong again or miss something (as usual): these all flushes are
rtnl lockup vulnerable wrt. other work functions, but cancel_work_sync
looks perfectly fine... (Or am I wrong because: ...?)

Then, if by any chance I'm right, something like flush_work_sync
(or changed flush_scheduled_work, if there is no problem with such
a change of implementation) could be safely (if it's called without
locks used by flushed work only) done cancel_work_sync() way, by
running a work function after try_to_grab_pending() returns 1 (after
list_del_init - of course without respecting a queue order).

Regards,
Jarek P.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] flush_work_sync vs. flush_scheduled_work Re: [PATCH] PHYLIB: IRQ event workqueue handling fixes

2007-10-22 Thread Oleg Nesterov

On 10/22, Jarek Poplawski wrote:

 On Fri, Oct 19, 2007 at 09:50:14AM +0200, Jarek Poplawski wrote:
  On Thu, Oct 18, 2007 at 07:48:19PM +0400, Oleg Nesterov wrote:
   On 10/18, Jarek Poplawski wrote:
   
+/**
+ * flush_work_sync - block until a work_struct's callback has 
terminated
   
   ^^^
   Hmm...
   
+ * Similar to cancel_work_sync() but will only busy wait (without 
cancel)
+ * if the work is queued.
   
   Yes, it won't block, but will spin in busy-wait loop until all other works
   scheduled before this work are finished. Not good. After that it really
   blocks waiting for this work to complete.
   
   And I am a bit confused. We can't use flush_workqueue() because some of 
   the
   queued work_structs may take rtnl_lock, yes? But in that case we can't use
   the new flush_work_sync() helper as well, no?
 
 OK, I know I'm dumber and dumber everyday,

You are not alone. I have the same feeling about myself!

 these all flushes are
 rtnl lockup vulnerable wrt. other work functions, but cancel_work_sync
 looks perfectly fine

Yes,

 Then, if by any chance I'm right, something like flush_work_sync
 (or changed flush_scheduled_work, if there is no problem with such
 a change of implementation) could be safely (if it's called without
 locks used by flushed work only) done cancel_work_sync() way, by
 running a work function after try_to_grab_pending() returns 1

If this work doesn't rearm itself - yes. (otherwise, the same -func
can run twice _at the same time_)

But again, in this case wait_on_work() after try_to_grab_pending() == 1
doesn't block, so we can just do

if (cancel_work_sync(w))
w-func();

Oleg.

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] PHYLIB: IRQ event workqueue handling fixes

2007-10-19 Thread Benjamin Herrenschmidt


> Actually I'm not convinced with this explanation. It seems to me that
> since there are such serious locking problems (especially with rntl),
> there could be once more considered a private workqueue. You've
> written earlier about being a lonely user of this code. But, since
> Benjamin offered his help with changing to mutexes, which looks like
> very reasonable idea to me (probably I miss most of the points...),
> maybe it's very good opportunity to both: make this code better and
> double the user base! I'm interested in looking for such solution
> if Benjamin thinks there could be too few problems for him... So,
> let somebody tell us what could be wrong with this idea?

My main problem is time :-) But I need to do the change to mutex if I am
to use phylib for emac. I can't tell when I'll have time to do it tho.

Ben.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] PHYLIB: IRQ event workqueue handling fixes

2007-10-19 Thread Maciej W. Rozycki

On Fri, 19 Oct 2007, Jarek Poplawski wrote:

> Actually I'm not convinced with this explanation. It seems to me that
> since there are such serious locking problems (especially with rntl),
> there could be once more considered a private workqueue. You've
> written earlier about being a lonely user of this code. But, since

 Well, this will change eventually when I submit the patch for Broadcom 
SiByte platforms so that sb1250-mac.c will be able to request an interrupt 
for the PHYs.  All the infrastructure is in place except from platform 
code to configure the SOC's GPIO line used for the interrupt input (when 
applicable) and let the driver know what the line actually is.

> Benjamin offered his help with changing to mutexes, which looks like
> very reasonable idea to me (probably I miss most of the points...),
> maybe it's very good opportunity to both: make this code better and
> double the user base! I'm interested in looking for such solution
> if Benjamin thinks there could be too few problems for him... So,
> let somebody tell us what could be wrong with this idea?

 I do not object and can certainly cooperate, but cannot make it a high 
priority work for me at the moment -- sorry.

  Maciej
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] PHYLIB: IRQ event workqueue handling fixes

2007-10-19 Thread Jarek Poplawski

On Fri, Oct 19, 2007 at 12:38:29PM +0100, Maciej W. Rozycki wrote:
> On Thu, 18 Oct 2007, Maciej W. Rozycki wrote:
> 
> > > 1) phy_change() checks PHY_HALTED flag without lock; I think it's
> > > racy: eg. if it's done during phy_stop() it can check just before
> > > the flag is set and reenable interrupts just after phy_stop() ends.
> > 
> >  I remember having a look into it, but it was long ago and I cannot 
> > immediately recall the conclusion.  Which means it is either broken or 
> > deserves a comment as non-obvious.  I will have a look into it again, but 
> > I am resource-starved a little at the moment, sorry.
> 
>  Well, I have now recalled what the issue is -- we just plainly and simply 
> want to avoid a hardirq spinlock for the very reason we do not do all the 
> processing in the hardirq handler.  The thing is we make accesses to the 
> MDIO bus with the phydev lock held and it may take ages until these 
> accesses will have completed.  And we cannot afford keeping interrupts 
> disabled for so long.
> 
>  So the only way is to make the check for the HALTED state lockless and 
> make sure any race condition is handled gracefully and does not lead to 
> inconsistent behaviour.  Which I think as of what we have in the 
> net-2.6.24 tree is the case, but there are never too many eyes to look at 
> a piece of code, so if anybody feels like proving me wrong, then just go 
> ahead!

Actually I'm not convinced with this explanation. It seems to me that
since there are such serious locking problems (especially with rntl),
there could be once more considered a private workqueue. You've
written earlier about being a lonely user of this code. But, since
Benjamin offered his help with changing to mutexes, which looks like
very reasonable idea to me (probably I miss most of the points...),
maybe it's very good opportunity to both: make this code better and
double the user base! I'm interested in looking for such solution
if Benjamin thinks there could be too few problems for him... So,
let somebody tell us what could be wrong with this idea?

Cheers (till Monday),
Jarek P.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] PHYLIB: IRQ event workqueue handling fixes

2007-10-19 Thread Maciej W. Rozycki

On Fri, 19 Oct 2007, Jarek Poplawski wrote:

> But then... your patch seems to make it possible, because it enables
> irq to the initial state of the counter. Of course, this could happen
> on closing only.

 That's because free_irq() does not disable the interrupt in the correct 
manner.  The scenario is more or less like this:

phy_interrupt() [depth == 0]
disable_irq()
depth++; status |= IRQ_DISABLED;
...
free_irq()  [depth == 1]
status |= IRQ_DISABLED;
...
phy_change()[depth == 1]
enable_irq()
depth--; status &= ~IRQ_DISABLED;
oops!

Now if free_irq() correctly incremented the depth counter, then the last
enable_irq() would still decrement it, but with its initial value of 2 it 
would not change the status to reenable the line.

  Maciej
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] PHYLIB: IRQ event workqueue handling fixes

2007-10-19 Thread Maciej W. Rozycki

On Thu, 18 Oct 2007, Maciej W. Rozycki wrote:

> > 1) phy_change() checks PHY_HALTED flag without lock; I think it's
> > racy: eg. if it's done during phy_stop() it can check just before
> > the flag is set and reenable interrupts just after phy_stop() ends.
> 
>  I remember having a look into it, but it was long ago and I cannot 
> immediately recall the conclusion.  Which means it is either broken or 
> deserves a comment as non-obvious.  I will have a look into it again, but 
> I am resource-starved a little at the moment, sorry.

 Well, I have now recalled what the issue is -- we just plainly and simply 
want to avoid a hardirq spinlock for the very reason we do not do all the 
processing in the hardirq handler.  The thing is we make accesses to the 
MDIO bus with the phydev lock held and it may take ages until these 
accesses will have completed.  And we cannot afford keeping interrupts 
disabled for so long.

 So the only way is to make the check for the HALTED state lockless and 
make sure any race condition is handled gracefully and does not lead to 
inconsistent behaviour.  Which I think as of what we have in the 
net-2.6.24 tree is the case, but there are never too many eyes to look at 
a piece of code, so if anybody feels like proving me wrong, then just go 
ahead!

  Maciej
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] PHYLIB: IRQ event workqueue handling fixes

2007-10-19 Thread Jarek Poplawski

On Thu, Oct 18, 2007 at 12:30:35PM +0100, Maciej W. Rozycki wrote:
> On Wed, 17 Oct 2007, Jarek Poplawski wrote:
...
> > 2) phy_change() doesn't reenable irq line after it sees returns
> > with errors; IMHO it should at least write some warning, but maybe
> > try some safety plan, so enable_irq() and try to disable interrupts
> > and free_irq() on the next call (if it happens). (But, I can be very
> > wrong with this - maybe it's OK and official way.)
> 
>  No way to do this safely -- at this point the device probably still has 
> its interrupt output asserted and the register to clear it is 
> inaccessible, so enabling the line will enter an infinite loop.  At this 
> point the system is no longer stable, so it is better to keep at least 
> some functionality, so that it may be attempted to be shut down cleanly, 
> rather than make it completely irresponsive.  The alternative is panic().

But then... your patch seems to make it possible, because it enables
irq to the initial state of the counter. Of course, this could happen
on closing only.

Jarek P.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] flush_work_sync vs. flush_scheduled_work Re: [PATCH] PHYLIB: IRQ event workqueue handling fixes

2007-10-19 Thread Johannes Berg

On Thu, 2007-10-18 at 19:48 +0400, Oleg Nesterov wrote:

> > +void flush_work_sync(struct work_struct *work)

> If we really the new helper, perhaps we can make it a bit better?
> 
> 1. Modify insert_work() to take the "struct list_head *at" parameter instead
>of "int tail". I think this patch will also cleanup the code a bit, and
>shrink a couple of bytes from .text
> 
> 2. flush_work_sync() inserts a barrier right after this work and blocks.
>We still need some retry logic to handle the queueing is in progress
>of course, but we won't spin waiting for the other works.

3. Add lockdep annotation like the other API. :) Andrew just sent my
patch (used to be two patches by somebody's request but that's fine)
titled "workqueue: debug flushing deadlocks with lockdep" to Linus.

johannes


signature.asc
Description: This is a digitally signed message part

Re: [PATCH] flush_work_sync vs. flush_scheduled_work Re: [PATCH] PHYLIB: IRQ event workqueue handling fixes

2007-10-19 Thread Jarek Poplawski

On Fri, Oct 19, 2007 at 09:50:14AM +0200, Jarek Poplawski wrote:
...
> sched_work_sync() with rtnl_lock(). It's only less probable to lockup
> with this than with flush_schedule_work().

...But, not much less...

Jarek P.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] flush_work_sync vs. flush_scheduled_work Re: [PATCH] PHYLIB: IRQ event workqueue handling fixes

2007-10-19 Thread Jarek Poplawski

On Thu, Oct 18, 2007 at 07:48:19PM +0400, Oleg Nesterov wrote:
> On 10/18, Jarek Poplawski wrote:
> >
> > +/**
> > + * flush_work_sync - block until a work_struct's callback has terminated
> ^^^
> Hmm...
> 
> > + * Similar to cancel_work_sync() but will only busy wait (without cancel)
> > + * if the work is queued.
> 
> Yes, it won't block, but will spin in busy-wait loop until all other works
> scheduled before this work are finished. Not good. After that it really
> blocks waiting for this work to complete.
> 
> And I am a bit confused. We can't use flush_workqueue() because some of the
> queued work_structs may take rtnl_lock, yes? But in that case we can't use
> the new flush_work_sync() helper as well, no?

OOOPS!

Of course, we can't!!! I remembered there was this issue long time
ago, but then I've had some break in tracking net & workqueue. So,
while reading this patch I was alarmed at first, and self-misled
later. I think, there is definitely needed some warning about
locking (or unlocking) during these flush_ & cancel_ functions.
(Btw, I've very much wondered now, why I didn't notice at that 'old'
time, that you added such a great feature (wrt. locking) and I even
didn't notice this...). 

So, Maciej (and other readers of this thread) - I withdraw my false
opinion from my second message here: it's very wrong to call this
sched_work_sync() with rtnl_lock(). It's only less probable to lockup
with this than with flush_schedule_work().

> 
> If we can't just cancel the work, can't we do something like
> 
>   if (cancel_work_sync(w))
>   w->func(w);
> 
> instead?
> 
> > +void flush_work_sync(struct work_struct *work)
> > +{
> > +   int ret;
> > +
> > +   do {
> > +   ret = work_pending(work);
> > +   wait_on_work(work);
> > +   if (ret)
> > +   cpu_relax();
> > +   } while (ret);
> > +}
> 
> If we really the new helper, perhaps we can make it a bit better?
> 
> 1. Modify insert_work() to take the "struct list_head *at" parameter instead
>of "int tail". I think this patch will also cleanup the code a bit, and
>shrink a couple of bytes from .text

Looks like a very good idea, but I need more time to rethink this.
Probably some code example should be helpful.

> 
> 2. flush_work_sync() inserts a barrier right after this work and blocks.
>We still need some retry logic to handle the queueing is in progress
>of course, but we won't spin waiting for the other works.

Until monday I should have an opinion on that (today a bit under
fire...).

> 
> What do you think?

Since there is no gain wrt. locking with my current proposal, I
withdraw this patch of course.

It looks like my wrong patch was great idea because we got this very
precious Oleg's opinion! (I know I'm a genius sometimes...)

Thanks very much,
Jarek P.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] PHYLIB: IRQ event workqueue handling fixes

2007-10-19 Thread Jarek Poplawski

On Fri, Oct 19, 2007 at 12:38:29PM +0100, Maciej W. Rozycki wrote:
 On Thu, 18 Oct 2007, Maciej W. Rozycki wrote:
 
   1) phy_change() checks PHY_HALTED flag without lock; I think it's
   racy: eg. if it's done during phy_stop() it can check just before
   the flag is set and reenable interrupts just after phy_stop() ends.
  
   I remember having a look into it, but it was long ago and I cannot 
  immediately recall the conclusion.  Which means it is either broken or 
  deserves a comment as non-obvious.  I will have a look into it again, but 
  I am resource-starved a little at the moment, sorry.
 
  Well, I have now recalled what the issue is -- we just plainly and simply 
 want to avoid a hardirq spinlock for the very reason we do not do all the 
 processing in the hardirq handler.  The thing is we make accesses to the 
 MDIO bus with the phydev lock held and it may take ages until these 
 accesses will have completed.  And we cannot afford keeping interrupts 
 disabled for so long.
 
  So the only way is to make the check for the HALTED state lockless and 
 make sure any race condition is handled gracefully and does not lead to 
 inconsistent behaviour.  Which I think as of what we have in the 
 net-2.6.24 tree is the case, but there are never too many eyes to look at 
 a piece of code, so if anybody feels like proving me wrong, then just go 
 ahead!

Actually I'm not convinced with this explanation. It seems to me that
since there are such serious locking problems (especially with rntl),
there could be once more considered a private workqueue. You've
written earlier about being a lonely user of this code. But, since
Benjamin offered his help with changing to mutexes, which looks like
very reasonable idea to me (probably I miss most of the points...),
maybe it's very good opportunity to both: make this code better and
double the user base! I'm interested in looking for such solution
if Benjamin thinks there could be too few problems for him... So,
let somebody tell us what could be wrong with this idea?

Cheers (till Monday),
Jarek P.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] PHYLIB: IRQ event workqueue handling fixes

2007-10-19 Thread Benjamin Herrenschmidt


 Actually I'm not convinced with this explanation. It seems to me that
 since there are such serious locking problems (especially with rntl),
 there could be once more considered a private workqueue. You've
 written earlier about being a lonely user of this code. But, since
 Benjamin offered his help with changing to mutexes, which looks like
 very reasonable idea to me (probably I miss most of the points...),
 maybe it's very good opportunity to both: make this code better and
 double the user base! I'm interested in looking for such solution
 if Benjamin thinks there could be too few problems for him... So,
 let somebody tell us what could be wrong with this idea?

My main problem is time :-) But I need to do the change to mutex if I am
to use phylib for emac. I can't tell when I'll have time to do it tho.

Ben.

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] PHYLIB: IRQ event workqueue handling fixes

2007-10-19 Thread Maciej W. Rozycki

On Fri, 19 Oct 2007, Jarek Poplawski wrote:

 Actually I'm not convinced with this explanation. It seems to me that
 since there are such serious locking problems (especially with rntl),
 there could be once more considered a private workqueue. You've
 written earlier about being a lonely user of this code. But, since

 Well, this will change eventually when I submit the patch for Broadcom 
SiByte platforms so that sb1250-mac.c will be able to request an interrupt 
for the PHYs.  All the infrastructure is in place except from platform 
code to configure the SOC's GPIO line used for the interrupt input (when 
applicable) and let the driver know what the line actually is.

 Benjamin offered his help with changing to mutexes, which looks like
 very reasonable idea to me (probably I miss most of the points...),
 maybe it's very good opportunity to both: make this code better and
 double the user base! I'm interested in looking for such solution
 if Benjamin thinks there could be too few problems for him... So,
 let somebody tell us what could be wrong with this idea?

 I do not object and can certainly cooperate, but cannot make it a high 
priority work for me at the moment -- sorry.

  Maciej
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] PHYLIB: IRQ event workqueue handling fixes

2007-10-19 Thread Maciej W. Rozycki

On Thu, 18 Oct 2007, Maciej W. Rozycki wrote:

  1) phy_change() checks PHY_HALTED flag without lock; I think it's
  racy: eg. if it's done during phy_stop() it can check just before
  the flag is set and reenable interrupts just after phy_stop() ends.
 
  I remember having a look into it, but it was long ago and I cannot 
 immediately recall the conclusion.  Which means it is either broken or 
 deserves a comment as non-obvious.  I will have a look into it again, but 
 I am resource-starved a little at the moment, sorry.

 Well, I have now recalled what the issue is -- we just plainly and simply 
want to avoid a hardirq spinlock for the very reason we do not do all the 
processing in the hardirq handler.  The thing is we make accesses to the 
MDIO bus with the phydev lock held and it may take ages until these 
accesses will have completed.  And we cannot afford keeping interrupts 
disabled for so long.

 So the only way is to make the check for the HALTED state lockless and 
make sure any race condition is handled gracefully and does not lead to 
inconsistent behaviour.  Which I think as of what we have in the 
net-2.6.24 tree is the case, but there are never too many eyes to look at 
a piece of code, so if anybody feels like proving me wrong, then just go 
ahead!

  Maciej
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] PHYLIB: IRQ event workqueue handling fixes

2007-10-19 Thread Jarek Poplawski

On Thu, Oct 18, 2007 at 12:30:35PM +0100, Maciej W. Rozycki wrote:
 On Wed, 17 Oct 2007, Jarek Poplawski wrote:
...
  2) phy_change() doesn't reenable irq line after it sees returns
  with errors; IMHO it should at least write some warning, but maybe
  try some safety plan, so enable_irq() and try to disable interrupts
  and free_irq() on the next call (if it happens). (But, I can be very
  wrong with this - maybe it's OK and official way.)
 
  No way to do this safely -- at this point the device probably still has 
 its interrupt output asserted and the register to clear it is 
 inaccessible, so enabling the line will enter an infinite loop.  At this 
 point the system is no longer stable, so it is better to keep at least 
 some functionality, so that it may be attempted to be shut down cleanly, 
 rather than make it completely irresponsive.  The alternative is panic().

But then... your patch seems to make it possible, because it enables
irq to the initial state of the counter. Of course, this could happen
on closing only.

Jarek P.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] PHYLIB: IRQ event workqueue handling fixes

2007-10-19 Thread Maciej W. Rozycki

On Fri, 19 Oct 2007, Jarek Poplawski wrote:

 But then... your patch seems to make it possible, because it enables
 irq to the initial state of the counter. Of course, this could happen
 on closing only.

 That's because free_irq() does not disable the interrupt in the correct 
manner.  The scenario is more or less like this:

phy_interrupt() [depth == 0]
disable_irq()
depth++; status |= IRQ_DISABLED;
...
free_irq()  [depth == 1]
status |= IRQ_DISABLED;
...
phy_change()[depth == 1]
enable_irq()
depth--; status = ~IRQ_DISABLED;
oops!

Now if free_irq() correctly incremented the depth counter, then the last
enable_irq() would still decrement it, but with its initial value of 2 it 
would not change the status to reenable the line.

  Maciej
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] flush_work_sync vs. flush_scheduled_work Re: [PATCH] PHYLIB: IRQ event workqueue handling fixes

2007-10-19 Thread Johannes Berg

On Thu, 2007-10-18 at 19:48 +0400, Oleg Nesterov wrote:

  +void flush_work_sync(struct work_struct *work)

 If we really the new helper, perhaps we can make it a bit better?
 
 1. Modify insert_work() to take the struct list_head *at parameter instead
of int tail. I think this patch will also cleanup the code a bit, and
shrink a couple of bytes from .text
 
 2. flush_work_sync() inserts a barrier right after this work and blocks.
We still need some retry logic to handle the queueing is in progress
of course, but we won't spin waiting for the other works.

3. Add lockdep annotation like the other API. :) Andrew just sent my
patch (used to be two patches by somebody's request but that's fine)
titled workqueue: debug flushing deadlocks with lockdep to Linus.

johannes


signature.asc
Description: This is a digitally signed message part

Re: [PATCH] flush_work_sync vs. flush_scheduled_work Re: [PATCH] PHYLIB: IRQ event workqueue handling fixes

2007-10-19 Thread Jarek Poplawski

On Fri, Oct 19, 2007 at 09:50:14AM +0200, Jarek Poplawski wrote:
...
 sched_work_sync() with rtnl_lock(). It's only less probable to lockup
 with this than with flush_schedule_work().

...But, not much less...

Jarek P.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] flush_work_sync vs. flush_scheduled_work Re: [PATCH] PHYLIB: IRQ event workqueue handling fixes

2007-10-19 Thread Jarek Poplawski

On Thu, Oct 18, 2007 at 07:48:19PM +0400, Oleg Nesterov wrote:
 On 10/18, Jarek Poplawski wrote:
 
  +/**
  + * flush_work_sync - block until a work_struct's callback has terminated
 ^^^
 Hmm...
 
  + * Similar to cancel_work_sync() but will only busy wait (without cancel)
  + * if the work is queued.
 
 Yes, it won't block, but will spin in busy-wait loop until all other works
 scheduled before this work are finished. Not good. After that it really
 blocks waiting for this work to complete.
 
 And I am a bit confused. We can't use flush_workqueue() because some of the
 queued work_structs may take rtnl_lock, yes? But in that case we can't use
 the new flush_work_sync() helper as well, no?

OOOPS!

Of course, we can't!!! I remembered there was this issue long time
ago, but then I've had some break in tracking net  workqueue. So,
while reading this patch I was alarmed at first, and self-misled
later. I think, there is definitely needed some warning about
locking (or unlocking) during these flush_  cancel_ functions.
(Btw, I've very much wondered now, why I didn't notice at that 'old'
time, that you added such a great feature (wrt. locking) and I even
didn't notice this...). 

So, Maciej (and other readers of this thread) - I withdraw my false
opinion from my second message here: it's very wrong to call this
sched_work_sync() with rtnl_lock(). It's only less probable to lockup
with this than with flush_schedule_work().

 
 If we can't just cancel the work, can't we do something like
 
   if (cancel_work_sync(w))
   w-func(w);
 
 instead?
 
  +void flush_work_sync(struct work_struct *work)
  +{
  +   int ret;
  +
  +   do {
  +   ret = work_pending(work);
  +   wait_on_work(work);
  +   if (ret)
  +   cpu_relax();
  +   } while (ret);
  +}
 
 If we really the new helper, perhaps we can make it a bit better?
 
 1. Modify insert_work() to take the struct list_head *at parameter instead
of int tail. I think this patch will also cleanup the code a bit, and
shrink a couple of bytes from .text

Looks like a very good idea, but I need more time to rethink this.
Probably some code example should be helpful.

 
 2. flush_work_sync() inserts a barrier right after this work and blocks.
We still need some retry logic to handle the queueing is in progress
of course, but we won't spin waiting for the other works.

Until monday I should have an opinion on that (today a bit under
fire...).

 
 What do you think?

Since there is no gain wrt. locking with my current proposal, I
withdraw this patch of course.

It looks like my wrong patch was great idea because we got this very
precious Oleg's opinion! (I know I'm a genius sometimes...)

Thanks very much,
Jarek P.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] flush_work_sync vs. flush_scheduled_work Re: [PATCH] PHYLIB: IRQ event workqueue handling fixes

2007-10-18 Thread Maciej W. Rozycki

On Thu, 18 Oct 2007, Oleg Nesterov wrote:

> If we can't just cancel the work, can't we do something like
> 
>   if (cancel_work_sync(w))
>   w->func(w);
> 
> instead?

 We do an equivalent of this -- all that we care about that w->func(w) 
would do is enable_irq() and the rest we want to skip at this point.  We 
probably do not need the counter in the end though.

  Maciej
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] flush_work_sync vs. flush_scheduled_work Re: [PATCH] PHYLIB: IRQ event workqueue handling fixes

2007-10-18 Thread Oleg Nesterov

On 10/18, Jarek Poplawski wrote:
>
> +/**
> + * flush_work_sync - block until a work_struct's callback has terminated
^^^
Hmm...

> + * Similar to cancel_work_sync() but will only busy wait (without cancel)
> + * if the work is queued.

Yes, it won't block, but will spin in busy-wait loop until all other works
scheduled before this work are finished. Not good. After that it really
blocks waiting for this work to complete.

And I am a bit confused. We can't use flush_workqueue() because some of the
queued work_structs may take rtnl_lock, yes? But in that case we can't use
the new flush_work_sync() helper as well, no?

If we can't just cancel the work, can't we do something like

if (cancel_work_sync(w))
w->func(w);

instead?

> +void flush_work_sync(struct work_struct *work)
> +{
> + int ret;
> +
> + do {
> + ret = work_pending(work);
> + wait_on_work(work);
> + if (ret)
> + cpu_relax();
> + } while (ret);
> +}

If we really the new helper, perhaps we can make it a bit better?

1. Modify insert_work() to take the "struct list_head *at" parameter instead
   of "int tail". I think this patch will also cleanup the code a bit, and
   shrink a couple of bytes from .text

2. flush_work_sync() inserts a barrier right after this work and blocks.
   We still need some retry logic to handle the queueing is in progress
   of course, but we won't spin waiting for the other works.

What do you think?

Oleg.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] PHYLIB: IRQ event workqueue handling fixes

2007-10-18 Thread Maciej W. Rozycki

On Thu, 18 Oct 2007, Jarek Poplawski wrote:

> Technically until free_irq returns a handler should respond to calls
> and with proper hardware it should have no problem with checking if
> it's its interrupt, even after disabling this hardware, because of
> possible races.

 Well, the hardirq handler can check it, no problem -- it is just it is so 
slow, the latency would be unacceptable.  The problem with the softirq 
handler is we do really not want it to be called after the driver has 
already been shut down and its structures freed.  It used to happen before 
this flush/cancel call was added with the usual effect (oops) as by then 
they may well have been stamped on already.

> But with a scenario like this:
> 
> - disable_irq()
> - disable_my_hadrware_irq()
> - clear_my_hardware_irq()
> - free_irq()
> - enable_irq()
> 
> it seems the handler should respond even after free_irq because there
> could be still interrupts to resend, originally generated by its
> hardware, so such behavior looks very suspicious, at least with some
> type of interrupts.

 These are softirqs, not hardware interrupts, so they are as such not 
related to *_irq() infrastructure.  The flaw is the depth count of IRQ 
lines is not maintained consistently.  This is, according to comments 
around the code in question, to cover up bugs elsewhere.  Not a brillant 
idea, I am afraid -- there should be no need to reset the depth upon 
request_irq() and likewise with free_irq(), but there you go.  I would be 
happy to investigate a possible solution and rewrite the necessary bits, 
but right now I am committed to other stuff, overdue already, sorry.

 The view could change if we supported hot-pluggable interrupt 
controllers, but it is not the case at the moment right now, so the 
interrupt lines are there to stay for the duration of the system lifespan 
and could be manipulated as necessary.

> So, I think, the idea of DEBUG_SHIRQ is generally right and very
> useful - but, of course, there could be exceptions, which btw. could
> try some hacks under DEBUG_SHIRQ too. And my opinion about
> 'properness' was very general (not about phy) too, just like my
> 'knowledge' of drivers.

 The idea is right, no question, but I am not quite sure it has been 
properly architected into our current design.  Actually I am almost sure 
of the reverse.  This is why I was (and still am) interested in feedback 
on it.

> Right! But then some warning could be useful, I presume.

 Of course; adding one to phy_error() should be straightforward.

> > > 4) phy_interrupt() should check return value from schedule_work() and
> > > enable irq on 0.
> > 
> >  No -- the work already pending will do that.
> 
> How? If schedule_work won't succeed because there is a pending one,
> we did disable_irq_nosync 2x, so we would stay at least 1x too deep!

 Correct and my note is misleading, sorry.

 The thing is we shouldn't have come here the second time in the first 
place (which is I think why the check is not there) as handlers for the 
same line are not allowed to run in parallel (cf. irq_desc->lock and 
IRQ_INPROGRESS).  Perhaps BUG_ON(!schedule_work()) would be appropriate, 
though I am not sure if we should handle every "impossible" condition we 
can imagine.  Quite a lot of hardirq handlers assume two instances will 
not run in parallel, so if it ever happened, then a serious breakage would 
unleash.

> But, I've enough of other concerns too, so nothing urgent here...

 The problem is at the moment I am still probably the only user of this 
code ;-) -- `grep' through the sources to see how many drivers request an 
IRQ for their PHY through phylib; unless something has changed very 
recently.

  Maciej
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] PHYLIB: IRQ event workqueue handling fixes

2007-10-18 Thread Jarek Poplawski

On Thu, Oct 18, 2007 at 12:30:35PM +0100, Maciej W. Rozycki wrote:
> On Wed, 17 Oct 2007, Jarek Poplawski wrote:
> 
> > I'm not sure free_irq() should maintain the depth count - rather warn
> > on not zero. But, IMHO, any activity on freed irq seems suspicious to
> > me (and doesn't look like very common), even if it's safe with current
> > implementation.
> 
>  No way to avoid it with DEBUG_SHIRQ.
> 
> > Yes, these DEBUG_SHIRQ checks are suspicious to me too, but they seem
> > to be reasonable only in the case of possible resent irqs (so not for
> > all irqs). On the other hand, it seems, proper irq handler with proper
> > hardware shouldn't have any problems with such a check.
> 
>  What do you mean by "proper irq handler with proper hardware"?  Using
> softirqs (they used to be called bottom-halves) is actually a natural way 
> of handling any interrupt which requires extensive processing.

Technically until free_irq returns a handler should respond to calls
and with proper hardware it should have no problem with checking if
it's its interrupt, even after disabling this hardware, because of
possible races.

But with a scenario like this:

- disable_irq()
- disable_my_hadrware_irq()
- clear_my_hardware_irq()
- free_irq()
- enable_irq()

it seems the handler should respond even after free_irq because there
could be still interrupts to resend, originally generated by its
hardware, so such behavior looks very suspicious, at least with some
type of interrupts.

So, I think, the idea of DEBUG_SHIRQ is generally right and very
useful - but, of course, there could be exceptions, which btw. could
try some hacks under DEBUG_SHIRQ too. And my opinion about
'properness' was very general (not about phy) too, just like my
'knowledge' of drivers.

> 
> > 1) phy_change() checks PHY_HALTED flag without lock; I think it's
> > racy: eg. if it's done during phy_stop() it can check just before
> > the flag is set and reenable interrupts just after phy_stop() ends.
> 
>  I remember having a look into it, but it was long ago and I cannot 
> immediately recall the conclusion.  Which means it is either broken or 
> deserves a comment as non-obvious.  I will have a look into it again, but 
> I am resource-starved a little at the moment, sorry.
> 
> > 2) phy_change() doesn't reenable irq line after it sees returns
> > with errors; IMHO it should at least write some warning, but maybe
> > try some safety plan, so enable_irq() and try to disable interrupts
> > and free_irq() on the next call (if it happens). (But, I can be very
> > wrong with this - maybe it's OK and official way.)
> 
>  No way to do this safely -- at this point the device probably still has 
> its interrupt output asserted and the register to clear it is 
> inaccessible, so enabling the line will enter an infinite loop.  At this 
> point the system is no longer stable, so it is better to keep at least 
> some functionality, so that it may be attempted to be shut down cleanly, 
> rather than make it completely irresponsive.  The alternative is panic().

Right! But then some warning could be useful, I presume.

> 
> > 3) phy_interrupt() checks PHY_HALTED flag without lock too, but I'm
> > not sure now if it could be dangerous after fixing #1; on the other
> > hand even if we know it's not our regular interrupt, with current
> > DEBUG_SHIRQ it could be easier to call schedule_work() anyway since
> > we are sure it's before/in free_irq() yet.
> 
>   See #1 above.
> 
> > 4) phy_interrupt() should check return value from schedule_work() and
> > enable irq on 0.
> 
>  No -- the work already pending will do that.

How? If schedule_work won't succeed because there is a pending one,
we did disable_irq_nosync 2x, so we would stay at least 1x too deep!

> 
> > 5) phy_stop_interrupts(): maybe I miss something, but it seems
> > phy_stop() is required before this, so maybe there should be a
> > comment on this?
> 
>  The API is documented in: Documentation/networking/phy.txt -- you are 
> welcome to improve.  If you do not want to get into the gory details, just 
> use the cooked interface and phy_disconnect() will do the dirty work for 
> you.
> 
> > 6) phy_stop_interrupts(): if I'm not wrong with #3 calling
> > phy_disable_interrupts() looks like we are not sure this phy_stop()
> > really works; than maybe a WARN_ON?
> 
>  WARN_ON what?

I've thought that phy_stop() could be needed before
phy_stop_interrupt() to set PHY_HALTED, but since it disables and
clears interrupts too, then there should be no need to repeat this,
maybe only check it's done. But if there is no such dependency, then
no point at all.

> 
> > 7) phy_stop_interrupts(): after above mentioned changes in
> > phy_interrupt(), and phy_changes() (always enable_irq()) I can't see
> > any reason why there should be more than one skipped enable_irq(),
> > so checking return from cancel_work_sync() shouldn't be enough
> > instead of this atomic counter.
> 
>  CONFIG_DEBUG_SHIRQ.  Barring this option,

Re: [PATCH] PHYLIB: IRQ event workqueue handling fixes

2007-10-18 Thread Maciej W. Rozycki

On Thu, 18 Oct 2007, Jarek Poplawski wrote:

> After rethinking, it looks like this last cancel should be useless.
> So, if phy_interrupt() schedules only if !PHY_HALTED and phy_change()
> does enable_irq() with no exeptions, it seems phy_interrupt() even
> without lock must see PHY_HALTED state before this free_irq() with
> possible DEBUG_SHIRQ call, then maybe only this safety:
> 
>   WARN_ON(work_pending(>phy_queue));

 Good point.

  Maciej
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] PHYLIB: IRQ event workqueue handling fixes

2007-10-18 Thread Maciej W. Rozycki

On Wed, 17 Oct 2007, Jarek Poplawski wrote:

> I'm not sure free_irq() should maintain the depth count - rather warn
> on not zero. But, IMHO, any activity on freed irq seems suspicious to
> me (and doesn't look like very common), even if it's safe with current
> implementation.

 No way to avoid it with DEBUG_SHIRQ.

> Yes, these DEBUG_SHIRQ checks are suspicious to me too, but they seem
> to be reasonable only in the case of possible resent irqs (so not for
> all irqs). On the other hand, it seems, proper irq handler with proper
> hardware shouldn't have any problems with such a check.

 What do you mean by "proper irq handler with proper hardware"?  Using
softirqs (they used to be called bottom-halves) is actually a natural way 
of handling any interrupt which requires extensive processing.

> 1) phy_change() checks PHY_HALTED flag without lock; I think it's
> racy: eg. if it's done during phy_stop() it can check just before
> the flag is set and reenable interrupts just after phy_stop() ends.

 I remember having a look into it, but it was long ago and I cannot 
immediately recall the conclusion.  Which means it is either broken or 
deserves a comment as non-obvious.  I will have a look into it again, but 
I am resource-starved a little at the moment, sorry.

> 2) phy_change() doesn't reenable irq line after it sees returns
> with errors; IMHO it should at least write some warning, but maybe
> try some safety plan, so enable_irq() and try to disable interrupts
> and free_irq() on the next call (if it happens). (But, I can be very
> wrong with this - maybe it's OK and official way.)

 No way to do this safely -- at this point the device probably still has 
its interrupt output asserted and the register to clear it is 
inaccessible, so enabling the line will enter an infinite loop.  At this 
point the system is no longer stable, so it is better to keep at least 
some functionality, so that it may be attempted to be shut down cleanly, 
rather than make it completely irresponsive.  The alternative is panic().

> 3) phy_interrupt() checks PHY_HALTED flag without lock too, but I'm
> not sure now if it could be dangerous after fixing #1; on the other
> hand even if we know it's not our regular interrupt, with current
> DEBUG_SHIRQ it could be easier to call schedule_work() anyway since
> we are sure it's before/in free_irq() yet.

  See #1 above.

> 4) phy_interrupt() should check return value from schedule_work() and
> enable irq on 0.

 No -- the work already pending will do that.

> 5) phy_stop_interrupts(): maybe I miss something, but it seems
> phy_stop() is required before this, so maybe there should be a
> comment on this?

 The API is documented in: Documentation/networking/phy.txt -- you are 
welcome to improve.  If you do not want to get into the gory details, just 
use the cooked interface and phy_disconnect() will do the dirty work for 
you.

> 6) phy_stop_interrupts(): if I'm not wrong with #3 calling
> phy_disable_interrupts() looks like we are not sure this phy_stop()
> really works; than maybe a WARN_ON?

 WARN_ON what?

> 7) phy_stop_interrupts(): after above mentioned changes in
> phy_interrupt(), and phy_changes() (always enable_irq()) I can't see
> any reason why there should be more than one skipped enable_irq(),
> so checking return from cancel_work_sync() shouldn't be enough
> instead of this atomic counter.

 CONFIG_DEBUG_SHIRQ.  Barring this option, cancel_work_sync() could have 
been moved to the front of free_irq().  I think I have seen this in 
reality, with the interrupt line left stuck disabled afterwards, but I 
will double check when I have an opportunity.  The approach implemented 
with this patch does work, which is the important bit, and if 
simplification is possible, then it may be applied later.

> 8) phy_stop_interrupts(): I'm not sure this additional call from
> DEBUG_SHIRQ should be so dangerous, eg.:
> 
>   /*
>* status == PHY_HALTED &&
>* interrupts are stopped after phy_stop()
>*/
>   if (cancel_work_sync(...))
>   enable_irq();
> 
>   free_irq(...);
>   /*
>* possible schedule_work() from DEBUG_SHIRQ only,
>* but proper check for PHY_HALTED is done;
>* so, let's flush after this too:
>*/
>   cancel_work_sync();

 Well, if there is another handler registered on this line, you'll get 
your interrupt line stuck disabled.

> Of course, I don't know phy.c enough, so most of this can be terribly
> wrong, then feel free to forget about this - I don't expect you should
> waste any time for explaining me these things - after all they are
> doubts only.

 I am not sure I know phy.c well enough either, ;-) and your concerns are 
appreciated as interesting conclusions may develop.  If someone disagrees 
with what I have written here, they are welcome to speak out too.

  Maciej
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL

[PATCH] flush_work_sync vs. flush_scheduled_work Re: [PATCH] PHYLIB: IRQ event workqueue handling fixes

2007-10-18 Thread Jarek Poplawski

After reading this and earlier threads about phylib's way of using
workqueue I think such a lighter and safer wrt. locking alternative
for flush_scheduled_work should be useful, but maybe it's only my
imagination.

So, let's ask Oleg Nesterov, whose solutions are here only
copy-cut-pasted & possibly abused by myslef.

->
Subject: flush_work_sync as an alternative for flush_scheduled_work

Similar to cancel_work_sync() but will only busy wait & block
(without cancel).

Signed-off-by: Jarek Poplawski <[EMAIL PROTECTED]>

---

 include/linux/workqueue.h |1 +
 kernel/workqueue.c|   24 
 2 files changed, 25 insertions(+)

diff -Nurp 2.6.23-mm1-/include/linux/workqueue.h 
2.6.23-mm1/include/linux/workqueue.h
--- 2.6.23-mm1-/include/linux/workqueue.h   2007-10-12 23:45:24.0 
+0200
+++ 2.6.23-mm1/include/linux/workqueue.h2007-10-17 20:55:26.0 
+0200
@@ -192,6 +192,7 @@ extern void init_workqueues(void);
 int execute_in_process_context(work_func_t fn, struct execute_work *);
 
 extern int cancel_work_sync(struct work_struct *work);
+extern void flush_work_sync(struct work_struct *work);
 
 /*
  * Kill off a pending schedule_delayed_work().  Note that the work callback
diff -Nurp 2.6.23-mm1-/kernel/workqueue.c 2.6.23-mm1/kernel/workqueue.c
--- 2.6.23-mm1-/kernel/workqueue.c  2007-10-12 23:45:25.0 +0200
+++ 2.6.23-mm1/kernel/workqueue.c   2007-10-17 20:54:03.0 +0200
@@ -539,6 +539,30 @@ int cancel_delayed_work_sync(struct dela
 }
 EXPORT_SYMBOL(cancel_delayed_work_sync);
 
+/**
+ * flush_work_sync - block until a work_struct's callback has terminated
+ * @work: the work which is to be flushed
+ *
+ * Similar to cancel_work_sync() but will only busy wait (without cancel)
+ * if the work is queued. If the work's callback appears to be running,
+ * flush_work_sync() will block until it has completed (but doesn't block
+ * while other callbacks are running, like flush_scheduled_work() does).
+ *
+ * It is not allowed to use this function if the work re-queues itself.
+ */
+void flush_work_sync(struct work_struct *work)
+{
+   int ret;
+
+   do {
+   ret = work_pending(work);
+   wait_on_work(work);
+   if (ret)
+   cpu_relax();
+   } while (ret);
+}
+EXPORT_SYMBOL(flush_work_sync);
+
 static struct workqueue_struct *keventd_wq __read_mostly;
 
 /**
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] PHYLIB: IRQ event workqueue handling fixes

2007-10-18 Thread Jarek Poplawski

On Wed, Oct 17, 2007 at 10:58:09AM +0200, Jarek Poplawski wrote:
...
> 8) phy_stop_interrupts(): I'm not sure this additional call from
> DEBUG_SHIRQ should be so dangerous, eg.:
> 
>   /*
>* status == PHY_HALTED &&
>* interrupts are stopped after phy_stop()
>*/
>   if (cancel_work_sync(...))
>   enable_irq();
> 
>   free_irq(...);
>   /*
>* possible schedule_work() from DEBUG_SHIRQ only,
>* but proper check for PHY_HALTED is done;
>* so, let's flush after this too:
>*/
>   cancel_work_sync();

After rethinking, it looks like this last cancel should be useless.
So, if phy_interrupt() schedules only if !PHY_HALTED and phy_change()
does enable_irq() with no exeptions, it seems phy_interrupt() even
without lock must see PHY_HALTED state before this free_irq() with
possible DEBUG_SHIRQ call, then maybe only this safety:

WARN_ON(work_pending(>phy_queue));

Btw, I've read this was considered and not liked, but IMHO, if this
really has to be like this, creating phy's own workqueue seems to be
resonable, at the very least to reduce latencies to other users of
this irq.

Jarek P.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] PHYLIB: IRQ event workqueue handling fixes

2007-10-18 Thread Jarek Poplawski

On Wed, Oct 17, 2007 at 10:58:09AM +0200, Jarek Poplawski wrote:
...
 8) phy_stop_interrupts(): I'm not sure this additional call from
 DEBUG_SHIRQ should be so dangerous, eg.:
 
   /*
* status == PHY_HALTED 
* interrupts are stopped after phy_stop()
*/
   if (cancel_work_sync(...))
   enable_irq();
 
   free_irq(...);
   /*
* possible schedule_work() from DEBUG_SHIRQ only,
* but proper check for PHY_HALTED is done;
* so, let's flush after this too:
*/
   cancel_work_sync();

After rethinking, it looks like this last cancel should be useless.
So, if phy_interrupt() schedules only if !PHY_HALTED and phy_change()
does enable_irq() with no exeptions, it seems phy_interrupt() even
without lock must see PHY_HALTED state before this free_irq() with
possible DEBUG_SHIRQ call, then maybe only this safety:

WARN_ON(work_pending(phydev-phy_queue));


Btw, I've read this was considered and not liked, but IMHO, if this
really has to be like this, creating phy's own workqueue seems to be
resonable, at the very least to reduce latencies to other users of
this irq.

Jarek P.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] flush_work_sync vs. flush_scheduled_work Re: [PATCH] PHYLIB: IRQ event workqueue handling fixes

2007-10-18 Thread Jarek Poplawski

After reading this and earlier threads about phylib's way of using
workqueue I think such a lighter and safer wrt. locking alternative
for flush_scheduled_work should be useful, but maybe it's only my
imagination.

So, let's ask Oleg Nesterov, whose solutions are here only
copy-cut-pasted  possibly abused by myslef.

-
Subject: flush_work_sync as an alternative for flush_scheduled_work

Similar to cancel_work_sync() but will only busy wait  block
(without cancel).

Signed-off-by: Jarek Poplawski [EMAIL PROTECTED]

---

 include/linux/workqueue.h |1 +
 kernel/workqueue.c|   24 
 2 files changed, 25 insertions(+)

diff -Nurp 2.6.23-mm1-/include/linux/workqueue.h 
2.6.23-mm1/include/linux/workqueue.h
--- 2.6.23-mm1-/include/linux/workqueue.h   2007-10-12 23:45:24.0 
+0200
+++ 2.6.23-mm1/include/linux/workqueue.h2007-10-17 20:55:26.0 
+0200
@@ -192,6 +192,7 @@ extern void init_workqueues(void);
 int execute_in_process_context(work_func_t fn, struct execute_work *);
 
 extern int cancel_work_sync(struct work_struct *work);
+extern void flush_work_sync(struct work_struct *work);
 
 /*
  * Kill off a pending schedule_delayed_work().  Note that the work callback
diff -Nurp 2.6.23-mm1-/kernel/workqueue.c 2.6.23-mm1/kernel/workqueue.c
--- 2.6.23-mm1-/kernel/workqueue.c  2007-10-12 23:45:25.0 +0200
+++ 2.6.23-mm1/kernel/workqueue.c   2007-10-17 20:54:03.0 +0200
@@ -539,6 +539,30 @@ int cancel_delayed_work_sync(struct dela
 }
 EXPORT_SYMBOL(cancel_delayed_work_sync);
 
+/**
+ * flush_work_sync - block until a work_struct's callback has terminated
+ * @work: the work which is to be flushed
+ *
+ * Similar to cancel_work_sync() but will only busy wait (without cancel)
+ * if the work is queued. If the work's callback appears to be running,
+ * flush_work_sync() will block until it has completed (but doesn't block
+ * while other callbacks are running, like flush_scheduled_work() does).
+ *
+ * It is not allowed to use this function if the work re-queues itself.
+ */
+void flush_work_sync(struct work_struct *work)
+{
+   int ret;
+
+   do {
+   ret = work_pending(work);
+   wait_on_work(work);
+   if (ret)
+   cpu_relax();
+   } while (ret);
+}
+EXPORT_SYMBOL(flush_work_sync);
+
 static struct workqueue_struct *keventd_wq __read_mostly;
 
 /**
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] PHYLIB: IRQ event workqueue handling fixes

2007-10-18 Thread Maciej W. Rozycki

On Thu, 18 Oct 2007, Jarek Poplawski wrote:

 After rethinking, it looks like this last cancel should be useless.
 So, if phy_interrupt() schedules only if !PHY_HALTED and phy_change()
 does enable_irq() with no exeptions, it seems phy_interrupt() even
 without lock must see PHY_HALTED state before this free_irq() with
 possible DEBUG_SHIRQ call, then maybe only this safety:
 
   WARN_ON(work_pending(phydev-phy_queue));

 Good point.

  Maciej
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] PHYLIB: IRQ event workqueue handling fixes

2007-10-18 Thread Maciej W. Rozycki

On Wed, 17 Oct 2007, Jarek Poplawski wrote:

 I'm not sure free_irq() should maintain the depth count - rather warn
 on not zero. But, IMHO, any activity on freed irq seems suspicious to
 me (and doesn't look like very common), even if it's safe with current
 implementation.

 No way to avoid it with DEBUG_SHIRQ.

 Yes, these DEBUG_SHIRQ checks are suspicious to me too, but they seem
 to be reasonable only in the case of possible resent irqs (so not for
 all irqs). On the other hand, it seems, proper irq handler with proper
 hardware shouldn't have any problems with such a check.

 What do you mean by proper irq handler with proper hardware?  Using
softirqs (they used to be called bottom-halves) is actually a natural way 
of handling any interrupt which requires extensive processing.

 1) phy_change() checks PHY_HALTED flag without lock; I think it's
 racy: eg. if it's done during phy_stop() it can check just before
 the flag is set and reenable interrupts just after phy_stop() ends.

 I remember having a look into it, but it was long ago and I cannot 
immediately recall the conclusion.  Which means it is either broken or 
deserves a comment as non-obvious.  I will have a look into it again, but 
I am resource-starved a little at the moment, sorry.

 2) phy_change() doesn't reenable irq line after it sees returns
 with errors; IMHO it should at least write some warning, but maybe
 try some safety plan, so enable_irq() and try to disable interrupts
 and free_irq() on the next call (if it happens). (But, I can be very
 wrong with this - maybe it's OK and official way.)

 No way to do this safely -- at this point the device probably still has 
its interrupt output asserted and the register to clear it is 
inaccessible, so enabling the line will enter an infinite loop.  At this 
point the system is no longer stable, so it is better to keep at least 
some functionality, so that it may be attempted to be shut down cleanly, 
rather than make it completely irresponsive.  The alternative is panic().

 3) phy_interrupt() checks PHY_HALTED flag without lock too, but I'm
 not sure now if it could be dangerous after fixing #1; on the other
 hand even if we know it's not our regular interrupt, with current
 DEBUG_SHIRQ it could be easier to call schedule_work() anyway since
 we are sure it's before/in free_irq() yet.

  See #1 above.

 4) phy_interrupt() should check return value from schedule_work() and
 enable irq on 0.

 No -- the work already pending will do that.

 5) phy_stop_interrupts(): maybe I miss something, but it seems
 phy_stop() is required before this, so maybe there should be a
 comment on this?

 The API is documented in: Documentation/networking/phy.txt -- you are 
welcome to improve.  If you do not want to get into the gory details, just 
use the cooked interface and phy_disconnect() will do the dirty work for 
you.

 6) phy_stop_interrupts(): if I'm not wrong with #3 calling
 phy_disable_interrupts() looks like we are not sure this phy_stop()
 really works; than maybe a WARN_ON?

 WARN_ON what?

 7) phy_stop_interrupts(): after above mentioned changes in
 phy_interrupt(), and phy_changes() (always enable_irq()) I can't see
 any reason why there should be more than one skipped enable_irq(),
 so checking return from cancel_work_sync() shouldn't be enough
 instead of this atomic counter.

 CONFIG_DEBUG_SHIRQ.  Barring this option, cancel_work_sync() could have 
been moved to the front of free_irq().  I think I have seen this in 
reality, with the interrupt line left stuck disabled afterwards, but I 
will double check when I have an opportunity.  The approach implemented 
with this patch does work, which is the important bit, and if 
simplification is possible, then it may be applied later.

 8) phy_stop_interrupts(): I'm not sure this additional call from
 DEBUG_SHIRQ should be so dangerous, eg.:
 
   /*
* status == PHY_HALTED 
* interrupts are stopped after phy_stop()
*/
   if (cancel_work_sync(...))
   enable_irq();
 
   free_irq(...);
   /*
* possible schedule_work() from DEBUG_SHIRQ only,
* but proper check for PHY_HALTED is done;
* so, let's flush after this too:
*/
   cancel_work_sync();

 Well, if there is another handler registered on this line, you'll get 
your interrupt line stuck disabled.

 Of course, I don't know phy.c enough, so most of this can be terribly
 wrong, then feel free to forget about this - I don't expect you should
 waste any time for explaining me these things - after all they are
 doubts only.

 I am not sure I know phy.c well enough either, ;-) and your concerns are 
appreciated as interesting conclusions may develop.  If someone disagrees 
with what I have written here, they are welcome to speak out too.

  Maciej
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at

Re: [PATCH] PHYLIB: IRQ event workqueue handling fixes

2007-10-18 Thread Jarek Poplawski

On Thu, Oct 18, 2007 at 12:30:35PM +0100, Maciej W. Rozycki wrote:
 On Wed, 17 Oct 2007, Jarek Poplawski wrote:
 
  I'm not sure free_irq() should maintain the depth count - rather warn
  on not zero. But, IMHO, any activity on freed irq seems suspicious to
  me (and doesn't look like very common), even if it's safe with current
  implementation.
 
  No way to avoid it with DEBUG_SHIRQ.
 
  Yes, these DEBUG_SHIRQ checks are suspicious to me too, but they seem
  to be reasonable only in the case of possible resent irqs (so not for
  all irqs). On the other hand, it seems, proper irq handler with proper
  hardware shouldn't have any problems with such a check.
 
  What do you mean by proper irq handler with proper hardware?  Using
 softirqs (they used to be called bottom-halves) is actually a natural way 
 of handling any interrupt which requires extensive processing.

Technically until free_irq returns a handler should respond to calls
and with proper hardware it should have no problem with checking if
it's its interrupt, even after disabling this hardware, because of
possible races.

But with a scenario like this:

- disable_irq()
- disable_my_hadrware_irq()
- clear_my_hardware_irq()
- free_irq()
- enable_irq()

it seems the handler should respond even after free_irq because there
could be still interrupts to resend, originally generated by its
hardware, so such behavior looks very suspicious, at least with some
type of interrupts.

So, I think, the idea of DEBUG_SHIRQ is generally right and very
useful - but, of course, there could be exceptions, which btw. could
try some hacks under DEBUG_SHIRQ too. And my opinion about
'properness' was very general (not about phy) too, just like my
'knowledge' of drivers.

 
  1) phy_change() checks PHY_HALTED flag without lock; I think it's
  racy: eg. if it's done during phy_stop() it can check just before
  the flag is set and reenable interrupts just after phy_stop() ends.
 
  I remember having a look into it, but it was long ago and I cannot 
 immediately recall the conclusion.  Which means it is either broken or 
 deserves a comment as non-obvious.  I will have a look into it again, but 
 I am resource-starved a little at the moment, sorry.
 
  2) phy_change() doesn't reenable irq line after it sees returns
  with errors; IMHO it should at least write some warning, but maybe
  try some safety plan, so enable_irq() and try to disable interrupts
  and free_irq() on the next call (if it happens). (But, I can be very
  wrong with this - maybe it's OK and official way.)
 
  No way to do this safely -- at this point the device probably still has 
 its interrupt output asserted and the register to clear it is 
 inaccessible, so enabling the line will enter an infinite loop.  At this 
 point the system is no longer stable, so it is better to keep at least 
 some functionality, so that it may be attempted to be shut down cleanly, 
 rather than make it completely irresponsive.  The alternative is panic().

Right! But then some warning could be useful, I presume.

 
  3) phy_interrupt() checks PHY_HALTED flag without lock too, but I'm
  not sure now if it could be dangerous after fixing #1; on the other
  hand even if we know it's not our regular interrupt, with current
  DEBUG_SHIRQ it could be easier to call schedule_work() anyway since
  we are sure it's before/in free_irq() yet.
 
   See #1 above.
 
  4) phy_interrupt() should check return value from schedule_work() and
  enable irq on 0.
 
  No -- the work already pending will do that.

How? If schedule_work won't succeed because there is a pending one,
we did disable_irq_nosync 2x, so we would stay at least 1x too deep!

 
  5) phy_stop_interrupts(): maybe I miss something, but it seems
  phy_stop() is required before this, so maybe there should be a
  comment on this?
 
  The API is documented in: Documentation/networking/phy.txt -- you are 
 welcome to improve.  If you do not want to get into the gory details, just 
 use the cooked interface and phy_disconnect() will do the dirty work for 
 you.
 
  6) phy_stop_interrupts(): if I'm not wrong with #3 calling
  phy_disable_interrupts() looks like we are not sure this phy_stop()
  really works; than maybe a WARN_ON?
 
  WARN_ON what?

I've thought that phy_stop() could be needed before
phy_stop_interrupt() to set PHY_HALTED, but since it disables and
clears interrupts too, then there should be no need to repeat this,
maybe only check it's done. But if there is no such dependency, then
no point at all.

 
  7) phy_stop_interrupts(): after above mentioned changes in
  phy_interrupt(), and phy_changes() (always enable_irq()) I can't see
  any reason why there should be more than one skipped enable_irq(),
  so checking return from cancel_work_sync() shouldn't be enough
  instead of this atomic counter.
 
  CONFIG_DEBUG_SHIRQ.  Barring this option, cancel_work_sync() could have 
 been moved to the front of free_irq().  I think I have seen this in 
 reality, with the

Re: [PATCH] PHYLIB: IRQ event workqueue handling fixes

2007-10-18 Thread Maciej W. Rozycki

On Thu, 18 Oct 2007, Jarek Poplawski wrote:

 Technically until free_irq returns a handler should respond to calls
 and with proper hardware it should have no problem with checking if
 it's its interrupt, even after disabling this hardware, because of
 possible races.

 Well, the hardirq handler can check it, no problem -- it is just it is so 
slow, the latency would be unacceptable.  The problem with the softirq 
handler is we do really not want it to be called after the driver has 
already been shut down and its structures freed.  It used to happen before 
this flush/cancel call was added with the usual effect (oops) as by then 
they may well have been stamped on already.

 But with a scenario like this:
 
 - disable_irq()
 - disable_my_hadrware_irq()
 - clear_my_hardware_irq()
 - free_irq()
 - enable_irq()
 
 it seems the handler should respond even after free_irq because there
 could be still interrupts to resend, originally generated by its
 hardware, so such behavior looks very suspicious, at least with some
 type of interrupts.

 These are softirqs, not hardware interrupts, so they are as such not 
related to *_irq() infrastructure.  The flaw is the depth count of IRQ 
lines is not maintained consistently.  This is, according to comments 
around the code in question, to cover up bugs elsewhere.  Not a brillant 
idea, I am afraid -- there should be no need to reset the depth upon 
request_irq() and likewise with free_irq(), but there you go.  I would be 
happy to investigate a possible solution and rewrite the necessary bits, 
but right now I am committed to other stuff, overdue already, sorry.

 The view could change if we supported hot-pluggable interrupt 
controllers, but it is not the case at the moment right now, so the 
interrupt lines are there to stay for the duration of the system lifespan 
and could be manipulated as necessary.

 So, I think, the idea of DEBUG_SHIRQ is generally right and very
 useful - but, of course, there could be exceptions, which btw. could
 try some hacks under DEBUG_SHIRQ too. And my opinion about
 'properness' was very general (not about phy) too, just like my
 'knowledge' of drivers.

 The idea is right, no question, but I am not quite sure it has been 
properly architected into our current design.  Actually I am almost sure 
of the reverse.  This is why I was (and still am) interested in feedback 
on it.

 Right! But then some warning could be useful, I presume.

 Of course; adding one to phy_error() should be straightforward.

   4) phy_interrupt() should check return value from schedule_work() and
   enable irq on 0.
  
   No -- the work already pending will do that.
 
 How? If schedule_work won't succeed because there is a pending one,
 we did disable_irq_nosync 2x, so we would stay at least 1x too deep!

 Correct and my note is misleading, sorry.

 The thing is we shouldn't have come here the second time in the first 
place (which is I think why the check is not there) as handlers for the 
same line are not allowed to run in parallel (cf. irq_desc-lock and 
IRQ_INPROGRESS).  Perhaps BUG_ON(!schedule_work()) would be appropriate, 
though I am not sure if we should handle every impossible condition we 
can imagine.  Quite a lot of hardirq handlers assume two instances will 
not run in parallel, so if it ever happened, then a serious breakage would 
unleash.

 But, I've enough of other concerns too, so nothing urgent here...

 The problem is at the moment I am still probably the only user of this 
code ;-) -- `grep' through the sources to see how many drivers request an 
IRQ for their PHY through phylib; unless something has changed very 
recently.

  Maciej
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] flush_work_sync vs. flush_scheduled_work Re: [PATCH] PHYLIB: IRQ event workqueue handling fixes

2007-10-18 Thread Oleg Nesterov

On 10/18, Jarek Poplawski wrote:

 +/**
 + * flush_work_sync - block until a work_struct's callback has terminated
^^^
Hmm...

 + * Similar to cancel_work_sync() but will only busy wait (without cancel)
 + * if the work is queued.

Yes, it won't block, but will spin in busy-wait loop until all other works
scheduled before this work are finished. Not good. After that it really
blocks waiting for this work to complete.

And I am a bit confused. We can't use flush_workqueue() because some of the
queued work_structs may take rtnl_lock, yes? But in that case we can't use
the new flush_work_sync() helper as well, no?

If we can't just cancel the work, can't we do something like

if (cancel_work_sync(w))
w-func(w);

instead?

 +void flush_work_sync(struct work_struct *work)
 +{
 + int ret;
 +
 + do {
 + ret = work_pending(work);
 + wait_on_work(work);
 + if (ret)
 + cpu_relax();
 + } while (ret);
 +}

If we really the new helper, perhaps we can make it a bit better?

1. Modify insert_work() to take the struct list_head *at parameter instead
   of int tail. I think this patch will also cleanup the code a bit, and
   shrink a couple of bytes from .text

2. flush_work_sync() inserts a barrier right after this work and blocks.
   We still need some retry logic to handle the queueing is in progress
   of course, but we won't spin waiting for the other works.

What do you think?

Oleg.

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] flush_work_sync vs. flush_scheduled_work Re: [PATCH] PHYLIB: IRQ event workqueue handling fixes

2007-10-18 Thread Maciej W. Rozycki

On Thu, 18 Oct 2007, Oleg Nesterov wrote:

 If we can't just cancel the work, can't we do something like
 
   if (cancel_work_sync(w))
   w-func(w);
 
 instead?

 We do an equivalent of this -- all that we care about that w-func(w) 
would do is enable_irq() and the rest we want to skip at this point.  We 
probably do not need the counter in the end though.

  Maciej
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] PHYLIB: IRQ event workqueue handling fixes

2007-10-17 Thread Benjamin Herrenschmidt


> Btw., since, because of this patch, I've had a one more look at phy.c
> and there are a few more doubts, so this time I'll try to bother you
> for real:

 .../...

While there... is somebody interested in making the whole PHY lib
operate a task level and use mutexes instead of spinlock ? I need that
for drivers like EMAC (who use their own PHY layer for now), and I might
even give a go at adapting phylib myself, but I'd like to take the
temperature about it first.

Basically, there is nothing in phylib that is performance critical or
such that requires it to run at irq time and/or use locks. On the other
hand, it complicates things in various areas. The most obvious one being
that it prevents the network driver mii access callbacks from sleeping
which can be annoying as MDIO can be slow, and some drivers have fancy
muxes in there that are better off mutexed than spinlocked.

Ben.
 

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] PHYLIB: IRQ event workqueue handling fixes

2007-10-17 Thread Jarek Poplawski

On Wed, Oct 17, 2007 at 10:58:09AM +0200, Jarek Poplawski wrote:
...
> 5) phy_stop_interrupts(): maybe I miss something, but it seems
> phy_stop() is required before this, so maybe there should be a
> comment on this?
> 
> 6) phy_stop_interrupts(): if I'm not wrong with #3 calling

Should be:
  6) phy_stop_interrupts(): if I'm not wrong with #5 calling

Jarek P.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] PHYLIB: IRQ event workqueue handling fixes

2007-10-17 Thread Jarek Poplawski

On Tue, Oct 16, 2007 at 06:19:32PM +0100, Maciej W. Rozycki wrote:
...
>  Well, enable_irq() and disable_irq() themselves are nesting, so they are 
> not a problem.  OTOH, free_irq() does not seem to maintain the depth count 
> correctly, which looks like a bug to me and which could trigger regardless 
> of whether flush_scheduled_work() or cancel_work_sync() was called.

I'm not sure free_irq() should maintain the depth count - rather warn
on not zero. But, IMHO, any activity on freed irq seems suspicious to
me (and doesn't look like very common), even if it's safe with current
implementation.

>  The reason is CONFIG_DEBUG_SHIRQ which makes a simulated interrupt event 
> be sent at the end of free_irq().  It looks like a problem that is 
> complementary to one I signalled here:
> 
> http://lkml.org/lkml/2007/9/12/82
> 
> with respect to request_irq(), where, similarly, such an interrupt event 
> is sent at the beginning.  It looks like nobody was concerned back then, 
> but perhaps it is time to do a better investigation now and propose a 
> solution.

Yes, these DEBUG_SHIRQ checks are suspicious to me too, but they seem
to be reasonable only in the case of possible resent irqs (so not for
all irqs). On the other hand, it seems, proper irq handler with proper
hardware shouldn't have any problems with such a check.

>  I'll think about it and thanks for your inquisitiveness that has led to 
> these conclusions.

Btw., since, because of this patch, I've had a one more look at phy.c
and there are a few more doubts, so this time I'll try to bother you
for real:

1) phy_change() checks PHY_HALTED flag without lock; I think it's
racy: eg. if it's done during phy_stop() it can check just before
the flag is set and reenable interrupts just after phy_stop() ends.

2) phy_change() doesn't reenable irq line after it sees returns
with errors; IMHO it should at least write some warning, but maybe
try some safety plan, so enable_irq() and try to disable interrupts
and free_irq() on the next call (if it happens). (But, I can be very
wrong with this - maybe it's OK and official way.)

3) phy_interrupt() checks PHY_HALTED flag without lock too, but I'm
not sure now if it could be dangerous after fixing #1; on the other
hand even if we know it's not our regular interrupt, with current
DEBUG_SHIRQ it could be easier to call schedule_work() anyway since
we are sure it's before/in free_irq() yet.

4) phy_interrupt() should check return value from schedule_work() and
enable irq on 0.

5) phy_stop_interrupts(): maybe I miss something, but it seems
phy_stop() is required before this, so maybe there should be a
comment on this?

6) phy_stop_interrupts(): if I'm not wrong with #3 calling
phy_disable_interrupts() looks like we are not sure this phy_stop()
really works; than maybe a WARN_ON?

7) phy_stop_interrupts(): after above mentioned changes in
phy_interrupt(), and phy_changes() (always enable_irq()) I can't see
any reason why there should be more than one skipped enable_irq(),
so checking return from cancel_work_sync() shouldn't be enough
instead of this atomic counter.

8) phy_stop_interrupts(): I'm not sure this additional call from
DEBUG_SHIRQ should be so dangerous, eg.:

/*
 * status == PHY_HALTED &&
 * interrupts are stopped after phy_stop()
 */
if (cancel_work_sync(...))
enable_irq();

free_irq(...);
/*
 * possible schedule_work() from DEBUG_SHIRQ only,
 * but proper check for PHY_HALTED is done;
 * so, let's flush after this too:
 */
cancel_work_sync();

Of course, I don't know phy.c enough, so most of this can be terribly
wrong, then feel free to forget about this - I don't expect you should
waste any time for explaining me these things - after all they are
doubts only.

Regards,
Jarek P.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] PHYLIB: IRQ event workqueue handling fixes

2007-10-17 Thread Jarek Poplawski

On Tue, Oct 16, 2007 at 06:19:32PM +0100, Maciej W. Rozycki wrote:
...
  Well, enable_irq() and disable_irq() themselves are nesting, so they are 
 not a problem.  OTOH, free_irq() does not seem to maintain the depth count 
 correctly, which looks like a bug to me and which could trigger regardless 
 of whether flush_scheduled_work() or cancel_work_sync() was called.

I'm not sure free_irq() should maintain the depth count - rather warn
on not zero. But, IMHO, any activity on freed irq seems suspicious to
me (and doesn't look like very common), even if it's safe with current
implementation.

  The reason is CONFIG_DEBUG_SHIRQ which makes a simulated interrupt event 
 be sent at the end of free_irq().  It looks like a problem that is 
 complementary to one I signalled here:
 
 http://lkml.org/lkml/2007/9/12/82
 
 with respect to request_irq(), where, similarly, such an interrupt event 
 is sent at the beginning.  It looks like nobody was concerned back then, 
 but perhaps it is time to do a better investigation now and propose a 
 solution.

Yes, these DEBUG_SHIRQ checks are suspicious to me too, but they seem
to be reasonable only in the case of possible resent irqs (so not for
all irqs). On the other hand, it seems, proper irq handler with proper
hardware shouldn't have any problems with such a check.

  I'll think about it and thanks for your inquisitiveness that has led to 
 these conclusions.

Btw., since, because of this patch, I've had a one more look at phy.c
and there are a few more doubts, so this time I'll try to bother you
for real:

1) phy_change() checks PHY_HALTED flag without lock; I think it's
racy: eg. if it's done during phy_stop() it can check just before
the flag is set and reenable interrupts just after phy_stop() ends.

2) phy_change() doesn't reenable irq line after it sees returns
with errors; IMHO it should at least write some warning, but maybe
try some safety plan, so enable_irq() and try to disable interrupts
and free_irq() on the next call (if it happens). (But, I can be very
wrong with this - maybe it's OK and official way.)

3) phy_interrupt() checks PHY_HALTED flag without lock too, but I'm
not sure now if it could be dangerous after fixing #1; on the other
hand even if we know it's not our regular interrupt, with current
DEBUG_SHIRQ it could be easier to call schedule_work() anyway since
we are sure it's before/in free_irq() yet.

4) phy_interrupt() should check return value from schedule_work() and
enable irq on 0.

5) phy_stop_interrupts(): maybe I miss something, but it seems
phy_stop() is required before this, so maybe there should be a
comment on this?

6) phy_stop_interrupts(): if I'm not wrong with #3 calling
phy_disable_interrupts() looks like we are not sure this phy_stop()
really works; than maybe a WARN_ON?

7) phy_stop_interrupts(): after above mentioned changes in
phy_interrupt(), and phy_changes() (always enable_irq()) I can't see
any reason why there should be more than one skipped enable_irq(),
so checking return from cancel_work_sync() shouldn't be enough
instead of this atomic counter.

8) phy_stop_interrupts(): I'm not sure this additional call from
DEBUG_SHIRQ should be so dangerous, eg.:

/*
 * status == PHY_HALTED 
 * interrupts are stopped after phy_stop()
 */
if (cancel_work_sync(...))
enable_irq();

free_irq(...);
/*
 * possible schedule_work() from DEBUG_SHIRQ only,
 * but proper check for PHY_HALTED is done;
 * so, let's flush after this too:
 */
cancel_work_sync();


Of course, I don't know phy.c enough, so most of this can be terribly
wrong, then feel free to forget about this - I don't expect you should
waste any time for explaining me these things - after all they are
doubts only.

Regards,
Jarek P.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] PHYLIB: IRQ event workqueue handling fixes

2007-10-17 Thread Jarek Poplawski

On Wed, Oct 17, 2007 at 10:58:09AM +0200, Jarek Poplawski wrote:
...
 5) phy_stop_interrupts(): maybe I miss something, but it seems
 phy_stop() is required before this, so maybe there should be a
 comment on this?
 
 6) phy_stop_interrupts(): if I'm not wrong with #3 calling

Should be:
  6) phy_stop_interrupts(): if I'm not wrong with #5 calling

Jarek P.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] PHYLIB: IRQ event workqueue handling fixes

2007-10-17 Thread Benjamin Herrenschmidt


 Btw., since, because of this patch, I've had a one more look at phy.c
 and there are a few more doubts, so this time I'll try to bother you
 for real:

 .../...

While there... is somebody interested in making the whole PHY lib
operate a task level and use mutexes instead of spinlock ? I need that
for drivers like EMAC (who use their own PHY layer for now), and I might
even give a go at adapting phylib myself, but I'd like to take the
temperature about it first.

Basically, there is nothing in phylib that is performance critical or
such that requires it to run at irq time and/or use locks. On the other
hand, it complicates things in various areas. The most obvious one being
that it prevents the network driver mii access callbacks from sleeping
which can be annoying as MDIO can be slow, and some drivers have fancy
muxes in there that are better off mutexed than spinlocked.

Ben.
 

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] PHYLIB: IRQ event workqueue handling fixes

2007-10-16 Thread Maciej W. Rozycki

On Tue, 16 Oct 2007, Jarek Poplawski wrote:

> Yes, it's all right here. Sorry for bothering - I should've found this
> by myself.

 Ah, no problem -- even with the right keys you may sometimes get lost in 
the results.

> I've still some doubts about this possible enable_irq() after
> free_irq(). If it's the only handler the status would be changed again
> and at least some of this code in check_irq_resend() would be run, but
> I can miss something again or/and this doesn't matter, as well.

 Well, enable_irq() and disable_irq() themselves are nesting, so they are 
not a problem.  OTOH, free_irq() does not seem to maintain the depth count 
correctly, which looks like a bug to me and which could trigger regardless 
of whether flush_scheduled_work() or cancel_work_sync() was called.

 The reason is CONFIG_DEBUG_SHIRQ which makes a simulated interrupt event 
be sent at the end of free_irq().  It looks like a problem that is 
complementary to one I signalled here:

http://lkml.org/lkml/2007/9/12/82

with respect to request_irq(), where, similarly, such an interrupt event 
is sent at the beginning.  It looks like nobody was concerned back then, 
but perhaps it is time to do a better investigation now and propose a 
solution.

 I'll think about it and thanks for your inquisitiveness that has led to 
these conclusions.

  Maciej
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] PHYLIB: IRQ event workqueue handling fixes

2007-10-16 Thread Jarek Poplawski

On Mon, Oct 15, 2007 at 06:03:20PM +0100, Maciej W. Rozycki wrote:
> On Mon, 15 Oct 2007, Jarek Poplawski wrote:
> 
> > Could you explain why cancel_work_sync() is better here than
> > flush_scheduled_work() wrt. rtnl_lock()?
> 
>  Well, this is actually the bit that made cancel_work_sync() be written in 
> the first place.  The short story is the netlink lock is most probably 
> held at this point (depending on the usage of phy_disconnect()) and there 
> is also an event waiting in the queue that requires the lock, so if 
> flush_scheduled_work() is called here a deadlock will happen.
> 
>  Let me find a reference for a longer story...:
> 
> http://www.linux-mips.org/cgi-bin/mesg.cgi?a=linux-mips=Pine.LNX.4.64N.0610031509380.4642%40blysk.ds.pg.gda.pl
> 
> and then discussed again:
> 
> http://www.uwsg.indiana.edu/hypermail/linux/kernel/0612.0/0593.html
> 

Yes, it's all right here. Sorry for bothering - I should've found this
by myself.

I've still some doubts about this possible enable_irq() after
free_irq(). If it's the only handler the status would be changed again
and at least some of this code in check_irq_resend() would be run, but
I can miss something again or/and this doesn't matter, as well.

Thanks,
Jarek P.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] PHYLIB: IRQ event workqueue handling fixes

2007-10-16 Thread Jarek Poplawski

On Mon, Oct 15, 2007 at 06:03:20PM +0100, Maciej W. Rozycki wrote:
 On Mon, 15 Oct 2007, Jarek Poplawski wrote:
 
  Could you explain why cancel_work_sync() is better here than
  flush_scheduled_work() wrt. rtnl_lock()?
 
  Well, this is actually the bit that made cancel_work_sync() be written in 
 the first place.  The short story is the netlink lock is most probably 
 held at this point (depending on the usage of phy_disconnect()) and there 
 is also an event waiting in the queue that requires the lock, so if 
 flush_scheduled_work() is called here a deadlock will happen.
 
  Let me find a reference for a longer story...:
 
 http://www.linux-mips.org/cgi-bin/mesg.cgi?a=linux-mipsi=Pine.LNX.4.64N.0610031509380.4642%40blysk.ds.pg.gda.pl
 
 and then discussed again:
 
 http://www.uwsg.indiana.edu/hypermail/linux/kernel/0612.0/0593.html
 

Yes, it's all right here. Sorry for bothering - I should've found this
by myself.

I've still some doubts about this possible enable_irq() after
free_irq(). If it's the only handler the status would be changed again
and at least some of this code in check_irq_resend() would be run, but
I can miss something again or/and this doesn't matter, as well.

Thanks,
Jarek P.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] PHYLIB: IRQ event workqueue handling fixes

2007-10-16 Thread Maciej W. Rozycki

On Tue, 16 Oct 2007, Jarek Poplawski wrote:

 Yes, it's all right here. Sorry for bothering - I should've found this
 by myself.

 Ah, no problem -- even with the right keys you may sometimes get lost in 
the results.

 I've still some doubts about this possible enable_irq() after
 free_irq(). If it's the only handler the status would be changed again
 and at least some of this code in check_irq_resend() would be run, but
 I can miss something again or/and this doesn't matter, as well.

 Well, enable_irq() and disable_irq() themselves are nesting, so they are 
not a problem.  OTOH, free_irq() does not seem to maintain the depth count 
correctly, which looks like a bug to me and which could trigger regardless 
of whether flush_scheduled_work() or cancel_work_sync() was called.

 The reason is CONFIG_DEBUG_SHIRQ which makes a simulated interrupt event 
be sent at the end of free_irq().  It looks like a problem that is 
complementary to one I signalled here:

http://lkml.org/lkml/2007/9/12/82

with respect to request_irq(), where, similarly, such an interrupt event 
is sent at the beginning.  It looks like nobody was concerned back then, 
but perhaps it is time to do a better investigation now and propose a 
solution.

 I'll think about it and thanks for your inquisitiveness that has led to 
these conclusions.

  Maciej
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] PHYLIB: IRQ event workqueue handling fixes

2007-10-15 Thread Maciej W. Rozycki

On Mon, 15 Oct 2007, Jarek Poplawski wrote:

> Could you explain why cancel_work_sync() is better here than
> flush_scheduled_work() wrt. rtnl_lock()?

 Well, this is actually the bit that made cancel_work_sync() be written in 
the first place.  The short story is the netlink lock is most probably 
held at this point (depending on the usage of phy_disconnect()) and there 
is also an event waiting in the queue that requires the lock, so if 
flush_scheduled_work() is called here a deadlock will happen.

 Let me find a reference for a longer story...:

http://www.linux-mips.org/cgi-bin/mesg.cgi?a=linux-mips=Pine.LNX.4.64N.0610031509380.4642%40blysk.ds.pg.gda.pl

and then discussed again:

http://www.uwsg.indiana.edu/hypermail/linux/kernel/0612.0/0593.html

  Maciej
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] PHYLIB: IRQ event workqueue handling fixes

2007-10-15 Thread Jarek Poplawski

On 19-09-2007 16:38, Maciej W. Rozycki wrote:
...
> @@ -661,13 +664,22 @@ int phy_stop_interrupts(struct phy_devic
>   if (err)
>   phy_error(phydev);
>  
> + free_irq(phydev->irq, phydev);
> +
>   /*
> -  * Finish any pending work; we might have been scheduled to be called
> -  * from keventd ourselves, but cancel_work_sync() handles that.
> +  * Cannot call flush_scheduled_work() here as desired because
> +  * of rtnl_lock(), but we do not really care about what would
> +  * be done, except from enable_irq(), so cancel any work
> +  * possibly pending and take care of the matter below.
>*/
>   cancel_work_sync(>phy_queue);

Hi,

Could you explain why cancel_work_sync() is better here than
flush_scheduled_work() wrt. rtnl_lock()?

Regards,
Jarek P.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] PHYLIB: IRQ event workqueue handling fixes

2007-10-15 Thread Jarek Poplawski

On 19-09-2007 16:38, Maciej W. Rozycki wrote:
...
 @@ -661,13 +664,22 @@ int phy_stop_interrupts(struct phy_devic
   if (err)
   phy_error(phydev);
  
 + free_irq(phydev-irq, phydev);
 +
   /*
 -  * Finish any pending work; we might have been scheduled to be called
 -  * from keventd ourselves, but cancel_work_sync() handles that.
 +  * Cannot call flush_scheduled_work() here as desired because
 +  * of rtnl_lock(), but we do not really care about what would
 +  * be done, except from enable_irq(), so cancel any work
 +  * possibly pending and take care of the matter below.
*/
   cancel_work_sync(phydev-phy_queue);

Hi,

Could you explain why cancel_work_sync() is better here than
flush_scheduled_work() wrt. rtnl_lock()?

Regards,
Jarek P.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] PHYLIB: IRQ event workqueue handling fixes

2007-10-15 Thread Maciej W. Rozycki

On Mon, 15 Oct 2007, Jarek Poplawski wrote:

 Could you explain why cancel_work_sync() is better here than
 flush_scheduled_work() wrt. rtnl_lock()?

 Well, this is actually the bit that made cancel_work_sync() be written in 
the first place.  The short story is the netlink lock is most probably 
held at this point (depending on the usage of phy_disconnect()) and there 
is also an event waiting in the queue that requires the lock, so if 
flush_scheduled_work() is called here a deadlock will happen.

 Let me find a reference for a longer story...:

http://www.linux-mips.org/cgi-bin/mesg.cgi?a=linux-mipsi=Pine.LNX.4.64N.0610031509380.4642%40blysk.ds.pg.gda.pl

and then discussed again:

http://www.uwsg.indiana.edu/hypermail/linux/kernel/0612.0/0593.html

  Maciej
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] PHYLIB: IRQ event workqueue handling fixes

2007-09-21 Thread Andrew Morton

On Fri, 21 Sep 2007 13:51:12 +0100 (BST)
"Maciej W. Rozycki" <[EMAIL PROTECTED]> wrote:

> On Thu, 20 Sep 2007, Andrew Morton wrote:
> 
> > You always put boring, crappy, insufficient text in the for-the-changelog
> > section and interesting, useful, sufficient text in the 
> > not-for-the-changelog
> > section.
> 
>  I'll swap the sections in the future then. ;-)  Frankly I was not sure 
> whether the changelog was happy about being fed with lengthy explanations 
> and it has not spoken out.

I think it's worth putting plenty of details in the changelog: it's compressed
on-disk and on-the-wire and is overall pretty cheap.  If people don't
actually seek the information out, it has close to zero impact on them.

But on those occasions when people _do_ seek the information out (and it
can be years later) then they want every drop of information they can get.

Numerous times I've gone back to the 2.5.x mm/ changelogs to work out
what on earth we were thinking when we did something, and it has proved
quite useful in explaining the existing code, or in suggesting possible
problems which we had forgotten about by 2007.

otoh, you can get a lot of handy info by googling for strategic parts of
the kernel code, or by googling snippets of the existing-but-short
changelog.  For example, this patch: google for "Keep track of
disable_irq_nosync() invocations" and voila.  Perhaps we don't need
changelogs at all ;)

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] PHYLIB: IRQ event workqueue handling fixes

2007-09-21 Thread Maciej W. Rozycki

On Thu, 20 Sep 2007, Andrew Morton wrote:

> You always put boring, crappy, insufficient text in the for-the-changelog
> section and interesting, useful, sufficient text in the not-for-the-changelog
> section.

 I'll swap the sections in the future then. ;-)  Frankly I was not sure 
whether the changelog was happy about being fed with lengthy explanations 
and it has not spoken out.

 I have to admit this is a habit carried over from the FSF-style ChangeLog 
-- where the enforced rule is actually *not* to provide any explanation 
for why changes are done and only describe what has been modified (with 
any discussion around archived in the respective mailing list).

> But you can't fool me!  I have an editor and I fix it up.

 Thank you and sorry for the extra work I caused you -- I shall keep your 
suggestion in mind in the future.

  Maciej
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] PHYLIB: IRQ event workqueue handling fixes

2007-09-21 Thread Maciej W. Rozycki

On Thu, 20 Sep 2007, Andrew Morton wrote:

 You always put boring, crappy, insufficient text in the for-the-changelog
 section and interesting, useful, sufficient text in the not-for-the-changelog
 section.

 I'll swap the sections in the future then. ;-)  Frankly I was not sure 
whether the changelog was happy about being fed with lengthy explanations 
and it has not spoken out.

 I have to admit this is a habit carried over from the FSF-style ChangeLog 
-- where the enforced rule is actually *not* to provide any explanation 
for why changes are done and only describe what has been modified (with 
any discussion around archived in the respective mailing list).

 But you can't fool me!  I have an editor and I fix it up.

 Thank you and sorry for the extra work I caused you -- I shall keep your 
suggestion in mind in the future.

  Maciej
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] PHYLIB: IRQ event workqueue handling fixes

2007-09-21 Thread Andrew Morton

On Fri, 21 Sep 2007 13:51:12 +0100 (BST)
Maciej W. Rozycki [EMAIL PROTECTED] wrote:

 On Thu, 20 Sep 2007, Andrew Morton wrote:
 
  You always put boring, crappy, insufficient text in the for-the-changelog
  section and interesting, useful, sufficient text in the 
  not-for-the-changelog
  section.
 
  I'll swap the sections in the future then. ;-)  Frankly I was not sure 
 whether the changelog was happy about being fed with lengthy explanations 
 and it has not spoken out.

I think it's worth putting plenty of details in the changelog: it's compressed
on-disk and on-the-wire and is overall pretty cheap.  If people don't
actually seek the information out, it has close to zero impact on them.

But on those occasions when people _do_ seek the information out (and it
can be years later) then they want every drop of information they can get.

Numerous times I've gone back to the 2.5.x mm/ changelogs to work out
what on earth we were thinking when we did something, and it has proved
quite useful in explaining the existing code, or in suggesting possible
problems which we had forgotten about by 2007.


otoh, you can get a lot of handy info by googling for strategic parts of
the kernel code, or by googling snippets of the existing-but-short
changelog.  For example, this patch: google for Keep track of
disable_irq_nosync() invocations and voila.  Perhaps we don't need
changelogs at all ;)

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] PHYLIB: IRQ event workqueue handling fixes

2007-09-20 Thread Andrew Morton

On Wed, 19 Sep 2007 15:38:19 +0100 (BST)
"Maciej W. Rozycki" <[EMAIL PROTECTED]> wrote:

>  Keep track of disable_irq_nosync() invocations and call enable_irq() the 
> right number of times if work has been cancelled that would include them.
> 
> Signed-off-by: Maciej W. Rozycki <[EMAIL PROTECTED]>
> ---
>  Now that the call to flush_work_keventd() (problematic because of 
> rtnl_mutex being held) has been replaced by cancel_work_sync() another 
> issue has arisen and been left unresolved.  As the MDIO bus cannot be 
> accessed from the interrupt context the PHY interrupt handler uses 
> disable_irq_nosync() to prevent from looping and schedules some work to be 
> done as a softirq, which, apart from handling the state change of the 
> originating PHY, is responsible for reenabling the interrupt.  Now if the 
> interrupt line is shared by another device and a call to the softirq 
> handler has been cancelled, that call to enable_irq() never happens and 
> the other device cannot use its interrupt anymore as its stuck disabled.
> 
>  I decided to use a counter rather than a flag because there may be more 
> than one call to phy_change() cancelled in the queue -- a real one and a 
> fake one triggered by free_irq() if DEBUG_SHIRQ is used, if nothing else.  
> Therefore because of its nesting property enable_irq() has to be called 
> the right number of times to match the number disable_irq_nosync() was 
> called and restore the original state.  This DEBUG_SHIRQ feature is also 
> the reason why free_irq() has to be called before cancel_work_sync().
> 
>  While at it I updated the comment about phy_stop_interrupts() being 
> called from `keventd' -- this is no longer relevant as the use of 
> cancel_work_sync() makes such an approach unnecessary.  OTOH a similar 
> comment referring to flush_scheduled_work() in phy_stop() still applies as 
> using cancel_work_sync() there would be dangerous.
> 
>  Checked with checkpatch.pl and at the run time (with and without 
> DEBUG_SHIRQ).

You always put boring, crappy, insufficient text in the for-the-changelog
section and interesting, useful, sufficient text in the not-for-the-changelog
section.

But you can't fool me!  I have an editor and I fix it up.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] PHYLIB: IRQ event workqueue handling fixes

2007-09-20 Thread Andrew Morton

On Wed, 19 Sep 2007 15:38:19 +0100 (BST)
Maciej W. Rozycki [EMAIL PROTECTED] wrote:

  Keep track of disable_irq_nosync() invocations and call enable_irq() the 
 right number of times if work has been cancelled that would include them.
 
 Signed-off-by: Maciej W. Rozycki [EMAIL PROTECTED]
 ---
  Now that the call to flush_work_keventd() (problematic because of 
 rtnl_mutex being held) has been replaced by cancel_work_sync() another 
 issue has arisen and been left unresolved.  As the MDIO bus cannot be 
 accessed from the interrupt context the PHY interrupt handler uses 
 disable_irq_nosync() to prevent from looping and schedules some work to be 
 done as a softirq, which, apart from handling the state change of the 
 originating PHY, is responsible for reenabling the interrupt.  Now if the 
 interrupt line is shared by another device and a call to the softirq 
 handler has been cancelled, that call to enable_irq() never happens and 
 the other device cannot use its interrupt anymore as its stuck disabled.
 
  I decided to use a counter rather than a flag because there may be more 
 than one call to phy_change() cancelled in the queue -- a real one and a 
 fake one triggered by free_irq() if DEBUG_SHIRQ is used, if nothing else.  
 Therefore because of its nesting property enable_irq() has to be called 
 the right number of times to match the number disable_irq_nosync() was 
 called and restore the original state.  This DEBUG_SHIRQ feature is also 
 the reason why free_irq() has to be called before cancel_work_sync().
 
  While at it I updated the comment about phy_stop_interrupts() being 
 called from `keventd' -- this is no longer relevant as the use of 
 cancel_work_sync() makes such an approach unnecessary.  OTOH a similar 
 comment referring to flush_scheduled_work() in phy_stop() still applies as 
 using cancel_work_sync() there would be dangerous.
 
  Checked with checkpatch.pl and at the run time (with and without 
 DEBUG_SHIRQ).

You always put boring, crappy, insufficient text in the for-the-changelog
section and interesting, useful, sufficient text in the not-for-the-changelog
section.

But you can't fool me!  I have an editor and I fix it up.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] PHYLIB: IRQ event workqueue handling fixes

2007-09-19 Thread Maciej W. Rozycki

 Keep track of disable_irq_nosync() invocations and call enable_irq() the 
right number of times if work has been cancelled that would include them.

Signed-off-by: Maciej W. Rozycki <[EMAIL PROTECTED]>
---
 Now that the call to flush_work_keventd() (problematic because of 
rtnl_mutex being held) has been replaced by cancel_work_sync() another 
issue has arisen and been left unresolved.  As the MDIO bus cannot be 
accessed from the interrupt context the PHY interrupt handler uses 
disable_irq_nosync() to prevent from looping and schedules some work to be 
done as a softirq, which, apart from handling the state change of the 
originating PHY, is responsible for reenabling the interrupt.  Now if the 
interrupt line is shared by another device and a call to the softirq 
handler has been cancelled, that call to enable_irq() never happens and 
the other device cannot use its interrupt anymore as its stuck disabled.

 I decided to use a counter rather than a flag because there may be more 
than one call to phy_change() cancelled in the queue -- a real one and a 
fake one triggered by free_irq() if DEBUG_SHIRQ is used, if nothing else.  
Therefore because of its nesting property enable_irq() has to be called 
the right number of times to match the number disable_irq_nosync() was 
called and restore the original state.  This DEBUG_SHIRQ feature is also 
the reason why free_irq() has to be called before cancel_work_sync().

 While at it I updated the comment about phy_stop_interrupts() being 
called from `keventd' -- this is no longer relevant as the use of 
cancel_work_sync() makes such an approach unnecessary.  OTOH a similar 
comment referring to flush_scheduled_work() in phy_stop() still applies as 
using cancel_work_sync() there would be dangerous.

 Checked with checkpatch.pl and at the run time (with and without 
DEBUG_SHIRQ).

 Please apply.

  Maciej
 
patch-mips-2.6.23-rc5-20070904-phy-irq-fix-9
diff -up --recursive --new-file 
linux-mips-2.6.23-rc5-20070904.macro/drivers/net/phy/phy.c 
linux-mips-2.6.23-rc5-20070904/drivers/net/phy/phy.c
--- linux-mips-2.6.23-rc5-20070904.macro/drivers/net/phy/phy.c  2007-09-16 
17:17:20.0 +
+++ linux-mips-2.6.23-rc5-20070904/drivers/net/phy/phy.c2007-09-18 
14:28:07.0 +
@@ -7,7 +7,7 @@
  * Author: Andy Fleming
  *
  * Copyright (c) 2004 Freescale Semiconductor, Inc.
- * Copyright (c) 2006  Maciej W. Rozycki
+ * Copyright (c) 2006, 2007  Maciej W. Rozycki
  *
  * This program is free software; you can redistribute  it and/or modify it
  * under  the terms of  the GNU General  Public License as published by the
@@ -35,6 +35,7 @@
 #include 
 #include 
 
+#include 
 #include 
 #include 
 #include 
@@ -561,6 +562,7 @@ static irqreturn_t phy_interrupt(int irq
 * queue will write the PHY to disable and clear the
 * interrupt, and then reenable the irq line. */
disable_irq_nosync(irq);
+   atomic_inc(>irq_disable);
 
schedule_work(>phy_queue);
 
@@ -631,6 +633,7 @@ int phy_start_interrupts(struct phy_devi
 
INIT_WORK(>phy_queue, phy_change);
 
+   atomic_set(>irq_disable, 0);
if (request_irq(phydev->irq, phy_interrupt,
IRQF_SHARED,
"phy_interrupt",
@@ -661,13 +664,22 @@ int phy_stop_interrupts(struct phy_devic
if (err)
phy_error(phydev);
 
+   free_irq(phydev->irq, phydev);
+
/*
-* Finish any pending work; we might have been scheduled to be called
-* from keventd ourselves, but cancel_work_sync() handles that.
+* Cannot call flush_scheduled_work() here as desired because
+* of rtnl_lock(), but we do not really care about what would
+* be done, except from enable_irq(), so cancel any work
+* possibly pending and take care of the matter below.
 */
cancel_work_sync(>phy_queue);
-
-   free_irq(phydev->irq, phydev);
+   /*
+* If work indeed has been cancelled, disable_irq() will have
+* been left unbalanced from phy_interrupt() and enable_irq()
+* has to be called so that other devices on the line work.
+*/
+   while (atomic_dec_return(>irq_disable) >= 0)
+   enable_irq(phydev->irq);
 
return err;
 }
@@ -694,6 +706,7 @@ static void phy_change(struct work_struc
phydev->state = PHY_CHANGELINK;
spin_unlock(>lock);
 
+   atomic_dec(>irq_disable);
enable_irq(phydev->irq);
 
/* Reenable interrupts */
@@ -707,6 +720,7 @@ static void phy_change(struct work_struc
 
 irq_enable_err:
disable_irq(phydev->irq);
+   atomic_inc(>irq_disable);
 phy_err:
phy_error(phydev);
 }
diff -up --recursive --new-file 
linux-mips-2.6.23-rc5-20070904.macro/include/linux/phy.h 
linux-mips-2.6.23-rc5-20070904/include/linux/phy.h
--- linux-mips-2.6.23-rc5-20070904.macro/include/linux/phy.h2007-07-10 
04:56:57.0 +
+++

[PATCH] PHYLIB: IRQ event workqueue handling fixes

2007-09-19 Thread Maciej W. Rozycki

 Keep track of disable_irq_nosync() invocations and call enable_irq() the 
right number of times if work has been cancelled that would include them.

Signed-off-by: Maciej W. Rozycki [EMAIL PROTECTED]
---
 Now that the call to flush_work_keventd() (problematic because of 
rtnl_mutex being held) has been replaced by cancel_work_sync() another 
issue has arisen and been left unresolved.  As the MDIO bus cannot be 
accessed from the interrupt context the PHY interrupt handler uses 
disable_irq_nosync() to prevent from looping and schedules some work to be 
done as a softirq, which, apart from handling the state change of the 
originating PHY, is responsible for reenabling the interrupt.  Now if the 
interrupt line is shared by another device and a call to the softirq 
handler has been cancelled, that call to enable_irq() never happens and 
the other device cannot use its interrupt anymore as its stuck disabled.

 I decided to use a counter rather than a flag because there may be more 
than one call to phy_change() cancelled in the queue -- a real one and a 
fake one triggered by free_irq() if DEBUG_SHIRQ is used, if nothing else.  
Therefore because of its nesting property enable_irq() has to be called 
the right number of times to match the number disable_irq_nosync() was 
called and restore the original state.  This DEBUG_SHIRQ feature is also 
the reason why free_irq() has to be called before cancel_work_sync().

 While at it I updated the comment about phy_stop_interrupts() being 
called from `keventd' -- this is no longer relevant as the use of 
cancel_work_sync() makes such an approach unnecessary.  OTOH a similar 
comment referring to flush_scheduled_work() in phy_stop() still applies as 
using cancel_work_sync() there would be dangerous.

 Checked with checkpatch.pl and at the run time (with and without 
DEBUG_SHIRQ).

 Please apply.

  Maciej
 
patch-mips-2.6.23-rc5-20070904-phy-irq-fix-9
diff -up --recursive --new-file 
linux-mips-2.6.23-rc5-20070904.macro/drivers/net/phy/phy.c 
linux-mips-2.6.23-rc5-20070904/drivers/net/phy/phy.c
--- linux-mips-2.6.23-rc5-20070904.macro/drivers/net/phy/phy.c  2007-09-16 
17:17:20.0 +
+++ linux-mips-2.6.23-rc5-20070904/drivers/net/phy/phy.c2007-09-18 
14:28:07.0 +
@@ -7,7 +7,7 @@
  * Author: Andy Fleming
  *
  * Copyright (c) 2004 Freescale Semiconductor, Inc.
- * Copyright (c) 2006  Maciej W. Rozycki
+ * Copyright (c) 2006, 2007  Maciej W. Rozycki
  *
  * This program is free software; you can redistribute  it and/or modify it
  * under  the terms of  the GNU General  Public License as published by the
@@ -35,6 +35,7 @@
 #include linux/timer.h
 #include linux/workqueue.h
 
+#include asm/atomic.h
 #include asm/io.h
 #include asm/irq.h
 #include asm/uaccess.h
@@ -561,6 +562,7 @@ static irqreturn_t phy_interrupt(int irq
 * queue will write the PHY to disable and clear the
 * interrupt, and then reenable the irq line. */
disable_irq_nosync(irq);
+   atomic_inc(phydev-irq_disable);
 
schedule_work(phydev-phy_queue);
 
@@ -631,6 +633,7 @@ int phy_start_interrupts(struct phy_devi
 
INIT_WORK(phydev-phy_queue, phy_change);
 
+   atomic_set(phydev-irq_disable, 0);
if (request_irq(phydev-irq, phy_interrupt,
IRQF_SHARED,
phy_interrupt,
@@ -661,13 +664,22 @@ int phy_stop_interrupts(struct phy_devic
if (err)
phy_error(phydev);
 
+   free_irq(phydev-irq, phydev);
+
/*
-* Finish any pending work; we might have been scheduled to be called
-* from keventd ourselves, but cancel_work_sync() handles that.
+* Cannot call flush_scheduled_work() here as desired because
+* of rtnl_lock(), but we do not really care about what would
+* be done, except from enable_irq(), so cancel any work
+* possibly pending and take care of the matter below.
 */
cancel_work_sync(phydev-phy_queue);
-
-   free_irq(phydev-irq, phydev);
+   /*
+* If work indeed has been cancelled, disable_irq() will have
+* been left unbalanced from phy_interrupt() and enable_irq()
+* has to be called so that other devices on the line work.
+*/
+   while (atomic_dec_return(phydev-irq_disable) = 0)
+   enable_irq(phydev-irq);
 
return err;
 }
@@ -694,6 +706,7 @@ static void phy_change(struct work_struc
phydev-state = PHY_CHANGELINK;
spin_unlock(phydev-lock);
 
+   atomic_dec(phydev-irq_disable);
enable_irq(phydev-irq);
 
/* Reenable interrupts */
@@ -707,6 +720,7 @@ static void phy_change(struct work_struc
 
 irq_enable_err:
disable_irq(phydev-irq);
+   atomic_inc(phydev-irq_disable);
 phy_err:
phy_error(phydev);
 }
diff -up --recursive --new-file 
linux-mips-2.6.23-rc5-20070904.macro/include/linux/phy.h

64 matches

Mail list logo