Re: [PATCH] flush_work_sync vs. flush_scheduled_work Re: [PATCH] PHYLIB: IRQ event workqueue handling fixes
On Mon, Oct 22, 2007 at 10:02:59PM +0400, Oleg Nesterov wrote: ... > If this work doesn't rearm itself - yes. (otherwise, the same ->func > can run twice _at the same time_) > > But again, in this case wait_on_work() after try_to_grab_pending() == 1 > doesn't block, so we can just do > > if (cancel_work_sync(w)) > w->func(); > ...but, if it were run just before work_clear_pending()? Jarek P. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] flush_work_sync vs. flush_scheduled_work Re: [PATCH] PHYLIB: IRQ event workqueue handling fixes
On Mon, Oct 22, 2007 at 10:02:59PM +0400, Oleg Nesterov wrote: > On 10/22, Jarek Poplawski wrote: ... > > OK, I know I'm dumber and dumber everyday, > > You are not alone. I have the same feeling about myself! Feeling is not the same, only true knowledge counts! > > > these all flushes are > > rtnl lockup vulnerable wrt. other work functions, but cancel_work_sync > > looks perfectly fine > > Yes, > > > Then, if by any chance I'm right, something like flush_work_sync > > (or changed flush_scheduled_work, if there is no problem with such > > a change of implementation) could be safely (if it's called without > > locks used by flushed work only) done cancel_work_sync() way, by > > running a work function after try_to_grab_pending() returns 1 > > If this work doesn't rearm itself - yes. (otherwise, the same ->func > can run twice _at the same time_) > > But again, in this case wait_on_work() after try_to_grab_pending() == 1 > doesn't block, so we can just do > > if (cancel_work_sync(w)) > w->func(); Of course, I should have written it much shorter by saying flush_scheduled_work could be done internally just like you suggested! My point is to make this all safer and simpler, so we don't have to remember each time about these differences wrt. locking. Then checking of such patches could be much easier. Unless this current behavior of flush_scheduled_work has any real advantages, worth of this potential unsafety. (Btw. is there any reason to use this with rearming works, anyway?) Thanks, Jarek P. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] flush_work_sync vs. flush_scheduled_work Re: [PATCH] PHYLIB: IRQ event workqueue handling fixes
On Mon, Oct 22, 2007 at 10:02:59PM +0400, Oleg Nesterov wrote: On 10/22, Jarek Poplawski wrote: ... OK, I know I'm dumber and dumber everyday, You are not alone. I have the same feeling about myself! Feeling is not the same, only true knowledge counts! these all flushes are rtnl lockup vulnerable wrt. other work functions, but cancel_work_sync looks perfectly fine Yes, Then, if by any chance I'm right, something like flush_work_sync (or changed flush_scheduled_work, if there is no problem with such a change of implementation) could be safely (if it's called without locks used by flushed work only) done cancel_work_sync() way, by running a work function after try_to_grab_pending() returns 1 If this work doesn't rearm itself - yes. (otherwise, the same -func can run twice _at the same time_) But again, in this case wait_on_work() after try_to_grab_pending() == 1 doesn't block, so we can just do if (cancel_work_sync(w)) w-func(); Of course, I should have written it much shorter by saying flush_scheduled_work could be done internally just like you suggested! My point is to make this all safer and simpler, so we don't have to remember each time about these differences wrt. locking. Then checking of such patches could be much easier. Unless this current behavior of flush_scheduled_work has any real advantages, worth of this potential unsafety. (Btw. is there any reason to use this with rearming works, anyway?) Thanks, Jarek P. - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] flush_work_sync vs. flush_scheduled_work Re: [PATCH] PHYLIB: IRQ event workqueue handling fixes
On Mon, Oct 22, 2007 at 10:02:59PM +0400, Oleg Nesterov wrote: ... If this work doesn't rearm itself - yes. (otherwise, the same -func can run twice _at the same time_) But again, in this case wait_on_work() after try_to_grab_pending() == 1 doesn't block, so we can just do if (cancel_work_sync(w)) w-func(); ...but, if it were run just before work_clear_pending()? Jarek P. - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] flush_work_sync vs. flush_scheduled_work Re: [PATCH] PHYLIB: IRQ event workqueue handling fixes
On 10/22, Jarek Poplawski wrote: > > On Fri, Oct 19, 2007 at 09:50:14AM +0200, Jarek Poplawski wrote: > > On Thu, Oct 18, 2007 at 07:48:19PM +0400, Oleg Nesterov wrote: > > > On 10/18, Jarek Poplawski wrote: > > > > > > > > +/** > > > > + * flush_work_sync - block until a work_struct's callback has > > > > terminated > > > > > > ^^^ > > > Hmm... > > > > > > > + * Similar to cancel_work_sync() but will only busy wait (without > > > > cancel) > > > > + * if the work is queued. > > > > > > Yes, it won't block, but will spin in busy-wait loop until all other works > > > scheduled before this work are finished. Not good. After that it really > > > blocks waiting for this work to complete. > > > > > > And I am a bit confused. We can't use flush_workqueue() because some of > > > the > > > queued work_structs may take rtnl_lock, yes? But in that case we can't use > > > the new flush_work_sync() helper as well, no? > > OK, I know I'm dumber and dumber everyday, You are not alone. I have the same feeling about myself! > these all flushes are > rtnl lockup vulnerable wrt. other work functions, but cancel_work_sync > looks perfectly fine Yes, > Then, if by any chance I'm right, something like flush_work_sync > (or changed flush_scheduled_work, if there is no problem with such > a change of implementation) could be safely (if it's called without > locks used by flushed work only) done cancel_work_sync() way, by > running a work function after try_to_grab_pending() returns 1 If this work doesn't rearm itself - yes. (otherwise, the same ->func can run twice _at the same time_) But again, in this case wait_on_work() after try_to_grab_pending() == 1 doesn't block, so we can just do if (cancel_work_sync(w)) w->func(); Oleg. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] flush_work_sync vs. flush_scheduled_work Re: [PATCH] PHYLIB: IRQ event workqueue handling fixes
On Fri, Oct 19, 2007 at 09:50:14AM +0200, Jarek Poplawski wrote: > On Thu, Oct 18, 2007 at 07:48:19PM +0400, Oleg Nesterov wrote: > > On 10/18, Jarek Poplawski wrote: > > > > > > +/** > > > + * flush_work_sync - block until a work_struct's callback has terminated > > ^^^ > > Hmm... > > > > > + * Similar to cancel_work_sync() but will only busy wait (without cancel) > > > + * if the work is queued. > > > > Yes, it won't block, but will spin in busy-wait loop until all other works > > scheduled before this work are finished. Not good. After that it really > > blocks waiting for this work to complete. > > > > And I am a bit confused. We can't use flush_workqueue() because some of the > > queued work_structs may take rtnl_lock, yes? But in that case we can't use > > the new flush_work_sync() helper as well, no? OK, I know I'm dumber and dumber everyday, but it seems in a hurry I got it wrong again or miss something (as usual): these all flushes are rtnl lockup vulnerable wrt. other work functions, but cancel_work_sync looks perfectly fine... (Or am I wrong because: ...?) Then, if by any chance I'm right, something like flush_work_sync (or changed flush_scheduled_work, if there is no problem with such a change of implementation) could be safely (if it's called without locks used by flushed work only) done cancel_work_sync() way, by running a work function after try_to_grab_pending() returns 1 (after list_del_init - of course without respecting a queue order). Regards, Jarek P. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] flush_work_sync vs. flush_scheduled_work Re: [PATCH] PHYLIB: IRQ event workqueue handling fixes
On Fri, Oct 19, 2007 at 09:50:14AM +0200, Jarek Poplawski wrote: On Thu, Oct 18, 2007 at 07:48:19PM +0400, Oleg Nesterov wrote: On 10/18, Jarek Poplawski wrote: +/** + * flush_work_sync - block until a work_struct's callback has terminated ^^^ Hmm... + * Similar to cancel_work_sync() but will only busy wait (without cancel) + * if the work is queued. Yes, it won't block, but will spin in busy-wait loop until all other works scheduled before this work are finished. Not good. After that it really blocks waiting for this work to complete. And I am a bit confused. We can't use flush_workqueue() because some of the queued work_structs may take rtnl_lock, yes? But in that case we can't use the new flush_work_sync() helper as well, no? OK, I know I'm dumber and dumber everyday, but it seems in a hurry I got it wrong again or miss something (as usual): these all flushes are rtnl lockup vulnerable wrt. other work functions, but cancel_work_sync looks perfectly fine... (Or am I wrong because: ...?) Then, if by any chance I'm right, something like flush_work_sync (or changed flush_scheduled_work, if there is no problem with such a change of implementation) could be safely (if it's called without locks used by flushed work only) done cancel_work_sync() way, by running a work function after try_to_grab_pending() returns 1 (after list_del_init - of course without respecting a queue order). Regards, Jarek P. - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] flush_work_sync vs. flush_scheduled_work Re: [PATCH] PHYLIB: IRQ event workqueue handling fixes
On 10/22, Jarek Poplawski wrote: On Fri, Oct 19, 2007 at 09:50:14AM +0200, Jarek Poplawski wrote: On Thu, Oct 18, 2007 at 07:48:19PM +0400, Oleg Nesterov wrote: On 10/18, Jarek Poplawski wrote: +/** + * flush_work_sync - block until a work_struct's callback has terminated ^^^ Hmm... + * Similar to cancel_work_sync() but will only busy wait (without cancel) + * if the work is queued. Yes, it won't block, but will spin in busy-wait loop until all other works scheduled before this work are finished. Not good. After that it really blocks waiting for this work to complete. And I am a bit confused. We can't use flush_workqueue() because some of the queued work_structs may take rtnl_lock, yes? But in that case we can't use the new flush_work_sync() helper as well, no? OK, I know I'm dumber and dumber everyday, You are not alone. I have the same feeling about myself! these all flushes are rtnl lockup vulnerable wrt. other work functions, but cancel_work_sync looks perfectly fine Yes, Then, if by any chance I'm right, something like flush_work_sync (or changed flush_scheduled_work, if there is no problem with such a change of implementation) could be safely (if it's called without locks used by flushed work only) done cancel_work_sync() way, by running a work function after try_to_grab_pending() returns 1 If this work doesn't rearm itself - yes. (otherwise, the same -func can run twice _at the same time_) But again, in this case wait_on_work() after try_to_grab_pending() == 1 doesn't block, so we can just do if (cancel_work_sync(w)) w-func(); Oleg. - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] PHYLIB: IRQ event workqueue handling fixes
> Actually I'm not convinced with this explanation. It seems to me that > since there are such serious locking problems (especially with rntl), > there could be once more considered a private workqueue. You've > written earlier about being a lonely user of this code. But, since > Benjamin offered his help with changing to mutexes, which looks like > very reasonable idea to me (probably I miss most of the points...), > maybe it's very good opportunity to both: make this code better and > double the user base! I'm interested in looking for such solution > if Benjamin thinks there could be too few problems for him... So, > let somebody tell us what could be wrong with this idea? My main problem is time :-) But I need to do the change to mutex if I am to use phylib for emac. I can't tell when I'll have time to do it tho. Ben. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] PHYLIB: IRQ event workqueue handling fixes
On Fri, 19 Oct 2007, Jarek Poplawski wrote: > Actually I'm not convinced with this explanation. It seems to me that > since there are such serious locking problems (especially with rntl), > there could be once more considered a private workqueue. You've > written earlier about being a lonely user of this code. But, since Well, this will change eventually when I submit the patch for Broadcom SiByte platforms so that sb1250-mac.c will be able to request an interrupt for the PHYs. All the infrastructure is in place except from platform code to configure the SOC's GPIO line used for the interrupt input (when applicable) and let the driver know what the line actually is. > Benjamin offered his help with changing to mutexes, which looks like > very reasonable idea to me (probably I miss most of the points...), > maybe it's very good opportunity to both: make this code better and > double the user base! I'm interested in looking for such solution > if Benjamin thinks there could be too few problems for him... So, > let somebody tell us what could be wrong with this idea? I do not object and can certainly cooperate, but cannot make it a high priority work for me at the moment -- sorry. Maciej - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] PHYLIB: IRQ event workqueue handling fixes
On Fri, Oct 19, 2007 at 12:38:29PM +0100, Maciej W. Rozycki wrote: > On Thu, 18 Oct 2007, Maciej W. Rozycki wrote: > > > > 1) phy_change() checks PHY_HALTED flag without lock; I think it's > > > racy: eg. if it's done during phy_stop() it can check just before > > > the flag is set and reenable interrupts just after phy_stop() ends. > > > > I remember having a look into it, but it was long ago and I cannot > > immediately recall the conclusion. Which means it is either broken or > > deserves a comment as non-obvious. I will have a look into it again, but > > I am resource-starved a little at the moment, sorry. > > Well, I have now recalled what the issue is -- we just plainly and simply > want to avoid a hardirq spinlock for the very reason we do not do all the > processing in the hardirq handler. The thing is we make accesses to the > MDIO bus with the phydev lock held and it may take ages until these > accesses will have completed. And we cannot afford keeping interrupts > disabled for so long. > > So the only way is to make the check for the HALTED state lockless and > make sure any race condition is handled gracefully and does not lead to > inconsistent behaviour. Which I think as of what we have in the > net-2.6.24 tree is the case, but there are never too many eyes to look at > a piece of code, so if anybody feels like proving me wrong, then just go > ahead! Actually I'm not convinced with this explanation. It seems to me that since there are such serious locking problems (especially with rntl), there could be once more considered a private workqueue. You've written earlier about being a lonely user of this code. But, since Benjamin offered his help with changing to mutexes, which looks like very reasonable idea to me (probably I miss most of the points...), maybe it's very good opportunity to both: make this code better and double the user base! I'm interested in looking for such solution if Benjamin thinks there could be too few problems for him... So, let somebody tell us what could be wrong with this idea? Cheers (till Monday), Jarek P. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] PHYLIB: IRQ event workqueue handling fixes
On Fri, 19 Oct 2007, Jarek Poplawski wrote: > But then... your patch seems to make it possible, because it enables > irq to the initial state of the counter. Of course, this could happen > on closing only. That's because free_irq() does not disable the interrupt in the correct manner. The scenario is more or less like this: phy_interrupt() [depth == 0] disable_irq() depth++; status |= IRQ_DISABLED; ... free_irq() [depth == 1] status |= IRQ_DISABLED; ... phy_change()[depth == 1] enable_irq() depth--; status &= ~IRQ_DISABLED; oops! Now if free_irq() correctly incremented the depth counter, then the last enable_irq() would still decrement it, but with its initial value of 2 it would not change the status to reenable the line. Maciej - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] PHYLIB: IRQ event workqueue handling fixes
On Thu, 18 Oct 2007, Maciej W. Rozycki wrote: > > 1) phy_change() checks PHY_HALTED flag without lock; I think it's > > racy: eg. if it's done during phy_stop() it can check just before > > the flag is set and reenable interrupts just after phy_stop() ends. > > I remember having a look into it, but it was long ago and I cannot > immediately recall the conclusion. Which means it is either broken or > deserves a comment as non-obvious. I will have a look into it again, but > I am resource-starved a little at the moment, sorry. Well, I have now recalled what the issue is -- we just plainly and simply want to avoid a hardirq spinlock for the very reason we do not do all the processing in the hardirq handler. The thing is we make accesses to the MDIO bus with the phydev lock held and it may take ages until these accesses will have completed. And we cannot afford keeping interrupts disabled for so long. So the only way is to make the check for the HALTED state lockless and make sure any race condition is handled gracefully and does not lead to inconsistent behaviour. Which I think as of what we have in the net-2.6.24 tree is the case, but there are never too many eyes to look at a piece of code, so if anybody feels like proving me wrong, then just go ahead! Maciej - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] PHYLIB: IRQ event workqueue handling fixes
On Thu, Oct 18, 2007 at 12:30:35PM +0100, Maciej W. Rozycki wrote: > On Wed, 17 Oct 2007, Jarek Poplawski wrote: ... > > 2) phy_change() doesn't reenable irq line after it sees returns > > with errors; IMHO it should at least write some warning, but maybe > > try some safety plan, so enable_irq() and try to disable interrupts > > and free_irq() on the next call (if it happens). (But, I can be very > > wrong with this - maybe it's OK and official way.) > > No way to do this safely -- at this point the device probably still has > its interrupt output asserted and the register to clear it is > inaccessible, so enabling the line will enter an infinite loop. At this > point the system is no longer stable, so it is better to keep at least > some functionality, so that it may be attempted to be shut down cleanly, > rather than make it completely irresponsive. The alternative is panic(). But then... your patch seems to make it possible, because it enables irq to the initial state of the counter. Of course, this could happen on closing only. Jarek P. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] flush_work_sync vs. flush_scheduled_work Re: [PATCH] PHYLIB: IRQ event workqueue handling fixes
On Thu, 2007-10-18 at 19:48 +0400, Oleg Nesterov wrote: > > +void flush_work_sync(struct work_struct *work) > If we really the new helper, perhaps we can make it a bit better? > > 1. Modify insert_work() to take the "struct list_head *at" parameter instead >of "int tail". I think this patch will also cleanup the code a bit, and >shrink a couple of bytes from .text > > 2. flush_work_sync() inserts a barrier right after this work and blocks. >We still need some retry logic to handle the queueing is in progress >of course, but we won't spin waiting for the other works. 3. Add lockdep annotation like the other API. :) Andrew just sent my patch (used to be two patches by somebody's request but that's fine) titled "workqueue: debug flushing deadlocks with lockdep" to Linus. johannes signature.asc Description: This is a digitally signed message part
Re: [PATCH] flush_work_sync vs. flush_scheduled_work Re: [PATCH] PHYLIB: IRQ event workqueue handling fixes
On Fri, Oct 19, 2007 at 09:50:14AM +0200, Jarek Poplawski wrote: ... > sched_work_sync() with rtnl_lock(). It's only less probable to lockup > with this than with flush_schedule_work(). ...But, not much less... Jarek P. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] flush_work_sync vs. flush_scheduled_work Re: [PATCH] PHYLIB: IRQ event workqueue handling fixes
On Thu, Oct 18, 2007 at 07:48:19PM +0400, Oleg Nesterov wrote: > On 10/18, Jarek Poplawski wrote: > > > > +/** > > + * flush_work_sync - block until a work_struct's callback has terminated > ^^^ > Hmm... > > > + * Similar to cancel_work_sync() but will only busy wait (without cancel) > > + * if the work is queued. > > Yes, it won't block, but will spin in busy-wait loop until all other works > scheduled before this work are finished. Not good. After that it really > blocks waiting for this work to complete. > > And I am a bit confused. We can't use flush_workqueue() because some of the > queued work_structs may take rtnl_lock, yes? But in that case we can't use > the new flush_work_sync() helper as well, no? OOOPS! Of course, we can't!!! I remembered there was this issue long time ago, but then I've had some break in tracking net & workqueue. So, while reading this patch I was alarmed at first, and self-misled later. I think, there is definitely needed some warning about locking (or unlocking) during these flush_ & cancel_ functions. (Btw, I've very much wondered now, why I didn't notice at that 'old' time, that you added such a great feature (wrt. locking) and I even didn't notice this...). So, Maciej (and other readers of this thread) - I withdraw my false opinion from my second message here: it's very wrong to call this sched_work_sync() with rtnl_lock(). It's only less probable to lockup with this than with flush_schedule_work(). > > If we can't just cancel the work, can't we do something like > > if (cancel_work_sync(w)) > w->func(w); > > instead? > > > +void flush_work_sync(struct work_struct *work) > > +{ > > + int ret; > > + > > + do { > > + ret = work_pending(work); > > + wait_on_work(work); > > + if (ret) > > + cpu_relax(); > > + } while (ret); > > +} > > If we really the new helper, perhaps we can make it a bit better? > > 1. Modify insert_work() to take the "struct list_head *at" parameter instead >of "int tail". I think this patch will also cleanup the code a bit, and >shrink a couple of bytes from .text Looks like a very good idea, but I need more time to rethink this. Probably some code example should be helpful. > > 2. flush_work_sync() inserts a barrier right after this work and blocks. >We still need some retry logic to handle the queueing is in progress >of course, but we won't spin waiting for the other works. Until monday I should have an opinion on that (today a bit under fire...). > > What do you think? Since there is no gain wrt. locking with my current proposal, I withdraw this patch of course. It looks like my wrong patch was great idea because we got this very precious Oleg's opinion! (I know I'm a genius sometimes...) Thanks very much, Jarek P. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] PHYLIB: IRQ event workqueue handling fixes
On Fri, Oct 19, 2007 at 12:38:29PM +0100, Maciej W. Rozycki wrote: On Thu, 18 Oct 2007, Maciej W. Rozycki wrote: 1) phy_change() checks PHY_HALTED flag without lock; I think it's racy: eg. if it's done during phy_stop() it can check just before the flag is set and reenable interrupts just after phy_stop() ends. I remember having a look into it, but it was long ago and I cannot immediately recall the conclusion. Which means it is either broken or deserves a comment as non-obvious. I will have a look into it again, but I am resource-starved a little at the moment, sorry. Well, I have now recalled what the issue is -- we just plainly and simply want to avoid a hardirq spinlock for the very reason we do not do all the processing in the hardirq handler. The thing is we make accesses to the MDIO bus with the phydev lock held and it may take ages until these accesses will have completed. And we cannot afford keeping interrupts disabled for so long. So the only way is to make the check for the HALTED state lockless and make sure any race condition is handled gracefully and does not lead to inconsistent behaviour. Which I think as of what we have in the net-2.6.24 tree is the case, but there are never too many eyes to look at a piece of code, so if anybody feels like proving me wrong, then just go ahead! Actually I'm not convinced with this explanation. It seems to me that since there are such serious locking problems (especially with rntl), there could be once more considered a private workqueue. You've written earlier about being a lonely user of this code. But, since Benjamin offered his help with changing to mutexes, which looks like very reasonable idea to me (probably I miss most of the points...), maybe it's very good opportunity to both: make this code better and double the user base! I'm interested in looking for such solution if Benjamin thinks there could be too few problems for him... So, let somebody tell us what could be wrong with this idea? Cheers (till Monday), Jarek P. - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] PHYLIB: IRQ event workqueue handling fixes
Actually I'm not convinced with this explanation. It seems to me that since there are such serious locking problems (especially with rntl), there could be once more considered a private workqueue. You've written earlier about being a lonely user of this code. But, since Benjamin offered his help with changing to mutexes, which looks like very reasonable idea to me (probably I miss most of the points...), maybe it's very good opportunity to both: make this code better and double the user base! I'm interested in looking for such solution if Benjamin thinks there could be too few problems for him... So, let somebody tell us what could be wrong with this idea? My main problem is time :-) But I need to do the change to mutex if I am to use phylib for emac. I can't tell when I'll have time to do it tho. Ben. - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] PHYLIB: IRQ event workqueue handling fixes
On Fri, 19 Oct 2007, Jarek Poplawski wrote: Actually I'm not convinced with this explanation. It seems to me that since there are such serious locking problems (especially with rntl), there could be once more considered a private workqueue. You've written earlier about being a lonely user of this code. But, since Well, this will change eventually when I submit the patch for Broadcom SiByte platforms so that sb1250-mac.c will be able to request an interrupt for the PHYs. All the infrastructure is in place except from platform code to configure the SOC's GPIO line used for the interrupt input (when applicable) and let the driver know what the line actually is. Benjamin offered his help with changing to mutexes, which looks like very reasonable idea to me (probably I miss most of the points...), maybe it's very good opportunity to both: make this code better and double the user base! I'm interested in looking for such solution if Benjamin thinks there could be too few problems for him... So, let somebody tell us what could be wrong with this idea? I do not object and can certainly cooperate, but cannot make it a high priority work for me at the moment -- sorry. Maciej - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] PHYLIB: IRQ event workqueue handling fixes
On Thu, 18 Oct 2007, Maciej W. Rozycki wrote: 1) phy_change() checks PHY_HALTED flag without lock; I think it's racy: eg. if it's done during phy_stop() it can check just before the flag is set and reenable interrupts just after phy_stop() ends. I remember having a look into it, but it was long ago and I cannot immediately recall the conclusion. Which means it is either broken or deserves a comment as non-obvious. I will have a look into it again, but I am resource-starved a little at the moment, sorry. Well, I have now recalled what the issue is -- we just plainly and simply want to avoid a hardirq spinlock for the very reason we do not do all the processing in the hardirq handler. The thing is we make accesses to the MDIO bus with the phydev lock held and it may take ages until these accesses will have completed. And we cannot afford keeping interrupts disabled for so long. So the only way is to make the check for the HALTED state lockless and make sure any race condition is handled gracefully and does not lead to inconsistent behaviour. Which I think as of what we have in the net-2.6.24 tree is the case, but there are never too many eyes to look at a piece of code, so if anybody feels like proving me wrong, then just go ahead! Maciej - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] PHYLIB: IRQ event workqueue handling fixes
On Thu, Oct 18, 2007 at 12:30:35PM +0100, Maciej W. Rozycki wrote: On Wed, 17 Oct 2007, Jarek Poplawski wrote: ... 2) phy_change() doesn't reenable irq line after it sees returns with errors; IMHO it should at least write some warning, but maybe try some safety plan, so enable_irq() and try to disable interrupts and free_irq() on the next call (if it happens). (But, I can be very wrong with this - maybe it's OK and official way.) No way to do this safely -- at this point the device probably still has its interrupt output asserted and the register to clear it is inaccessible, so enabling the line will enter an infinite loop. At this point the system is no longer stable, so it is better to keep at least some functionality, so that it may be attempted to be shut down cleanly, rather than make it completely irresponsive. The alternative is panic(). But then... your patch seems to make it possible, because it enables irq to the initial state of the counter. Of course, this could happen on closing only. Jarek P. - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] PHYLIB: IRQ event workqueue handling fixes
On Fri, 19 Oct 2007, Jarek Poplawski wrote: But then... your patch seems to make it possible, because it enables irq to the initial state of the counter. Of course, this could happen on closing only. That's because free_irq() does not disable the interrupt in the correct manner. The scenario is more or less like this: phy_interrupt() [depth == 0] disable_irq() depth++; status |= IRQ_DISABLED; ... free_irq() [depth == 1] status |= IRQ_DISABLED; ... phy_change()[depth == 1] enable_irq() depth--; status = ~IRQ_DISABLED; oops! Now if free_irq() correctly incremented the depth counter, then the last enable_irq() would still decrement it, but with its initial value of 2 it would not change the status to reenable the line. Maciej - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] flush_work_sync vs. flush_scheduled_work Re: [PATCH] PHYLIB: IRQ event workqueue handling fixes
On Thu, 2007-10-18 at 19:48 +0400, Oleg Nesterov wrote: +void flush_work_sync(struct work_struct *work) If we really the new helper, perhaps we can make it a bit better? 1. Modify insert_work() to take the struct list_head *at parameter instead of int tail. I think this patch will also cleanup the code a bit, and shrink a couple of bytes from .text 2. flush_work_sync() inserts a barrier right after this work and blocks. We still need some retry logic to handle the queueing is in progress of course, but we won't spin waiting for the other works. 3. Add lockdep annotation like the other API. :) Andrew just sent my patch (used to be two patches by somebody's request but that's fine) titled workqueue: debug flushing deadlocks with lockdep to Linus. johannes signature.asc Description: This is a digitally signed message part
Re: [PATCH] flush_work_sync vs. flush_scheduled_work Re: [PATCH] PHYLIB: IRQ event workqueue handling fixes
On Fri, Oct 19, 2007 at 09:50:14AM +0200, Jarek Poplawski wrote: ... sched_work_sync() with rtnl_lock(). It's only less probable to lockup with this than with flush_schedule_work(). ...But, not much less... Jarek P. - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] flush_work_sync vs. flush_scheduled_work Re: [PATCH] PHYLIB: IRQ event workqueue handling fixes
On Thu, Oct 18, 2007 at 07:48:19PM +0400, Oleg Nesterov wrote: On 10/18, Jarek Poplawski wrote: +/** + * flush_work_sync - block until a work_struct's callback has terminated ^^^ Hmm... + * Similar to cancel_work_sync() but will only busy wait (without cancel) + * if the work is queued. Yes, it won't block, but will spin in busy-wait loop until all other works scheduled before this work are finished. Not good. After that it really blocks waiting for this work to complete. And I am a bit confused. We can't use flush_workqueue() because some of the queued work_structs may take rtnl_lock, yes? But in that case we can't use the new flush_work_sync() helper as well, no? OOOPS! Of course, we can't!!! I remembered there was this issue long time ago, but then I've had some break in tracking net workqueue. So, while reading this patch I was alarmed at first, and self-misled later. I think, there is definitely needed some warning about locking (or unlocking) during these flush_ cancel_ functions. (Btw, I've very much wondered now, why I didn't notice at that 'old' time, that you added such a great feature (wrt. locking) and I even didn't notice this...). So, Maciej (and other readers of this thread) - I withdraw my false opinion from my second message here: it's very wrong to call this sched_work_sync() with rtnl_lock(). It's only less probable to lockup with this than with flush_schedule_work(). If we can't just cancel the work, can't we do something like if (cancel_work_sync(w)) w-func(w); instead? +void flush_work_sync(struct work_struct *work) +{ + int ret; + + do { + ret = work_pending(work); + wait_on_work(work); + if (ret) + cpu_relax(); + } while (ret); +} If we really the new helper, perhaps we can make it a bit better? 1. Modify insert_work() to take the struct list_head *at parameter instead of int tail. I think this patch will also cleanup the code a bit, and shrink a couple of bytes from .text Looks like a very good idea, but I need more time to rethink this. Probably some code example should be helpful. 2. flush_work_sync() inserts a barrier right after this work and blocks. We still need some retry logic to handle the queueing is in progress of course, but we won't spin waiting for the other works. Until monday I should have an opinion on that (today a bit under fire...). What do you think? Since there is no gain wrt. locking with my current proposal, I withdraw this patch of course. It looks like my wrong patch was great idea because we got this very precious Oleg's opinion! (I know I'm a genius sometimes...) Thanks very much, Jarek P. - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] flush_work_sync vs. flush_scheduled_work Re: [PATCH] PHYLIB: IRQ event workqueue handling fixes
On Thu, 18 Oct 2007, Oleg Nesterov wrote: > If we can't just cancel the work, can't we do something like > > if (cancel_work_sync(w)) > w->func(w); > > instead? We do an equivalent of this -- all that we care about that w->func(w) would do is enable_irq() and the rest we want to skip at this point. We probably do not need the counter in the end though. Maciej - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] flush_work_sync vs. flush_scheduled_work Re: [PATCH] PHYLIB: IRQ event workqueue handling fixes
On 10/18, Jarek Poplawski wrote: > > +/** > + * flush_work_sync - block until a work_struct's callback has terminated ^^^ Hmm... > + * Similar to cancel_work_sync() but will only busy wait (without cancel) > + * if the work is queued. Yes, it won't block, but will spin in busy-wait loop until all other works scheduled before this work are finished. Not good. After that it really blocks waiting for this work to complete. And I am a bit confused. We can't use flush_workqueue() because some of the queued work_structs may take rtnl_lock, yes? But in that case we can't use the new flush_work_sync() helper as well, no? If we can't just cancel the work, can't we do something like if (cancel_work_sync(w)) w->func(w); instead? > +void flush_work_sync(struct work_struct *work) > +{ > + int ret; > + > + do { > + ret = work_pending(work); > + wait_on_work(work); > + if (ret) > + cpu_relax(); > + } while (ret); > +} If we really the new helper, perhaps we can make it a bit better? 1. Modify insert_work() to take the "struct list_head *at" parameter instead of "int tail". I think this patch will also cleanup the code a bit, and shrink a couple of bytes from .text 2. flush_work_sync() inserts a barrier right after this work and blocks. We still need some retry logic to handle the queueing is in progress of course, but we won't spin waiting for the other works. What do you think? Oleg. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] PHYLIB: IRQ event workqueue handling fixes
On Thu, 18 Oct 2007, Jarek Poplawski wrote: > Technically until free_irq returns a handler should respond to calls > and with proper hardware it should have no problem with checking if > it's its interrupt, even after disabling this hardware, because of > possible races. Well, the hardirq handler can check it, no problem -- it is just it is so slow, the latency would be unacceptable. The problem with the softirq handler is we do really not want it to be called after the driver has already been shut down and its structures freed. It used to happen before this flush/cancel call was added with the usual effect (oops) as by then they may well have been stamped on already. > But with a scenario like this: > > - disable_irq() > - disable_my_hadrware_irq() > - clear_my_hardware_irq() > - free_irq() > - enable_irq() > > it seems the handler should respond even after free_irq because there > could be still interrupts to resend, originally generated by its > hardware, so such behavior looks very suspicious, at least with some > type of interrupts. These are softirqs, not hardware interrupts, so they are as such not related to *_irq() infrastructure. The flaw is the depth count of IRQ lines is not maintained consistently. This is, according to comments around the code in question, to cover up bugs elsewhere. Not a brillant idea, I am afraid -- there should be no need to reset the depth upon request_irq() and likewise with free_irq(), but there you go. I would be happy to investigate a possible solution and rewrite the necessary bits, but right now I am committed to other stuff, overdue already, sorry. The view could change if we supported hot-pluggable interrupt controllers, but it is not the case at the moment right now, so the interrupt lines are there to stay for the duration of the system lifespan and could be manipulated as necessary. > So, I think, the idea of DEBUG_SHIRQ is generally right and very > useful - but, of course, there could be exceptions, which btw. could > try some hacks under DEBUG_SHIRQ too. And my opinion about > 'properness' was very general (not about phy) too, just like my > 'knowledge' of drivers. The idea is right, no question, but I am not quite sure it has been properly architected into our current design. Actually I am almost sure of the reverse. This is why I was (and still am) interested in feedback on it. > Right! But then some warning could be useful, I presume. Of course; adding one to phy_error() should be straightforward. > > > 4) phy_interrupt() should check return value from schedule_work() and > > > enable irq on 0. > > > > No -- the work already pending will do that. > > How? If schedule_work won't succeed because there is a pending one, > we did disable_irq_nosync 2x, so we would stay at least 1x too deep! Correct and my note is misleading, sorry. The thing is we shouldn't have come here the second time in the first place (which is I think why the check is not there) as handlers for the same line are not allowed to run in parallel (cf. irq_desc->lock and IRQ_INPROGRESS). Perhaps BUG_ON(!schedule_work()) would be appropriate, though I am not sure if we should handle every "impossible" condition we can imagine. Quite a lot of hardirq handlers assume two instances will not run in parallel, so if it ever happened, then a serious breakage would unleash. > But, I've enough of other concerns too, so nothing urgent here... The problem is at the moment I am still probably the only user of this code ;-) -- `grep' through the sources to see how many drivers request an IRQ for their PHY through phylib; unless something has changed very recently. Maciej - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] PHYLIB: IRQ event workqueue handling fixes
On Thu, Oct 18, 2007 at 12:30:35PM +0100, Maciej W. Rozycki wrote: > On Wed, 17 Oct 2007, Jarek Poplawski wrote: > > > I'm not sure free_irq() should maintain the depth count - rather warn > > on not zero. But, IMHO, any activity on freed irq seems suspicious to > > me (and doesn't look like very common), even if it's safe with current > > implementation. > > No way to avoid it with DEBUG_SHIRQ. > > > Yes, these DEBUG_SHIRQ checks are suspicious to me too, but they seem > > to be reasonable only in the case of possible resent irqs (so not for > > all irqs). On the other hand, it seems, proper irq handler with proper > > hardware shouldn't have any problems with such a check. > > What do you mean by "proper irq handler with proper hardware"? Using > softirqs (they used to be called bottom-halves) is actually a natural way > of handling any interrupt which requires extensive processing. Technically until free_irq returns a handler should respond to calls and with proper hardware it should have no problem with checking if it's its interrupt, even after disabling this hardware, because of possible races. But with a scenario like this: - disable_irq() - disable_my_hadrware_irq() - clear_my_hardware_irq() - free_irq() - enable_irq() it seems the handler should respond even after free_irq because there could be still interrupts to resend, originally generated by its hardware, so such behavior looks very suspicious, at least with some type of interrupts. So, I think, the idea of DEBUG_SHIRQ is generally right and very useful - but, of course, there could be exceptions, which btw. could try some hacks under DEBUG_SHIRQ too. And my opinion about 'properness' was very general (not about phy) too, just like my 'knowledge' of drivers. > > > 1) phy_change() checks PHY_HALTED flag without lock; I think it's > > racy: eg. if it's done during phy_stop() it can check just before > > the flag is set and reenable interrupts just after phy_stop() ends. > > I remember having a look into it, but it was long ago and I cannot > immediately recall the conclusion. Which means it is either broken or > deserves a comment as non-obvious. I will have a look into it again, but > I am resource-starved a little at the moment, sorry. > > > 2) phy_change() doesn't reenable irq line after it sees returns > > with errors; IMHO it should at least write some warning, but maybe > > try some safety plan, so enable_irq() and try to disable interrupts > > and free_irq() on the next call (if it happens). (But, I can be very > > wrong with this - maybe it's OK and official way.) > > No way to do this safely -- at this point the device probably still has > its interrupt output asserted and the register to clear it is > inaccessible, so enabling the line will enter an infinite loop. At this > point the system is no longer stable, so it is better to keep at least > some functionality, so that it may be attempted to be shut down cleanly, > rather than make it completely irresponsive. The alternative is panic(). Right! But then some warning could be useful, I presume. > > > 3) phy_interrupt() checks PHY_HALTED flag without lock too, but I'm > > not sure now if it could be dangerous after fixing #1; on the other > > hand even if we know it's not our regular interrupt, with current > > DEBUG_SHIRQ it could be easier to call schedule_work() anyway since > > we are sure it's before/in free_irq() yet. > > See #1 above. > > > 4) phy_interrupt() should check return value from schedule_work() and > > enable irq on 0. > > No -- the work already pending will do that. How? If schedule_work won't succeed because there is a pending one, we did disable_irq_nosync 2x, so we would stay at least 1x too deep! > > > 5) phy_stop_interrupts(): maybe I miss something, but it seems > > phy_stop() is required before this, so maybe there should be a > > comment on this? > > The API is documented in: Documentation/networking/phy.txt -- you are > welcome to improve. If you do not want to get into the gory details, just > use the cooked interface and phy_disconnect() will do the dirty work for > you. > > > 6) phy_stop_interrupts(): if I'm not wrong with #3 calling > > phy_disable_interrupts() looks like we are not sure this phy_stop() > > really works; than maybe a WARN_ON? > > WARN_ON what? I've thought that phy_stop() could be needed before phy_stop_interrupt() to set PHY_HALTED, but since it disables and clears interrupts too, then there should be no need to repeat this, maybe only check it's done. But if there is no such dependency, then no point at all. > > > 7) phy_stop_interrupts(): after above mentioned changes in > > phy_interrupt(), and phy_changes() (always enable_irq()) I can't see > > any reason why there should be more than one skipped enable_irq(), > > so checking return from cancel_work_sync() shouldn't be enough > > instead of this atomic counter. > > CONFIG_DEBUG_SHIRQ. Barring this option,
Re: [PATCH] PHYLIB: IRQ event workqueue handling fixes
On Thu, 18 Oct 2007, Jarek Poplawski wrote: > After rethinking, it looks like this last cancel should be useless. > So, if phy_interrupt() schedules only if !PHY_HALTED and phy_change() > does enable_irq() with no exeptions, it seems phy_interrupt() even > without lock must see PHY_HALTED state before this free_irq() with > possible DEBUG_SHIRQ call, then maybe only this safety: > > WARN_ON(work_pending(>phy_queue)); Good point. Maciej - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] PHYLIB: IRQ event workqueue handling fixes
On Wed, 17 Oct 2007, Jarek Poplawski wrote: > I'm not sure free_irq() should maintain the depth count - rather warn > on not zero. But, IMHO, any activity on freed irq seems suspicious to > me (and doesn't look like very common), even if it's safe with current > implementation. No way to avoid it with DEBUG_SHIRQ. > Yes, these DEBUG_SHIRQ checks are suspicious to me too, but they seem > to be reasonable only in the case of possible resent irqs (so not for > all irqs). On the other hand, it seems, proper irq handler with proper > hardware shouldn't have any problems with such a check. What do you mean by "proper irq handler with proper hardware"? Using softirqs (they used to be called bottom-halves) is actually a natural way of handling any interrupt which requires extensive processing. > 1) phy_change() checks PHY_HALTED flag without lock; I think it's > racy: eg. if it's done during phy_stop() it can check just before > the flag is set and reenable interrupts just after phy_stop() ends. I remember having a look into it, but it was long ago and I cannot immediately recall the conclusion. Which means it is either broken or deserves a comment as non-obvious. I will have a look into it again, but I am resource-starved a little at the moment, sorry. > 2) phy_change() doesn't reenable irq line after it sees returns > with errors; IMHO it should at least write some warning, but maybe > try some safety plan, so enable_irq() and try to disable interrupts > and free_irq() on the next call (if it happens). (But, I can be very > wrong with this - maybe it's OK and official way.) No way to do this safely -- at this point the device probably still has its interrupt output asserted and the register to clear it is inaccessible, so enabling the line will enter an infinite loop. At this point the system is no longer stable, so it is better to keep at least some functionality, so that it may be attempted to be shut down cleanly, rather than make it completely irresponsive. The alternative is panic(). > 3) phy_interrupt() checks PHY_HALTED flag without lock too, but I'm > not sure now if it could be dangerous after fixing #1; on the other > hand even if we know it's not our regular interrupt, with current > DEBUG_SHIRQ it could be easier to call schedule_work() anyway since > we are sure it's before/in free_irq() yet. See #1 above. > 4) phy_interrupt() should check return value from schedule_work() and > enable irq on 0. No -- the work already pending will do that. > 5) phy_stop_interrupts(): maybe I miss something, but it seems > phy_stop() is required before this, so maybe there should be a > comment on this? The API is documented in: Documentation/networking/phy.txt -- you are welcome to improve. If you do not want to get into the gory details, just use the cooked interface and phy_disconnect() will do the dirty work for you. > 6) phy_stop_interrupts(): if I'm not wrong with #3 calling > phy_disable_interrupts() looks like we are not sure this phy_stop() > really works; than maybe a WARN_ON? WARN_ON what? > 7) phy_stop_interrupts(): after above mentioned changes in > phy_interrupt(), and phy_changes() (always enable_irq()) I can't see > any reason why there should be more than one skipped enable_irq(), > so checking return from cancel_work_sync() shouldn't be enough > instead of this atomic counter. CONFIG_DEBUG_SHIRQ. Barring this option, cancel_work_sync() could have been moved to the front of free_irq(). I think I have seen this in reality, with the interrupt line left stuck disabled afterwards, but I will double check when I have an opportunity. The approach implemented with this patch does work, which is the important bit, and if simplification is possible, then it may be applied later. > 8) phy_stop_interrupts(): I'm not sure this additional call from > DEBUG_SHIRQ should be so dangerous, eg.: > > /* >* status == PHY_HALTED && >* interrupts are stopped after phy_stop() >*/ > if (cancel_work_sync(...)) > enable_irq(); > > free_irq(...); > /* >* possible schedule_work() from DEBUG_SHIRQ only, >* but proper check for PHY_HALTED is done; >* so, let's flush after this too: >*/ > cancel_work_sync(); Well, if there is another handler registered on this line, you'll get your interrupt line stuck disabled. > Of course, I don't know phy.c enough, so most of this can be terribly > wrong, then feel free to forget about this - I don't expect you should > waste any time for explaining me these things - after all they are > doubts only. I am not sure I know phy.c well enough either, ;-) and your concerns are appreciated as interesting conclusions may develop. If someone disagrees with what I have written here, they are welcome to speak out too. Maciej - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL
[PATCH] flush_work_sync vs. flush_scheduled_work Re: [PATCH] PHYLIB: IRQ event workqueue handling fixes
After reading this and earlier threads about phylib's way of using workqueue I think such a lighter and safer wrt. locking alternative for flush_scheduled_work should be useful, but maybe it's only my imagination. So, let's ask Oleg Nesterov, whose solutions are here only copy-cut-pasted & possibly abused by myslef. -> Subject: flush_work_sync as an alternative for flush_scheduled_work Similar to cancel_work_sync() but will only busy wait & block (without cancel). Signed-off-by: Jarek Poplawski <[EMAIL PROTECTED]> --- include/linux/workqueue.h |1 + kernel/workqueue.c| 24 2 files changed, 25 insertions(+) diff -Nurp 2.6.23-mm1-/include/linux/workqueue.h 2.6.23-mm1/include/linux/workqueue.h --- 2.6.23-mm1-/include/linux/workqueue.h 2007-10-12 23:45:24.0 +0200 +++ 2.6.23-mm1/include/linux/workqueue.h2007-10-17 20:55:26.0 +0200 @@ -192,6 +192,7 @@ extern void init_workqueues(void); int execute_in_process_context(work_func_t fn, struct execute_work *); extern int cancel_work_sync(struct work_struct *work); +extern void flush_work_sync(struct work_struct *work); /* * Kill off a pending schedule_delayed_work(). Note that the work callback diff -Nurp 2.6.23-mm1-/kernel/workqueue.c 2.6.23-mm1/kernel/workqueue.c --- 2.6.23-mm1-/kernel/workqueue.c 2007-10-12 23:45:25.0 +0200 +++ 2.6.23-mm1/kernel/workqueue.c 2007-10-17 20:54:03.0 +0200 @@ -539,6 +539,30 @@ int cancel_delayed_work_sync(struct dela } EXPORT_SYMBOL(cancel_delayed_work_sync); +/** + * flush_work_sync - block until a work_struct's callback has terminated + * @work: the work which is to be flushed + * + * Similar to cancel_work_sync() but will only busy wait (without cancel) + * if the work is queued. If the work's callback appears to be running, + * flush_work_sync() will block until it has completed (but doesn't block + * while other callbacks are running, like flush_scheduled_work() does). + * + * It is not allowed to use this function if the work re-queues itself. + */ +void flush_work_sync(struct work_struct *work) +{ + int ret; + + do { + ret = work_pending(work); + wait_on_work(work); + if (ret) + cpu_relax(); + } while (ret); +} +EXPORT_SYMBOL(flush_work_sync); + static struct workqueue_struct *keventd_wq __read_mostly; /** - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] PHYLIB: IRQ event workqueue handling fixes
On Wed, Oct 17, 2007 at 10:58:09AM +0200, Jarek Poplawski wrote: ... > 8) phy_stop_interrupts(): I'm not sure this additional call from > DEBUG_SHIRQ should be so dangerous, eg.: > > /* >* status == PHY_HALTED && >* interrupts are stopped after phy_stop() >*/ > if (cancel_work_sync(...)) > enable_irq(); > > free_irq(...); > /* >* possible schedule_work() from DEBUG_SHIRQ only, >* but proper check for PHY_HALTED is done; >* so, let's flush after this too: >*/ > cancel_work_sync(); After rethinking, it looks like this last cancel should be useless. So, if phy_interrupt() schedules only if !PHY_HALTED and phy_change() does enable_irq() with no exeptions, it seems phy_interrupt() even without lock must see PHY_HALTED state before this free_irq() with possible DEBUG_SHIRQ call, then maybe only this safety: WARN_ON(work_pending(>phy_queue)); Btw, I've read this was considered and not liked, but IMHO, if this really has to be like this, creating phy's own workqueue seems to be resonable, at the very least to reduce latencies to other users of this irq. Jarek P. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] PHYLIB: IRQ event workqueue handling fixes
On Wed, Oct 17, 2007 at 10:58:09AM +0200, Jarek Poplawski wrote: ... 8) phy_stop_interrupts(): I'm not sure this additional call from DEBUG_SHIRQ should be so dangerous, eg.: /* * status == PHY_HALTED * interrupts are stopped after phy_stop() */ if (cancel_work_sync(...)) enable_irq(); free_irq(...); /* * possible schedule_work() from DEBUG_SHIRQ only, * but proper check for PHY_HALTED is done; * so, let's flush after this too: */ cancel_work_sync(); After rethinking, it looks like this last cancel should be useless. So, if phy_interrupt() schedules only if !PHY_HALTED and phy_change() does enable_irq() with no exeptions, it seems phy_interrupt() even without lock must see PHY_HALTED state before this free_irq() with possible DEBUG_SHIRQ call, then maybe only this safety: WARN_ON(work_pending(phydev-phy_queue)); Btw, I've read this was considered and not liked, but IMHO, if this really has to be like this, creating phy's own workqueue seems to be resonable, at the very least to reduce latencies to other users of this irq. Jarek P. - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] flush_work_sync vs. flush_scheduled_work Re: [PATCH] PHYLIB: IRQ event workqueue handling fixes
After reading this and earlier threads about phylib's way of using workqueue I think such a lighter and safer wrt. locking alternative for flush_scheduled_work should be useful, but maybe it's only my imagination. So, let's ask Oleg Nesterov, whose solutions are here only copy-cut-pasted possibly abused by myslef. - Subject: flush_work_sync as an alternative for flush_scheduled_work Similar to cancel_work_sync() but will only busy wait block (without cancel). Signed-off-by: Jarek Poplawski [EMAIL PROTECTED] --- include/linux/workqueue.h |1 + kernel/workqueue.c| 24 2 files changed, 25 insertions(+) diff -Nurp 2.6.23-mm1-/include/linux/workqueue.h 2.6.23-mm1/include/linux/workqueue.h --- 2.6.23-mm1-/include/linux/workqueue.h 2007-10-12 23:45:24.0 +0200 +++ 2.6.23-mm1/include/linux/workqueue.h2007-10-17 20:55:26.0 +0200 @@ -192,6 +192,7 @@ extern void init_workqueues(void); int execute_in_process_context(work_func_t fn, struct execute_work *); extern int cancel_work_sync(struct work_struct *work); +extern void flush_work_sync(struct work_struct *work); /* * Kill off a pending schedule_delayed_work(). Note that the work callback diff -Nurp 2.6.23-mm1-/kernel/workqueue.c 2.6.23-mm1/kernel/workqueue.c --- 2.6.23-mm1-/kernel/workqueue.c 2007-10-12 23:45:25.0 +0200 +++ 2.6.23-mm1/kernel/workqueue.c 2007-10-17 20:54:03.0 +0200 @@ -539,6 +539,30 @@ int cancel_delayed_work_sync(struct dela } EXPORT_SYMBOL(cancel_delayed_work_sync); +/** + * flush_work_sync - block until a work_struct's callback has terminated + * @work: the work which is to be flushed + * + * Similar to cancel_work_sync() but will only busy wait (without cancel) + * if the work is queued. If the work's callback appears to be running, + * flush_work_sync() will block until it has completed (but doesn't block + * while other callbacks are running, like flush_scheduled_work() does). + * + * It is not allowed to use this function if the work re-queues itself. + */ +void flush_work_sync(struct work_struct *work) +{ + int ret; + + do { + ret = work_pending(work); + wait_on_work(work); + if (ret) + cpu_relax(); + } while (ret); +} +EXPORT_SYMBOL(flush_work_sync); + static struct workqueue_struct *keventd_wq __read_mostly; /** - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] PHYLIB: IRQ event workqueue handling fixes
On Thu, 18 Oct 2007, Jarek Poplawski wrote: After rethinking, it looks like this last cancel should be useless. So, if phy_interrupt() schedules only if !PHY_HALTED and phy_change() does enable_irq() with no exeptions, it seems phy_interrupt() even without lock must see PHY_HALTED state before this free_irq() with possible DEBUG_SHIRQ call, then maybe only this safety: WARN_ON(work_pending(phydev-phy_queue)); Good point. Maciej - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] PHYLIB: IRQ event workqueue handling fixes
On Wed, 17 Oct 2007, Jarek Poplawski wrote: I'm not sure free_irq() should maintain the depth count - rather warn on not zero. But, IMHO, any activity on freed irq seems suspicious to me (and doesn't look like very common), even if it's safe with current implementation. No way to avoid it with DEBUG_SHIRQ. Yes, these DEBUG_SHIRQ checks are suspicious to me too, but they seem to be reasonable only in the case of possible resent irqs (so not for all irqs). On the other hand, it seems, proper irq handler with proper hardware shouldn't have any problems with such a check. What do you mean by proper irq handler with proper hardware? Using softirqs (they used to be called bottom-halves) is actually a natural way of handling any interrupt which requires extensive processing. 1) phy_change() checks PHY_HALTED flag without lock; I think it's racy: eg. if it's done during phy_stop() it can check just before the flag is set and reenable interrupts just after phy_stop() ends. I remember having a look into it, but it was long ago and I cannot immediately recall the conclusion. Which means it is either broken or deserves a comment as non-obvious. I will have a look into it again, but I am resource-starved a little at the moment, sorry. 2) phy_change() doesn't reenable irq line after it sees returns with errors; IMHO it should at least write some warning, but maybe try some safety plan, so enable_irq() and try to disable interrupts and free_irq() on the next call (if it happens). (But, I can be very wrong with this - maybe it's OK and official way.) No way to do this safely -- at this point the device probably still has its interrupt output asserted and the register to clear it is inaccessible, so enabling the line will enter an infinite loop. At this point the system is no longer stable, so it is better to keep at least some functionality, so that it may be attempted to be shut down cleanly, rather than make it completely irresponsive. The alternative is panic(). 3) phy_interrupt() checks PHY_HALTED flag without lock too, but I'm not sure now if it could be dangerous after fixing #1; on the other hand even if we know it's not our regular interrupt, with current DEBUG_SHIRQ it could be easier to call schedule_work() anyway since we are sure it's before/in free_irq() yet. See #1 above. 4) phy_interrupt() should check return value from schedule_work() and enable irq on 0. No -- the work already pending will do that. 5) phy_stop_interrupts(): maybe I miss something, but it seems phy_stop() is required before this, so maybe there should be a comment on this? The API is documented in: Documentation/networking/phy.txt -- you are welcome to improve. If you do not want to get into the gory details, just use the cooked interface and phy_disconnect() will do the dirty work for you. 6) phy_stop_interrupts(): if I'm not wrong with #3 calling phy_disable_interrupts() looks like we are not sure this phy_stop() really works; than maybe a WARN_ON? WARN_ON what? 7) phy_stop_interrupts(): after above mentioned changes in phy_interrupt(), and phy_changes() (always enable_irq()) I can't see any reason why there should be more than one skipped enable_irq(), so checking return from cancel_work_sync() shouldn't be enough instead of this atomic counter. CONFIG_DEBUG_SHIRQ. Barring this option, cancel_work_sync() could have been moved to the front of free_irq(). I think I have seen this in reality, with the interrupt line left stuck disabled afterwards, but I will double check when I have an opportunity. The approach implemented with this patch does work, which is the important bit, and if simplification is possible, then it may be applied later. 8) phy_stop_interrupts(): I'm not sure this additional call from DEBUG_SHIRQ should be so dangerous, eg.: /* * status == PHY_HALTED * interrupts are stopped after phy_stop() */ if (cancel_work_sync(...)) enable_irq(); free_irq(...); /* * possible schedule_work() from DEBUG_SHIRQ only, * but proper check for PHY_HALTED is done; * so, let's flush after this too: */ cancel_work_sync(); Well, if there is another handler registered on this line, you'll get your interrupt line stuck disabled. Of course, I don't know phy.c enough, so most of this can be terribly wrong, then feel free to forget about this - I don't expect you should waste any time for explaining me these things - after all they are doubts only. I am not sure I know phy.c well enough either, ;-) and your concerns are appreciated as interesting conclusions may develop. If someone disagrees with what I have written here, they are welcome to speak out too. Maciej - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at
Re: [PATCH] PHYLIB: IRQ event workqueue handling fixes
On Thu, Oct 18, 2007 at 12:30:35PM +0100, Maciej W. Rozycki wrote: On Wed, 17 Oct 2007, Jarek Poplawski wrote: I'm not sure free_irq() should maintain the depth count - rather warn on not zero. But, IMHO, any activity on freed irq seems suspicious to me (and doesn't look like very common), even if it's safe with current implementation. No way to avoid it with DEBUG_SHIRQ. Yes, these DEBUG_SHIRQ checks are suspicious to me too, but they seem to be reasonable only in the case of possible resent irqs (so not for all irqs). On the other hand, it seems, proper irq handler with proper hardware shouldn't have any problems with such a check. What do you mean by proper irq handler with proper hardware? Using softirqs (they used to be called bottom-halves) is actually a natural way of handling any interrupt which requires extensive processing. Technically until free_irq returns a handler should respond to calls and with proper hardware it should have no problem with checking if it's its interrupt, even after disabling this hardware, because of possible races. But with a scenario like this: - disable_irq() - disable_my_hadrware_irq() - clear_my_hardware_irq() - free_irq() - enable_irq() it seems the handler should respond even after free_irq because there could be still interrupts to resend, originally generated by its hardware, so such behavior looks very suspicious, at least with some type of interrupts. So, I think, the idea of DEBUG_SHIRQ is generally right and very useful - but, of course, there could be exceptions, which btw. could try some hacks under DEBUG_SHIRQ too. And my opinion about 'properness' was very general (not about phy) too, just like my 'knowledge' of drivers. 1) phy_change() checks PHY_HALTED flag without lock; I think it's racy: eg. if it's done during phy_stop() it can check just before the flag is set and reenable interrupts just after phy_stop() ends. I remember having a look into it, but it was long ago and I cannot immediately recall the conclusion. Which means it is either broken or deserves a comment as non-obvious. I will have a look into it again, but I am resource-starved a little at the moment, sorry. 2) phy_change() doesn't reenable irq line after it sees returns with errors; IMHO it should at least write some warning, but maybe try some safety plan, so enable_irq() and try to disable interrupts and free_irq() on the next call (if it happens). (But, I can be very wrong with this - maybe it's OK and official way.) No way to do this safely -- at this point the device probably still has its interrupt output asserted and the register to clear it is inaccessible, so enabling the line will enter an infinite loop. At this point the system is no longer stable, so it is better to keep at least some functionality, so that it may be attempted to be shut down cleanly, rather than make it completely irresponsive. The alternative is panic(). Right! But then some warning could be useful, I presume. 3) phy_interrupt() checks PHY_HALTED flag without lock too, but I'm not sure now if it could be dangerous after fixing #1; on the other hand even if we know it's not our regular interrupt, with current DEBUG_SHIRQ it could be easier to call schedule_work() anyway since we are sure it's before/in free_irq() yet. See #1 above. 4) phy_interrupt() should check return value from schedule_work() and enable irq on 0. No -- the work already pending will do that. How? If schedule_work won't succeed because there is a pending one, we did disable_irq_nosync 2x, so we would stay at least 1x too deep! 5) phy_stop_interrupts(): maybe I miss something, but it seems phy_stop() is required before this, so maybe there should be a comment on this? The API is documented in: Documentation/networking/phy.txt -- you are welcome to improve. If you do not want to get into the gory details, just use the cooked interface and phy_disconnect() will do the dirty work for you. 6) phy_stop_interrupts(): if I'm not wrong with #3 calling phy_disable_interrupts() looks like we are not sure this phy_stop() really works; than maybe a WARN_ON? WARN_ON what? I've thought that phy_stop() could be needed before phy_stop_interrupt() to set PHY_HALTED, but since it disables and clears interrupts too, then there should be no need to repeat this, maybe only check it's done. But if there is no such dependency, then no point at all. 7) phy_stop_interrupts(): after above mentioned changes in phy_interrupt(), and phy_changes() (always enable_irq()) I can't see any reason why there should be more than one skipped enable_irq(), so checking return from cancel_work_sync() shouldn't be enough instead of this atomic counter. CONFIG_DEBUG_SHIRQ. Barring this option, cancel_work_sync() could have been moved to the front of free_irq(). I think I have seen this in reality, with the
Re: [PATCH] PHYLIB: IRQ event workqueue handling fixes
On Thu, 18 Oct 2007, Jarek Poplawski wrote: Technically until free_irq returns a handler should respond to calls and with proper hardware it should have no problem with checking if it's its interrupt, even after disabling this hardware, because of possible races. Well, the hardirq handler can check it, no problem -- it is just it is so slow, the latency would be unacceptable. The problem with the softirq handler is we do really not want it to be called after the driver has already been shut down and its structures freed. It used to happen before this flush/cancel call was added with the usual effect (oops) as by then they may well have been stamped on already. But with a scenario like this: - disable_irq() - disable_my_hadrware_irq() - clear_my_hardware_irq() - free_irq() - enable_irq() it seems the handler should respond even after free_irq because there could be still interrupts to resend, originally generated by its hardware, so such behavior looks very suspicious, at least with some type of interrupts. These are softirqs, not hardware interrupts, so they are as such not related to *_irq() infrastructure. The flaw is the depth count of IRQ lines is not maintained consistently. This is, according to comments around the code in question, to cover up bugs elsewhere. Not a brillant idea, I am afraid -- there should be no need to reset the depth upon request_irq() and likewise with free_irq(), but there you go. I would be happy to investigate a possible solution and rewrite the necessary bits, but right now I am committed to other stuff, overdue already, sorry. The view could change if we supported hot-pluggable interrupt controllers, but it is not the case at the moment right now, so the interrupt lines are there to stay for the duration of the system lifespan and could be manipulated as necessary. So, I think, the idea of DEBUG_SHIRQ is generally right and very useful - but, of course, there could be exceptions, which btw. could try some hacks under DEBUG_SHIRQ too. And my opinion about 'properness' was very general (not about phy) too, just like my 'knowledge' of drivers. The idea is right, no question, but I am not quite sure it has been properly architected into our current design. Actually I am almost sure of the reverse. This is why I was (and still am) interested in feedback on it. Right! But then some warning could be useful, I presume. Of course; adding one to phy_error() should be straightforward. 4) phy_interrupt() should check return value from schedule_work() and enable irq on 0. No -- the work already pending will do that. How? If schedule_work won't succeed because there is a pending one, we did disable_irq_nosync 2x, so we would stay at least 1x too deep! Correct and my note is misleading, sorry. The thing is we shouldn't have come here the second time in the first place (which is I think why the check is not there) as handlers for the same line are not allowed to run in parallel (cf. irq_desc-lock and IRQ_INPROGRESS). Perhaps BUG_ON(!schedule_work()) would be appropriate, though I am not sure if we should handle every impossible condition we can imagine. Quite a lot of hardirq handlers assume two instances will not run in parallel, so if it ever happened, then a serious breakage would unleash. But, I've enough of other concerns too, so nothing urgent here... The problem is at the moment I am still probably the only user of this code ;-) -- `grep' through the sources to see how many drivers request an IRQ for their PHY through phylib; unless something has changed very recently. Maciej - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] flush_work_sync vs. flush_scheduled_work Re: [PATCH] PHYLIB: IRQ event workqueue handling fixes
On 10/18, Jarek Poplawski wrote: +/** + * flush_work_sync - block until a work_struct's callback has terminated ^^^ Hmm... + * Similar to cancel_work_sync() but will only busy wait (without cancel) + * if the work is queued. Yes, it won't block, but will spin in busy-wait loop until all other works scheduled before this work are finished. Not good. After that it really blocks waiting for this work to complete. And I am a bit confused. We can't use flush_workqueue() because some of the queued work_structs may take rtnl_lock, yes? But in that case we can't use the new flush_work_sync() helper as well, no? If we can't just cancel the work, can't we do something like if (cancel_work_sync(w)) w-func(w); instead? +void flush_work_sync(struct work_struct *work) +{ + int ret; + + do { + ret = work_pending(work); + wait_on_work(work); + if (ret) + cpu_relax(); + } while (ret); +} If we really the new helper, perhaps we can make it a bit better? 1. Modify insert_work() to take the struct list_head *at parameter instead of int tail. I think this patch will also cleanup the code a bit, and shrink a couple of bytes from .text 2. flush_work_sync() inserts a barrier right after this work and blocks. We still need some retry logic to handle the queueing is in progress of course, but we won't spin waiting for the other works. What do you think? Oleg. - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] flush_work_sync vs. flush_scheduled_work Re: [PATCH] PHYLIB: IRQ event workqueue handling fixes
On Thu, 18 Oct 2007, Oleg Nesterov wrote: If we can't just cancel the work, can't we do something like if (cancel_work_sync(w)) w-func(w); instead? We do an equivalent of this -- all that we care about that w-func(w) would do is enable_irq() and the rest we want to skip at this point. We probably do not need the counter in the end though. Maciej - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] PHYLIB: IRQ event workqueue handling fixes
> Btw., since, because of this patch, I've had a one more look at phy.c > and there are a few more doubts, so this time I'll try to bother you > for real: .../... While there... is somebody interested in making the whole PHY lib operate a task level and use mutexes instead of spinlock ? I need that for drivers like EMAC (who use their own PHY layer for now), and I might even give a go at adapting phylib myself, but I'd like to take the temperature about it first. Basically, there is nothing in phylib that is performance critical or such that requires it to run at irq time and/or use locks. On the other hand, it complicates things in various areas. The most obvious one being that it prevents the network driver mii access callbacks from sleeping which can be annoying as MDIO can be slow, and some drivers have fancy muxes in there that are better off mutexed than spinlocked. Ben. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] PHYLIB: IRQ event workqueue handling fixes
On Wed, Oct 17, 2007 at 10:58:09AM +0200, Jarek Poplawski wrote: ... > 5) phy_stop_interrupts(): maybe I miss something, but it seems > phy_stop() is required before this, so maybe there should be a > comment on this? > > 6) phy_stop_interrupts(): if I'm not wrong with #3 calling Should be: 6) phy_stop_interrupts(): if I'm not wrong with #5 calling Jarek P. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] PHYLIB: IRQ event workqueue handling fixes
On Tue, Oct 16, 2007 at 06:19:32PM +0100, Maciej W. Rozycki wrote: ... > Well, enable_irq() and disable_irq() themselves are nesting, so they are > not a problem. OTOH, free_irq() does not seem to maintain the depth count > correctly, which looks like a bug to me and which could trigger regardless > of whether flush_scheduled_work() or cancel_work_sync() was called. I'm not sure free_irq() should maintain the depth count - rather warn on not zero. But, IMHO, any activity on freed irq seems suspicious to me (and doesn't look like very common), even if it's safe with current implementation. > The reason is CONFIG_DEBUG_SHIRQ which makes a simulated interrupt event > be sent at the end of free_irq(). It looks like a problem that is > complementary to one I signalled here: > > http://lkml.org/lkml/2007/9/12/82 > > with respect to request_irq(), where, similarly, such an interrupt event > is sent at the beginning. It looks like nobody was concerned back then, > but perhaps it is time to do a better investigation now and propose a > solution. Yes, these DEBUG_SHIRQ checks are suspicious to me too, but they seem to be reasonable only in the case of possible resent irqs (so not for all irqs). On the other hand, it seems, proper irq handler with proper hardware shouldn't have any problems with such a check. > I'll think about it and thanks for your inquisitiveness that has led to > these conclusions. Btw., since, because of this patch, I've had a one more look at phy.c and there are a few more doubts, so this time I'll try to bother you for real: 1) phy_change() checks PHY_HALTED flag without lock; I think it's racy: eg. if it's done during phy_stop() it can check just before the flag is set and reenable interrupts just after phy_stop() ends. 2) phy_change() doesn't reenable irq line after it sees returns with errors; IMHO it should at least write some warning, but maybe try some safety plan, so enable_irq() and try to disable interrupts and free_irq() on the next call (if it happens). (But, I can be very wrong with this - maybe it's OK and official way.) 3) phy_interrupt() checks PHY_HALTED flag without lock too, but I'm not sure now if it could be dangerous after fixing #1; on the other hand even if we know it's not our regular interrupt, with current DEBUG_SHIRQ it could be easier to call schedule_work() anyway since we are sure it's before/in free_irq() yet. 4) phy_interrupt() should check return value from schedule_work() and enable irq on 0. 5) phy_stop_interrupts(): maybe I miss something, but it seems phy_stop() is required before this, so maybe there should be a comment on this? 6) phy_stop_interrupts(): if I'm not wrong with #3 calling phy_disable_interrupts() looks like we are not sure this phy_stop() really works; than maybe a WARN_ON? 7) phy_stop_interrupts(): after above mentioned changes in phy_interrupt(), and phy_changes() (always enable_irq()) I can't see any reason why there should be more than one skipped enable_irq(), so checking return from cancel_work_sync() shouldn't be enough instead of this atomic counter. 8) phy_stop_interrupts(): I'm not sure this additional call from DEBUG_SHIRQ should be so dangerous, eg.: /* * status == PHY_HALTED && * interrupts are stopped after phy_stop() */ if (cancel_work_sync(...)) enable_irq(); free_irq(...); /* * possible schedule_work() from DEBUG_SHIRQ only, * but proper check for PHY_HALTED is done; * so, let's flush after this too: */ cancel_work_sync(); Of course, I don't know phy.c enough, so most of this can be terribly wrong, then feel free to forget about this - I don't expect you should waste any time for explaining me these things - after all they are doubts only. Regards, Jarek P. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] PHYLIB: IRQ event workqueue handling fixes
On Tue, Oct 16, 2007 at 06:19:32PM +0100, Maciej W. Rozycki wrote: ... Well, enable_irq() and disable_irq() themselves are nesting, so they are not a problem. OTOH, free_irq() does not seem to maintain the depth count correctly, which looks like a bug to me and which could trigger regardless of whether flush_scheduled_work() or cancel_work_sync() was called. I'm not sure free_irq() should maintain the depth count - rather warn on not zero. But, IMHO, any activity on freed irq seems suspicious to me (and doesn't look like very common), even if it's safe with current implementation. The reason is CONFIG_DEBUG_SHIRQ which makes a simulated interrupt event be sent at the end of free_irq(). It looks like a problem that is complementary to one I signalled here: http://lkml.org/lkml/2007/9/12/82 with respect to request_irq(), where, similarly, such an interrupt event is sent at the beginning. It looks like nobody was concerned back then, but perhaps it is time to do a better investigation now and propose a solution. Yes, these DEBUG_SHIRQ checks are suspicious to me too, but they seem to be reasonable only in the case of possible resent irqs (so not for all irqs). On the other hand, it seems, proper irq handler with proper hardware shouldn't have any problems with such a check. I'll think about it and thanks for your inquisitiveness that has led to these conclusions. Btw., since, because of this patch, I've had a one more look at phy.c and there are a few more doubts, so this time I'll try to bother you for real: 1) phy_change() checks PHY_HALTED flag without lock; I think it's racy: eg. if it's done during phy_stop() it can check just before the flag is set and reenable interrupts just after phy_stop() ends. 2) phy_change() doesn't reenable irq line after it sees returns with errors; IMHO it should at least write some warning, but maybe try some safety plan, so enable_irq() and try to disable interrupts and free_irq() on the next call (if it happens). (But, I can be very wrong with this - maybe it's OK and official way.) 3) phy_interrupt() checks PHY_HALTED flag without lock too, but I'm not sure now if it could be dangerous after fixing #1; on the other hand even if we know it's not our regular interrupt, with current DEBUG_SHIRQ it could be easier to call schedule_work() anyway since we are sure it's before/in free_irq() yet. 4) phy_interrupt() should check return value from schedule_work() and enable irq on 0. 5) phy_stop_interrupts(): maybe I miss something, but it seems phy_stop() is required before this, so maybe there should be a comment on this? 6) phy_stop_interrupts(): if I'm not wrong with #3 calling phy_disable_interrupts() looks like we are not sure this phy_stop() really works; than maybe a WARN_ON? 7) phy_stop_interrupts(): after above mentioned changes in phy_interrupt(), and phy_changes() (always enable_irq()) I can't see any reason why there should be more than one skipped enable_irq(), so checking return from cancel_work_sync() shouldn't be enough instead of this atomic counter. 8) phy_stop_interrupts(): I'm not sure this additional call from DEBUG_SHIRQ should be so dangerous, eg.: /* * status == PHY_HALTED * interrupts are stopped after phy_stop() */ if (cancel_work_sync(...)) enable_irq(); free_irq(...); /* * possible schedule_work() from DEBUG_SHIRQ only, * but proper check for PHY_HALTED is done; * so, let's flush after this too: */ cancel_work_sync(); Of course, I don't know phy.c enough, so most of this can be terribly wrong, then feel free to forget about this - I don't expect you should waste any time for explaining me these things - after all they are doubts only. Regards, Jarek P. - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] PHYLIB: IRQ event workqueue handling fixes
On Wed, Oct 17, 2007 at 10:58:09AM +0200, Jarek Poplawski wrote: ... 5) phy_stop_interrupts(): maybe I miss something, but it seems phy_stop() is required before this, so maybe there should be a comment on this? 6) phy_stop_interrupts(): if I'm not wrong with #3 calling Should be: 6) phy_stop_interrupts(): if I'm not wrong with #5 calling Jarek P. - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] PHYLIB: IRQ event workqueue handling fixes
Btw., since, because of this patch, I've had a one more look at phy.c and there are a few more doubts, so this time I'll try to bother you for real: .../... While there... is somebody interested in making the whole PHY lib operate a task level and use mutexes instead of spinlock ? I need that for drivers like EMAC (who use their own PHY layer for now), and I might even give a go at adapting phylib myself, but I'd like to take the temperature about it first. Basically, there is nothing in phylib that is performance critical or such that requires it to run at irq time and/or use locks. On the other hand, it complicates things in various areas. The most obvious one being that it prevents the network driver mii access callbacks from sleeping which can be annoying as MDIO can be slow, and some drivers have fancy muxes in there that are better off mutexed than spinlocked. Ben. - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] PHYLIB: IRQ event workqueue handling fixes
On Tue, 16 Oct 2007, Jarek Poplawski wrote: > Yes, it's all right here. Sorry for bothering - I should've found this > by myself. Ah, no problem -- even with the right keys you may sometimes get lost in the results. > I've still some doubts about this possible enable_irq() after > free_irq(). If it's the only handler the status would be changed again > and at least some of this code in check_irq_resend() would be run, but > I can miss something again or/and this doesn't matter, as well. Well, enable_irq() and disable_irq() themselves are nesting, so they are not a problem. OTOH, free_irq() does not seem to maintain the depth count correctly, which looks like a bug to me and which could trigger regardless of whether flush_scheduled_work() or cancel_work_sync() was called. The reason is CONFIG_DEBUG_SHIRQ which makes a simulated interrupt event be sent at the end of free_irq(). It looks like a problem that is complementary to one I signalled here: http://lkml.org/lkml/2007/9/12/82 with respect to request_irq(), where, similarly, such an interrupt event is sent at the beginning. It looks like nobody was concerned back then, but perhaps it is time to do a better investigation now and propose a solution. I'll think about it and thanks for your inquisitiveness that has led to these conclusions. Maciej - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] PHYLIB: IRQ event workqueue handling fixes
On Mon, Oct 15, 2007 at 06:03:20PM +0100, Maciej W. Rozycki wrote: > On Mon, 15 Oct 2007, Jarek Poplawski wrote: > > > Could you explain why cancel_work_sync() is better here than > > flush_scheduled_work() wrt. rtnl_lock()? > > Well, this is actually the bit that made cancel_work_sync() be written in > the first place. The short story is the netlink lock is most probably > held at this point (depending on the usage of phy_disconnect()) and there > is also an event waiting in the queue that requires the lock, so if > flush_scheduled_work() is called here a deadlock will happen. > > Let me find a reference for a longer story...: > > http://www.linux-mips.org/cgi-bin/mesg.cgi?a=linux-mips=Pine.LNX.4.64N.0610031509380.4642%40blysk.ds.pg.gda.pl > > and then discussed again: > > http://www.uwsg.indiana.edu/hypermail/linux/kernel/0612.0/0593.html > Yes, it's all right here. Sorry for bothering - I should've found this by myself. I've still some doubts about this possible enable_irq() after free_irq(). If it's the only handler the status would be changed again and at least some of this code in check_irq_resend() would be run, but I can miss something again or/and this doesn't matter, as well. Thanks, Jarek P. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] PHYLIB: IRQ event workqueue handling fixes
On Mon, Oct 15, 2007 at 06:03:20PM +0100, Maciej W. Rozycki wrote: On Mon, 15 Oct 2007, Jarek Poplawski wrote: Could you explain why cancel_work_sync() is better here than flush_scheduled_work() wrt. rtnl_lock()? Well, this is actually the bit that made cancel_work_sync() be written in the first place. The short story is the netlink lock is most probably held at this point (depending on the usage of phy_disconnect()) and there is also an event waiting in the queue that requires the lock, so if flush_scheduled_work() is called here a deadlock will happen. Let me find a reference for a longer story...: http://www.linux-mips.org/cgi-bin/mesg.cgi?a=linux-mipsi=Pine.LNX.4.64N.0610031509380.4642%40blysk.ds.pg.gda.pl and then discussed again: http://www.uwsg.indiana.edu/hypermail/linux/kernel/0612.0/0593.html Yes, it's all right here. Sorry for bothering - I should've found this by myself. I've still some doubts about this possible enable_irq() after free_irq(). If it's the only handler the status would be changed again and at least some of this code in check_irq_resend() would be run, but I can miss something again or/and this doesn't matter, as well. Thanks, Jarek P. - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] PHYLIB: IRQ event workqueue handling fixes
On Tue, 16 Oct 2007, Jarek Poplawski wrote: Yes, it's all right here. Sorry for bothering - I should've found this by myself. Ah, no problem -- even with the right keys you may sometimes get lost in the results. I've still some doubts about this possible enable_irq() after free_irq(). If it's the only handler the status would be changed again and at least some of this code in check_irq_resend() would be run, but I can miss something again or/and this doesn't matter, as well. Well, enable_irq() and disable_irq() themselves are nesting, so they are not a problem. OTOH, free_irq() does not seem to maintain the depth count correctly, which looks like a bug to me and which could trigger regardless of whether flush_scheduled_work() or cancel_work_sync() was called. The reason is CONFIG_DEBUG_SHIRQ which makes a simulated interrupt event be sent at the end of free_irq(). It looks like a problem that is complementary to one I signalled here: http://lkml.org/lkml/2007/9/12/82 with respect to request_irq(), where, similarly, such an interrupt event is sent at the beginning. It looks like nobody was concerned back then, but perhaps it is time to do a better investigation now and propose a solution. I'll think about it and thanks for your inquisitiveness that has led to these conclusions. Maciej - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] PHYLIB: IRQ event workqueue handling fixes
On Mon, 15 Oct 2007, Jarek Poplawski wrote: > Could you explain why cancel_work_sync() is better here than > flush_scheduled_work() wrt. rtnl_lock()? Well, this is actually the bit that made cancel_work_sync() be written in the first place. The short story is the netlink lock is most probably held at this point (depending on the usage of phy_disconnect()) and there is also an event waiting in the queue that requires the lock, so if flush_scheduled_work() is called here a deadlock will happen. Let me find a reference for a longer story...: http://www.linux-mips.org/cgi-bin/mesg.cgi?a=linux-mips=Pine.LNX.4.64N.0610031509380.4642%40blysk.ds.pg.gda.pl and then discussed again: http://www.uwsg.indiana.edu/hypermail/linux/kernel/0612.0/0593.html Maciej - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] PHYLIB: IRQ event workqueue handling fixes
On 19-09-2007 16:38, Maciej W. Rozycki wrote: ... > @@ -661,13 +664,22 @@ int phy_stop_interrupts(struct phy_devic > if (err) > phy_error(phydev); > > + free_irq(phydev->irq, phydev); > + > /* > - * Finish any pending work; we might have been scheduled to be called > - * from keventd ourselves, but cancel_work_sync() handles that. > + * Cannot call flush_scheduled_work() here as desired because > + * of rtnl_lock(), but we do not really care about what would > + * be done, except from enable_irq(), so cancel any work > + * possibly pending and take care of the matter below. >*/ > cancel_work_sync(>phy_queue); Hi, Could you explain why cancel_work_sync() is better here than flush_scheduled_work() wrt. rtnl_lock()? Regards, Jarek P. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] PHYLIB: IRQ event workqueue handling fixes
On 19-09-2007 16:38, Maciej W. Rozycki wrote: ... @@ -661,13 +664,22 @@ int phy_stop_interrupts(struct phy_devic if (err) phy_error(phydev); + free_irq(phydev-irq, phydev); + /* - * Finish any pending work; we might have been scheduled to be called - * from keventd ourselves, but cancel_work_sync() handles that. + * Cannot call flush_scheduled_work() here as desired because + * of rtnl_lock(), but we do not really care about what would + * be done, except from enable_irq(), so cancel any work + * possibly pending and take care of the matter below. */ cancel_work_sync(phydev-phy_queue); Hi, Could you explain why cancel_work_sync() is better here than flush_scheduled_work() wrt. rtnl_lock()? Regards, Jarek P. - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] PHYLIB: IRQ event workqueue handling fixes
On Mon, 15 Oct 2007, Jarek Poplawski wrote: Could you explain why cancel_work_sync() is better here than flush_scheduled_work() wrt. rtnl_lock()? Well, this is actually the bit that made cancel_work_sync() be written in the first place. The short story is the netlink lock is most probably held at this point (depending on the usage of phy_disconnect()) and there is also an event waiting in the queue that requires the lock, so if flush_scheduled_work() is called here a deadlock will happen. Let me find a reference for a longer story...: http://www.linux-mips.org/cgi-bin/mesg.cgi?a=linux-mipsi=Pine.LNX.4.64N.0610031509380.4642%40blysk.ds.pg.gda.pl and then discussed again: http://www.uwsg.indiana.edu/hypermail/linux/kernel/0612.0/0593.html Maciej - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] PHYLIB: IRQ event workqueue handling fixes
On Fri, 21 Sep 2007 13:51:12 +0100 (BST) "Maciej W. Rozycki" <[EMAIL PROTECTED]> wrote: > On Thu, 20 Sep 2007, Andrew Morton wrote: > > > You always put boring, crappy, insufficient text in the for-the-changelog > > section and interesting, useful, sufficient text in the > > not-for-the-changelog > > section. > > I'll swap the sections in the future then. ;-) Frankly I was not sure > whether the changelog was happy about being fed with lengthy explanations > and it has not spoken out. I think it's worth putting plenty of details in the changelog: it's compressed on-disk and on-the-wire and is overall pretty cheap. If people don't actually seek the information out, it has close to zero impact on them. But on those occasions when people _do_ seek the information out (and it can be years later) then they want every drop of information they can get. Numerous times I've gone back to the 2.5.x mm/ changelogs to work out what on earth we were thinking when we did something, and it has proved quite useful in explaining the existing code, or in suggesting possible problems which we had forgotten about by 2007. otoh, you can get a lot of handy info by googling for strategic parts of the kernel code, or by googling snippets of the existing-but-short changelog. For example, this patch: google for "Keep track of disable_irq_nosync() invocations" and voila. Perhaps we don't need changelogs at all ;) - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] PHYLIB: IRQ event workqueue handling fixes
On Thu, 20 Sep 2007, Andrew Morton wrote: > You always put boring, crappy, insufficient text in the for-the-changelog > section and interesting, useful, sufficient text in the not-for-the-changelog > section. I'll swap the sections in the future then. ;-) Frankly I was not sure whether the changelog was happy about being fed with lengthy explanations and it has not spoken out. I have to admit this is a habit carried over from the FSF-style ChangeLog -- where the enforced rule is actually *not* to provide any explanation for why changes are done and only describe what has been modified (with any discussion around archived in the respective mailing list). > But you can't fool me! I have an editor and I fix it up. Thank you and sorry for the extra work I caused you -- I shall keep your suggestion in mind in the future. Maciej - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] PHYLIB: IRQ event workqueue handling fixes
On Thu, 20 Sep 2007, Andrew Morton wrote: You always put boring, crappy, insufficient text in the for-the-changelog section and interesting, useful, sufficient text in the not-for-the-changelog section. I'll swap the sections in the future then. ;-) Frankly I was not sure whether the changelog was happy about being fed with lengthy explanations and it has not spoken out. I have to admit this is a habit carried over from the FSF-style ChangeLog -- where the enforced rule is actually *not* to provide any explanation for why changes are done and only describe what has been modified (with any discussion around archived in the respective mailing list). But you can't fool me! I have an editor and I fix it up. Thank you and sorry for the extra work I caused you -- I shall keep your suggestion in mind in the future. Maciej - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] PHYLIB: IRQ event workqueue handling fixes
On Fri, 21 Sep 2007 13:51:12 +0100 (BST) Maciej W. Rozycki [EMAIL PROTECTED] wrote: On Thu, 20 Sep 2007, Andrew Morton wrote: You always put boring, crappy, insufficient text in the for-the-changelog section and interesting, useful, sufficient text in the not-for-the-changelog section. I'll swap the sections in the future then. ;-) Frankly I was not sure whether the changelog was happy about being fed with lengthy explanations and it has not spoken out. I think it's worth putting plenty of details in the changelog: it's compressed on-disk and on-the-wire and is overall pretty cheap. If people don't actually seek the information out, it has close to zero impact on them. But on those occasions when people _do_ seek the information out (and it can be years later) then they want every drop of information they can get. Numerous times I've gone back to the 2.5.x mm/ changelogs to work out what on earth we were thinking when we did something, and it has proved quite useful in explaining the existing code, or in suggesting possible problems which we had forgotten about by 2007. otoh, you can get a lot of handy info by googling for strategic parts of the kernel code, or by googling snippets of the existing-but-short changelog. For example, this patch: google for Keep track of disable_irq_nosync() invocations and voila. Perhaps we don't need changelogs at all ;) - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] PHYLIB: IRQ event workqueue handling fixes
On Wed, 19 Sep 2007 15:38:19 +0100 (BST) "Maciej W. Rozycki" <[EMAIL PROTECTED]> wrote: > Keep track of disable_irq_nosync() invocations and call enable_irq() the > right number of times if work has been cancelled that would include them. > > Signed-off-by: Maciej W. Rozycki <[EMAIL PROTECTED]> > --- > Now that the call to flush_work_keventd() (problematic because of > rtnl_mutex being held) has been replaced by cancel_work_sync() another > issue has arisen and been left unresolved. As the MDIO bus cannot be > accessed from the interrupt context the PHY interrupt handler uses > disable_irq_nosync() to prevent from looping and schedules some work to be > done as a softirq, which, apart from handling the state change of the > originating PHY, is responsible for reenabling the interrupt. Now if the > interrupt line is shared by another device and a call to the softirq > handler has been cancelled, that call to enable_irq() never happens and > the other device cannot use its interrupt anymore as its stuck disabled. > > I decided to use a counter rather than a flag because there may be more > than one call to phy_change() cancelled in the queue -- a real one and a > fake one triggered by free_irq() if DEBUG_SHIRQ is used, if nothing else. > Therefore because of its nesting property enable_irq() has to be called > the right number of times to match the number disable_irq_nosync() was > called and restore the original state. This DEBUG_SHIRQ feature is also > the reason why free_irq() has to be called before cancel_work_sync(). > > While at it I updated the comment about phy_stop_interrupts() being > called from `keventd' -- this is no longer relevant as the use of > cancel_work_sync() makes such an approach unnecessary. OTOH a similar > comment referring to flush_scheduled_work() in phy_stop() still applies as > using cancel_work_sync() there would be dangerous. > > Checked with checkpatch.pl and at the run time (with and without > DEBUG_SHIRQ). You always put boring, crappy, insufficient text in the for-the-changelog section and interesting, useful, sufficient text in the not-for-the-changelog section. But you can't fool me! I have an editor and I fix it up. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] PHYLIB: IRQ event workqueue handling fixes
On Wed, 19 Sep 2007 15:38:19 +0100 (BST) Maciej W. Rozycki [EMAIL PROTECTED] wrote: Keep track of disable_irq_nosync() invocations and call enable_irq() the right number of times if work has been cancelled that would include them. Signed-off-by: Maciej W. Rozycki [EMAIL PROTECTED] --- Now that the call to flush_work_keventd() (problematic because of rtnl_mutex being held) has been replaced by cancel_work_sync() another issue has arisen and been left unresolved. As the MDIO bus cannot be accessed from the interrupt context the PHY interrupt handler uses disable_irq_nosync() to prevent from looping and schedules some work to be done as a softirq, which, apart from handling the state change of the originating PHY, is responsible for reenabling the interrupt. Now if the interrupt line is shared by another device and a call to the softirq handler has been cancelled, that call to enable_irq() never happens and the other device cannot use its interrupt anymore as its stuck disabled. I decided to use a counter rather than a flag because there may be more than one call to phy_change() cancelled in the queue -- a real one and a fake one triggered by free_irq() if DEBUG_SHIRQ is used, if nothing else. Therefore because of its nesting property enable_irq() has to be called the right number of times to match the number disable_irq_nosync() was called and restore the original state. This DEBUG_SHIRQ feature is also the reason why free_irq() has to be called before cancel_work_sync(). While at it I updated the comment about phy_stop_interrupts() being called from `keventd' -- this is no longer relevant as the use of cancel_work_sync() makes such an approach unnecessary. OTOH a similar comment referring to flush_scheduled_work() in phy_stop() still applies as using cancel_work_sync() there would be dangerous. Checked with checkpatch.pl and at the run time (with and without DEBUG_SHIRQ). You always put boring, crappy, insufficient text in the for-the-changelog section and interesting, useful, sufficient text in the not-for-the-changelog section. But you can't fool me! I have an editor and I fix it up. - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] PHYLIB: IRQ event workqueue handling fixes
Keep track of disable_irq_nosync() invocations and call enable_irq() the right number of times if work has been cancelled that would include them. Signed-off-by: Maciej W. Rozycki <[EMAIL PROTECTED]> --- Now that the call to flush_work_keventd() (problematic because of rtnl_mutex being held) has been replaced by cancel_work_sync() another issue has arisen and been left unresolved. As the MDIO bus cannot be accessed from the interrupt context the PHY interrupt handler uses disable_irq_nosync() to prevent from looping and schedules some work to be done as a softirq, which, apart from handling the state change of the originating PHY, is responsible for reenabling the interrupt. Now if the interrupt line is shared by another device and a call to the softirq handler has been cancelled, that call to enable_irq() never happens and the other device cannot use its interrupt anymore as its stuck disabled. I decided to use a counter rather than a flag because there may be more than one call to phy_change() cancelled in the queue -- a real one and a fake one triggered by free_irq() if DEBUG_SHIRQ is used, if nothing else. Therefore because of its nesting property enable_irq() has to be called the right number of times to match the number disable_irq_nosync() was called and restore the original state. This DEBUG_SHIRQ feature is also the reason why free_irq() has to be called before cancel_work_sync(). While at it I updated the comment about phy_stop_interrupts() being called from `keventd' -- this is no longer relevant as the use of cancel_work_sync() makes such an approach unnecessary. OTOH a similar comment referring to flush_scheduled_work() in phy_stop() still applies as using cancel_work_sync() there would be dangerous. Checked with checkpatch.pl and at the run time (with and without DEBUG_SHIRQ). Please apply. Maciej patch-mips-2.6.23-rc5-20070904-phy-irq-fix-9 diff -up --recursive --new-file linux-mips-2.6.23-rc5-20070904.macro/drivers/net/phy/phy.c linux-mips-2.6.23-rc5-20070904/drivers/net/phy/phy.c --- linux-mips-2.6.23-rc5-20070904.macro/drivers/net/phy/phy.c 2007-09-16 17:17:20.0 + +++ linux-mips-2.6.23-rc5-20070904/drivers/net/phy/phy.c2007-09-18 14:28:07.0 + @@ -7,7 +7,7 @@ * Author: Andy Fleming * * Copyright (c) 2004 Freescale Semiconductor, Inc. - * Copyright (c) 2006 Maciej W. Rozycki + * Copyright (c) 2006, 2007 Maciej W. Rozycki * * This program is free software; you can redistribute it and/or modify it * under the terms of the GNU General Public License as published by the @@ -35,6 +35,7 @@ #include #include +#include #include #include #include @@ -561,6 +562,7 @@ static irqreturn_t phy_interrupt(int irq * queue will write the PHY to disable and clear the * interrupt, and then reenable the irq line. */ disable_irq_nosync(irq); + atomic_inc(>irq_disable); schedule_work(>phy_queue); @@ -631,6 +633,7 @@ int phy_start_interrupts(struct phy_devi INIT_WORK(>phy_queue, phy_change); + atomic_set(>irq_disable, 0); if (request_irq(phydev->irq, phy_interrupt, IRQF_SHARED, "phy_interrupt", @@ -661,13 +664,22 @@ int phy_stop_interrupts(struct phy_devic if (err) phy_error(phydev); + free_irq(phydev->irq, phydev); + /* -* Finish any pending work; we might have been scheduled to be called -* from keventd ourselves, but cancel_work_sync() handles that. +* Cannot call flush_scheduled_work() here as desired because +* of rtnl_lock(), but we do not really care about what would +* be done, except from enable_irq(), so cancel any work +* possibly pending and take care of the matter below. */ cancel_work_sync(>phy_queue); - - free_irq(phydev->irq, phydev); + /* +* If work indeed has been cancelled, disable_irq() will have +* been left unbalanced from phy_interrupt() and enable_irq() +* has to be called so that other devices on the line work. +*/ + while (atomic_dec_return(>irq_disable) >= 0) + enable_irq(phydev->irq); return err; } @@ -694,6 +706,7 @@ static void phy_change(struct work_struc phydev->state = PHY_CHANGELINK; spin_unlock(>lock); + atomic_dec(>irq_disable); enable_irq(phydev->irq); /* Reenable interrupts */ @@ -707,6 +720,7 @@ static void phy_change(struct work_struc irq_enable_err: disable_irq(phydev->irq); + atomic_inc(>irq_disable); phy_err: phy_error(phydev); } diff -up --recursive --new-file linux-mips-2.6.23-rc5-20070904.macro/include/linux/phy.h linux-mips-2.6.23-rc5-20070904/include/linux/phy.h --- linux-mips-2.6.23-rc5-20070904.macro/include/linux/phy.h2007-07-10 04:56:57.0 + +++
[PATCH] PHYLIB: IRQ event workqueue handling fixes
Keep track of disable_irq_nosync() invocations and call enable_irq() the right number of times if work has been cancelled that would include them. Signed-off-by: Maciej W. Rozycki [EMAIL PROTECTED] --- Now that the call to flush_work_keventd() (problematic because of rtnl_mutex being held) has been replaced by cancel_work_sync() another issue has arisen and been left unresolved. As the MDIO bus cannot be accessed from the interrupt context the PHY interrupt handler uses disable_irq_nosync() to prevent from looping and schedules some work to be done as a softirq, which, apart from handling the state change of the originating PHY, is responsible for reenabling the interrupt. Now if the interrupt line is shared by another device and a call to the softirq handler has been cancelled, that call to enable_irq() never happens and the other device cannot use its interrupt anymore as its stuck disabled. I decided to use a counter rather than a flag because there may be more than one call to phy_change() cancelled in the queue -- a real one and a fake one triggered by free_irq() if DEBUG_SHIRQ is used, if nothing else. Therefore because of its nesting property enable_irq() has to be called the right number of times to match the number disable_irq_nosync() was called and restore the original state. This DEBUG_SHIRQ feature is also the reason why free_irq() has to be called before cancel_work_sync(). While at it I updated the comment about phy_stop_interrupts() being called from `keventd' -- this is no longer relevant as the use of cancel_work_sync() makes such an approach unnecessary. OTOH a similar comment referring to flush_scheduled_work() in phy_stop() still applies as using cancel_work_sync() there would be dangerous. Checked with checkpatch.pl and at the run time (with and without DEBUG_SHIRQ). Please apply. Maciej patch-mips-2.6.23-rc5-20070904-phy-irq-fix-9 diff -up --recursive --new-file linux-mips-2.6.23-rc5-20070904.macro/drivers/net/phy/phy.c linux-mips-2.6.23-rc5-20070904/drivers/net/phy/phy.c --- linux-mips-2.6.23-rc5-20070904.macro/drivers/net/phy/phy.c 2007-09-16 17:17:20.0 + +++ linux-mips-2.6.23-rc5-20070904/drivers/net/phy/phy.c2007-09-18 14:28:07.0 + @@ -7,7 +7,7 @@ * Author: Andy Fleming * * Copyright (c) 2004 Freescale Semiconductor, Inc. - * Copyright (c) 2006 Maciej W. Rozycki + * Copyright (c) 2006, 2007 Maciej W. Rozycki * * This program is free software; you can redistribute it and/or modify it * under the terms of the GNU General Public License as published by the @@ -35,6 +35,7 @@ #include linux/timer.h #include linux/workqueue.h +#include asm/atomic.h #include asm/io.h #include asm/irq.h #include asm/uaccess.h @@ -561,6 +562,7 @@ static irqreturn_t phy_interrupt(int irq * queue will write the PHY to disable and clear the * interrupt, and then reenable the irq line. */ disable_irq_nosync(irq); + atomic_inc(phydev-irq_disable); schedule_work(phydev-phy_queue); @@ -631,6 +633,7 @@ int phy_start_interrupts(struct phy_devi INIT_WORK(phydev-phy_queue, phy_change); + atomic_set(phydev-irq_disable, 0); if (request_irq(phydev-irq, phy_interrupt, IRQF_SHARED, phy_interrupt, @@ -661,13 +664,22 @@ int phy_stop_interrupts(struct phy_devic if (err) phy_error(phydev); + free_irq(phydev-irq, phydev); + /* -* Finish any pending work; we might have been scheduled to be called -* from keventd ourselves, but cancel_work_sync() handles that. +* Cannot call flush_scheduled_work() here as desired because +* of rtnl_lock(), but we do not really care about what would +* be done, except from enable_irq(), so cancel any work +* possibly pending and take care of the matter below. */ cancel_work_sync(phydev-phy_queue); - - free_irq(phydev-irq, phydev); + /* +* If work indeed has been cancelled, disable_irq() will have +* been left unbalanced from phy_interrupt() and enable_irq() +* has to be called so that other devices on the line work. +*/ + while (atomic_dec_return(phydev-irq_disable) = 0) + enable_irq(phydev-irq); return err; } @@ -694,6 +706,7 @@ static void phy_change(struct work_struc phydev-state = PHY_CHANGELINK; spin_unlock(phydev-lock); + atomic_dec(phydev-irq_disable); enable_irq(phydev-irq); /* Reenable interrupts */ @@ -707,6 +720,7 @@ static void phy_change(struct work_struc irq_enable_err: disable_irq(phydev-irq); + atomic_inc(phydev-irq_disable); phy_err: phy_error(phydev); } diff -up --recursive --new-file linux-mips-2.6.23-rc5-20070904.macro/include/linux/phy.h