RE: Issues with calling pm_runtime functions in platform_pm_suspend_noirq/IRQ disabled context.
> -Original Message- > From: Kevin Hilman [mailto:khil...@deeprootsystems.com] > Sent: Monday, August 09, 2010 9:01 PM > To: Basak, Partha > Cc: Paul Walmsley; linux-omap@vger.kernel.org; Kalliguddi, Hema; Raja, > Govindraj; Varadarajan, Charulatha > Subject: Re: Issues with calling pm_runtime functions in > platform_pm_suspend_noirq/IRQ disabled context. > > "Basak, Partha" writes: > > >> Finally, we have omap_sram_idle(): > >> > >> > In particular, for USB, while executing SRAM_Idle, is we use > pm_runtime > >> > functions, we see spurious IO-Pad interrupts. > >> > >> Your message doesn't specify what PM runtime functions you are > executing. > >> But if those functions are calling mutex_lock(), then they obviously > must > >> not be called from omap_sram_idle() in interrupt context. > >> > >> It's not clear from your message why you need to call PM runtime > functions > >> from the idle loop. I'd suggest that you post the problematic code in > >> question to the list. > >> > > > > USB issue: > > > > Please refer to the USB patch attached (will be sent to the list as well > in a few minutes) > > As I mentioned previously, few drivers like GPIO, USB & UART have clock- > disable/enable + context save/restore in Idle path(omap_sram_idle()). > > > > In particular, I am referring to calling of the functions > pm_runtime_put_sync() & pm_runtime_get_sync() for USB around wfi. > > > > Now, the following call sequence from omap_sram_idle()will enable IRQs: > > pm_runtime_put_sync--> > > __pm_runtime_put--> > > pm_runtime_idle--> > > spin_unlock_irq(), > > __pm_runtime_idle(),--> > > spin_unlock_irq, > > spin_unlock_irq > > > > The functions pm_runtime_idle() & __ pm_runtime_idle() do not use > > spin_lock_irqsave & spin_unlock_irqrestore. This leads to enabling of > > the interrupts in omap_sram_idle unconditionally, which for USB in > > particular is leading to IO-pad interrupt being triggered which does > > not come otherwise. > > You seem to have correctly identified the problem. Can you try the > patch below (untested) which attempts to address the root cause you've > identified? > > Kevin > > > To work around this problem, instead of using Runtime APIs, we are using > omap_device_enable/disable, which in-turn call to __omap_hwmod_enable/idle > > > > Simply put, I am not talking about grabbing a mutex when interrupts are > disabled, rather of a scenario where interrupts are getting enabled as a > side-effect of calling these functions in omap_sram_idle(). > > From be3aeea5f8d29c8ce2fa278f48aef849eb682282 Mon Sep 17 00:00:00 2001 > From: Kevin Hilman > Date: Mon, 9 Aug 2010 08:12:39 -0700 > Subject: [PATCH] PM: allow runtime PM get/put from interrupts-disabled > context > > When using runtime PM in combination with CPUidle, the runtime PM > transtions of some devices may be triggered during the idle path. > Late in the idle sequence, interrupts may already be disabled when > runtime PM for these devices is initiated (using the _sync versions of > the runtime PM API.) > > Currently, the runtime PM core assumes methods are called with > interrupts enabled. However, if it is called with interrupts > disabled, the internal locking unconditionally enables interrupts, for > example: > > pm_runtime_put_sync() > __pm_runtime_put() > pm_runtime_idle() > spin_lock_irq() > __pm_runtime_idle() > spin_unlock_irq() <--- interrupts unconditionally > enabled > dev->bus->pm->runtime_idle() > spin_lock_irq() > spin_unlock_irq() > > To fix, use the save/restore versions of the spinlock API. > > Reported-by: Partha Basak > Signed-off-by: Kevin Hilman > --- > drivers/base/power/runtime.c | 68 ++--- > > include/linux/pm.h |1 + > 2 files changed, 37 insertions(+), 32 deletions(-) > > diff --git a/drivers/base/power/runtime.c b/drivers/base/power/runtime.c > index b0ec0e9..b02ef35 100644 > --- a/drivers/base/power/runtime.c > +++ b/drivers/base/power/runtime.c > @@ -80,24 +80,24 @@ static int __pm_runtime_idle(struct device *dev) > dev->power.idle_notification = true; > > if (dev->bus && dev->bus->pm && dev->bus->p
RE: Issues with calling pm_runtime functions in platform_pm_suspend_noirq/IRQ disabled context.
> -Original Message- > From: linux-omap-ow...@vger.kernel.org [mailto:linux-omap- > ow...@vger.kernel.org] On Behalf Of Shilimkar, Santosh > Sent: Monday, August 09, 2010 9:43 PM > To: Kevin Hilman > Cc: Basak, Partha; Paul Walmsley; linux-omap@vger.kernel.org; Kalliguddi, > Hema; Raja, Govindraj; Varadarajan, Charulatha > Subject: RE: Issues with calling pm_runtime functions in > platform_pm_suspend_noirq/IRQ disabled context. > [Snip] > > >> pm_runtime_put_sync() > > >> __pm_runtime_put() > > >> pm_runtime_idle() > > >> spin_lock_irq() > > >> __pm_runtime_idle() > > >> spin_unlock_irq() <--- interrupts unconditionally > > >> enabled > > >> dev->bus->pm->runtime_idle() > > >> spin_lock_irq() > > >> spin_unlock_irq() > > >> > > >> To fix, use the save/restore versions of the spinlock API. > > >> > > > Similar thing was thought when this problem was encountered but > > > there was a concern that it may lead to interrupts are being disable > > > for longer times > > > > why? > > > > if interrupts are enabled when this API is used, this patch doesn't > > change the amount of time interrupts are disabled. > > > > if interrupts are already disabled, then you *want* interrupts to be > > disabled for the entire time. > > > Correct me if I am off the track here. > > If these API's are _always_ called with interrupt disabled then probably > we don't even need the new spinlock version locking. > > Considering the API is called with interrupt enabled. > > int pm_runtime_suspend(struct device *dev) > { > int retval; > > spin_lock_irqsave(&dev->power.lock, dev->power.lock_flags); > retval = __pm_runtime_suspend(dev, false); > spin_unlock_irqrestore(&dev->power.lock, dev->power.lock_flags);c > > The interrupt on local cpu remain disabled till "__pm_runtime_suspend" > does its job and re-enables the interrupts. No? > > Sorry Kevin for oversight. The existing used lock is already a irq version. You are right. This change should be fine. Some how I was under impression the existing code was using plain spin lock version. Really sorry about the noise. Regards, Santosh -- To unsubscribe from this list: send the line "unsubscribe linux-omap" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: Issues with calling pm_runtime functions in platform_pm_suspend_noirq/IRQ disabled context.
> -Original Message- > From: Kevin Hilman [mailto:khil...@deeprootsystems.com] > Sent: Monday, August 09, 2010 9:25 PM > To: Shilimkar, Santosh > Cc: Basak, Partha; Paul Walmsley; linux-omap@vger.kernel.org; Kalliguddi, > Hema; Raja, Govindraj; Varadarajan, Charulatha > Subject: Re: Issues with calling pm_runtime functions in > platform_pm_suspend_noirq/IRQ disabled context. > > "Shilimkar, Santosh" writes: > > >> -Original Message- > >> From: linux-omap-ow...@vger.kernel.org [mailto:linux-omap- > >> ow...@vger.kernel.org] On Behalf Of Kevin Hilman > >> Sent: Monday, August 09, 2010 9:01 PM > >> To: Basak, Partha > >> Cc: Paul Walmsley; linux-omap@vger.kernel.org; Kalliguddi, Hema; Raja, > >> Govindraj; Varadarajan, Charulatha > >> Subject: Re: Issues with calling pm_runtime functions in > >> platform_pm_suspend_noirq/IRQ disabled context. > >> > >> "Basak, Partha" writes: > >> > >> >> Finally, we have omap_sram_idle(): > >> >> > >> >> > In particular, for USB, while executing SRAM_Idle, is we use > >> pm_runtime > >> >> > functions, we see spurious IO-Pad interrupts. > >> >> > >> >> Your message doesn't specify what PM runtime functions you are > >> executing. > >> >> But if those functions are calling mutex_lock(), then they obviously > >> must > >> >> not be called from omap_sram_idle() in interrupt context. > >> >> > >> >> It's not clear from your message why you need to call PM runtime > >> functions > >> >> from the idle loop. I'd suggest that you post the problematic code > in > >> >> question to the list. > >> >> > >> > > >> > USB issue: > >> > > >> > Please refer to the USB patch attached (will be sent to the list as > well > >> in a few minutes) > >> > As I mentioned previously, few drivers like GPIO, USB & UART have > clock- > >> disable/enable + context save/restore in Idle path(omap_sram_idle()). > >> > > >> > In particular, I am referring to calling of the functions > >> pm_runtime_put_sync() & pm_runtime_get_sync() for USB around wfi. > >> > > >> > Now, the following call sequence from omap_sram_idle()will enable > IRQs: > >> > pm_runtime_put_sync--> > >> > __pm_runtime_put--> > >> > pm_runtime_idle--> > >> > spin_unlock_irq(), > >> > __pm_runtime_idle(),--> > >> > spin_unlock_irq, > >> > spin_unlock_irq > >> > > >> > The functions pm_runtime_idle() & __ pm_runtime_idle() do not use > >> > spin_lock_irqsave & spin_unlock_irqrestore. This leads to enabling of > >> > the interrupts in omap_sram_idle unconditionally, which for USB in > >> > particular is leading to IO-pad interrupt being triggered which does > >> > not come otherwise. > >> > >> You seem to have correctly identified the problem. Can you try the > >> patch below (untested) which attempts to address the root cause you've > >> identified? > >> > >> Kevin > >> > >> > To work around this problem, instead of using Runtime APIs, we are > using > >> omap_device_enable/disable, which in-turn call to > __omap_hwmod_enable/idle > >> > > >> > Simply put, I am not talking about grabbing a mutex when interrupts > are > >> disabled, rather of a scenario where interrupts are getting enabled as > a > >> side-effect of calling these functions in omap_sram_idle(). > >> > >> From be3aeea5f8d29c8ce2fa278f48aef849eb682282 Mon Sep 17 00:00:00 2001 > >> From: Kevin Hilman > >> Date: Mon, 9 Aug 2010 08:12:39 -0700 > >> Subject: [PATCH] PM: allow runtime PM get/put from interrupts-disabled > >> context > >> > >> When using runtime PM in combination with CPUidle, the runtime PM > >> transtions of some devices may be triggered during the idle path. > >> Late in the idle sequence, interrupts may already be disabled when > >> runtime PM for these devices is initiated (using the _sync versions of > >> the runtime PM API.) > >> > >> Currently, the runtime PM core assumes methods are called with > >> interrupts enabled. However, if it is called wi
Re: Issues with calling pm_runtime functions in platform_pm_suspend_noirq/IRQ disabled context.
"Shilimkar, Santosh" writes: >> -Original Message- >> From: linux-omap-ow...@vger.kernel.org [mailto:linux-omap- >> ow...@vger.kernel.org] On Behalf Of Kevin Hilman >> Sent: Monday, August 09, 2010 9:01 PM >> To: Basak, Partha >> Cc: Paul Walmsley; linux-omap@vger.kernel.org; Kalliguddi, Hema; Raja, >> Govindraj; Varadarajan, Charulatha >> Subject: Re: Issues with calling pm_runtime functions in >> platform_pm_suspend_noirq/IRQ disabled context. >> >> "Basak, Partha" writes: >> >> >> Finally, we have omap_sram_idle(): >> >> >> >> > In particular, for USB, while executing SRAM_Idle, is we use >> pm_runtime >> >> > functions, we see spurious IO-Pad interrupts. >> >> >> >> Your message doesn't specify what PM runtime functions you are >> executing. >> >> But if those functions are calling mutex_lock(), then they obviously >> must >> >> not be called from omap_sram_idle() in interrupt context. >> >> >> >> It's not clear from your message why you need to call PM runtime >> functions >> >> from the idle loop. I'd suggest that you post the problematic code in >> >> question to the list. >> >> >> > >> > USB issue: >> > >> > Please refer to the USB patch attached (will be sent to the list as well >> in a few minutes) >> > As I mentioned previously, few drivers like GPIO, USB & UART have clock- >> disable/enable + context save/restore in Idle path(omap_sram_idle()). >> > >> > In particular, I am referring to calling of the functions >> pm_runtime_put_sync() & pm_runtime_get_sync() for USB around wfi. >> > >> > Now, the following call sequence from omap_sram_idle()will enable IRQs: >> > pm_runtime_put_sync--> >> >__pm_runtime_put--> >> >pm_runtime_idle--> >> >spin_unlock_irq(), >> >__pm_runtime_idle(),--> >> > spin_unlock_irq, >> >spin_unlock_irq >> > >> > The functions pm_runtime_idle() & __ pm_runtime_idle() do not use >> > spin_lock_irqsave & spin_unlock_irqrestore. This leads to enabling of >> > the interrupts in omap_sram_idle unconditionally, which for USB in >> > particular is leading to IO-pad interrupt being triggered which does >> > not come otherwise. >> >> You seem to have correctly identified the problem. Can you try the >> patch below (untested) which attempts to address the root cause you've >> identified? >> >> Kevin >> >> > To work around this problem, instead of using Runtime APIs, we are using >> omap_device_enable/disable, which in-turn call to __omap_hwmod_enable/idle >> > >> > Simply put, I am not talking about grabbing a mutex when interrupts are >> disabled, rather of a scenario where interrupts are getting enabled as a >> side-effect of calling these functions in omap_sram_idle(). >> >> From be3aeea5f8d29c8ce2fa278f48aef849eb682282 Mon Sep 17 00:00:00 2001 >> From: Kevin Hilman >> Date: Mon, 9 Aug 2010 08:12:39 -0700 >> Subject: [PATCH] PM: allow runtime PM get/put from interrupts-disabled >> context >> >> When using runtime PM in combination with CPUidle, the runtime PM >> transtions of some devices may be triggered during the idle path. >> Late in the idle sequence, interrupts may already be disabled when >> runtime PM for these devices is initiated (using the _sync versions of >> the runtime PM API.) >> >> Currently, the runtime PM core assumes methods are called with >> interrupts enabled. However, if it is called with interrupts >> disabled, the internal locking unconditionally enables interrupts, for >> example: >> >> pm_runtime_put_sync() >> __pm_runtime_put() >> pm_runtime_idle() >> spin_lock_irq() >> __pm_runtime_idle() >> spin_unlock_irq() <--- interrupts unconditionally >> enabled >> dev->bus->pm->runtime_idle() >> spin_lock_irq() >> spin_unlock_irq() >> >> To fix, use the save/restore versions of the spinlock API. >> > Similar thing was thought when this problem was encountered but > there was a concern that it may lead to interrupts are being disable > for longer times why? if interrupts are enabled when this API is used, this patch doesn't change the amount of time interrupts are disabled. if interrupts are already disabled, then you *want* interrupts to be disabled for the entire time. what exactly is the concern? Kevin > Are all run time PM APIs are short enough to keep interrupts > disabled without impacting interrupt latency? > -- To unsubscribe from this list: send the line "unsubscribe linux-omap" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: Issues with calling pm_runtime functions in platform_pm_suspend_noirq/IRQ disabled context.
> -Original Message- > From: linux-omap-ow...@vger.kernel.org [mailto:linux-omap- > ow...@vger.kernel.org] On Behalf Of Kevin Hilman > Sent: Monday, August 09, 2010 9:01 PM > To: Basak, Partha > Cc: Paul Walmsley; linux-omap@vger.kernel.org; Kalliguddi, Hema; Raja, > Govindraj; Varadarajan, Charulatha > Subject: Re: Issues with calling pm_runtime functions in > platform_pm_suspend_noirq/IRQ disabled context. > > "Basak, Partha" writes: > > >> Finally, we have omap_sram_idle(): > >> > >> > In particular, for USB, while executing SRAM_Idle, is we use > pm_runtime > >> > functions, we see spurious IO-Pad interrupts. > >> > >> Your message doesn't specify what PM runtime functions you are > executing. > >> But if those functions are calling mutex_lock(), then they obviously > must > >> not be called from omap_sram_idle() in interrupt context. > >> > >> It's not clear from your message why you need to call PM runtime > functions > >> from the idle loop. I'd suggest that you post the problematic code in > >> question to the list. > >> > > > > USB issue: > > > > Please refer to the USB patch attached (will be sent to the list as well > in a few minutes) > > As I mentioned previously, few drivers like GPIO, USB & UART have clock- > disable/enable + context save/restore in Idle path(omap_sram_idle()). > > > > In particular, I am referring to calling of the functions > pm_runtime_put_sync() & pm_runtime_get_sync() for USB around wfi. > > > > Now, the following call sequence from omap_sram_idle()will enable IRQs: > > pm_runtime_put_sync--> > > __pm_runtime_put--> > > pm_runtime_idle--> > > spin_unlock_irq(), > > __pm_runtime_idle(),--> > > spin_unlock_irq, > > spin_unlock_irq > > > > The functions pm_runtime_idle() & __ pm_runtime_idle() do not use > > spin_lock_irqsave & spin_unlock_irqrestore. This leads to enabling of > > the interrupts in omap_sram_idle unconditionally, which for USB in > > particular is leading to IO-pad interrupt being triggered which does > > not come otherwise. > > You seem to have correctly identified the problem. Can you try the > patch below (untested) which attempts to address the root cause you've > identified? > > Kevin > > > To work around this problem, instead of using Runtime APIs, we are using > omap_device_enable/disable, which in-turn call to __omap_hwmod_enable/idle > > > > Simply put, I am not talking about grabbing a mutex when interrupts are > disabled, rather of a scenario where interrupts are getting enabled as a > side-effect of calling these functions in omap_sram_idle(). > > From be3aeea5f8d29c8ce2fa278f48aef849eb682282 Mon Sep 17 00:00:00 2001 > From: Kevin Hilman > Date: Mon, 9 Aug 2010 08:12:39 -0700 > Subject: [PATCH] PM: allow runtime PM get/put from interrupts-disabled > context > > When using runtime PM in combination with CPUidle, the runtime PM > transtions of some devices may be triggered during the idle path. > Late in the idle sequence, interrupts may already be disabled when > runtime PM for these devices is initiated (using the _sync versions of > the runtime PM API.) > > Currently, the runtime PM core assumes methods are called with > interrupts enabled. However, if it is called with interrupts > disabled, the internal locking unconditionally enables interrupts, for > example: > > pm_runtime_put_sync() > __pm_runtime_put() > pm_runtime_idle() > spin_lock_irq() > __pm_runtime_idle() > spin_unlock_irq() <--- interrupts unconditionally > enabled > dev->bus->pm->runtime_idle() > spin_lock_irq() > spin_unlock_irq() > > To fix, use the save/restore versions of the spinlock API. > Similar thing was thought when this problem was encountered but there was a concern that it may lead to interrupts are being disable for longer times Are all run time PM APIs are short enough to keep interrupts disabled without impacting interrupt latency? -- To unsubscribe from this list: send the line "unsubscribe linux-omap" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Issues with calling pm_runtime functions in platform_pm_suspend_noirq/IRQ disabled context.
"Basak, Partha" writes: >> Finally, we have omap_sram_idle(): >> >> > In particular, for USB, while executing SRAM_Idle, is we use pm_runtime >> > functions, we see spurious IO-Pad interrupts. >> >> Your message doesn't specify what PM runtime functions you are executing. >> But if those functions are calling mutex_lock(), then they obviously must >> not be called from omap_sram_idle() in interrupt context. >> >> It's not clear from your message why you need to call PM runtime functions >> from the idle loop. I'd suggest that you post the problematic code in >> question to the list. >> > > USB issue: > > Please refer to the USB patch attached (will be sent to the list as well in a > few minutes) > As I mentioned previously, few drivers like GPIO, USB & UART have clock- > disable/enable + context save/restore in Idle path(omap_sram_idle()). > > In particular, I am referring to calling of the functions > pm_runtime_put_sync() & pm_runtime_get_sync() for USB around wfi. > > Now, the following call sequence from omap_sram_idle()will enable IRQs: > pm_runtime_put_sync--> > __pm_runtime_put--> > pm_runtime_idle--> > spin_unlock_irq(), > __pm_runtime_idle(),--> >spin_unlock_irq, > spin_unlock_irq > > The functions pm_runtime_idle() & __ pm_runtime_idle() do not use > spin_lock_irqsave & spin_unlock_irqrestore. This leads to enabling of > the interrupts in omap_sram_idle unconditionally, which for USB in > particular is leading to IO-pad interrupt being triggered which does > not come otherwise. You seem to have correctly identified the problem. Can you try the patch below (untested) which attempts to address the root cause you've identified? Kevin > To work around this problem, instead of using Runtime APIs, we are using > omap_device_enable/disable, which in-turn call to __omap_hwmod_enable/idle > > Simply put, I am not talking about grabbing a mutex when interrupts are > disabled, rather of a scenario where interrupts are getting enabled as a > side-effect of calling these functions in omap_sram_idle(). >From be3aeea5f8d29c8ce2fa278f48aef849eb682282 Mon Sep 17 00:00:00 2001 From: Kevin Hilman Date: Mon, 9 Aug 2010 08:12:39 -0700 Subject: [PATCH] PM: allow runtime PM get/put from interrupts-disabled context When using runtime PM in combination with CPUidle, the runtime PM transtions of some devices may be triggered during the idle path. Late in the idle sequence, interrupts may already be disabled when runtime PM for these devices is initiated (using the _sync versions of the runtime PM API.) Currently, the runtime PM core assumes methods are called with interrupts enabled. However, if it is called with interrupts disabled, the internal locking unconditionally enables interrupts, for example: pm_runtime_put_sync() __pm_runtime_put() pm_runtime_idle() spin_lock_irq() __pm_runtime_idle() spin_unlock_irq() <--- interrupts unconditionally enabled dev->bus->pm->runtime_idle() spin_lock_irq() spin_unlock_irq() To fix, use the save/restore versions of the spinlock API. Reported-by: Partha Basak Signed-off-by: Kevin Hilman --- drivers/base/power/runtime.c | 68 ++--- include/linux/pm.h |1 + 2 files changed, 37 insertions(+), 32 deletions(-) diff --git a/drivers/base/power/runtime.c b/drivers/base/power/runtime.c index b0ec0e9..b02ef35 100644 --- a/drivers/base/power/runtime.c +++ b/drivers/base/power/runtime.c @@ -80,24 +80,24 @@ static int __pm_runtime_idle(struct device *dev) dev->power.idle_notification = true; if (dev->bus && dev->bus->pm && dev->bus->pm->runtime_idle) { - spin_unlock_irq(&dev->power.lock); + spin_unlock_irqrestore(&dev->power.lock, dev->power.lock_flags); dev->bus->pm->runtime_idle(dev); - spin_lock_irq(&dev->power.lock); + spin_lock_irqsave(&dev->power.lock, dev->power.lock_flags); } else if (dev->type && dev->type->pm && dev->type->pm->runtime_idle) { - spin_unlock_irq(&dev->power.lock); + spin_unlock_irqrestore(&dev->power.lock, dev->power.lock_flags); dev->type->pm->runtime_idle(dev); - spin_lock_irq(&dev->power.lock); + spin_lock_irqsave(&dev->power.lock, dev->power.lock_flags); } else if (dev->class && dev->class->pm && dev->class->pm->runtime_idle) { - spin_unlock_irq(&dev->power.lock); + spin_unlock_irqrestore(&dev->power.lock, dev->power.lock_flags); dev->class->pm->runtime_idle(dev); - spin_lock_irq(&dev->power.lock); + spin_lock_irqsave(&dev->power.lock, dev->power.lock_flags); } dev->power.idle_notificat
RE: Issues with calling pm_runtime functions in platform_pm_suspend_noirq/IRQ disabled context.
> Finally, we have omap_sram_idle(): > > > In particular, for USB, while executing SRAM_Idle, is we use pm_runtime > > functions, we see spurious IO-Pad interrupts. > > Your message doesn't specify what PM runtime functions you are executing. > But if those functions are calling mutex_lock(), then they obviously must > not be called from omap_sram_idle() in interrupt context. > > It's not clear from your message why you need to call PM runtime functions > from the idle loop. I'd suggest that you post the problematic code in > question to the list. > USB issue: Please refer to the USB patch attached (will be sent to the list as well in a few minutes) As I mentioned previously, few drivers like GPIO, USB & UART have clock- disable/enable + context save/restore in Idle path(omap_sram_idle()). In particular, I am referring to calling of the functions pm_runtime_put_sync() & pm_runtime_get_sync() for USB around wfi. Now, the following call sequence from omap_sram_idle()will enable IRQs: pm_runtime_put_sync--> __pm_runtime_put--> pm_runtime_idle--> spin_unlock_irq(), __pm_runtime_idle(),--> spin_unlock_irq, spin_unlock_irq The functions pm_runtime_idle() & __ pm_runtime_idle() do not use spin_lock_irqsave & spin_unlock_irqrestore. This leads to enabling of the interrupts in omap_sram_idle unconditionally, which for USB in particular is leading to IO-pad interrupt being triggered which does not come otherwise. To work around this problem, instead of using Runtime APIs, we are using omap_device_enable/disable, which in-turn call to __omap_hwmod_enable/idle Simply put, I am not talking about grabbing a mutex when interrupts are disabled, rather of a scenario where interrupts are getting enabled as a side-effect of calling these functions in omap_sram_idle(). > > - Paul 0008-runtime-pm-changes.patch Description: 0008-runtime-pm-changes.patch
RE: Issues with calling pm_runtime functions in platform_pm_suspend_noirq/IRQ disabled context.
On Sat, 7 Aug 2010, Basak, Partha wrote: > > -Original Message- > > From: Paul Walmsley [mailto:p...@pwsan.com] > > Sent: Saturday, August 07, 2010 1:00 AM > > > > On Tue, 3 Aug 2010, Basak, Partha wrote: > > > > > I believe, it is not correct to call the pm_runtime APIs from interrupt > > > locked context > > > > Why do you think that the PM runtime APIs are being called from interrupt > > context? Could you point out the call chain that you're seeing? > > > I meant that they should not not be called in a context which has > interrupts disabled. SRAM_Idle as well as suspend_no_irq are the > instances I was referring to. It would be really helpful if you would use the exact names of the functions that you are referring to. This message assumes that your use of "SRAM_Idle" refers to omap_sram_idle() in pm34xx.c, and that "suspend_no_irq" refers to platform_pm_suspend_noirq() in pm_bus.c. You write that "[the PM runtime functions] should not not be called in a context which has interrupts disabled." All interrupts? Some interrupts? "Context" in the general sense, or in the strict sense used in Linux interrupt handling? It is difficult to comment on this statement since it is unclear. I will guess that your message refers to the fact that some of the PM runtime functions take a mutex, and therefore it is invalid to call those functions when the timer interrupt is disabled. If that's what you mean, then it's worth observing that it's valid to call some PM runtime functions, such as pm_runtime_get_noresume(), pm_runtime_suspended(), etc., no matter which interrupts are enabled, since they don't take any mutex. Similarly, it seems quite valid to call pretty much any PM runtime function as long as the timer interrupt is still running. This is the case with platform_pm_suspend_noirq(), at least, with this call chain: dpm_suspend_noirq() -> device_suspend_noirq() -> pm_irq_op() -> dev_pm_ops.suspend_noirq -> platform_pm_suspend_noirq() Unfortunately, your message didn't provide a call chain, so, not sure if we're even referring to the same code path. Based on this chain, I don't see any interrupt-related problems with calling PM runtime functions from platform_pm_suspend_noirq(). If you are actually seeing some problem here, then you should post the warning that you're seeing and the call chain that's causing the problem. Finally, we have omap_sram_idle(): > In particular, for USB, while executing SRAM_Idle, is we use pm_runtime > functions, we see spurious IO-Pad interrupts. Your message doesn't specify what PM runtime functions you are executing. But if those functions are calling mutex_lock(), then they obviously must not be called from omap_sram_idle() in interrupt context. It's not clear from your message why you need to call PM runtime functions from the idle loop. I'd suggest that you post the problematic code in question to the list. - Paul -- To unsubscribe from this list: send the line "unsubscribe linux-omap" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: Issues with calling pm_runtime functions in platform_pm_suspend_noirq/IRQ disabled context.
> -Original Message- > From: Kevin Hilman [mailto:khil...@deeprootsystems.com] > Sent: Thursday, August 05, 2010 3:17 AM > To: Basak, Partha > Cc: Paul Walmsley; linux-omap@vger.kernel.org; Kalliguddi, Hema; Raja, > Govindraj; Varadarajan, Charulatha; Nayak, Rajendra; Cousson, Benoit > Subject: Re: Issues with calling pm_runtime functions in > platform_pm_suspend_noirq/IRQ disabled context. > > "Basak, Partha" writes: > > >> -Original Message- > >> From: Kevin Hilman [mailto:khil...@deeprootsystems.com] > >> Sent: Tuesday, August 03, 2010 11:06 PM > >> To: Basak, Partha > >> Cc: Paul Walmsley; linux-omap@vger.kernel.org; Kalliguddi, Hema; Raja, > >> Govindraj; Varadarajan, Charulatha; Nayak, Rajendra; Cousson, Benoit > >> Subject: Re: Issues with calling pm_runtime functions in > >> platform_pm_suspend_noirq/IRQ disabled context. > >> > >> "Basak, Partha" writes: > >> > >> > Resending with the corrected mailing list > >> > > >> > Hello Kevin, > >> > > >> > I want to draw your attention to an issue the following patch > >> > introduces. http://dev.omapzoom.org/?p=swarch/linux-omap- > >> adv.git;a=commitdiff;h=8041359e18e49bf8a3d41f15894db9e476f3a8fc > >> > >> [...] > >> > >> > I believe, it is not correct to call the pm_runtime APIs from > >> > interrupt locked context since the underlying functions > >> > __pm_runtime_suspend & __pm_runtime_resume enable interrupts and call > >> > schedule(). > >> > > >> > Further, from a s/w layering standpoint, platform-bus is a deeper > >> > layer than the Runtime layer, which the runtime layer calls into. So > >> > it may not be correct to have this layer calling into pm_runtime(). > It > >> > should directly call omap_device APIs. > >> > >> The original version of this patch did directly call the omap_device > >> APIs. However, Paul didn't like that approach and there were > >> use-counting problems to handle as well (e.g. if a driver was not (yet) > >> in use, it would already be disabled, and a suspend would trigger an > >> omap_device_disable() which would fail since device was already > >> disabled.) > >> > >> Paul and I agreed that this current approach would solve the > >> use-counting problems, but you're correct in that it introduces > >> new problems as these functions can _possibly_ sleep/schedule. > >> > >> That being said, have you actually seen problems here? I would like > >> to see how they are triggered? > > > > Scenario 1: > > > > Consider the case where a driver has already called > > pm_runtime_put_sync as part of aggressive clock cutting > > scheme. Subsequently, when system suspend is called, > > pm_runtime_put_sync is called again. > > > Due to atomic_dec_and_test check > > on usage_count, the second call goes through w/o error. > > I don't think you're fully understanding the point of the put/get in the > suspend/resume path. > > The reason for the 'put' in the suspend path is because the PM core has > done a 'get' in dpm_prepare() [c.f. drivers/base/power/main.c] so that > runtime PM transitions are prevented during suspend/resume. > > On OMAP, we want to allow those transitions to happen so we can use the > same runtime PM calls in the idle and suspend paths. > > > At System resume, pm_runtime_get_sync is called twice, once by resume, > > once by the driver itself. > > Yes, but there is a 'put_sync' in between those done by the PM core > during resume (c.f. dpm_complete() in drivers/base/power/main.c] > > > Because of the extra reference count, the > > aggressive clock control by the driver is broken, as call to put_sync > > by a driver reduces to usage_count to 1. As a result clock transition > > in Idle-path is broken. > I understand the rationale better, but having these changes is making the next Idle call after a suspend-resume cycle to fail. I will continue debugging on this and come back with more details. > > > Scenario2: > > > > Tested GPIO for Suspend with these changes & obtained dump of Circular > Locking dependencies: > > I don't think these to problems are related, AFAICT, they are separate > issues. I'll respond to this scenario in a different reply. > > >> > >> [...] > >> > >> The UART driver is a special case as it is mana
RE: Issues with calling pm_runtime functions in platform_pm_suspend_noirq/IRQ disabled context.
> -Original Message- > From: Kevin Hilman [mailto:khil...@deeprootsystems.com] > Sent: Thursday, August 05, 2010 4:51 AM > To: Basak, Partha > Cc: Paul Walmsley; linux-omap@vger.kernel.org; Kalliguddi, Hema; Raja, > Govindraj; Varadarajan, Charulatha; Nayak, Rajendra; Cousson, Benoit > Subject: Re: Issues with calling pm_runtime functions in > platform_pm_suspend_noirq/IRQ disabled context. > > Kevin Hilman writes: > > > Kevin Hilman writes: > > > >> "Basak, Partha" writes: > >> > >>>> -Original Message- > >>>> From: Kevin Hilman [mailto:khil...@deeprootsystems.com] > >>>> Sent: Tuesday, August 03, 2010 11:06 PM > >>>> To: Basak, Partha > >>>> Cc: Paul Walmsley; linux-omap@vger.kernel.org; Kalliguddi, Hema; > Raja, > >>>> Govindraj; Varadarajan, Charulatha; Nayak, Rajendra; Cousson, Benoit > >>>> Subject: Re: Issues with calling pm_runtime functions in > >>>> platform_pm_suspend_noirq/IRQ disabled context. > >>>> > >>>> "Basak, Partha" writes: > >>>> > >>>> > Resending with the corrected mailing list > >>>> > > >>>> > Hello Kevin, > >>>> > > >>>> > I want to draw your attention to an issue the following patch > >>>> > introduces. http://dev.omapzoom.org/?p=swarch/linux-omap- > >>>> adv.git;a=commitdiff;h=8041359e18e49bf8a3d41f15894db9e476f3a8fc > >>>> > >>>> [...] > >>>> > >>>> > I believe, it is not correct to call the pm_runtime APIs from > >>>> > interrupt locked context since the underlying functions > >>>> > __pm_runtime_suspend & __pm_runtime_resume enable interrupts and > call > >>>> > schedule(). > >>>> > > >>>> > Further, from a s/w layering standpoint, platform-bus is a deeper > >>>> > layer than the Runtime layer, which the runtime layer calls into. > So > >>>> > it may not be correct to have this layer calling into pm_runtime(). > It > >>>> > should directly call omap_device APIs. > >>>> > >>>> The original version of this patch did directly call the omap_device > >>>> APIs. However, Paul didn't like that approach and there were > >>>> use-counting problems to handle as well (e.g. if a driver was not > (yet) > >>>> in use, it would already be disabled, and a suspend would trigger an > >>>> omap_device_disable() which would fail since device was already > >>>> disabled.) > >>>> > >>>> Paul and I agreed that this current approach would solve the > >>>> use-counting problems, but you're correct in that it introduces > >>>> new problems as these functions can _possibly_ sleep/schedule. > >>>> > >>>> That being said, have you actually seen problems here? I would like > >>>> to see how they are triggered? > >>> > >>> Scenario 1: > >>> > >>> Consider the case where a driver has already called > >>> pm_runtime_put_sync as part of aggressive clock cutting > >>> scheme. Subsequently, when system suspend is called, > >>> pm_runtime_put_sync is called again. > >> > >>> Due to atomic_dec_and_test check > >>> on usage_count, the second call goes through w/o error. > >> > >> I don't think you're fully understanding the point of the put/get in > the > >> suspend/resume path. > >> > >> The reason for the 'put' in the suspend path is because the PM core has > >> done a 'get' in dpm_prepare() [c.f. drivers/base/power/main.c] so that > >> runtime PM transitions are prevented during suspend/resume. > >> > >> On OMAP, we want to allow those transitions to happen so we can use the > >> same runtime PM calls in the idle and suspend paths. > >> > >>> At System resume, pm_runtime_get_sync is called twice, once by resume, > >>> once by the driver itself. > >> > >> Yes, but there is a 'put_sync' in between those done by the PM core > >> during resume (c.f. dpm_complete() in drivers/base/power/main.c] > >> > >>> Because of the extra reference count, the > >>> aggressive clock control by the driver is broken, as call to put_sync > >>> by a driver
RE: Issues with calling pm_runtime functions in platform_pm_suspend_noirq/IRQ disabled context.
> -Original Message- > From: Kevin Hilman [mailto:khil...@deeprootsystems.com] > Sent: Thursday, August 05, 2010 5:05 AM > To: Basak, Partha > Cc: Paul Walmsley; linux-omap@vger.kernel.org; Kalliguddi, Hema; Raja, > Govindraj; Varadarajan, Charulatha; Nayak, Rajendra; Cousson, Benoit > Subject: Re: Issues with calling pm_runtime functions in > platform_pm_suspend_noirq/IRQ disabled context. > > "Basak, Partha" writes: > > > > > Tested GPIO for Suspend with these changes & obtained dump of Circular > Locking dependencies: > > > > It would *really* help to have these GPIO conversion patches posted > against a public tree/branch one could see exactly what is going on and > be able to reproduce this problem for easier analysis. I have some > hunches as to what is going wrong here, but cannot be sure. I'd much > rather be able to see and try the code. > > Fortunately, lockdep is verbose enough to try and understand the > symptoms here. Lockdep seems to have detected that these two locks have > been acquired in different order, resulting in *potential* deadlock. > > Indeed, during init (#0 below) you have the hwmod_mutex acquired > (hwmod_for_each_by_class()) then the dpm_list_mtx acquired > (device_pm_add()). Later, during suspend the dpm_list_mtx is aquired > first (dpm_suspend_noirq()), then the omap_hwmod_mutex is acquired > (omap_hwmod_idle()). > > Taking the same locks in different order at different times is the > circular dependency and a recipe for potential deadlock. > > In our case, there is no real potential for deadlock here as the locks > are only taken the hwmod->dpm order once during init, all other times > (when the omap_device/hwmod are ready) it will happen in the dpm->hwmod > order. > > The simple fix here is probably to remove the locking in the > omap_hwmod_for_each* functions. Could you try that? > We tried this, it works. But the GPIO patch series sent out(posted today, Aug 06 2010) are tested based on reverting the pm_bus.c suspend_no_irq patch. > I'm a bit confused why I don't see the same problem with the UART layer > which currently also handles things in the idle path as well. > > Kevin > > > # echo mem > /sys/power/state > > [ 809.981658] PM: Syncing filesystems ... done. > > [ 810.037963] Freezing user space processes ... (elapsed 0.02 seconds) > done. > > [ 810.067901] Freezing remaining freezable tasks ... (elapsed 0.02 > seconds) done. > > [ 810.105224] Suspending console(s) (use no_console_suspend to debug) > > [ 810.166748] PM: suspend of devices complete after 46.142 msecs > > [ 810.170562] > > [ 810.170593] === > > [ 810.170593] [ INFO: possible circular locking dependency detected ] > > [ 810.170623] 2.6.35-rc5-00131-g56e767c-dirty #34 > > [ 810.170654] --- > > [ 810.170654] sh/670 is trying to acquire lock: > > [ 810.170684] (omap_hwmod_mutex){+.+.+.}, at: [] > omap_hwmod_idle+0x1c/0x38 > > [ 810.170745] > > [ 810.170745] but task is already holding lock: > > [ 810.170776] (dpm_list_mtx){+.+...}, at: [] > dpm_suspend_noirq+0x28/0x188 > > [ 810.170806] > > [ 810.170837] which lock already depends on the new lock. > > [ 810.170837] > > [ 810.170837] > > [ 810.170837] the existing dependency chain (in reverse order) is: > > [ 810.170867] > > [ 810.170867] -> #1 (dpm_list_mtx){+.+...}: > > [ 810.170898][] lock_acquire+0x60/0x74 > > [ 810.170959][] mutex_lock_nested+0x58/0x2e4 > > [ 810.170989][] device_pm_add+0x14/0xcc > > [ 810.171020][] device_add+0x3b8/0x564 > > [ 810.171051][] platform_device_add+0x104/0x160 > > [ 810.171112][] omap_device_build_ss+0x14c/0x1c8 > > [ 810.171142][] omap_device_build+0x48/0x50 > > [ 810.171173][] omap2_init_gpio+0xf0/0x15c > > [ 810.171203][] > omap_hwmod_for_each_by_class+0x60/0xa4 > > [ 810.171264][] do_one_initcall+0x58/0x1b4 > > [ 810.171295][] kernel_init+0x98/0x150 > > [ 810.171325][] kernel_thread_exit+0x0/0x8 > > [ 810.171356] > > [ 810.171356] -> #0 (omap_hwmod_mutex){+.+.+.}: > > [ 810.171386][] __lock_acquire+0x108c/0x1784 > > [ 810.171447][] lock_acquire+0x60/0x74 > > [ 810.171478][] mutex_lock_nested+0x58/0x2e4 > > [ 810.171508][] omap_hwmod_idle+0x1c/0x38 > > [ 810.171539][] omap_device_idle_hwmods+0x20/0x3c > > [ 810.171600][] _o
Re: Issues with calling pm_runtime functions in platform_pm_suspend_noirq/IRQ disabled context.
"Basak, Partha" writes: > > Tested GPIO for Suspend with these changes & obtained dump of Circular > Locking dependencies: > It would *really* help to have these GPIO conversion patches posted against a public tree/branch one could see exactly what is going on and be able to reproduce this problem for easier analysis. I have some hunches as to what is going wrong here, but cannot be sure. I'd much rather be able to see and try the code. Fortunately, lockdep is verbose enough to try and understand the symptoms here. Lockdep seems to have detected that these two locks have been acquired in different order, resulting in *potential* deadlock. Indeed, during init (#0 below) you have the hwmod_mutex acquired (hwmod_for_each_by_class()) then the dpm_list_mtx acquired (device_pm_add()). Later, during suspend the dpm_list_mtx is aquired first (dpm_suspend_noirq()), then the omap_hwmod_mutex is acquired (omap_hwmod_idle()). Taking the same locks in different order at different times is the circular dependency and a recipe for potential deadlock. In our case, there is no real potential for deadlock here as the locks are only taken the hwmod->dpm order once during init, all other times (when the omap_device/hwmod are ready) it will happen in the dpm->hwmod order. The simple fix here is probably to remove the locking in the omap_hwmod_for_each* functions. Could you try that? I'm a bit confused why I don't see the same problem with the UART layer which currently also handles things in the idle path as well. Kevin > # echo mem > /sys/power/state > [ 809.981658] PM: Syncing filesystems ... done. > [ 810.037963] Freezing user space processes ... (elapsed 0.02 seconds) done. > [ 810.067901] Freezing remaining freezable tasks ... (elapsed 0.02 seconds) > done. > [ 810.105224] Suspending console(s) (use no_console_suspend to debug) > [ 810.166748] PM: suspend of devices complete after 46.142 msecs > [ 810.170562] > [ 810.170593] === > [ 810.170593] [ INFO: possible circular locking dependency detected ] > [ 810.170623] 2.6.35-rc5-00131-g56e767c-dirty #34 > [ 810.170654] --- > [ 810.170654] sh/670 is trying to acquire lock: > [ 810.170684] (omap_hwmod_mutex){+.+.+.}, at: [] > omap_hwmod_idle+0x1c/0x38 > [ 810.170745] > [ 810.170745] but task is already holding lock: > [ 810.170776] (dpm_list_mtx){+.+...}, at: [] > dpm_suspend_noirq+0x28/0x188 > [ 810.170806] > [ 810.170837] which lock already depends on the new lock. > [ 810.170837] > [ 810.170837] > [ 810.170837] the existing dependency chain (in reverse order) is: > [ 810.170867] > [ 810.170867] -> #1 (dpm_list_mtx){+.+...}: > [ 810.170898][] lock_acquire+0x60/0x74 > [ 810.170959][] mutex_lock_nested+0x58/0x2e4 > [ 810.170989][] device_pm_add+0x14/0xcc > [ 810.171020][] device_add+0x3b8/0x564 > [ 810.171051][] platform_device_add+0x104/0x160 > [ 810.171112][] omap_device_build_ss+0x14c/0x1c8 > [ 810.171142][] omap_device_build+0x48/0x50 > [ 810.171173][] omap2_init_gpio+0xf0/0x15c > [ 810.171203][] omap_hwmod_for_each_by_class+0x60/0xa4 > [ 810.171264][] do_one_initcall+0x58/0x1b4 > [ 810.171295][] kernel_init+0x98/0x150 > [ 810.171325][] kernel_thread_exit+0x0/0x8 > [ 810.171356] > [ 810.171356] -> #0 (omap_hwmod_mutex){+.+.+.}: > [ 810.171386][] __lock_acquire+0x108c/0x1784 > [ 810.171447][] lock_acquire+0x60/0x74 > [ 810.171478][] mutex_lock_nested+0x58/0x2e4 > [ 810.171508][] omap_hwmod_idle+0x1c/0x38 > [ 810.171539][] omap_device_idle_hwmods+0x20/0x3c > [ 810.171600][] _omap_device_deactivate+0x58/0x14c > [ 810.171630][] omap_device_idle+0x4c/0x6c > [ 810.171661][] platform_pm_runtime_suspend+0x4c/0x74 > [ 810.171691][] __pm_runtime_suspend+0x204/0x34c > [ 810.171722][] pm_runtime_suspend+0x20/0x34 > [ 810.171752][] platform_pm_runtime_idle+0x8/0x10 > [ 810.171783][] __pm_runtime_idle+0x15c/0x198 > [ 810.171813][] pm_runtime_idle+0x1c/0x30 > [ 810.171844][] platform_pm_suspend_noirq+0x48/0x50 > [ 810.171875][] pm_noirq_op+0xa0/0x184 > [ 810.171905][] dpm_suspend_noirq+0xac/0x188 > [ 810.171936][] suspend_devices_and_enter+0x94/0x1d8 > [ 810.171966][] enter_state+0xbc/0x120 > [ 810.171997][] state_store+0xa4/0xb8 > [ 810.172027][] kobj_attr_store+0x18/0x1c > [ 810.172088][] sysfs_write_file+0x10c/0x144 > [ 810.172119][] vfs_write+0xb0/0x148 > [ 810.172149][] sys_write+0x3c/0x68 > [ 810.172180][] ret_fast_syscall+0x0/0x3c > [ 810.172210] > [ 810.172210] other info that might help us debug this: > [ 810.172210] > [ 810.172241] 4 locks held by sh/670: > [ 810.172241] #0: (&buffer->mutex){+.+.+.}, at: [
Re: Issues with calling pm_runtime functions in platform_pm_suspend_noirq/IRQ disabled context.
Kevin Hilman writes: > Kevin Hilman writes: > >> "Basak, Partha" writes: >> >>>> -Original Message- >>>> From: Kevin Hilman [mailto:khil...@deeprootsystems.com] >>>> Sent: Tuesday, August 03, 2010 11:06 PM >>>> To: Basak, Partha >>>> Cc: Paul Walmsley; linux-omap@vger.kernel.org; Kalliguddi, Hema; Raja, >>>> Govindraj; Varadarajan, Charulatha; Nayak, Rajendra; Cousson, Benoit >>>> Subject: Re: Issues with calling pm_runtime functions in >>>> platform_pm_suspend_noirq/IRQ disabled context. >>>> >>>> "Basak, Partha" writes: >>>> >>>> > Resending with the corrected mailing list >>>> > >>>> > Hello Kevin, >>>> > >>>> > I want to draw your attention to an issue the following patch >>>> > introduces. http://dev.omapzoom.org/?p=swarch/linux-omap- >>>> adv.git;a=commitdiff;h=8041359e18e49bf8a3d41f15894db9e476f3a8fc >>>> >>>> [...] >>>> >>>> > I believe, it is not correct to call the pm_runtime APIs from >>>> > interrupt locked context since the underlying functions >>>> > __pm_runtime_suspend & __pm_runtime_resume enable interrupts and call >>>> > schedule(). >>>> > >>>> > Further, from a s/w layering standpoint, platform-bus is a deeper >>>> > layer than the Runtime layer, which the runtime layer calls into. So >>>> > it may not be correct to have this layer calling into pm_runtime(). It >>>> > should directly call omap_device APIs. >>>> >>>> The original version of this patch did directly call the omap_device >>>> APIs. However, Paul didn't like that approach and there were >>>> use-counting problems to handle as well (e.g. if a driver was not (yet) >>>> in use, it would already be disabled, and a suspend would trigger an >>>> omap_device_disable() which would fail since device was already >>>> disabled.) >>>> >>>> Paul and I agreed that this current approach would solve the >>>> use-counting problems, but you're correct in that it introduces >>>> new problems as these functions can _possibly_ sleep/schedule. >>>> >>>> That being said, have you actually seen problems here? I would like >>>> to see how they are triggered? >>> >>> Scenario 1: >>> >>> Consider the case where a driver has already called >>> pm_runtime_put_sync as part of aggressive clock cutting >>> scheme. Subsequently, when system suspend is called, >>> pm_runtime_put_sync is called again. >> >>> Due to atomic_dec_and_test check >>> on usage_count, the second call goes through w/o error. >> >> I don't think you're fully understanding the point of the put/get in the >> suspend/resume path. >> >> The reason for the 'put' in the suspend path is because the PM core has >> done a 'get' in dpm_prepare() [c.f. drivers/base/power/main.c] so that >> runtime PM transitions are prevented during suspend/resume. >> >> On OMAP, we want to allow those transitions to happen so we can use the >> same runtime PM calls in the idle and suspend paths. >> >>> At System resume, pm_runtime_get_sync is called twice, once by resume, >>> once by the driver itself. >> >> Yes, but there is a 'put_sync' in between those done by the PM core >> during resume (c.f. dpm_complete() in drivers/base/power/main.c] >> >>> Because of the extra reference count, the >>> aggressive clock control by the driver is broken, as call to put_sync >>> by a driver reduces to usage_count to 1. As a result clock transition >>> in Idle-path is broken. > > A closer look here, and there is indeed a bug in the _resume_noirq() > method. The get here needs to be a _noresume version to match what is > done in the PM core. > > This doesn't change the use count, but it does change whether the device > is actually awoken by this 'get' or not. This get should never actually > awake the device. It is only there to compensate for what the PM core > does. > > Can you try this patch? Will post a version to the list with > changelog shortly. After a little more thinking, I'm not sure yet if this is a real problem or not. One other question. At least for GPIO testing, does it work if you simply remove the _noirq hooks from pm_bus.c? If so, please feel to post a version with the dependency that the patch adding suspend/resume hooks in pm_bus.c needs to be reverted first. Kevin > Kevin > > diff --git a/arch/arm/mach-omap2/pm_bus.c b/arch/arm/mach-omap2/pm_bus.c > index bab0186..01bbe65 100644 > --- a/arch/arm/mach-omap2/pm_bus.c > +++ b/arch/arm/mach-omap2/pm_bus.c > @@ -90,7 +90,7 @@ int platform_pm_resume_noirq(struct device *dev) >* method so that the runtime PM usage counting is in the same >* state it was when suspend was called. >*/ > - pm_runtime_get_sync(dev); > + pm_runtime_get_noresume(dev); > > if (!drv) > return 0; -- To unsubscribe from this list: send the line "unsubscribe linux-omap" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Issues with calling pm_runtime functions in platform_pm_suspend_noirq/IRQ disabled context.
Kevin Hilman writes: > "Basak, Partha" writes: > >>> -Original Message- >>> From: Kevin Hilman [mailto:khil...@deeprootsystems.com] >>> Sent: Tuesday, August 03, 2010 11:06 PM >>> To: Basak, Partha >>> Cc: Paul Walmsley; linux-omap@vger.kernel.org; Kalliguddi, Hema; Raja, >>> Govindraj; Varadarajan, Charulatha; Nayak, Rajendra; Cousson, Benoit >>> Subject: Re: Issues with calling pm_runtime functions in >>> platform_pm_suspend_noirq/IRQ disabled context. >>> >>> "Basak, Partha" writes: >>> >>> > Resending with the corrected mailing list >>> > >>> > Hello Kevin, >>> > >>> > I want to draw your attention to an issue the following patch >>> > introduces. http://dev.omapzoom.org/?p=swarch/linux-omap- >>> adv.git;a=commitdiff;h=8041359e18e49bf8a3d41f15894db9e476f3a8fc >>> >>> [...] >>> >>> > I believe, it is not correct to call the pm_runtime APIs from >>> > interrupt locked context since the underlying functions >>> > __pm_runtime_suspend & __pm_runtime_resume enable interrupts and call >>> > schedule(). >>> > >>> > Further, from a s/w layering standpoint, platform-bus is a deeper >>> > layer than the Runtime layer, which the runtime layer calls into. So >>> > it may not be correct to have this layer calling into pm_runtime(). It >>> > should directly call omap_device APIs. >>> >>> The original version of this patch did directly call the omap_device >>> APIs. However, Paul didn't like that approach and there were >>> use-counting problems to handle as well (e.g. if a driver was not (yet) >>> in use, it would already be disabled, and a suspend would trigger an >>> omap_device_disable() which would fail since device was already >>> disabled.) >>> >>> Paul and I agreed that this current approach would solve the >>> use-counting problems, but you're correct in that it introduces >>> new problems as these functions can _possibly_ sleep/schedule. >>> >>> That being said, have you actually seen problems here? I would like >>> to see how they are triggered? >> >> Scenario 1: >> >> Consider the case where a driver has already called >> pm_runtime_put_sync as part of aggressive clock cutting >> scheme. Subsequently, when system suspend is called, >> pm_runtime_put_sync is called again. > >> Due to atomic_dec_and_test check >> on usage_count, the second call goes through w/o error. > > I don't think you're fully understanding the point of the put/get in the > suspend/resume path. > > The reason for the 'put' in the suspend path is because the PM core has > done a 'get' in dpm_prepare() [c.f. drivers/base/power/main.c] so that > runtime PM transitions are prevented during suspend/resume. > > On OMAP, we want to allow those transitions to happen so we can use the > same runtime PM calls in the idle and suspend paths. > >> At System resume, pm_runtime_get_sync is called twice, once by resume, >> once by the driver itself. > > Yes, but there is a 'put_sync' in between those done by the PM core > during resume (c.f. dpm_complete() in drivers/base/power/main.c] > >> Because of the extra reference count, the >> aggressive clock control by the driver is broken, as call to put_sync >> by a driver reduces to usage_count to 1. As a result clock transition >> in Idle-path is broken. A closer look here, and there is indeed a bug in the _resume_noirq() method. The get here needs to be a _noresume version to match what is done in the PM core. This doesn't change the use count, but it does change whether the device is actually awoken by this 'get' or not. This get should never actually awake the device. It is only there to compensate for what the PM core does. Can you try this patch? Will post a version to the list with changelog shortly. Kevin diff --git a/arch/arm/mach-omap2/pm_bus.c b/arch/arm/mach-omap2/pm_bus.c index bab0186..01bbe65 100644 --- a/arch/arm/mach-omap2/pm_bus.c +++ b/arch/arm/mach-omap2/pm_bus.c @@ -90,7 +90,7 @@ int platform_pm_resume_noirq(struct device *dev) * method so that the runtime PM usage counting is in the same * state it was when suspend was called. */ - pm_runtime_get_sync(dev); + pm_runtime_get_noresume(dev); if (!drv) return 0; -- To unsubscribe from this list: send the line "unsubscribe linux-omap" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Issues with calling pm_runtime functions in platform_pm_suspend_noirq/IRQ disabled context.
"Basak, Partha" writes: >> -Original Message- >> From: Kevin Hilman [mailto:khil...@deeprootsystems.com] >> Sent: Tuesday, August 03, 2010 11:06 PM >> To: Basak, Partha >> Cc: Paul Walmsley; linux-omap@vger.kernel.org; Kalliguddi, Hema; Raja, >> Govindraj; Varadarajan, Charulatha; Nayak, Rajendra; Cousson, Benoit >> Subject: Re: Issues with calling pm_runtime functions in >> platform_pm_suspend_noirq/IRQ disabled context. >> >> "Basak, Partha" writes: >> >> > Resending with the corrected mailing list >> > >> > Hello Kevin, >> > >> > I want to draw your attention to an issue the following patch >> > introduces. http://dev.omapzoom.org/?p=swarch/linux-omap- >> adv.git;a=commitdiff;h=8041359e18e49bf8a3d41f15894db9e476f3a8fc >> >> [...] >> >> > I believe, it is not correct to call the pm_runtime APIs from >> > interrupt locked context since the underlying functions >> > __pm_runtime_suspend & __pm_runtime_resume enable interrupts and call >> > schedule(). >> > >> > Further, from a s/w layering standpoint, platform-bus is a deeper >> > layer than the Runtime layer, which the runtime layer calls into. So >> > it may not be correct to have this layer calling into pm_runtime(). It >> > should directly call omap_device APIs. >> >> The original version of this patch did directly call the omap_device >> APIs. However, Paul didn't like that approach and there were >> use-counting problems to handle as well (e.g. if a driver was not (yet) >> in use, it would already be disabled, and a suspend would trigger an >> omap_device_disable() which would fail since device was already >> disabled.) >> >> Paul and I agreed that this current approach would solve the >> use-counting problems, but you're correct in that it introduces >> new problems as these functions can _possibly_ sleep/schedule. >> >> That being said, have you actually seen problems here? I would like >> to see how they are triggered? > > Scenario 1: > > Consider the case where a driver has already called > pm_runtime_put_sync as part of aggressive clock cutting > scheme. Subsequently, when system suspend is called, > pm_runtime_put_sync is called again. > Due to atomic_dec_and_test check > on usage_count, the second call goes through w/o error. I don't think you're fully understanding the point of the put/get in the suspend/resume path. The reason for the 'put' in the suspend path is because the PM core has done a 'get' in dpm_prepare() [c.f. drivers/base/power/main.c] so that runtime PM transitions are prevented during suspend/resume. On OMAP, we want to allow those transitions to happen so we can use the same runtime PM calls in the idle and suspend paths. > At System resume, pm_runtime_get_sync is called twice, once by resume, > once by the driver itself. Yes, but there is a 'put_sync' in between those done by the PM core during resume (c.f. dpm_complete() in drivers/base/power/main.c] > Because of the extra reference count, the > aggressive clock control by the driver is broken, as call to put_sync > by a driver reduces to usage_count to 1. As a result clock transition > in Idle-path is broken. > Scenario2: > > Tested GPIO for Suspend with these changes & obtained dump of Circular > Locking dependencies: I don't think these to problems are related, AFAICT, they are separate issues. I'll respond to this scenario in a different reply. >> >> [...] >> >> The UART driver is a special case as it is managed from deep inside the >> idle path. >> >> > Can you feedback on my observations and comment on the approach taken >> > for these drivers? >> >> I cannot comment until I understand why these drivers need to >> enable/disable with interrupts off. >> >> In general, any use of the interrupts-off versions of the omap_device >> APIs need to be thorougly justified. >> >> Even the UART one will go away when the omap-serial driver is converted >> to runtime PM. >> > > For GPIO, it is a must to relinquish clocks in Idle-path as it is > possible to have a GPIO bank requested and still allow PER domain to > go to OFF. These the kind of things that I would expect to be discussed/described in the changelogs to patches posted to the list. > For USB, this is required because of a h/w issue. Again, patches with descriptive changelogs describing the "h/w issue" please. > My point is, just by exposing a _omap_hwmod_idle()(mutex-free version) will &
RE: Issues with calling pm_runtime functions in platform_pm_suspend_noirq/IRQ disabled context.
> -Original Message- > From: Kevin Hilman [mailto:khil...@deeprootsystems.com] > Sent: Tuesday, August 03, 2010 11:06 PM > To: Basak, Partha > Cc: Paul Walmsley; linux-omap@vger.kernel.org; Kalliguddi, Hema; Raja, > Govindraj; Varadarajan, Charulatha; Nayak, Rajendra; Cousson, Benoit > Subject: Re: Issues with calling pm_runtime functions in > platform_pm_suspend_noirq/IRQ disabled context. > > "Basak, Partha" writes: > > > Resending with the corrected mailing list > > > > Hello Kevin, > > > > I want to draw your attention to an issue the following patch > > introduces. http://dev.omapzoom.org/?p=swarch/linux-omap- > adv.git;a=commitdiff;h=8041359e18e49bf8a3d41f15894db9e476f3a8fc > > [...] > > > I believe, it is not correct to call the pm_runtime APIs from > > interrupt locked context since the underlying functions > > __pm_runtime_suspend & __pm_runtime_resume enable interrupts and call > > schedule(). > > > > Further, from a s/w layering standpoint, platform-bus is a deeper > > layer than the Runtime layer, which the runtime layer calls into. So > > it may not be correct to have this layer calling into pm_runtime(). It > > should directly call omap_device APIs. > > The original version of this patch did directly call the omap_device > APIs. However, Paul didn't like that approach and there were > use-counting problems to handle as well (e.g. if a driver was not (yet) > in use, it would already be disabled, and a suspend would trigger an > omap_device_disable() which would fail since device was already > disabled.) > > Paul and I agreed that this current approach would solve the > use-counting problems, but you're correct in that it introduces > new problems as these functions can _possibly_ sleep/schedule. > > That being said, have you actually seen problems here? I would like > to see how they are triggered? Scenario 1: Consider the case where a driver has already called pm_runtime_put_sync as part of aggressive clock cutting scheme. Subsequently, when system suspend is called, pm_runtime_put_sync is called again. Due to atomic_dec_and_test check on usage_count, the second call goes through w/o error. At System resume, pm_runtime_get_sync is called twice, once by resume, once by the driver itself. Because of the extra reference count, the aggressive clock control by the driver is broken, as call to put_sync by a driver reduces to usage_count to 1. As a result clock transition in Idle-path is broken. Scenario2: Tested GPIO for Suspend with these changes & obtained dump of Circular Locking dependencies: # echo mem > /sys/power/state [ 809.981658] PM: Syncing filesystems ... done. [ 810.037963] Freezing user space processes ... (elapsed 0.02 seconds) done. [ 810.067901] Freezing remaining freezable tasks ... (elapsed 0.02 seconds) done. [ 810.105224] Suspending console(s) (use no_console_suspend to debug) [ 810.166748] PM: suspend of devices complete after 46.142 msecs [ 810.170562] [ 810.170593] === [ 810.170593] [ INFO: possible circular locking dependency detected ] [ 810.170623] 2.6.35-rc5-00131-g56e767c-dirty #34 [ 810.170654] --- [ 810.170654] sh/670 is trying to acquire lock: [ 810.170684] (omap_hwmod_mutex){+.+.+.}, at: [] omap_hwmod_idle+0x1c/0x38 [ 810.170745] [ 810.170745] but task is already holding lock: [ 810.170776] (dpm_list_mtx){+.+...}, at: [] dpm_suspend_noirq+0x28/0x188 [ 810.170806] [ 810.170837] which lock already depends on the new lock. [ 810.170837] [ 810.170837] [ 810.170837] the existing dependency chain (in reverse order) is: [ 810.170867] [ 810.170867] -> #1 (dpm_list_mtx){+.+...}: [ 810.170898][] lock_acquire+0x60/0x74 [ 810.170959][] mutex_lock_nested+0x58/0x2e4 [ 810.170989][] device_pm_add+0x14/0xcc [ 810.171020][] device_add+0x3b8/0x564 [ 810.171051][] platform_device_add+0x104/0x160 [ 810.171112][] omap_device_build_ss+0x14c/0x1c8 [ 810.171142][] omap_device_build+0x48/0x50 [ 810.171173][] omap2_init_gpio+0xf0/0x15c [ 810.171203][] omap_hwmod_for_each_by_class+0x60/0xa4 [ 810.171264][] do_one_initcall+0x58/0x1b4 [ 810.171295][] kernel_init+0x98/0x150 [ 810.171325][] kernel_thread_exit+0x0/0x8 [ 810.171356] [ 810.171356] -> #0 (omap_hwmod_mutex){+.+.+.}: [ 810.171386][] __lock_acquire+0x108c/0x1784 [ 810.171447][] lock_acquire+0x60/0x74 [ 810.171478][] mutex_lock_nested+0x58/0x2e4 [ 810.171508][] omap_hwmod_idle+0x1c/0x38 [ 810.171539][] omap_device_idle_hwmods+0x20/0x3c [ 810.171600][] _omap_device_deactivate+0x58/0x14c [ 810.17163
Re: Issues with calling pm_runtime functions in platform_pm_suspend_noirq/IRQ disabled context.
"Basak, Partha" writes: > Resending with the corrected mailing list > > Hello Kevin, > > I want to draw your attention to an issue the following patch > introduces. > http://dev.omapzoom.org/?p=swarch/linux-omap-adv.git;a=commitdiff;h=8041359e18e49bf8a3d41f15894db9e476f3a8fc [...] > I believe, it is not correct to call the pm_runtime APIs from > interrupt locked context since the underlying functions > __pm_runtime_suspend & __pm_runtime_resume enable interrupts and call > schedule(). > > Further, from a s/w layering standpoint, platform-bus is a deeper > layer than the Runtime layer, which the runtime layer calls into. So > it may not be correct to have this layer calling into pm_runtime(). It > should directly call omap_device APIs. The original version of this patch did directly call the omap_device APIs. However, Paul didn't like that approach and there were use-counting problems to handle as well (e.g. if a driver was not (yet) in use, it would already be disabled, and a suspend would trigger an omap_device_disable() which would fail since device was already disabled.) Paul and I agreed that this current approach would solve the use-counting problems, but you're correct in that it introduces new problems as these functions can _possibly_ sleep/schedule. That being said, have you actually seen problems here? I would like to see how they are triggered? By the time the drivers _noirq hooks are called, the PM core has shutdown all the drivers, prevented any runtime PM transitions and disabled interrupts, so no pending runtime transitions will be queued so the sleeping patch of the __pm_runtime_* should never be entered. > We are facing a similar issue with a few drivers USB, UART & GPIO > where, we need to enable/disable clocks & restore/save context in the > CPU-Idle path with interrupts locked. These are unrelated issues. The above is for the static suspend/resume case. These are for runtime PM. > As we are unable to use Runtime APIs, we are using omap_device APIs > directly with the following modification in the .activate_funcion/ > deactivate_funcion (example UART driver) [...] The UART driver is a special case as it is managed from deep inside the idle path. > Can you feedback on my observations and comment on the approach taken > for these drivers? I cannot comment until I understand why these drivers need to enable/disable with interrupts off. In general, any use of the interrupts-off versions of the omap_device APIs need to be thorougly justified. Even the UART one will go away when the omap-serial driver is converted to runtime PM. Kevin -- To unsubscribe from this list: send the line "unsubscribe linux-omap" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html