RE: Issues with calling pm_runtime functions in platform_pm_suspend_noirq/IRQ disabled context.

2010-08-10 Thread Basak, Partha


> -Original Message-
> From: Kevin Hilman [mailto:khil...@deeprootsystems.com]
> Sent: Monday, August 09, 2010 9:01 PM
> To: Basak, Partha
> Cc: Paul Walmsley; linux-omap@vger.kernel.org; Kalliguddi, Hema; Raja,
> Govindraj; Varadarajan, Charulatha
> Subject: Re: Issues with calling pm_runtime functions in
> platform_pm_suspend_noirq/IRQ disabled context.
> 
> "Basak, Partha"  writes:
> 
> >> Finally, we have omap_sram_idle():
> >>
> >> > In particular, for USB, while executing SRAM_Idle, is we use
> pm_runtime
> >> > functions, we see spurious IO-Pad interrupts.
> >>
> >> Your message doesn't specify what PM runtime functions you are
> executing.
> >> But if those functions are calling mutex_lock(), then they obviously
> must
> >> not be called from omap_sram_idle() in interrupt context.
> >>
> >> It's not clear from your message why you need to call PM runtime
> functions
> >> from the idle loop.  I'd suggest that you post the problematic code in
> >> question to the list.
> >>
> >
> > USB issue:
> >
> > Please refer to the USB patch attached (will be sent to the list as well
> in a few minutes)
> > As I mentioned previously, few drivers like GPIO, USB & UART have clock-
> disable/enable + context save/restore in Idle path(omap_sram_idle()).
> >
> > In particular, I am referring to calling of the functions
> pm_runtime_put_sync() & pm_runtime_get_sync() for USB around wfi.
> >
> > Now, the following call sequence from omap_sram_idle()will enable IRQs:
> > pm_runtime_put_sync-->
> > __pm_runtime_put-->
> > pm_runtime_idle-->
> > spin_unlock_irq(),
> > __pm_runtime_idle(),-->
> >  spin_unlock_irq,
> > spin_unlock_irq
> >
> > The functions pm_runtime_idle() & __ pm_runtime_idle() do not use
> > spin_lock_irqsave & spin_unlock_irqrestore. This leads to enabling of
> > the interrupts in omap_sram_idle unconditionally, which for USB in
> > particular is leading to IO-pad interrupt being triggered which does
> > not come otherwise.
> 
> You seem to have correctly identified the problem.  Can you try the
> patch below (untested) which attempts to address the root cause you've
> identified?
> 
> Kevin
> 
> > To work around this problem, instead of using Runtime APIs, we are using
> omap_device_enable/disable, which in-turn call to __omap_hwmod_enable/idle
> >
> > Simply put, I am not talking about grabbing a mutex when interrupts are
> disabled, rather of a scenario where interrupts are getting enabled as a
> side-effect of calling these functions in   omap_sram_idle().
> 
> From be3aeea5f8d29c8ce2fa278f48aef849eb682282 Mon Sep 17 00:00:00 2001
> From: Kevin Hilman 
> Date: Mon, 9 Aug 2010 08:12:39 -0700
> Subject: [PATCH] PM: allow runtime PM get/put from interrupts-disabled
> context
> 
> When using runtime PM in combination with CPUidle, the runtime PM
> transtions of some devices may be triggered during the idle path.
> Late in the idle sequence, interrupts may already be disabled when
> runtime PM for these devices is initiated (using the _sync versions of
> the runtime PM API.)
> 
> Currently, the runtime PM core assumes methods are called with
> interrupts enabled.  However, if it is called with interrupts
> disabled, the internal locking unconditionally enables interrupts, for
> example:
> 
> pm_runtime_put_sync()
> __pm_runtime_put()
> pm_runtime_idle()
> spin_lock_irq()
> __pm_runtime_idle()
> spin_unlock_irq()   <--- interrupts unconditionally
> enabled
> dev->bus->pm->runtime_idle()
> spin_lock_irq()
>  spin_unlock_irq()
> 
> To fix, use the save/restore versions of the spinlock API.
> 
> Reported-by: Partha Basak 
> Signed-off-by: Kevin Hilman 
> ---
>  drivers/base/power/runtime.c |   68 ++---
> 
>  include/linux/pm.h   |1 +
>  2 files changed, 37 insertions(+), 32 deletions(-)
> 
> diff --git a/drivers/base/power/runtime.c b/drivers/base/power/runtime.c
> index b0ec0e9..b02ef35 100644
> --- a/drivers/base/power/runtime.c
> +++ b/drivers/base/power/runtime.c
> @@ -80,24 +80,24 @@ static int __pm_runtime_idle(struct device *dev)
>   dev->power.idle_notification = true;
> 
>   if (dev->bus && dev->bus->pm && dev->bus->p

RE: Issues with calling pm_runtime functions in platform_pm_suspend_noirq/IRQ disabled context.

2010-08-09 Thread Shilimkar, Santosh

> -Original Message-
> From: linux-omap-ow...@vger.kernel.org [mailto:linux-omap-
> ow...@vger.kernel.org] On Behalf Of Shilimkar, Santosh
> Sent: Monday, August 09, 2010 9:43 PM
> To: Kevin Hilman
> Cc: Basak, Partha; Paul Walmsley; linux-omap@vger.kernel.org; Kalliguddi,
> Hema; Raja, Govindraj; Varadarajan, Charulatha
> Subject: RE: Issues with calling pm_runtime functions in
> platform_pm_suspend_noirq/IRQ disabled context.
> 

[Snip]

> > >> pm_runtime_put_sync()
> > >> __pm_runtime_put()
> > >> pm_runtime_idle()
> > >> spin_lock_irq()
> > >> __pm_runtime_idle()
> > >> spin_unlock_irq()   <--- interrupts unconditionally
> > >> enabled
> > >> dev->bus->pm->runtime_idle()
> > >> spin_lock_irq()
> > >>  spin_unlock_irq()
> > >>
> > >> To fix, use the save/restore versions of the spinlock API.
> > >>
> > > Similar thing was thought when this problem was encountered but
> > > there was a concern that it may lead to interrupts are being disable
> > > for longer times
> >
> > why?
> >
> > if interrupts are enabled when this API is used, this patch doesn't
> > change the amount of time interrupts are disabled.
> >
> > if interrupts are already disabled, then you *want* interrupts to be
> > disabled for the entire time.
> >
> Correct me if I am off the track here.
> 
> If these API's are _always_ called with interrupt disabled then probably
> we don't even need the new spinlock version locking.
> 
> Considering the API is called with interrupt enabled.
> 
> int pm_runtime_suspend(struct device *dev)
>  {
>   int retval;
> 
>   spin_lock_irqsave(&dev->power.lock, dev->power.lock_flags);
>   retval = __pm_runtime_suspend(dev, false);
>   spin_unlock_irqrestore(&dev->power.lock, dev->power.lock_flags);c
> 
> The interrupt on local cpu remain disabled till "__pm_runtime_suspend"
> does its job and re-enables the interrupts. No?
> 
> 
Sorry Kevin for oversight. The existing used lock is already a irq version.
You are right. This change should be fine. Some how I was under impression
the existing code was using plain spin lock version.

Really sorry about the noise.

Regards,
Santosh

--
To unsubscribe from this list: send the line "unsubscribe linux-omap" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: Issues with calling pm_runtime functions in platform_pm_suspend_noirq/IRQ disabled context.

2010-08-09 Thread Shilimkar, Santosh
> -Original Message-
> From: Kevin Hilman [mailto:khil...@deeprootsystems.com]
> Sent: Monday, August 09, 2010 9:25 PM
> To: Shilimkar, Santosh
> Cc: Basak, Partha; Paul Walmsley; linux-omap@vger.kernel.org; Kalliguddi,
> Hema; Raja, Govindraj; Varadarajan, Charulatha
> Subject: Re: Issues with calling pm_runtime functions in
> platform_pm_suspend_noirq/IRQ disabled context.
> 
> "Shilimkar, Santosh"  writes:
> 
> >> -Original Message-
> >> From: linux-omap-ow...@vger.kernel.org [mailto:linux-omap-
> >> ow...@vger.kernel.org] On Behalf Of Kevin Hilman
> >> Sent: Monday, August 09, 2010 9:01 PM
> >> To: Basak, Partha
> >> Cc: Paul Walmsley; linux-omap@vger.kernel.org; Kalliguddi, Hema; Raja,
> >> Govindraj; Varadarajan, Charulatha
> >> Subject: Re: Issues with calling pm_runtime functions in
> >> platform_pm_suspend_noirq/IRQ disabled context.
> >>
> >> "Basak, Partha"  writes:
> >>
> >> >> Finally, we have omap_sram_idle():
> >> >>
> >> >> > In particular, for USB, while executing SRAM_Idle, is we use
> >> pm_runtime
> >> >> > functions, we see spurious IO-Pad interrupts.
> >> >>
> >> >> Your message doesn't specify what PM runtime functions you are
> >> executing.
> >> >> But if those functions are calling mutex_lock(), then they obviously
> >> must
> >> >> not be called from omap_sram_idle() in interrupt context.
> >> >>
> >> >> It's not clear from your message why you need to call PM runtime
> >> functions
> >> >> from the idle loop.  I'd suggest that you post the problematic code
> in
> >> >> question to the list.
> >> >>
> >> >
> >> > USB issue:
> >> >
> >> > Please refer to the USB patch attached (will be sent to the list as
> well
> >> in a few minutes)
> >> > As I mentioned previously, few drivers like GPIO, USB & UART have
> clock-
> >> disable/enable + context save/restore in Idle path(omap_sram_idle()).
> >> >
> >> > In particular, I am referring to calling of the functions
> >> pm_runtime_put_sync() & pm_runtime_get_sync() for USB around wfi.
> >> >
> >> > Now, the following call sequence from omap_sram_idle()will enable
> IRQs:
> >> > pm_runtime_put_sync-->
> >> >  __pm_runtime_put-->
> >> >  pm_runtime_idle-->
> >> >  spin_unlock_irq(),
> >> >  __pm_runtime_idle(),-->
> >> >   spin_unlock_irq,
> >> >  spin_unlock_irq
> >> >
> >> > The functions pm_runtime_idle() & __ pm_runtime_idle() do not use
> >> > spin_lock_irqsave & spin_unlock_irqrestore. This leads to enabling of
> >> > the interrupts in omap_sram_idle unconditionally, which for USB in
> >> > particular is leading to IO-pad interrupt being triggered which does
> >> > not come otherwise.
> >>
> >> You seem to have correctly identified the problem.  Can you try the
> >> patch below (untested) which attempts to address the root cause you've
> >> identified?
> >>
> >> Kevin
> >>
> >> > To work around this problem, instead of using Runtime APIs, we are
> using
> >> omap_device_enable/disable, which in-turn call to
> __omap_hwmod_enable/idle
> >> >
> >> > Simply put, I am not talking about grabbing a mutex when interrupts
> are
> >> disabled, rather of a scenario where interrupts are getting enabled as
> a
> >> side-effect of calling these functions in   omap_sram_idle().
> >>
> >> From be3aeea5f8d29c8ce2fa278f48aef849eb682282 Mon Sep 17 00:00:00 2001
> >> From: Kevin Hilman 
> >> Date: Mon, 9 Aug 2010 08:12:39 -0700
> >> Subject: [PATCH] PM: allow runtime PM get/put from interrupts-disabled
> >> context
> >>
> >> When using runtime PM in combination with CPUidle, the runtime PM
> >> transtions of some devices may be triggered during the idle path.
> >> Late in the idle sequence, interrupts may already be disabled when
> >> runtime PM for these devices is initiated (using the _sync versions of
> >> the runtime PM API.)
> >>
> >> Currently, the runtime PM core assumes methods are called with
> >> interrupts enabled.  However, if it is called wi

Re: Issues with calling pm_runtime functions in platform_pm_suspend_noirq/IRQ disabled context.

2010-08-09 Thread Kevin Hilman
"Shilimkar, Santosh"  writes:

>> -Original Message-
>> From: linux-omap-ow...@vger.kernel.org [mailto:linux-omap-
>> ow...@vger.kernel.org] On Behalf Of Kevin Hilman
>> Sent: Monday, August 09, 2010 9:01 PM
>> To: Basak, Partha
>> Cc: Paul Walmsley; linux-omap@vger.kernel.org; Kalliguddi, Hema; Raja,
>> Govindraj; Varadarajan, Charulatha
>> Subject: Re: Issues with calling pm_runtime functions in
>> platform_pm_suspend_noirq/IRQ disabled context.
>> 
>> "Basak, Partha"  writes:
>> 
>> >> Finally, we have omap_sram_idle():
>> >>
>> >> > In particular, for USB, while executing SRAM_Idle, is we use
>> pm_runtime
>> >> > functions, we see spurious IO-Pad interrupts.
>> >>
>> >> Your message doesn't specify what PM runtime functions you are
>> executing.
>> >> But if those functions are calling mutex_lock(), then they obviously
>> must
>> >> not be called from omap_sram_idle() in interrupt context.
>> >>
>> >> It's not clear from your message why you need to call PM runtime
>> functions
>> >> from the idle loop.  I'd suggest that you post the problematic code in
>> >> question to the list.
>> >>
>> >
>> > USB issue:
>> >
>> > Please refer to the USB patch attached (will be sent to the list as well
>> in a few minutes)
>> > As I mentioned previously, few drivers like GPIO, USB & UART have clock-
>> disable/enable + context save/restore in Idle path(omap_sram_idle()).
>> >
>> > In particular, I am referring to calling of the functions
>> pm_runtime_put_sync() & pm_runtime_get_sync() for USB around wfi.
>> >
>> > Now, the following call sequence from omap_sram_idle()will enable IRQs:
>> > pm_runtime_put_sync-->
>> >__pm_runtime_put-->
>> >pm_runtime_idle-->
>> >spin_unlock_irq(),
>> >__pm_runtime_idle(),-->
>> > spin_unlock_irq,
>> >spin_unlock_irq
>> >
>> > The functions pm_runtime_idle() & __ pm_runtime_idle() do not use
>> > spin_lock_irqsave & spin_unlock_irqrestore. This leads to enabling of
>> > the interrupts in omap_sram_idle unconditionally, which for USB in
>> > particular is leading to IO-pad interrupt being triggered which does
>> > not come otherwise.
>> 
>> You seem to have correctly identified the problem.  Can you try the
>> patch below (untested) which attempts to address the root cause you've
>> identified?
>> 
>> Kevin
>> 
>> > To work around this problem, instead of using Runtime APIs, we are using
>> omap_device_enable/disable, which in-turn call to __omap_hwmod_enable/idle
>> >
>> > Simply put, I am not talking about grabbing a mutex when interrupts are
>> disabled, rather of a scenario where interrupts are getting enabled as a
>> side-effect of calling these functions in   omap_sram_idle().
>> 
>> From be3aeea5f8d29c8ce2fa278f48aef849eb682282 Mon Sep 17 00:00:00 2001
>> From: Kevin Hilman 
>> Date: Mon, 9 Aug 2010 08:12:39 -0700
>> Subject: [PATCH] PM: allow runtime PM get/put from interrupts-disabled
>> context
>> 
>> When using runtime PM in combination with CPUidle, the runtime PM
>> transtions of some devices may be triggered during the idle path.
>> Late in the idle sequence, interrupts may already be disabled when
>> runtime PM for these devices is initiated (using the _sync versions of
>> the runtime PM API.)
>> 
>> Currently, the runtime PM core assumes methods are called with
>> interrupts enabled.  However, if it is called with interrupts
>> disabled, the internal locking unconditionally enables interrupts, for
>> example:
>> 
>> pm_runtime_put_sync()
>> __pm_runtime_put()
>> pm_runtime_idle()
>> spin_lock_irq()
>> __pm_runtime_idle()
>> spin_unlock_irq()   <--- interrupts unconditionally
>> enabled
>> dev->bus->pm->runtime_idle()
>> spin_lock_irq()
>>  spin_unlock_irq()
>> 
>> To fix, use the save/restore versions of the spinlock API.
>>
> Similar thing was thought when this problem was encountered but 
> there was a concern that it may lead to interrupts are being disable
> for longer times

why?

if interrupts are enabled when this API is used, this patch doesn't
change the amount of time interrupts are disabled.

if interrupts are already disabled, then you *want* interrupts to be
disabled for the entire time.

what exactly is the concern?

Kevin

> Are all run time PM APIs are short enough to keep interrupts
> disabled without impacting interrupt latency?
>  
--
To unsubscribe from this list: send the line "unsubscribe linux-omap" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: Issues with calling pm_runtime functions in platform_pm_suspend_noirq/IRQ disabled context.

2010-08-09 Thread Shilimkar, Santosh
> -Original Message-
> From: linux-omap-ow...@vger.kernel.org [mailto:linux-omap-
> ow...@vger.kernel.org] On Behalf Of Kevin Hilman
> Sent: Monday, August 09, 2010 9:01 PM
> To: Basak, Partha
> Cc: Paul Walmsley; linux-omap@vger.kernel.org; Kalliguddi, Hema; Raja,
> Govindraj; Varadarajan, Charulatha
> Subject: Re: Issues with calling pm_runtime functions in
> platform_pm_suspend_noirq/IRQ disabled context.
> 
> "Basak, Partha"  writes:
> 
> >> Finally, we have omap_sram_idle():
> >>
> >> > In particular, for USB, while executing SRAM_Idle, is we use
> pm_runtime
> >> > functions, we see spurious IO-Pad interrupts.
> >>
> >> Your message doesn't specify what PM runtime functions you are
> executing.
> >> But if those functions are calling mutex_lock(), then they obviously
> must
> >> not be called from omap_sram_idle() in interrupt context.
> >>
> >> It's not clear from your message why you need to call PM runtime
> functions
> >> from the idle loop.  I'd suggest that you post the problematic code in
> >> question to the list.
> >>
> >
> > USB issue:
> >
> > Please refer to the USB patch attached (will be sent to the list as well
> in a few minutes)
> > As I mentioned previously, few drivers like GPIO, USB & UART have clock-
> disable/enable + context save/restore in Idle path(omap_sram_idle()).
> >
> > In particular, I am referring to calling of the functions
> pm_runtime_put_sync() & pm_runtime_get_sync() for USB around wfi.
> >
> > Now, the following call sequence from omap_sram_idle()will enable IRQs:
> > pm_runtime_put_sync-->
> > __pm_runtime_put-->
> > pm_runtime_idle-->
> > spin_unlock_irq(),
> > __pm_runtime_idle(),-->
> >  spin_unlock_irq,
> > spin_unlock_irq
> >
> > The functions pm_runtime_idle() & __ pm_runtime_idle() do not use
> > spin_lock_irqsave & spin_unlock_irqrestore. This leads to enabling of
> > the interrupts in omap_sram_idle unconditionally, which for USB in
> > particular is leading to IO-pad interrupt being triggered which does
> > not come otherwise.
> 
> You seem to have correctly identified the problem.  Can you try the
> patch below (untested) which attempts to address the root cause you've
> identified?
> 
> Kevin
> 
> > To work around this problem, instead of using Runtime APIs, we are using
> omap_device_enable/disable, which in-turn call to __omap_hwmod_enable/idle
> >
> > Simply put, I am not talking about grabbing a mutex when interrupts are
> disabled, rather of a scenario where interrupts are getting enabled as a
> side-effect of calling these functions in   omap_sram_idle().
> 
> From be3aeea5f8d29c8ce2fa278f48aef849eb682282 Mon Sep 17 00:00:00 2001
> From: Kevin Hilman 
> Date: Mon, 9 Aug 2010 08:12:39 -0700
> Subject: [PATCH] PM: allow runtime PM get/put from interrupts-disabled
> context
> 
> When using runtime PM in combination with CPUidle, the runtime PM
> transtions of some devices may be triggered during the idle path.
> Late in the idle sequence, interrupts may already be disabled when
> runtime PM for these devices is initiated (using the _sync versions of
> the runtime PM API.)
> 
> Currently, the runtime PM core assumes methods are called with
> interrupts enabled.  However, if it is called with interrupts
> disabled, the internal locking unconditionally enables interrupts, for
> example:
> 
> pm_runtime_put_sync()
> __pm_runtime_put()
> pm_runtime_idle()
> spin_lock_irq()
> __pm_runtime_idle()
> spin_unlock_irq()   <--- interrupts unconditionally
> enabled
> dev->bus->pm->runtime_idle()
> spin_lock_irq()
>  spin_unlock_irq()
> 
> To fix, use the save/restore versions of the spinlock API.
>
Similar thing was thought when this problem was encountered but 
there was a concern that it may lead to interrupts are being disable
for longer times

Are all run time PM APIs are short enough to keep interrupts
disabled without impacting interrupt latency?
 
--
To unsubscribe from this list: send the line "unsubscribe linux-omap" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Issues with calling pm_runtime functions in platform_pm_suspend_noirq/IRQ disabled context.

2010-08-09 Thread Kevin Hilman
"Basak, Partha"  writes:

>> Finally, we have omap_sram_idle():
>> 
>> > In particular, for USB, while executing SRAM_Idle, is we use pm_runtime
>> > functions, we see spurious IO-Pad interrupts.
>> 
>> Your message doesn't specify what PM runtime functions you are executing.
>> But if those functions are calling mutex_lock(), then they obviously must
>> not be called from omap_sram_idle() in interrupt context.
>> 
>> It's not clear from your message why you need to call PM runtime functions
>> from the idle loop.  I'd suggest that you post the problematic code in
>> question to the list.
>>
>
> USB issue:
>
> Please refer to the USB patch attached (will be sent to the list as well in a 
> few minutes)
> As I mentioned previously, few drivers like GPIO, USB & UART have clock- 
> disable/enable + context save/restore in Idle path(omap_sram_idle()).
>
> In particular, I am referring to calling of the functions 
> pm_runtime_put_sync() & pm_runtime_get_sync() for USB around wfi.
>
> Now, the following call sequence from omap_sram_idle()will enable IRQs: 
> pm_runtime_put_sync-->
>   __pm_runtime_put-->
>   pm_runtime_idle-->
>   spin_unlock_irq(),
>   __pm_runtime_idle(),-->
>spin_unlock_irq,
>   spin_unlock_irq 
>
> The functions pm_runtime_idle() & __ pm_runtime_idle() do not use
> spin_lock_irqsave & spin_unlock_irqrestore. This leads to enabling of
> the interrupts in omap_sram_idle unconditionally, which for USB in
> particular is leading to IO-pad interrupt being triggered which does
> not come otherwise.

You seem to have correctly identified the problem.  Can you try the
patch below (untested) which attempts to address the root cause you've
identified?

Kevin

> To work around this problem, instead of using Runtime APIs, we are using 
> omap_device_enable/disable, which in-turn call to __omap_hwmod_enable/idle
>
> Simply put, I am not talking about grabbing a mutex when interrupts are 
> disabled, rather of a scenario where interrupts are getting enabled as a 
> side-effect of calling these functions in   omap_sram_idle().

>From be3aeea5f8d29c8ce2fa278f48aef849eb682282 Mon Sep 17 00:00:00 2001
From: Kevin Hilman 
Date: Mon, 9 Aug 2010 08:12:39 -0700
Subject: [PATCH] PM: allow runtime PM get/put from interrupts-disabled context

When using runtime PM in combination with CPUidle, the runtime PM
transtions of some devices may be triggered during the idle path.
Late in the idle sequence, interrupts may already be disabled when
runtime PM for these devices is initiated (using the _sync versions of
the runtime PM API.)

Currently, the runtime PM core assumes methods are called with
interrupts enabled.  However, if it is called with interrupts
disabled, the internal locking unconditionally enables interrupts, for
example:

pm_runtime_put_sync()
__pm_runtime_put()
pm_runtime_idle()
spin_lock_irq()
__pm_runtime_idle()
spin_unlock_irq()   <--- interrupts unconditionally enabled
dev->bus->pm->runtime_idle()
spin_lock_irq()
 spin_unlock_irq()

To fix, use the save/restore versions of the spinlock API.

Reported-by: Partha Basak 
Signed-off-by: Kevin Hilman 
---
 drivers/base/power/runtime.c |   68 ++---
 include/linux/pm.h   |1 +
 2 files changed, 37 insertions(+), 32 deletions(-)

diff --git a/drivers/base/power/runtime.c b/drivers/base/power/runtime.c
index b0ec0e9..b02ef35 100644
--- a/drivers/base/power/runtime.c
+++ b/drivers/base/power/runtime.c
@@ -80,24 +80,24 @@ static int __pm_runtime_idle(struct device *dev)
dev->power.idle_notification = true;
 
if (dev->bus && dev->bus->pm && dev->bus->pm->runtime_idle) {
-   spin_unlock_irq(&dev->power.lock);
+   spin_unlock_irqrestore(&dev->power.lock, dev->power.lock_flags);
 
dev->bus->pm->runtime_idle(dev);
 
-   spin_lock_irq(&dev->power.lock);
+   spin_lock_irqsave(&dev->power.lock, dev->power.lock_flags);
} else if (dev->type && dev->type->pm && dev->type->pm->runtime_idle) {
-   spin_unlock_irq(&dev->power.lock);
+   spin_unlock_irqrestore(&dev->power.lock, dev->power.lock_flags);
 
dev->type->pm->runtime_idle(dev);
 
-   spin_lock_irq(&dev->power.lock);
+   spin_lock_irqsave(&dev->power.lock, dev->power.lock_flags);
} else if (dev->class && dev->class->pm
&& dev->class->pm->runtime_idle) {
-   spin_unlock_irq(&dev->power.lock);
+   spin_unlock_irqrestore(&dev->power.lock, dev->power.lock_flags);
 
dev->class->pm->runtime_idle(dev);
 
-   spin_lock_irq(&dev->power.lock);
+   spin_lock_irqsave(&dev->power.lock, dev->power.lock_flags);
}
 
dev->power.idle_notificat

RE: Issues with calling pm_runtime functions in platform_pm_suspend_noirq/IRQ disabled context.

2010-08-09 Thread Basak, Partha
> Finally, we have omap_sram_idle():
> 
> > In particular, for USB, while executing SRAM_Idle, is we use pm_runtime
> > functions, we see spurious IO-Pad interrupts.
> 
> Your message doesn't specify what PM runtime functions you are executing.
> But if those functions are calling mutex_lock(), then they obviously must
> not be called from omap_sram_idle() in interrupt context.
> 
> It's not clear from your message why you need to call PM runtime functions
> from the idle loop.  I'd suggest that you post the problematic code in
> question to the list.
>

USB issue:

Please refer to the USB patch attached (will be sent to the list as well in a 
few minutes)
As I mentioned previously, few drivers like GPIO, USB & UART have clock- 
disable/enable + context save/restore in Idle path(omap_sram_idle()).

In particular, I am referring to calling of the functions pm_runtime_put_sync() 
& pm_runtime_get_sync() for USB around wfi.

Now, the following call sequence from omap_sram_idle()will enable IRQs: 
pm_runtime_put_sync-->
__pm_runtime_put-->
pm_runtime_idle-->
spin_unlock_irq(),
__pm_runtime_idle(),-->
 spin_unlock_irq,
spin_unlock_irq 

The functions pm_runtime_idle() & __ pm_runtime_idle() do not use 
spin_lock_irqsave & spin_unlock_irqrestore. This leads to enabling of the 
interrupts in omap_sram_idle unconditionally, which for USB in particular is 
leading to IO-pad interrupt being triggered which does not come otherwise.

To work around this problem, instead of using Runtime APIs, we are using 
omap_device_enable/disable, which in-turn call to __omap_hwmod_enable/idle

Simply put, I am not talking about grabbing a mutex when interrupts are 
disabled, rather of a scenario where interrupts are getting enabled as a 
side-effect of calling these functions in   omap_sram_idle().
> 
> - Paul


0008-runtime-pm-changes.patch
Description: 0008-runtime-pm-changes.patch


RE: Issues with calling pm_runtime functions in platform_pm_suspend_noirq/IRQ disabled context.

2010-08-09 Thread Paul Walmsley
On Sat, 7 Aug 2010, Basak, Partha wrote:

> > -Original Message-
> > From: Paul Walmsley [mailto:p...@pwsan.com]
> > Sent: Saturday, August 07, 2010 1:00 AM
> > 
> > On Tue, 3 Aug 2010, Basak, Partha wrote:
> > 
> > > I believe, it is not correct to call the pm_runtime APIs from interrupt
> > > locked context
> > 
> > Why do you think that the PM runtime APIs are being called from interrupt
> > context?  Could you point out the call chain that you're seeing?
> >
> I meant that they should not not be called in a context which has 
> interrupts disabled. SRAM_Idle as well as suspend_no_irq are the 
> instances I was referring to.

It would be really helpful if you would use the exact names of the 
functions that you are referring to. This message assumes that your use of 
"SRAM_Idle" refers to omap_sram_idle() in pm34xx.c, and that 
"suspend_no_irq" refers to platform_pm_suspend_noirq() in pm_bus.c.

You write that "[the PM runtime functions] should not not be called in a 
context which has interrupts disabled."  All interrupts?  Some interrupts?  
"Context" in the general sense, or in the strict sense used in Linux 
interrupt handling?  It is difficult to comment on this statement since it 
is unclear.  I will guess that your message refers to the fact that some
of the PM runtime functions take a mutex, and therefore it is invalid to 
call those functions when the timer interrupt is disabled.

If that's what you mean, then it's worth observing that it's valid to call 
some PM runtime functions, such as pm_runtime_get_noresume(), 
pm_runtime_suspended(), etc., no matter which interrupts are enabled, 
since they don't take any mutex.  Similarly, it seems quite valid to call 
pretty much any PM runtime function as long as the timer interrupt is 
still running.  This is the case with platform_pm_suspend_noirq(), at 
least, with this call chain:

dpm_suspend_noirq() ->
  device_suspend_noirq() ->
pm_irq_op() ->
  dev_pm_ops.suspend_noirq ->
 platform_pm_suspend_noirq()

Unfortunately, your message didn't provide a call chain, so, not sure if 
we're even referring to the same code path.  Based on this chain, I don't 
see any interrupt-related problems with calling PM runtime functions from 
platform_pm_suspend_noirq().  If you are actually seeing some problem 
here, then you should post the warning that you're seeing and the call 
chain that's causing the problem.

Finally, we have omap_sram_idle():

> In particular, for USB, while executing SRAM_Idle, is we use pm_runtime 
> functions, we see spurious IO-Pad interrupts.

Your message doesn't specify what PM runtime functions you are executing.  
But if those functions are calling mutex_lock(), then they obviously must 
not be called from omap_sram_idle() in interrupt context.

It's not clear from your message why you need to call PM runtime functions 
from the idle loop.  I'd suggest that you post the problematic code in 
question to the list.


- Paul
--
To unsubscribe from this list: send the line "unsubscribe linux-omap" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: Issues with calling pm_runtime functions in platform_pm_suspend_noirq/IRQ disabled context.

2010-08-06 Thread Basak, Partha


> -Original Message-
> From: Kevin Hilman [mailto:khil...@deeprootsystems.com]
> Sent: Thursday, August 05, 2010 3:17 AM
> To: Basak, Partha
> Cc: Paul Walmsley; linux-omap@vger.kernel.org; Kalliguddi, Hema; Raja,
> Govindraj; Varadarajan, Charulatha; Nayak, Rajendra; Cousson, Benoit
> Subject: Re: Issues with calling pm_runtime functions in
> platform_pm_suspend_noirq/IRQ disabled context.
> 
> "Basak, Partha"  writes:
> 
> >> -Original Message-
> >> From: Kevin Hilman [mailto:khil...@deeprootsystems.com]
> >> Sent: Tuesday, August 03, 2010 11:06 PM
> >> To: Basak, Partha
> >> Cc: Paul Walmsley; linux-omap@vger.kernel.org; Kalliguddi, Hema; Raja,
> >> Govindraj; Varadarajan, Charulatha; Nayak, Rajendra; Cousson, Benoit
> >> Subject: Re: Issues with calling pm_runtime functions in
> >> platform_pm_suspend_noirq/IRQ disabled context.
> >>
> >> "Basak, Partha"  writes:
> >>
> >> > Resending with the corrected mailing list
> >> >
> >> > Hello Kevin,
> >> >
> >> > I want to draw your attention to an issue the following patch
> >> > introduces. http://dev.omapzoom.org/?p=swarch/linux-omap-
> >> adv.git;a=commitdiff;h=8041359e18e49bf8a3d41f15894db9e476f3a8fc
> >>
> >> [...]
> >>
> >> > I believe, it is not correct to call the pm_runtime APIs from
> >> > interrupt locked context since the underlying functions
> >> > __pm_runtime_suspend & __pm_runtime_resume enable interrupts and call
> >> > schedule().
> >> >
> >> > Further, from a s/w layering standpoint, platform-bus is a deeper
> >> > layer than the Runtime layer, which the runtime layer calls into. So
> >> > it may not be correct to have this layer calling into pm_runtime().
> It
> >> > should directly call omap_device APIs.
> >>
> >> The original version of this patch did directly call the omap_device
> >> APIs.  However, Paul didn't like that approach and there were
> >> use-counting problems to handle as well (e.g. if a driver was not (yet)
> >> in use, it would already be disabled, and a suspend would trigger an
> >> omap_device_disable() which would fail since device was already
> >> disabled.)
> >>
> >> Paul and I agreed that this current approach would solve the
> >> use-counting problems, but you're correct in that it introduces
> >> new problems as these functions can _possibly_ sleep/schedule.
> >>
> >> That being said, have you actually seen problems here?   I would like
> >> to see how they are triggered?
> >
> > Scenario 1:
> >
> > Consider the case where a driver has already called
> > pm_runtime_put_sync as part of aggressive clock cutting
> > scheme. Subsequently, when system suspend is called,
> > pm_runtime_put_sync is called again.
> 
> > Due to atomic_dec_and_test check
> > on usage_count, the second call goes through w/o error.
> 
> I don't think you're fully understanding the point of the put/get in the
> suspend/resume path.
> 
> The reason for the 'put' in the suspend path is because the PM core has
> done a 'get' in dpm_prepare() [c.f. drivers/base/power/main.c] so that
> runtime PM transitions are prevented during suspend/resume.
> 
> On OMAP, we want to allow those transitions to happen so we can use the
> same runtime PM calls in the idle and suspend paths.
> 
> > At System resume, pm_runtime_get_sync is called twice, once by resume,
> > once by the driver itself.
> 
> Yes, but there is a 'put_sync' in between those done by the PM core
> during resume (c.f. dpm_complete() in drivers/base/power/main.c]
> 
> > Because of the extra reference count, the
> > aggressive clock control by the driver is broken, as call to put_sync
> > by a driver reduces to usage_count to 1. As a result clock transition
> > in Idle-path is broken.
> 
I understand the rationale better, but having these changes is making the next 
Idle call after a suspend-resume cycle to fail. I will continue debugging on 
this and come back with more details.

> 
> > Scenario2:
> >
> > Tested GPIO for Suspend with these changes & obtained dump of Circular
> Locking dependencies:
> 
> I don't think these to problems are related, AFAICT, they are separate
> issues.  I'll respond to this scenario in a different reply.
> 
> >>
> >> [...]
> >>
> >> The UART driver is a special case as it is mana

RE: Issues with calling pm_runtime functions in platform_pm_suspend_noirq/IRQ disabled context.

2010-08-06 Thread Basak, Partha


> -Original Message-
> From: Kevin Hilman [mailto:khil...@deeprootsystems.com]
> Sent: Thursday, August 05, 2010 4:51 AM
> To: Basak, Partha
> Cc: Paul Walmsley; linux-omap@vger.kernel.org; Kalliguddi, Hema; Raja,
> Govindraj; Varadarajan, Charulatha; Nayak, Rajendra; Cousson, Benoit
> Subject: Re: Issues with calling pm_runtime functions in
> platform_pm_suspend_noirq/IRQ disabled context.
> 
> Kevin Hilman  writes:
> 
> > Kevin Hilman  writes:
> >
> >> "Basak, Partha"  writes:
> >>
> >>>> -Original Message-
> >>>> From: Kevin Hilman [mailto:khil...@deeprootsystems.com]
> >>>> Sent: Tuesday, August 03, 2010 11:06 PM
> >>>> To: Basak, Partha
> >>>> Cc: Paul Walmsley; linux-omap@vger.kernel.org; Kalliguddi, Hema;
> Raja,
> >>>> Govindraj; Varadarajan, Charulatha; Nayak, Rajendra; Cousson, Benoit
> >>>> Subject: Re: Issues with calling pm_runtime functions in
> >>>> platform_pm_suspend_noirq/IRQ disabled context.
> >>>>
> >>>> "Basak, Partha"  writes:
> >>>>
> >>>> > Resending with the corrected mailing list
> >>>> >
> >>>> > Hello Kevin,
> >>>> >
> >>>> > I want to draw your attention to an issue the following patch
> >>>> > introduces. http://dev.omapzoom.org/?p=swarch/linux-omap-
> >>>> adv.git;a=commitdiff;h=8041359e18e49bf8a3d41f15894db9e476f3a8fc
> >>>>
> >>>> [...]
> >>>>
> >>>> > I believe, it is not correct to call the pm_runtime APIs from
> >>>> > interrupt locked context since the underlying functions
> >>>> > __pm_runtime_suspend & __pm_runtime_resume enable interrupts and
> call
> >>>> > schedule().
> >>>> >
> >>>> > Further, from a s/w layering standpoint, platform-bus is a deeper
> >>>> > layer than the Runtime layer, which the runtime layer calls into.
> So
> >>>> > it may not be correct to have this layer calling into pm_runtime().
> It
> >>>> > should directly call omap_device APIs.
> >>>>
> >>>> The original version of this patch did directly call the omap_device
> >>>> APIs.  However, Paul didn't like that approach and there were
> >>>> use-counting problems to handle as well (e.g. if a driver was not
> (yet)
> >>>> in use, it would already be disabled, and a suspend would trigger an
> >>>> omap_device_disable() which would fail since device was already
> >>>> disabled.)
> >>>>
> >>>> Paul and I agreed that this current approach would solve the
> >>>> use-counting problems, but you're correct in that it introduces
> >>>> new problems as these functions can _possibly_ sleep/schedule.
> >>>>
> >>>> That being said, have you actually seen problems here?   I would like
> >>>> to see how they are triggered?
> >>>
> >>> Scenario 1:
> >>>
> >>> Consider the case where a driver has already called
> >>> pm_runtime_put_sync as part of aggressive clock cutting
> >>> scheme. Subsequently, when system suspend is called,
> >>> pm_runtime_put_sync is called again.
> >>
> >>> Due to atomic_dec_and_test check
> >>> on usage_count, the second call goes through w/o error.
> >>
> >> I don't think you're fully understanding the point of the put/get in
> the
> >> suspend/resume path.
> >>
> >> The reason for the 'put' in the suspend path is because the PM core has
> >> done a 'get' in dpm_prepare() [c.f. drivers/base/power/main.c] so that
> >> runtime PM transitions are prevented during suspend/resume.
> >>
> >> On OMAP, we want to allow those transitions to happen so we can use the
> >> same runtime PM calls in the idle and suspend paths.
> >>
> >>> At System resume, pm_runtime_get_sync is called twice, once by resume,
> >>> once by the driver itself.
> >>
> >> Yes, but there is a 'put_sync' in between those done by the PM core
> >> during resume (c.f. dpm_complete() in drivers/base/power/main.c]
> >>
> >>> Because of the extra reference count, the
> >>> aggressive clock control by the driver is broken, as call to put_sync
> >>> by a driver

RE: Issues with calling pm_runtime functions in platform_pm_suspend_noirq/IRQ disabled context.

2010-08-06 Thread Basak, Partha


> -Original Message-
> From: Kevin Hilman [mailto:khil...@deeprootsystems.com]
> Sent: Thursday, August 05, 2010 5:05 AM
> To: Basak, Partha
> Cc: Paul Walmsley; linux-omap@vger.kernel.org; Kalliguddi, Hema; Raja,
> Govindraj; Varadarajan, Charulatha; Nayak, Rajendra; Cousson, Benoit
> Subject: Re: Issues with calling pm_runtime functions in
> platform_pm_suspend_noirq/IRQ disabled context.
> 
> "Basak, Partha"  writes:
> 
> >
> > Tested GPIO for Suspend with these changes & obtained dump of Circular
> Locking dependencies:
> >
> 
> It would *really* help to have these GPIO conversion patches posted
> against a public tree/branch one could see exactly what is going on and
> be able to reproduce this problem for easier analysis.  I have some
> hunches as to what is going wrong here, but cannot be sure.  I'd much
> rather be able to see and try the code.
> 
> Fortunately, lockdep is verbose enough to try and understand the
> symptoms here.  Lockdep seems to have detected that these two locks have
> been acquired in different order, resulting in *potential* deadlock.
> 
> Indeed, during init (#0 below) you have the hwmod_mutex acquired
> (hwmod_for_each_by_class()) then the dpm_list_mtx acquired
> (device_pm_add()).  Later, during suspend the dpm_list_mtx is aquired
> first (dpm_suspend_noirq()), then the omap_hwmod_mutex is acquired
> (omap_hwmod_idle()).
> 
> Taking the same locks in different order at different times is the
> circular dependency and a recipe for potential deadlock.
> 
> In our case, there is no real potential for deadlock here as the locks
> are only taken the hwmod->dpm order once during init, all other times
> (when the omap_device/hwmod are ready) it will happen in the dpm->hwmod
> order.
> 
> The simple fix here is probably to remove the locking in the
> omap_hwmod_for_each* functions.  Could you try that?
> 
We tried this, it works. But the GPIO patch series sent out(posted today, Aug 
06 2010) are tested based on reverting the pm_bus.c suspend_no_irq patch. 

> I'm a bit confused why I don't see the same problem with the UART layer
> which currently also handles things in the idle path as well.
> 
> Kevin
> 
> > # echo mem > /sys/power/state
> > [  809.981658] PM: Syncing filesystems ... done.
> > [  810.037963] Freezing user space processes ... (elapsed 0.02 seconds)
> done.
> > [  810.067901] Freezing remaining freezable tasks ... (elapsed 0.02
> seconds) done.
> > [  810.105224] Suspending console(s) (use no_console_suspend to debug)
> > [  810.166748] PM: suspend of devices complete after 46.142 msecs
> > [  810.170562]
> > [  810.170593] ===
> > [  810.170593] [ INFO: possible circular locking dependency detected ]
> > [  810.170623] 2.6.35-rc5-00131-g56e767c-dirty #34
> > [  810.170654] ---
> > [  810.170654] sh/670 is trying to acquire lock:
> > [  810.170684]  (omap_hwmod_mutex){+.+.+.}, at: []
> omap_hwmod_idle+0x1c/0x38
> > [  810.170745]
> > [  810.170745] but task is already holding lock:
> > [  810.170776]  (dpm_list_mtx){+.+...}, at: []
> dpm_suspend_noirq+0x28/0x188
> > [  810.170806]
> > [  810.170837] which lock already depends on the new lock.
> > [  810.170837]
> > [  810.170837]
> > [  810.170837] the existing dependency chain (in reverse order) is:
> > [  810.170867]
> > [  810.170867] -> #1 (dpm_list_mtx){+.+...}:
> > [  810.170898][] lock_acquire+0x60/0x74
> > [  810.170959][] mutex_lock_nested+0x58/0x2e4
> > [  810.170989][] device_pm_add+0x14/0xcc
> > [  810.171020][] device_add+0x3b8/0x564
> > [  810.171051][] platform_device_add+0x104/0x160
> > [  810.171112][] omap_device_build_ss+0x14c/0x1c8
> > [  810.171142][] omap_device_build+0x48/0x50
> > [  810.171173][] omap2_init_gpio+0xf0/0x15c
> > [  810.171203][]
> omap_hwmod_for_each_by_class+0x60/0xa4
> > [  810.171264][] do_one_initcall+0x58/0x1b4
> > [  810.171295][] kernel_init+0x98/0x150
> > [  810.171325][] kernel_thread_exit+0x0/0x8
> > [  810.171356]
> > [  810.171356] -> #0 (omap_hwmod_mutex){+.+.+.}:
> > [  810.171386][] __lock_acquire+0x108c/0x1784
> > [  810.171447][] lock_acquire+0x60/0x74
> > [  810.171478][] mutex_lock_nested+0x58/0x2e4
> > [  810.171508][] omap_hwmod_idle+0x1c/0x38
> > [  810.171539][] omap_device_idle_hwmods+0x20/0x3c
> > [  810.171600][] _o

Re: Issues with calling pm_runtime functions in platform_pm_suspend_noirq/IRQ disabled context.

2010-08-04 Thread Kevin Hilman
"Basak, Partha"  writes:

>
> Tested GPIO for Suspend with these changes & obtained dump of Circular 
> Locking dependencies:
>

It would *really* help to have these GPIO conversion patches posted
against a public tree/branch one could see exactly what is going on and
be able to reproduce this problem for easier analysis.  I have some
hunches as to what is going wrong here, but cannot be sure.  I'd much
rather be able to see and try the code.

Fortunately, lockdep is verbose enough to try and understand the
symptoms here.  Lockdep seems to have detected that these two locks have
been acquired in different order, resulting in *potential* deadlock.

Indeed, during init (#0 below) you have the hwmod_mutex acquired
(hwmod_for_each_by_class()) then the dpm_list_mtx acquired
(device_pm_add()).  Later, during suspend the dpm_list_mtx is aquired
first (dpm_suspend_noirq()), then the omap_hwmod_mutex is acquired
(omap_hwmod_idle()).

Taking the same locks in different order at different times is the
circular dependency and a recipe for potential deadlock.

In our case, there is no real potential for deadlock here as the locks
are only taken the hwmod->dpm order once during init, all other times
(when the omap_device/hwmod are ready) it will happen in the dpm->hwmod
order.

The simple fix here is probably to remove the locking in the
omap_hwmod_for_each* functions.  Could you try that?

I'm a bit confused why I don't see the same problem with the UART layer
which currently also handles things in the idle path as well.

Kevin

> # echo mem > /sys/power/state
> [  809.981658] PM: Syncing filesystems ... done.
> [  810.037963] Freezing user space processes ... (elapsed 0.02 seconds) done.
> [  810.067901] Freezing remaining freezable tasks ... (elapsed 0.02 seconds) 
> done.
> [  810.105224] Suspending console(s) (use no_console_suspend to debug)
> [  810.166748] PM: suspend of devices complete after 46.142 msecs
> [  810.170562]
> [  810.170593] ===
> [  810.170593] [ INFO: possible circular locking dependency detected ]
> [  810.170623] 2.6.35-rc5-00131-g56e767c-dirty #34
> [  810.170654] ---
> [  810.170654] sh/670 is trying to acquire lock:
> [  810.170684]  (omap_hwmod_mutex){+.+.+.}, at: [] 
> omap_hwmod_idle+0x1c/0x38
> [  810.170745]
> [  810.170745] but task is already holding lock:
> [  810.170776]  (dpm_list_mtx){+.+...}, at: [] 
> dpm_suspend_noirq+0x28/0x188
> [  810.170806]
> [  810.170837] which lock already depends on the new lock.
> [  810.170837]
> [  810.170837]
> [  810.170837] the existing dependency chain (in reverse order) is:
> [  810.170867]
> [  810.170867] -> #1 (dpm_list_mtx){+.+...}:
> [  810.170898][] lock_acquire+0x60/0x74
> [  810.170959][] mutex_lock_nested+0x58/0x2e4
> [  810.170989][] device_pm_add+0x14/0xcc
> [  810.171020][] device_add+0x3b8/0x564
> [  810.171051][] platform_device_add+0x104/0x160
> [  810.171112][] omap_device_build_ss+0x14c/0x1c8
> [  810.171142][] omap_device_build+0x48/0x50
> [  810.171173][] omap2_init_gpio+0xf0/0x15c
> [  810.171203][] omap_hwmod_for_each_by_class+0x60/0xa4
> [  810.171264][] do_one_initcall+0x58/0x1b4
> [  810.171295][] kernel_init+0x98/0x150
> [  810.171325][] kernel_thread_exit+0x0/0x8
> [  810.171356]
> [  810.171356] -> #0 (omap_hwmod_mutex){+.+.+.}:
> [  810.171386][] __lock_acquire+0x108c/0x1784
> [  810.171447][] lock_acquire+0x60/0x74
> [  810.171478][] mutex_lock_nested+0x58/0x2e4
> [  810.171508][] omap_hwmod_idle+0x1c/0x38
> [  810.171539][] omap_device_idle_hwmods+0x20/0x3c
> [  810.171600][] _omap_device_deactivate+0x58/0x14c
> [  810.171630][] omap_device_idle+0x4c/0x6c
> [  810.171661][] platform_pm_runtime_suspend+0x4c/0x74
> [  810.171691][] __pm_runtime_suspend+0x204/0x34c
> [  810.171722][] pm_runtime_suspend+0x20/0x34
> [  810.171752][] platform_pm_runtime_idle+0x8/0x10
> [  810.171783][] __pm_runtime_idle+0x15c/0x198
> [  810.171813][] pm_runtime_idle+0x1c/0x30
> [  810.171844][] platform_pm_suspend_noirq+0x48/0x50
> [  810.171875][] pm_noirq_op+0xa0/0x184
> [  810.171905][] dpm_suspend_noirq+0xac/0x188
> [  810.171936][] suspend_devices_and_enter+0x94/0x1d8
> [  810.171966][] enter_state+0xbc/0x120
> [  810.171997][] state_store+0xa4/0xb8
> [  810.172027][] kobj_attr_store+0x18/0x1c
> [  810.172088][] sysfs_write_file+0x10c/0x144
> [  810.172119][] vfs_write+0xb0/0x148
> [  810.172149][] sys_write+0x3c/0x68
> [  810.172180][] ret_fast_syscall+0x0/0x3c
> [  810.172210]
> [  810.172210] other info that might help us debug this:
> [  810.172210]
> [  810.172241] 4 locks held by sh/670:
> [  810.172241]  #0:  (&buffer->mutex){+.+.+.}, at: [

Re: Issues with calling pm_runtime functions in platform_pm_suspend_noirq/IRQ disabled context.

2010-08-04 Thread Kevin Hilman
Kevin Hilman  writes:

> Kevin Hilman  writes:
>
>> "Basak, Partha"  writes:
>>
>>>> -Original Message-
>>>> From: Kevin Hilman [mailto:khil...@deeprootsystems.com]
>>>> Sent: Tuesday, August 03, 2010 11:06 PM
>>>> To: Basak, Partha
>>>> Cc: Paul Walmsley; linux-omap@vger.kernel.org; Kalliguddi, Hema; Raja,
>>>> Govindraj; Varadarajan, Charulatha; Nayak, Rajendra; Cousson, Benoit
>>>> Subject: Re: Issues with calling pm_runtime functions in
>>>> platform_pm_suspend_noirq/IRQ disabled context.
>>>> 
>>>> "Basak, Partha"  writes:
>>>> 
>>>> > Resending with the corrected mailing list
>>>> >
>>>> > Hello Kevin,
>>>> >
>>>> > I want to draw your attention to an issue the following patch
>>>> > introduces. http://dev.omapzoom.org/?p=swarch/linux-omap-
>>>> adv.git;a=commitdiff;h=8041359e18e49bf8a3d41f15894db9e476f3a8fc
>>>> 
>>>> [...]
>>>> 
>>>> > I believe, it is not correct to call the pm_runtime APIs from
>>>> > interrupt locked context since the underlying functions
>>>> > __pm_runtime_suspend & __pm_runtime_resume enable interrupts and call
>>>> > schedule().
>>>> >
>>>> > Further, from a s/w layering standpoint, platform-bus is a deeper
>>>> > layer than the Runtime layer, which the runtime layer calls into. So
>>>> > it may not be correct to have this layer calling into pm_runtime(). It
>>>> > should directly call omap_device APIs.
>>>> 
>>>> The original version of this patch did directly call the omap_device
>>>> APIs.  However, Paul didn't like that approach and there were
>>>> use-counting problems to handle as well (e.g. if a driver was not (yet)
>>>> in use, it would already be disabled, and a suspend would trigger an
>>>> omap_device_disable() which would fail since device was already
>>>> disabled.)
>>>> 
>>>> Paul and I agreed that this current approach would solve the
>>>> use-counting problems, but you're correct in that it introduces
>>>> new problems as these functions can _possibly_ sleep/schedule.
>>>> 
>>>> That being said, have you actually seen problems here?   I would like
>>>> to see how they are triggered?
>>>
>>> Scenario 1:
>>>
>>> Consider the case where a driver has already called
>>> pm_runtime_put_sync as part of aggressive clock cutting
>>> scheme. Subsequently, when system suspend is called,
>>> pm_runtime_put_sync is called again. 
>>
>>> Due to atomic_dec_and_test check
>>> on usage_count, the second call goes through w/o error.
>>
>> I don't think you're fully understanding the point of the put/get in the
>> suspend/resume path.
>>
>> The reason for the 'put' in the suspend path is because the PM core has
>> done a 'get' in dpm_prepare() [c.f. drivers/base/power/main.c] so that
>> runtime PM transitions are prevented during suspend/resume.
>>
>> On OMAP, we want to allow those transitions to happen so we can use the
>> same runtime PM calls in the idle and suspend paths.
>>
>>> At System resume, pm_runtime_get_sync is called twice, once by resume,
>>> once by the driver itself. 
>>
>> Yes, but there is a 'put_sync' in between those done by the PM core
>> during resume (c.f. dpm_complete() in drivers/base/power/main.c]
>>
>>> Because of the extra reference count, the
>>> aggressive clock control by the driver is broken, as call to put_sync
>>> by a driver reduces to usage_count to 1. As a result clock transition
>>> in Idle-path is broken.
>
> A closer look here, and there is indeed a bug in the _resume_noirq()
> method.  The get here needs to be a _noresume version to match what is
> done in the PM core.
>
> This doesn't change the use count, but it does change whether the device
> is actually awoken by this 'get' or not.  This get should never actually
> awake the device.  It is only there to compensate for what the PM core
> does.
>
> Can you try this patch?  Will post a version to the list with
> changelog shortly.

After a little more thinking, I'm not sure yet if this is a real problem
or not.

One other question.  At least for GPIO testing, does it work if you
simply remove the _noirq hooks from pm_bus.c?  If so, please feel to
post a version with the dependency that the patch adding suspend/resume
hooks in pm_bus.c needs to be reverted first.

Kevin

> Kevin
>
> diff --git a/arch/arm/mach-omap2/pm_bus.c b/arch/arm/mach-omap2/pm_bus.c
> index bab0186..01bbe65 100644
> --- a/arch/arm/mach-omap2/pm_bus.c
> +++ b/arch/arm/mach-omap2/pm_bus.c
> @@ -90,7 +90,7 @@ int platform_pm_resume_noirq(struct device *dev)
>* method so that the runtime PM usage counting is in the same
>* state it was when suspend was called.
>*/
> - pm_runtime_get_sync(dev);
> + pm_runtime_get_noresume(dev);
>  
>   if (!drv)
>   return 0;
--
To unsubscribe from this list: send the line "unsubscribe linux-omap" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Issues with calling pm_runtime functions in platform_pm_suspend_noirq/IRQ disabled context.

2010-08-04 Thread Kevin Hilman
Kevin Hilman  writes:

> "Basak, Partha"  writes:
>
>>> -Original Message-
>>> From: Kevin Hilman [mailto:khil...@deeprootsystems.com]
>>> Sent: Tuesday, August 03, 2010 11:06 PM
>>> To: Basak, Partha
>>> Cc: Paul Walmsley; linux-omap@vger.kernel.org; Kalliguddi, Hema; Raja,
>>> Govindraj; Varadarajan, Charulatha; Nayak, Rajendra; Cousson, Benoit
>>> Subject: Re: Issues with calling pm_runtime functions in
>>> platform_pm_suspend_noirq/IRQ disabled context.
>>> 
>>> "Basak, Partha"  writes:
>>> 
>>> > Resending with the corrected mailing list
>>> >
>>> > Hello Kevin,
>>> >
>>> > I want to draw your attention to an issue the following patch
>>> > introduces. http://dev.omapzoom.org/?p=swarch/linux-omap-
>>> adv.git;a=commitdiff;h=8041359e18e49bf8a3d41f15894db9e476f3a8fc
>>> 
>>> [...]
>>> 
>>> > I believe, it is not correct to call the pm_runtime APIs from
>>> > interrupt locked context since the underlying functions
>>> > __pm_runtime_suspend & __pm_runtime_resume enable interrupts and call
>>> > schedule().
>>> >
>>> > Further, from a s/w layering standpoint, platform-bus is a deeper
>>> > layer than the Runtime layer, which the runtime layer calls into. So
>>> > it may not be correct to have this layer calling into pm_runtime(). It
>>> > should directly call omap_device APIs.
>>> 
>>> The original version of this patch did directly call the omap_device
>>> APIs.  However, Paul didn't like that approach and there were
>>> use-counting problems to handle as well (e.g. if a driver was not (yet)
>>> in use, it would already be disabled, and a suspend would trigger an
>>> omap_device_disable() which would fail since device was already
>>> disabled.)
>>> 
>>> Paul and I agreed that this current approach would solve the
>>> use-counting problems, but you're correct in that it introduces
>>> new problems as these functions can _possibly_ sleep/schedule.
>>> 
>>> That being said, have you actually seen problems here?   I would like
>>> to see how they are triggered?
>>
>> Scenario 1:
>>
>> Consider the case where a driver has already called
>> pm_runtime_put_sync as part of aggressive clock cutting
>> scheme. Subsequently, when system suspend is called,
>> pm_runtime_put_sync is called again. 
>
>> Due to atomic_dec_and_test check
>> on usage_count, the second call goes through w/o error.
>
> I don't think you're fully understanding the point of the put/get in the
> suspend/resume path.
>
> The reason for the 'put' in the suspend path is because the PM core has
> done a 'get' in dpm_prepare() [c.f. drivers/base/power/main.c] so that
> runtime PM transitions are prevented during suspend/resume.
>
> On OMAP, we want to allow those transitions to happen so we can use the
> same runtime PM calls in the idle and suspend paths.
>
>> At System resume, pm_runtime_get_sync is called twice, once by resume,
>> once by the driver itself. 
>
> Yes, but there is a 'put_sync' in between those done by the PM core
> during resume (c.f. dpm_complete() in drivers/base/power/main.c]
>
>> Because of the extra reference count, the
>> aggressive clock control by the driver is broken, as call to put_sync
>> by a driver reduces to usage_count to 1. As a result clock transition
>> in Idle-path is broken.

A closer look here, and there is indeed a bug in the _resume_noirq()
method.  The get here needs to be a _noresume version to match what is
done in the PM core.

This doesn't change the use count, but it does change whether the device
is actually awoken by this 'get' or not.  This get should never actually
awake the device.  It is only there to compensate for what the PM core
does.

Can you try this patch?  Will post a version to the list with
changelog shortly.

Kevin

diff --git a/arch/arm/mach-omap2/pm_bus.c b/arch/arm/mach-omap2/pm_bus.c
index bab0186..01bbe65 100644
--- a/arch/arm/mach-omap2/pm_bus.c
+++ b/arch/arm/mach-omap2/pm_bus.c
@@ -90,7 +90,7 @@ int platform_pm_resume_noirq(struct device *dev)
 * method so that the runtime PM usage counting is in the same
 * state it was when suspend was called.
 */
-   pm_runtime_get_sync(dev);
+   pm_runtime_get_noresume(dev);
 
if (!drv)
return 0;
--
To unsubscribe from this list: send the line "unsubscribe linux-omap" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Issues with calling pm_runtime functions in platform_pm_suspend_noirq/IRQ disabled context.

2010-08-04 Thread Kevin Hilman
"Basak, Partha"  writes:

>> -Original Message-
>> From: Kevin Hilman [mailto:khil...@deeprootsystems.com]
>> Sent: Tuesday, August 03, 2010 11:06 PM
>> To: Basak, Partha
>> Cc: Paul Walmsley; linux-omap@vger.kernel.org; Kalliguddi, Hema; Raja,
>> Govindraj; Varadarajan, Charulatha; Nayak, Rajendra; Cousson, Benoit
>> Subject: Re: Issues with calling pm_runtime functions in
>> platform_pm_suspend_noirq/IRQ disabled context.
>> 
>> "Basak, Partha"  writes:
>> 
>> > Resending with the corrected mailing list
>> >
>> > Hello Kevin,
>> >
>> > I want to draw your attention to an issue the following patch
>> > introduces. http://dev.omapzoom.org/?p=swarch/linux-omap-
>> adv.git;a=commitdiff;h=8041359e18e49bf8a3d41f15894db9e476f3a8fc
>> 
>> [...]
>> 
>> > I believe, it is not correct to call the pm_runtime APIs from
>> > interrupt locked context since the underlying functions
>> > __pm_runtime_suspend & __pm_runtime_resume enable interrupts and call
>> > schedule().
>> >
>> > Further, from a s/w layering standpoint, platform-bus is a deeper
>> > layer than the Runtime layer, which the runtime layer calls into. So
>> > it may not be correct to have this layer calling into pm_runtime(). It
>> > should directly call omap_device APIs.
>> 
>> The original version of this patch did directly call the omap_device
>> APIs.  However, Paul didn't like that approach and there were
>> use-counting problems to handle as well (e.g. if a driver was not (yet)
>> in use, it would already be disabled, and a suspend would trigger an
>> omap_device_disable() which would fail since device was already
>> disabled.)
>> 
>> Paul and I agreed that this current approach would solve the
>> use-counting problems, but you're correct in that it introduces
>> new problems as these functions can _possibly_ sleep/schedule.
>> 
>> That being said, have you actually seen problems here?   I would like
>> to see how they are triggered?
>
> Scenario 1:
>
> Consider the case where a driver has already called
> pm_runtime_put_sync as part of aggressive clock cutting
> scheme. Subsequently, when system suspend is called,
> pm_runtime_put_sync is called again. 

> Due to atomic_dec_and_test check
> on usage_count, the second call goes through w/o error.

I don't think you're fully understanding the point of the put/get in the
suspend/resume path.

The reason for the 'put' in the suspend path is because the PM core has
done a 'get' in dpm_prepare() [c.f. drivers/base/power/main.c] so that
runtime PM transitions are prevented during suspend/resume.

On OMAP, we want to allow those transitions to happen so we can use the
same runtime PM calls in the idle and suspend paths.

> At System resume, pm_runtime_get_sync is called twice, once by resume,
> once by the driver itself. 

Yes, but there is a 'put_sync' in between those done by the PM core
during resume (c.f. dpm_complete() in drivers/base/power/main.c]

> Because of the extra reference count, the
> aggressive clock control by the driver is broken, as call to put_sync
> by a driver reduces to usage_count to 1. As a result clock transition
> in Idle-path is broken.


> Scenario2:
>
> Tested GPIO for Suspend with these changes & obtained dump of Circular 
> Locking dependencies:

I don't think these to problems are related, AFAICT, they are separate
issues.  I'll respond to this scenario in a different reply.

>> 
>> [...]
>> 
>> The UART driver is a special case as it is managed from deep inside the
>> idle path.
>> 
>> > Can you feedback on my observations and comment on the approach taken
>> > for these drivers?
>> 
>> I cannot comment until I understand why these drivers need to
>> enable/disable with interrupts off.
>> 
>> In general, any use of the interrupts-off versions of the omap_device
>> APIs need to be thorougly justified.
>> 
>> Even the UART one will go away when the omap-serial driver is converted
>> to runtime PM.
>>
>
> For GPIO, it is a must to relinquish clocks in Idle-path as it is
> possible to have a GPIO bank requested and still allow PER domain to
> go to OFF.

These the kind of things that I would expect to be discussed/described
in the changelogs to patches posted to the list.

> For USB, this is required because of a h/w issue.

Again, patches with descriptive changelogs describing the "h/w issue"
please.

> My point is, just by exposing a _omap_hwmod_idle()(mutex-free version) will 
&

RE: Issues with calling pm_runtime functions in platform_pm_suspend_noirq/IRQ disabled context.

2010-08-04 Thread Basak, Partha


> -Original Message-
> From: Kevin Hilman [mailto:khil...@deeprootsystems.com]
> Sent: Tuesday, August 03, 2010 11:06 PM
> To: Basak, Partha
> Cc: Paul Walmsley; linux-omap@vger.kernel.org; Kalliguddi, Hema; Raja,
> Govindraj; Varadarajan, Charulatha; Nayak, Rajendra; Cousson, Benoit
> Subject: Re: Issues with calling pm_runtime functions in
> platform_pm_suspend_noirq/IRQ disabled context.
> 
> "Basak, Partha"  writes:
> 
> > Resending with the corrected mailing list
> >
> > Hello Kevin,
> >
> > I want to draw your attention to an issue the following patch
> > introduces. http://dev.omapzoom.org/?p=swarch/linux-omap-
> adv.git;a=commitdiff;h=8041359e18e49bf8a3d41f15894db9e476f3a8fc
> 
> [...]
> 
> > I believe, it is not correct to call the pm_runtime APIs from
> > interrupt locked context since the underlying functions
> > __pm_runtime_suspend & __pm_runtime_resume enable interrupts and call
> > schedule().
> >
> > Further, from a s/w layering standpoint, platform-bus is a deeper
> > layer than the Runtime layer, which the runtime layer calls into. So
> > it may not be correct to have this layer calling into pm_runtime(). It
> > should directly call omap_device APIs.
> 
> The original version of this patch did directly call the omap_device
> APIs.  However, Paul didn't like that approach and there were
> use-counting problems to handle as well (e.g. if a driver was not (yet)
> in use, it would already be disabled, and a suspend would trigger an
> omap_device_disable() which would fail since device was already
> disabled.)
> 
> Paul and I agreed that this current approach would solve the
> use-counting problems, but you're correct in that it introduces
> new problems as these functions can _possibly_ sleep/schedule.
> 
> That being said, have you actually seen problems here?   I would like
> to see how they are triggered?

Scenario 1:
Consider the case where a driver has already called pm_runtime_put_sync as part 
of aggressive clock cutting scheme. Subsequently, when system suspend is 
called, pm_runtime_put_sync is called again. Due to atomic_dec_and_test check 
on usage_count, the second call goes through w/o error. 

At System resume, pm_runtime_get_sync is called twice, once by resume, once by 
the driver itself. Because of the extra reference count, the aggressive clock 
control by the driver is broken, as call to put_sync by a driver reduces to 
usage_count to 1. As a result clock transition in Idle-path is broken.

Scenario2:

Tested GPIO for Suspend with these changes & obtained dump of Circular Locking 
dependencies:

# echo mem > /sys/power/state
[  809.981658] PM: Syncing filesystems ... done.
[  810.037963] Freezing user space processes ... (elapsed 0.02 seconds) done.
[  810.067901] Freezing remaining freezable tasks ... (elapsed 0.02 seconds) 
done.
[  810.105224] Suspending console(s) (use no_console_suspend to debug)
[  810.166748] PM: suspend of devices complete after 46.142 msecs
[  810.170562]
[  810.170593] ===
[  810.170593] [ INFO: possible circular locking dependency detected ]
[  810.170623] 2.6.35-rc5-00131-g56e767c-dirty #34
[  810.170654] ---
[  810.170654] sh/670 is trying to acquire lock:
[  810.170684]  (omap_hwmod_mutex){+.+.+.}, at: [] 
omap_hwmod_idle+0x1c/0x38
[  810.170745]
[  810.170745] but task is already holding lock:
[  810.170776]  (dpm_list_mtx){+.+...}, at: [] 
dpm_suspend_noirq+0x28/0x188
[  810.170806]
[  810.170837] which lock already depends on the new lock.
[  810.170837]
[  810.170837]
[  810.170837] the existing dependency chain (in reverse order) is:
[  810.170867]
[  810.170867] -> #1 (dpm_list_mtx){+.+...}:
[  810.170898][] lock_acquire+0x60/0x74
[  810.170959][] mutex_lock_nested+0x58/0x2e4
[  810.170989][] device_pm_add+0x14/0xcc
[  810.171020][] device_add+0x3b8/0x564
[  810.171051][] platform_device_add+0x104/0x160
[  810.171112][] omap_device_build_ss+0x14c/0x1c8
[  810.171142][] omap_device_build+0x48/0x50
[  810.171173][] omap2_init_gpio+0xf0/0x15c
[  810.171203][] omap_hwmod_for_each_by_class+0x60/0xa4
[  810.171264][] do_one_initcall+0x58/0x1b4
[  810.171295][] kernel_init+0x98/0x150
[  810.171325][] kernel_thread_exit+0x0/0x8
[  810.171356]
[  810.171356] -> #0 (omap_hwmod_mutex){+.+.+.}:
[  810.171386][] __lock_acquire+0x108c/0x1784
[  810.171447][] lock_acquire+0x60/0x74
[  810.171478][] mutex_lock_nested+0x58/0x2e4
[  810.171508][] omap_hwmod_idle+0x1c/0x38
[  810.171539][] omap_device_idle_hwmods+0x20/0x3c
[  810.171600][] _omap_device_deactivate+0x58/0x14c
[  810.17163

Re: Issues with calling pm_runtime functions in platform_pm_suspend_noirq/IRQ disabled context.

2010-08-03 Thread Kevin Hilman
"Basak, Partha"  writes:

> Resending with the corrected mailing list
>
> Hello Kevin,
>
> I want to draw your attention to an issue the following patch
> introduces. 
> http://dev.omapzoom.org/?p=swarch/linux-omap-adv.git;a=commitdiff;h=8041359e18e49bf8a3d41f15894db9e476f3a8fc

[...]

> I believe, it is not correct to call the pm_runtime APIs from
> interrupt locked context since the underlying functions
> __pm_runtime_suspend & __pm_runtime_resume enable interrupts and call
> schedule().
>
> Further, from a s/w layering standpoint, platform-bus is a deeper
> layer than the Runtime layer, which the runtime layer calls into. So
> it may not be correct to have this layer calling into pm_runtime(). It
> should directly call omap_device APIs.

The original version of this patch did directly call the omap_device
APIs.  However, Paul didn't like that approach and there were
use-counting problems to handle as well (e.g. if a driver was not (yet)
in use, it would already be disabled, and a suspend would trigger an
omap_device_disable() which would fail since device was already
disabled.)

Paul and I agreed that this current approach would solve the
use-counting problems, but you're correct in that it introduces
new problems as these functions can _possibly_ sleep/schedule.

That being said, have you actually seen problems here?   I would like
to see how they are triggered?

By the time the drivers _noirq hooks are called, the PM core has
shutdown all the drivers, prevented any runtime PM transitions and
disabled interrupts, so no pending runtime transitions will be queued
so the sleeping patch of the __pm_runtime_* should never be entered.

> We are facing a similar issue with a few drivers USB, UART & GPIO
> where, we need to enable/disable clocks & restore/save context in the
> CPU-Idle path with interrupts locked.

These are unrelated issues.  The above is for the static suspend/resume
case.  These are for runtime PM.

> As we are unable to use Runtime APIs, we are using omap_device APIs
> directly with the following modification in the .activate_funcion/
> deactivate_funcion (example UART driver)

[...]

The UART driver is a special case as it is managed from deep inside the
idle path.

> Can you feedback on my observations and comment on the approach taken
> for these drivers?

I cannot comment until I understand why these drivers need to
enable/disable with interrupts off.

In general, any use of the interrupts-off versions of the omap_device
APIs need to be thorougly justified.

Even the UART one will go away when the omap-serial driver is converted
to runtime PM.

Kevin


--
To unsubscribe from this list: send the line "unsubscribe linux-omap" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html