Nishanth Menon <[email protected]> writes:

> OMAP device hooks around suspend|resume_noirq ensures that hwmod
> devices are forced to idle using omap_device_idle/enable as part of
> the last stage of suspend activity.
>
> For a device such as i2c who uses autosuspend, it is possible to enter
> the suspend path with dev->power.runtime_status = RPM_ACTIVE.
>
> As part of the suspend flow, the generic runtime logic would increment
> it's dev->power.disable_depth to 1. This should prevent further
> pm_runtime_get_sync from succeeding once the runtime_status has been
> set to RPM_SUSPENDED.
>
> Now, as part of the suspend_noirq handler in omap_device, we force the
> following: if the device status is !suspended, we force the device
> to idle using omap_device_idle (clocks are cut etc..). This ensures
> that from a hardware perspective, the device is "suspended". However,
> runtime_status is left to be active.
>
> *if* an operation is attempted after this point to
> pm_runtime_get_sync, runtime framework depends on runtime_status to
> indicate accurately the device status, and since it sees it to be
> ACTIVE, it assumes the module is functional and returns a non-error
> value. As a result the user will see pm_runtime_get succeed, however a
> register access will crash due to the lack of clocks.

Ouch.

Dumb Q: who is requesting an i2c transaction after ->suspend_noirq().
The i2c driver itself should be able to detect that it's being accessed
after this point and return an error.

That being said, I agree that omap_device should still be catching this
case in order to find/fix driver races like this.

> To prevent this from happening, we should ensure that runtime_status
> exactly indicates the device status. As a result of this change
> any further calls to pm_runtime_get* would return -EACCES (since
> disable_depth is 1). On resume, we restore the clocks and runtime
> status exactly as we suspended with.
>
> Reported-by: J Keerthy <[email protected]>
> Signed-off-by: Nishanth Menon <[email protected]>
> Acked-by: Rajendra Nayak <[email protected]>
> ---
> patch baseline: V3.12 tag (also applies on linux-next next-20131107 tag)
>
> Logs from 3.12 based vendor kernel:
> Before: http://pastebin.com/m5KxnB7a
> After: http://pastebin.com/8AfX4e7r
>
> The discussion on cpufreq side of the story which triggered this scenario:
>       http://marc.info/?l=linux-pm&m=138263811321921&w=2
>
> Tested on TI vendor kernel (with dt boot):
>       AM335x: evm, BBB, sk, BBW
>       OMAP5uEVM, DRA7-evm
>
>  arch/arm/mach-omap2/omap_device.c |   16 ++++++++++++++--
>  arch/arm/mach-omap2/omap_device.h |    1 +
>  2 files changed, 15 insertions(+), 2 deletions(-)
>
> diff --git a/arch/arm/mach-omap2/omap_device.c 
> b/arch/arm/mach-omap2/omap_device.c
> index b69dd9a..87ecbb0 100644
> --- a/arch/arm/mach-omap2/omap_device.c
> +++ b/arch/arm/mach-omap2/omap_device.c
> @@ -621,6 +621,13 @@ static int _od_suspend_noirq(struct device *dev)
>  
>       if (!ret && !pm_runtime_status_suspended(dev)) {
>               if (pm_generic_runtime_suspend(dev) == 0) {
> +                     if (!pm_runtime_suspended(dev)) {

Why the addition check for supended here?

This version (as opposed to the _status_suspended() version above will
fail if runtime PM has been disabled from userspace (via
/sys/devices/.../power/control), and will thus prevent low power states
from being hit in suspend if runtime suspend has been disabled from
userspace.  That's a bug.

> +                             /* NOTE: *might* indicate driver race */

Yes, a driver race which should then be fixed in the driver.

> +                             dev_dbg(dev, "%s: Force suspending\n",
> +                                     __func__);
> +                             pm_runtime_set_suspended(dev);
> +                             od->flags |= OMAP_DEVICE_SUSPEND_FORCED;

Not sure why you need an additonal flag.  Why not just always do this
and use the existin OMAP_DEVICE_SUSPENDED flag.

Kevin

> +                     }
>                       omap_device_idle(pdev);
>                       od->flags |= OMAP_DEVICE_SUSPENDED;
>               }
> @@ -634,10 +641,15 @@ static int _od_resume_noirq(struct device *dev)
>       struct platform_device *pdev = to_platform_device(dev);
>       struct omap_device *od = to_omap_device(pdev);
>  
> -     if ((od->flags & OMAP_DEVICE_SUSPENDED) &&
> -         !pm_runtime_status_suspended(dev)) {
> +     if (od->flags & OMAP_DEVICE_SUSPENDED) {
>               od->flags &= ~OMAP_DEVICE_SUSPENDED;
>               omap_device_enable(pdev);
> +
> +             if (od->flags & OMAP_DEVICE_SUSPEND_FORCED) {
> +                     pm_runtime_set_active(dev);
> +                     od->flags &= ~OMAP_DEVICE_SUSPEND_FORCED;
> +             }
> +
>               pm_generic_runtime_resume(dev);
>       }
>  
> diff --git a/arch/arm/mach-omap2/omap_device.h 
> b/arch/arm/mach-omap2/omap_device.h
> index 17ca1ae..45885b0 100644
> --- a/arch/arm/mach-omap2/omap_device.h
> +++ b/arch/arm/mach-omap2/omap_device.h
> @@ -38,6 +38,7 @@ extern struct dev_pm_domain omap_device_pm_domain;
>  
>  /* omap_device.flags values */
>  #define OMAP_DEVICE_SUSPENDED                BIT(0)
> +#define OMAP_DEVICE_SUSPEND_FORCED   BIT(1)
>  
>  /**
>   * struct omap_device - omap_device wrapper for platform_devices
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Reply via email to