Re: [PATCH] iio: light: gp2ap002: Fix rumtime PM imbalance on error
On Sun, Apr 18, 2021 at 11:43 AM Jonathan Cameron wrote: > I'm still far from completely convinced that it is 'necessary' > to take the reference whilst going through this sequence because > there is nothing to kick off the suspend until we tell it to use > autosuspend. However, I appreciate (much like taking locks in > general in probe) that it makes it easy to see there is no race. One part is related to hierarchy, in the past if this device would sit on a I2C bus on an PCIE card, unless you take a reference (that escalate upwards) the I2C bus host or PCIE card may decide to go and runtime suspend, cutting communication with your device. Since 04f59143b571 the power state of I2C buses are decoupled from the clients and we decided that the I2C (as well as SPI hosts, separate commit) shall make sure PM resume themselves when transmitting messages to clients. So in this case, since it is an I2C device, we are probably fine without grabbing a reference. But this is not a general rule for any (non-slow) bus, so the idiomatic pattern to follow is better like this. Yours, Linus Walleij
Re: [PATCH] iio: light: gp2ap002: Fix rumtime PM imbalance on error
On Mon, 12 Apr 2021 13:47:58 +0200 Linus Walleij wrote: > On Mon, Apr 12, 2021 at 12:16 PM Jonathan Cameron > wrote: > > > An example would be the bmc150_magn driver which does exactly the > > same call sequence as this one, but without the reference count increment > > and decrement. Basically I want to know if there is a problem in > > those other drivers that is being protected against here! > > The bmc150_magn driver does not look like I would have done it. > > That said, I think it works, because the only thing it is calling is > bmc150_magn_set_power_mode() and that seems stateless, > just poking some regmap bits. The state is tracked by the driver > AFAICT and we don't need pm_runtime_get_noresume() because > it doesn't really matter if the pm_runtime_suspend() callback > gets called immediately or randomly out of sync with what we are > doing from this point. > > I would anyways patch it like the gp2ap002 driver because it > is easier to follow the code. Including using only runtime PM > att setting SET_SYSTEM_SLEEP_PM_OPS() to the > pm_runtime_force_suspend and pm_runtime_force_resume > functions, everything just get so much easier when you use > only one type of PM and not two orthogonal ones. > > drivers/iio/light/bh1780.c > should be a good example of how to do it idiomatically > because it was reviewed by Ulf Hansson who knows this > runtime PM stuff better than me. > > A sequence like this: > >pm_runtime_get_noresume(&client->dev); >pm_runtime_set_active(&client->dev); >pm_runtime_enable(&client->dev); >pm_runtime_set_autosuspend_delay(&client->dev, 5000); >pm_runtime_use_autosuspend(&client->dev); >pm_runtime_put(&client->dev); > > is very nice because you can clearly see that it will not race > and after the last put() unless something happens the > runtime suspend will kick in after 5000 ms. > > Likewise when disabling: > > pm_runtime_get_sync(&client->dev); > pm_runtime_put_noidle(&client->dev); > pm_runtime_disable(&client->dev); > > same thing: crystal clear there are no races, the device is > definately runtime resumed and we can proceed to > shut down resources explicitly after this point. > > If you then add: > > SET_SYSTEM_SLEEP_PM_OPS(pm_runtime_force_suspend, > pm_runtime_force_resume) > > Now you have ordinary sleep PM for free. It will just > force the same suspend/resume callbacks and they are > guaranteed to be race free. > > This doesn't work for everyone but surprisingly often this is > what you want. I'm still far from completely convinced that it is 'necessary' to take the reference whilst going through this sequence because there is nothing to kick off the suspend until we tell it to use autosuspend. However, I appreciate (much like taking locks in general in probe) that it makes it easy to see there is no race. Anyhow, fix is still valid either way so applied to the fixes togreg branch of iio.git with a fixes tag added to the initial introduction of the driver (which I think is where this came in). Thanks, Jonathan > > Yours, > Linus Walleij
Re: [PATCH] iio: light: gp2ap002: Fix rumtime PM imbalance on error
On Mon, Apr 12, 2021 at 12:16 PM Jonathan Cameron wrote: > An example would be the bmc150_magn driver which does exactly the > same call sequence as this one, but without the reference count increment > and decrement. Basically I want to know if there is a problem in > those other drivers that is being protected against here! The bmc150_magn driver does not look like I would have done it. That said, I think it works, because the only thing it is calling is bmc150_magn_set_power_mode() and that seems stateless, just poking some regmap bits. The state is tracked by the driver AFAICT and we don't need pm_runtime_get_noresume() because it doesn't really matter if the pm_runtime_suspend() callback gets called immediately or randomly out of sync with what we are doing from this point. I would anyways patch it like the gp2ap002 driver because it is easier to follow the code. Including using only runtime PM att setting SET_SYSTEM_SLEEP_PM_OPS() to the pm_runtime_force_suspend and pm_runtime_force_resume functions, everything just get so much easier when you use only one type of PM and not two orthogonal ones. drivers/iio/light/bh1780.c should be a good example of how to do it idiomatically because it was reviewed by Ulf Hansson who knows this runtime PM stuff better than me. A sequence like this: pm_runtime_get_noresume(&client->dev); pm_runtime_set_active(&client->dev); pm_runtime_enable(&client->dev); pm_runtime_set_autosuspend_delay(&client->dev, 5000); pm_runtime_use_autosuspend(&client->dev); pm_runtime_put(&client->dev); is very nice because you can clearly see that it will not race and after the last put() unless something happens the runtime suspend will kick in after 5000 ms. Likewise when disabling: pm_runtime_get_sync(&client->dev); pm_runtime_put_noidle(&client->dev); pm_runtime_disable(&client->dev); same thing: crystal clear there are no races, the device is definately runtime resumed and we can proceed to shut down resources explicitly after this point. If you then add: SET_SYSTEM_SLEEP_PM_OPS(pm_runtime_force_suspend, pm_runtime_force_resume) Now you have ordinary sleep PM for free. It will just force the same suspend/resume callbacks and they are guaranteed to be race free. This doesn't work for everyone but surprisingly often this is what you want. Yours, Linus Walleij
Re: [PATCH] iio: light: gp2ap002: Fix rumtime PM imbalance on error
On Mon, 12 Apr 2021 00:38:41 +0200 Linus Walleij wrote: > On Sun, Apr 11, 2021 at 5:07 PM Jonathan Cameron wrote: > > On Wed, 7 Apr 2021 11:49:27 +0800 > > Dinghao Liu wrote: > > > > > When devm_request_threaded_irq() fails, we should decrease the > > > runtime PM counter to keep the counter balanced. But when > > > iio_device_register() fails, we need not to decrease it because > > > we have already decreased it before. > > > > Whilst agree with your assessment that the code is wrong, I'm not > > totally sure why we need to do the pm_runtime_get_noresume() in > > the first place. Why do we need to hold the reference for > > the operations going on here? What can race against this that > > might care about that reference count? > > pm_runtime_get_noresume() is increasing the runtime PM > reference without calling the pm_runtime_resume() callback. > > It is often called in sequence like this: > > pm_runtime_get_noresume(dev); > pm_runtime_set_active(dev); > pm_runtime_enable(dev); > > This increases the reference, sets the device as active > and enables runtime PM. > > The reason that probe() has activated resources such as > enabling two regulators, and want to leave them on so that > later on pm_runtime_suspend() will disable them, i.e. > handover to runtime PM with the device in resumed state. > > I hope this is answering the question, not sure. There are drivers that look the same except they aren't holding the reference. Are those immediately disabling the power? I can't see the path by which that happens, but perhaps I'm just missing something? Maybe this is just paranoid locking in a probe path (before we are in a position where races can occur)? An example would be the bmc150_magn driver which does exactly the same call sequence as this one, but without the reference count increment and decrement. Basically I want to know if there is a problem in those other drivers that is being protected against here! > > Yours, > Linus Walleij
Re: [PATCH] iio: light: gp2ap002: Fix rumtime PM imbalance on error
On Sun, Apr 11, 2021 at 5:07 PM Jonathan Cameron wrote: > On Wed, 7 Apr 2021 11:49:27 +0800 > Dinghao Liu wrote: > > > When devm_request_threaded_irq() fails, we should decrease the > > runtime PM counter to keep the counter balanced. But when > > iio_device_register() fails, we need not to decrease it because > > we have already decreased it before. > > Whilst agree with your assessment that the code is wrong, I'm not > totally sure why we need to do the pm_runtime_get_noresume() in > the first place. Why do we need to hold the reference for > the operations going on here? What can race against this that > might care about that reference count? pm_runtime_get_noresume() is increasing the runtime PM reference without calling the pm_runtime_resume() callback. It is often called in sequence like this: pm_runtime_get_noresume(dev); pm_runtime_set_active(dev); pm_runtime_enable(dev); This increases the reference, sets the device as active and enables runtime PM. The reason that probe() has activated resources such as enabling two regulators, and want to leave them on so that later on pm_runtime_suspend() will disable them, i.e. handover to runtime PM with the device in resumed state. I hope this is answering the question, not sure. Yours, Linus Walleij
Re: [PATCH] iio: light: gp2ap002: Fix rumtime PM imbalance on error
On Wed, 7 Apr 2021 11:49:27 +0800 Dinghao Liu wrote: > When devm_request_threaded_irq() fails, we should decrease the > runtime PM counter to keep the counter balanced. But when > iio_device_register() fails, we need not to decrease it because > we have already decreased it before. Whilst agree with your assessment that the code is wrong, I'm not totally sure why we need to do the pm_runtime_get_noresume() in the first place. Why do we need to hold the reference for the operations going on here? What can race against this that might care about that reference count? Jonathan > > Signed-off-by: Dinghao Liu > --- > drivers/iio/light/gp2ap002.c | 5 +++-- > 1 file changed, 3 insertions(+), 2 deletions(-) > > diff --git a/drivers/iio/light/gp2ap002.c b/drivers/iio/light/gp2ap002.c > index 7ba7aa59437c..040d8429a6e0 100644 > --- a/drivers/iio/light/gp2ap002.c > +++ b/drivers/iio/light/gp2ap002.c > @@ -583,7 +583,7 @@ static int gp2ap002_probe(struct i2c_client *client, > "gp2ap002", indio_dev); > if (ret) { > dev_err(dev, "unable to request IRQ\n"); > - goto out_disable_vio; > + goto out_put_pm; > } > gp2ap002->irq = client->irq; > > @@ -613,8 +613,9 @@ static int gp2ap002_probe(struct i2c_client *client, > > return 0; > > -out_disable_pm: > +out_put_pm: > pm_runtime_put_noidle(dev); > +out_disable_pm: > pm_runtime_disable(dev); > out_disable_vio: > regulator_disable(gp2ap002->vio);
Re: [PATCH] iio: light: gp2ap002: Fix rumtime PM imbalance on error
On Wed, Apr 7, 2021 at 5:49 AM Dinghao Liu wrote: > When devm_request_threaded_irq() fails, we should decrease the > runtime PM counter to keep the counter balanced. But when > iio_device_register() fails, we need not to decrease it because > we have already decreased it before. > > Signed-off-by: Dinghao Liu Looks correct, this semantic ordering always confuse me a bit: Reviewed-by: Linus Walleij Yours, Linus Walleij
[PATCH] iio: light: gp2ap002: Fix rumtime PM imbalance on error
When devm_request_threaded_irq() fails, we should decrease the runtime PM counter to keep the counter balanced. But when iio_device_register() fails, we need not to decrease it because we have already decreased it before. Signed-off-by: Dinghao Liu --- drivers/iio/light/gp2ap002.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/drivers/iio/light/gp2ap002.c b/drivers/iio/light/gp2ap002.c index 7ba7aa59437c..040d8429a6e0 100644 --- a/drivers/iio/light/gp2ap002.c +++ b/drivers/iio/light/gp2ap002.c @@ -583,7 +583,7 @@ static int gp2ap002_probe(struct i2c_client *client, "gp2ap002", indio_dev); if (ret) { dev_err(dev, "unable to request IRQ\n"); - goto out_disable_vio; + goto out_put_pm; } gp2ap002->irq = client->irq; @@ -613,8 +613,9 @@ static int gp2ap002_probe(struct i2c_client *client, return 0; -out_disable_pm: +out_put_pm: pm_runtime_put_noidle(dev); +out_disable_pm: pm_runtime_disable(dev); out_disable_vio: regulator_disable(gp2ap002->vio); -- 2.17.1