Re: [PATCH v2 10/15] soc: qcom: ipa: use new module_firmware_crashed()
On Fri, May 22, 2020 at 03:52:44PM -0500, Alex Elder wrote: > On 5/22/20 12:28 AM, Luis Chamberlain wrote: > > OK thanks. Can the user be affected by this crash? If so how? Can > > we recover ? Is that always guaranteed? > > We can't guarantee anything about recovering from a crash of > an independent entity. But by design, recovery from a modem > crash is possible and is supposed to leave Linux in a > consistent state. A modem crash is likely to be observable > to the user. Thanks this helps, I'll drop this patch! Luis
Re: [PATCH v2 10/15] soc: qcom: ipa: use new module_firmware_crashed()
On 5/22/20 12:28 AM, Luis Chamberlain wrote: On Tue, May 19, 2020 at 05:34:13PM -0500, Alex Elder wrote: On 5/15/20 4:28 PM, Luis Chamberlain wrote: This makes use of the new module_firmware_crashed() to help annotate when firmware for device drivers crash. When firmware crashes devices can sometimes become unresponsive, and recovery sometimes requires a driver unload / reload and in the worst cases a reboot. Using a taint flag allows us to annotate when this happens clearly. I don't fully understand what this is meant to do, so I can't fully assess whether it's the right thing to do. It is meant to taint the kernel to ensure it is clear that something critically bad has happened with the device firmware, it crashed, and recovery may or may not happen, we are not 100% certain. But in this particular place in the IPA code, the *modem* has crashed. And the IPA driver is not responsible for modem firmware, remoteproc is. Oi vei. So the device it depends on has crashed. Yes, more or less. It could be considered a little more nuanced than even that, but I won't get into it here. The IPA driver *can* be responsible for loading some other firmware, but even in that case, it only happens on initial boot, and it's basically assumed to never crash. OK is this an issue which we can recover from? If for the slightest bit this can affect users it is something we should inform them over. If the IPA driver successfully loads this firmware, it should be assumed to never crash. So in that respect, there is no recovery required. Again, the modem (with which the IPA hardware and driver interacts) can crash, or it can be shut down intentionally. And in either case it can recover, automatically or manually. But all of that (and the modem's firmware loading) is the responsibility of the remoteproc subsystem, not IPA. This patch set is missing uevents for these issues, but I just added support for this. So regardless of whether this module_firmware_crashed() call is appropriate in some places, I believe it should not be used here. OK thanks. Can the user be affected by this crash? If so how? Can we recover ? Is that always guaranteed? We can't guarantee anything about recovering from a crash of an independent entity. But by design, recovery from a modem crash is possible and is supposed to leave Linux in a consistent state. A modem crash is likely to be observable to the user. I'll repeat that I don't believe the new call you inserted in the IPA driver belongs there. -Alex Luis
Re: [PATCH v2 10/15] soc: qcom: ipa: use new module_firmware_crashed()
On Tue, May 19, 2020 at 05:34:13PM -0500, Alex Elder wrote: > On 5/15/20 4:28 PM, Luis Chamberlain wrote: > > This makes use of the new module_firmware_crashed() to help > > annotate when firmware for device drivers crash. When firmware > > crashes devices can sometimes become unresponsive, and recovery > > sometimes requires a driver unload / reload and in the worst cases > > a reboot. > > > > Using a taint flag allows us to annotate when this happens clearly. > > I don't fully understand what this is meant to do, so I can't > fully assess whether it's the right thing to do. It is meant to taint the kernel to ensure it is clear that something critically bad has happened with the device firmware, it crashed, and recovery may or may not happen, we are not 100% certain. > > But in this particular place in the IPA code, the *modem* has > crashed. And the IPA driver is not responsible for modem > firmware, remoteproc is. Oi vei. So the device it depends on has crashed. > The IPA driver *can* be responsible for loading some other > firmware, but even in that case, it only happens on initial > boot, and it's basically assumed to never crash. OK is this an issue which we can recover from? If for the slightest bit this can affect users it is something we should inform them over. This patch set is missing uevents for these issues, but I just added support for this. > So regardless of whether this module_firmware_crashed() call is > appropriate in some places, I believe it should not be used here. OK thanks. Can the user be affected by this crash? If so how? Can we recover ? Is that always guaranteed? Luis
Re: [PATCH v2 10/15] soc: qcom: ipa: use new module_firmware_crashed()
On 5/15/20 4:28 PM, Luis Chamberlain wrote: This makes use of the new module_firmware_crashed() to help annotate when firmware for device drivers crash. When firmware crashes devices can sometimes become unresponsive, and recovery sometimes requires a driver unload / reload and in the worst cases a reboot. Using a taint flag allows us to annotate when this happens clearly. I don't fully understand what this is meant to do, so I can't fully assess whether it's the right thing to do. But in this particular place in the IPA code, the *modem* has crashed. And the IPA driver is not responsible for modem firmware, remoteproc is. The IPA driver *can* be responsible for loading some other firmware, but even in that case, it only happens on initial boot, and it's basically assumed to never crash. So regardless of whether this module_firmware_crashed() call is appropriate in some places, I believe it should not be used here. -Alex Cc: Alex Elder Signed-off-by: Luis Chamberlain --- drivers/net/ipa/ipa_modem.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/net/ipa/ipa_modem.c b/drivers/net/ipa/ipa_modem.c index ed10818dd99f..1790b87446ed 100644 --- a/drivers/net/ipa/ipa_modem.c +++ b/drivers/net/ipa/ipa_modem.c @@ -285,6 +285,7 @@ static void ipa_modem_crashed(struct ipa *ipa) struct device *dev = &ipa->pdev->dev; int ret; + module_firmware_crashed(); ipa_endpoint_modem_pause_all(ipa, true); ipa_endpoint_modem_hol_block_clear_all(ipa);
Re: [PATCH v2 10/15] soc: qcom: ipa: use new module_firmware_crashed()
On Fri, May 15, 2020 at 09:28:41PM +, Luis Chamberlain wrote: > This makes use of the new module_firmware_crashed() to help > annotate when firmware for device drivers crash. When firmware > crashes devices can sometimes become unresponsive, and recovery > sometimes requires a driver unload / reload and in the worst cases > a reboot. > > Using a taint flag allows us to annotate when this happens clearly. > > Cc: Alex Elder > Signed-off-by: Luis Chamberlain > --- > drivers/net/ipa/ipa_modem.c | 1 + > 1 file changed, 1 insertion(+) > > diff --git a/drivers/net/ipa/ipa_modem.c b/drivers/net/ipa/ipa_modem.c > index ed10818dd99f..1790b87446ed 100644 > --- a/drivers/net/ipa/ipa_modem.c > +++ b/drivers/net/ipa/ipa_modem.c > @@ -285,6 +285,7 @@ static void ipa_modem_crashed(struct ipa *ipa) > struct device *dev = &ipa->pdev->dev; > int ret; > > + module_firmware_crashed(); > ipa_endpoint_modem_pause_all(ipa, true); > > ipa_endpoint_modem_hol_block_clear_all(ipa); > -- > 2.26.2 > Acked-by: Rafael Aquini
[PATCH v2 10/15] soc: qcom: ipa: use new module_firmware_crashed()
This makes use of the new module_firmware_crashed() to help annotate when firmware for device drivers crash. When firmware crashes devices can sometimes become unresponsive, and recovery sometimes requires a driver unload / reload and in the worst cases a reboot. Using a taint flag allows us to annotate when this happens clearly. Cc: Alex Elder Signed-off-by: Luis Chamberlain --- drivers/net/ipa/ipa_modem.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/net/ipa/ipa_modem.c b/drivers/net/ipa/ipa_modem.c index ed10818dd99f..1790b87446ed 100644 --- a/drivers/net/ipa/ipa_modem.c +++ b/drivers/net/ipa/ipa_modem.c @@ -285,6 +285,7 @@ static void ipa_modem_crashed(struct ipa *ipa) struct device *dev = &ipa->pdev->dev; int ret; + module_firmware_crashed(); ipa_endpoint_modem_pause_all(ipa, true); ipa_endpoint_modem_hol_block_clear_all(ipa); -- 2.26.2