On Tue, Apr 14, 2026 at 11:23:50AM +0800, Jingyi Wang wrote: > On 4/10/2026 10:15 PM, Stephan Gerhold wrote: > > On Thu, Apr 09, 2026 at 01:46:22AM -0700, Jingyi Wang wrote: > > > For rproc that doing attach, glink_subdev_start() is called only when > > > attach successfully. If rproc_report_crash() is called in the attach > > > function, rproc_boot_recovery()->rproc_stop()->glink_subdev_stop() could > > > be called and cause NULL pointer dereference: > > > > > > Unable to handle kernel NULL pointer dereference at virtual address > > > 0000000000000300 > > > Mem abort info: > > > ... > > > pc : qcom_glink_smem_unregister+0x14/0x48 [qcom_glink_smem] > > > lr : glink_subdev_stop+0x1c/0x30 [qcom_common] > > > ... > > > Call trace: > > > qcom_glink_smem_unregister+0x14/0x48 [qcom_glink_smem] (P) > > > glink_subdev_stop+0x1c/0x30 [qcom_common] > > > rproc_stop+0x58/0x17c > > > rproc_trigger_recovery+0xb0/0x150 > > > rproc_crash_handler_work+0xa4/0xc4 > > > process_scheduled_works+0x18c/0x2d8 > > > worker_thread+0x144/0x280 > > > kthread+0x124/0x138 > > > ret_from_fork+0x10/0x20 > > > Code: a9be7bfd 910003fd a90153f3 aa0003f3 (b9430000) > > > ---[ end trace 0000000000000000 ]--- > > > > > > Add NULL pointer check in the glink_subdev_stop() to make sure > > > qcom_glink_smem_unregister() will not be called if glink_subdev_start() > > > is not called. > > > > > > > You mention the actual root problem here: Why is glink_subdev_stop() > > called if glink_subdev_start() wasn't called? > > > > The call to rproc_start_subdevices() in __rproc_attach() makes sure that > > all subdevices are in consistent state when exiting the function (either > > prepared+started or stopped+unprepared). Only if all subdevices were > > started successfully, the rproc->state is changed to RPROC_ATTACHED. > > > > In your case, attaching the rproc failed so the rproc->state should be > > still RPROC_DETACHED. All subdevices should be stopped+unprepared. We > > shouldn't stop/unprepare any subdevices again in this state, they all > > might crash like glink does here. > > > > We know that subdevices are already stopped+unprepared in RPROC_DETACHED > > state, so I think you just need to skip rproc_stop_subdevices() and > > rproc_unprepare_subdevices() inside rproc_stop() in this case, see diff > > below. > > > > @@ -1708,8 +1709,9 @@ static int rproc_stop(struct rproc *rproc, bool > > crashed) > > if (!rproc->ops->stop) > > return -EINVAL; > > - /* Stop any subdevices for the remote processor */ > > - rproc_stop_subdevices(rproc, crashed); > > + /* Stop any subdevices for the remote processor if it was attached */ > > + if (rproc->state != RPROC_DETACHED) > > + rproc_stop_subdevices(rproc, crashed); > > /* the installed resource table is no longer accessible */ > > ret = rproc_reset_rsc_table_on_stop(rproc); > > @@ -1726,7 +1728,8 @@ static int rproc_stop(struct rproc *rproc, bool > > crashed) > > return ret; > > } > > - rproc_unprepare_subdevices(rproc); > > + if (rproc->state != RPROC_DETACHED) > > + rproc_unprepare_subdevices(rproc); > > rproc->state = RPROC_OFFLINE; > > In this case, rproc_crash_handler_work()->rproc_trigger_recovery()-> > rproc_boot_recovery()->rproc_stop()->glink_subdev_stop() is called, > "rproc->state = RPROC_CRASHED" is set in the rproc_crash_handler_work > before rproc_boot_recovery is called, so checking RPROC_DETACHED can > not work for this case. >
You're right, I forgot about that. I think we need a more generic solution for this though. rproc_stop_subdevices() should not be called without a prior call to rproc_start_subdevices(). I think there are a couple of options for this: - Add a bool "subdevs_started" to struct rproc and manage that separately from the rproc->state. - Track the rproc state before the crash separately (something like rproc->state_before_crash) and check that in the stop path. - Add a new state RPROC_CRASHED_DETACHED to make sure the states are unique. - ... Does the same issue also exist in qcom_pas_stop() of "[PATCH v5 4/5] remoteproc: qcom: pas: Add late attach support for subsystems" [1]? There you check for pas->rproc->state != RPROC_ATTACHED, wouldn't this also fail for the RPROC_CRASHED case? Thanks, Stephan [1]: https://lore.kernel.org/linux-arm-msm/[email protected]/

