On Tue, Apr 14, 2026 at 11:41:39AM +0800, Jingyi Wang wrote: > > > On 4/10/2026 10:28 PM, Stephan Gerhold wrote: > > +Cc Bartosz, Dmitry > > > > On Thu, Apr 09, 2026 at 01:46:21AM -0700, Jingyi Wang wrote: > > > For rproc with state RPROC_DETACHED and auto_boot enabled, the attach > > > callback will be called in the rproc_add()->rproc_trigger_auto_boot()-> > > > rproc_boot() path, the failure in this path will cause the rproc_add() > > > fail and the resource release, which will cause issue like rproc recovery > > > or falling back to firmware load fail. Add attach_work for rproc and call > > > it asynchronously in rproc_add() path like what rproc_start() do. > > > > > > Signed-off-by: Jingyi Wang <[email protected]> > > > --- > > > drivers/remoteproc/remoteproc_core.c | 20 ++++++++++++-------- > > > include/linux/remoteproc.h | 1 + > > > 2 files changed, 13 insertions(+), 8 deletions(-) > > > > > > diff --git a/drivers/remoteproc/remoteproc_core.c > > > b/drivers/remoteproc/remoteproc_core.c > > > index b087ed21858a..f02db1113fae 100644 > > > --- a/drivers/remoteproc/remoteproc_core.c > > > +++ b/drivers/remoteproc/remoteproc_core.c > > > @@ -1673,18 +1673,21 @@ static void rproc_auto_boot_callback(const struct > > > firmware *fw, void *context) > > > release_firmware(fw); > > > } > > > +static void rproc_attach_work(struct work_struct *work) > > > +{ > > > + struct rproc *rproc = container_of(work, struct rproc, attach_work); > > > + > > > + rproc_boot(rproc); > > > +} > > > + > > > static int rproc_trigger_auto_boot(struct rproc *rproc) > > > { > > > int ret; > > > - /* > > > - * Since the remote processor is in a detached state, it has already > > > - * been booted by another entity. As such there is no point in waiting > > > - * for a firmware image to be loaded, we can simply initiate the process > > > - * of attaching to it immediately. > > > - */ > > > - if (rproc->state == RPROC_DETACHED) > > > - return rproc_boot(rproc); > > > + if (rproc->state == RPROC_DETACHED) { > > > + schedule_work(&rproc->attach_work); > > > + return 0; > > > + } > > > > I think the change itself is reasonable to make "auto-attach" behavior > > consistent with "auto-boot". The commit message is a bit misleading > > though: > > > > - You're really doing two separate functional changes here: > > > > (1) Ignore the return value of rproc_boot() during auto-boot attach, > > to keep the remoteproc registered and available in sysfs even if > > attaching fails. > > (2) Run the rproc_boot() in the background using schedule_work(). > > [To improve boot performance? To work around some locking issues?] > > > > - The actual issue you are seeing sounds like a use-after-free in the > > remoteproc core error cleanup path. I think this one is still > > present, we should really have a call to > > cancel_work_sync(&rproc->crash_handler) as Dmitry wrote in the > > previous discussion [1]. > > > > Thanks, > > Stephan > > > > [1]: > > https://lore.kernel.org/all/ce24a2sgg4b6wymoxwgl2ve6np2nxn2wuxfqxfpmvqqrpvgouf@xihd6ziqwu4m/ > > Hi Stephan, > > Exactly as you say, what this change do is allowing rproc_attach return false. > It should be okay to keep this change and describe it more clear in commit msg > in next version? >
That's fine for me. > And the use-after-free issue is what we want to resolve in the patch2 > in this series, I think cancel_work_sync() is a reasonable change > but it cannot resolve this issue as the worker could be executing when > we call this(and this is what it behaves when I did local test) and > the use-after-free issue still exists. Shall we send a separate patch > for this cancel_work_sync? > cancel_work_sync() should wait until the worker execution has finished. If you call it before freeing the resources (= deleting the remoteproc), I would expect it should work as expected. It makes sense to have separate patches for this, one is about fixing the use-after-free issue, the other is more about the behavior when the initial auto-boot fails. Thanks, Stephan

