On Tue, Apr 14, 2026 at 11:41:39AM +0800, Jingyi Wang wrote:
> 
> 
> On 4/10/2026 10:28 PM, Stephan Gerhold wrote:
> > +Cc Bartosz, Dmitry
> > 
> > On Thu, Apr 09, 2026 at 01:46:21AM -0700, Jingyi Wang wrote:
> > > For rproc with state RPROC_DETACHED and auto_boot enabled, the attach
> > > callback will be called in the rproc_add()->rproc_trigger_auto_boot()->
> > > rproc_boot() path, the failure in this path will cause the rproc_add()
> > > fail and the resource release, which will cause issue like rproc recovery
> > > or falling back to firmware load fail. Add attach_work for rproc and call
> > > it asynchronously in rproc_add() path like what rproc_start() do.
> > > 
> > > Signed-off-by: Jingyi Wang <[email protected]>
> > > ---
> > >   drivers/remoteproc/remoteproc_core.c | 20 ++++++++++++--------
> > >   include/linux/remoteproc.h           |  1 +
> > >   2 files changed, 13 insertions(+), 8 deletions(-)
> > > 
> > > diff --git a/drivers/remoteproc/remoteproc_core.c 
> > > b/drivers/remoteproc/remoteproc_core.c
> > > index b087ed21858a..f02db1113fae 100644
> > > --- a/drivers/remoteproc/remoteproc_core.c
> > > +++ b/drivers/remoteproc/remoteproc_core.c
> > > @@ -1673,18 +1673,21 @@ static void rproc_auto_boot_callback(const struct 
> > > firmware *fw, void *context)
> > >           release_firmware(fw);
> > >   }
> > > +static void rproc_attach_work(struct work_struct *work)
> > > +{
> > > + struct rproc *rproc = container_of(work, struct rproc, attach_work);
> > > +
> > > + rproc_boot(rproc);
> > > +}
> > > +
> > >   static int rproc_trigger_auto_boot(struct rproc *rproc)
> > >   {
> > >           int ret;
> > > - /*
> > > -  * Since the remote processor is in a detached state, it has already
> > > -  * been booted by another entity.  As such there is no point in waiting
> > > -  * for a firmware image to be loaded, we can simply initiate the process
> > > -  * of attaching to it immediately.
> > > -  */
> > > - if (rproc->state == RPROC_DETACHED)
> > > -         return rproc_boot(rproc);
> > > + if (rproc->state == RPROC_DETACHED) {
> > > +         schedule_work(&rproc->attach_work);
> > > +         return 0;
> > > + }
> > 
> > I think the change itself is reasonable to make "auto-attach" behavior
> > consistent with "auto-boot". The commit message is a bit misleading
> > though:
> > 
> >   - You're really doing two separate functional changes here:
> > 
> >     (1) Ignore the return value of rproc_boot() during auto-boot attach,
> >         to keep the remoteproc registered and available in sysfs even if
> >         attaching fails.
> >     (2) Run the rproc_boot() in the background using schedule_work().
> >         [To improve boot performance? To work around some locking issues?]
> > 
> >   - The actual issue you are seeing sounds like a use-after-free in the
> >     remoteproc core error cleanup path. I think this one is still
> >     present, we should really have a call to
> >     cancel_work_sync(&rproc->crash_handler) as Dmitry wrote in the
> >     previous discussion [1].
> > 
> > Thanks,
> > Stephan
> > 
> > [1]: 
> > https://lore.kernel.org/all/ce24a2sgg4b6wymoxwgl2ve6np2nxn2wuxfqxfpmvqqrpvgouf@xihd6ziqwu4m/
> 
> Hi Stephan,
> 
> Exactly as you say, what this change do is allowing rproc_attach return false.
> It should be okay to keep this change and describe it more clear in commit msg
> in next version?
> 

That's fine for me.

> And the use-after-free issue is what we want to resolve in the patch2
> in this series, I think cancel_work_sync() is a reasonable change
> but it cannot resolve this issue as the worker could be executing when
> we call this(and this is what it behaves when I did local test) and
> the use-after-free issue still exists. Shall we send a separate patch
> for this cancel_work_sync?
> 

cancel_work_sync() should wait until the worker execution has finished.
If you call it before freeing the resources (= deleting the remoteproc),
I would expect it should work as expected.

It makes sense to have separate patches for this, one is about fixing
the use-after-free issue, the other is more about the behavior when the
initial auto-boot fails.

Thanks,
Stephan

Reply via email to