On Mon, Oct 20, 2025 at 08:57:34AM -0300, Jason Gunthorpe wrote:
> On Sun, Oct 19, 2025 at 11:08:29PM +0800, Tzung-Bi Shih wrote:
> > On Fri, Oct 17, 2025 at 01:21:16PM -0300, Jason Gunthorpe wrote:
> > > On Sat, Oct 18, 2025 at 12:07:58AM +0800, Tzung-Bi Shih wrote:
> > > > > This is already properly lifetime controlled!
> > > > > 
> > > > > It *HAS* to be, and even your patches are assuming it by blindly
> > > > > reaching into the parent's memory!
> > > > > 
> > > > > +     misc->rps[0] = ec->ec_dev->revocable_provider;
> > > > > 
> > > > > If the parent driver has been racily unbound at this point the
> > > > > ec->ec_dev is already a UAF!
> > > > 
> > > > Not really, it uses the fact that the caller is from probe().  I think 
> > > > the
> > > > driver can't be unbound when it is still in probe().
> > > 
> > > Right, but that's my point you are already relying on driver binding
> > > lifetime rules to make your access valid. You should continue to rely
> > > on that and fix the lack of synchronous remove to fix the bug.
> > 
> > I think what you're looking for is something similar to the following
> > patches.
> > 
> > - Instead of having a real resource to protect with revocable, use the
> >   subsystem device itself as a virtual resource.  Revoke the virtual
> >   resource when unregistering the device from the subsystem.
> > 
> > - Exit earlier if the virtual resource is NULL (i.e. the subsystem device
> >   has been unregistered) in the file operation wrappers.
> 
> Sure
>  
> > By doing so, we don't need to provide a misc_deregister_sync() which could
> > probably maintain a list of opening files in miscdevice and handle with all
> > opening files when unregistering.  
> 
> I don't think we want to change the default behavior of
> misc_deregister.. Maybe if it was a mutex not srcu it would be OK, but
> srcu you are looking at delaying driver removal by seconds
> potentially.
> 
> > @@ -234,6 +240,10 @@ int misc_register(struct miscdevice *misc)
> >                 return -EINVAL;
> >         }
> >  
> > +       misc->rp = revocable_provider_alloc(misc);
> > +       if (!misc->rp)
> > +               return -ENOMEM;
> 
> Just get rid of all this revocable stuff, all this needs is a scru or
> a mutex, none of this obfuscation around a simple lock is helpful in
> core kernel code.

I didn't get the idea.  With a mutex, how to handle the opening files?

Are they something like: (?)
- Maintain a list for opening files in both .open() and .release().
- In misc_deregister_sync(), traverse the list, do something (what?), and
  wait for the userspace programs close the files.

> > @@ -1066,6 +1066,7 @@ struct file {
> >                 freeptr_t               f_freeptr;
> >         };
> >         /* --- cacheline 3 boundary (192 bytes) --- */
> > +       struct fs_revocable_replacement *f_rr;
> >  } __randomize_layout
> 
> The thing that will likely attract objections is this. It is probably
> a good idea to try to remove it.
> 
> For simple misc users the inode->i_cdev will always be valid and you
> can reach the struct misc_dev/cdev from there in all the calls.
> 
> More complex cdev users replace the inode so that wouldn't work
> universally but it is good enough to get started at least.

The context is meant to be the same lifecycle with file opens/releases but
not the miscdevice.  I think the mutex vs. revocable stuff is the more
fundamental issue, we can focus on that first.

Reply via email to