On Mon, May 06, 2019 at 10:41:34AM +0530, Prateek Sood wrote:
> On 5/1/19 5:29 PM, Prateek Sood wrote:
> > While loading firmware blobs parallely in different threads, it is possible
> > to free sysfs node of glue_dirs in device_del() from a thread while another
> > thread is trying to add subdir from device_add() in glue_dirs sysfs node.
> > 
> >     CPU1                                           CPU2
> > fw_load_sysfs_fallback()
> >   device_add()
> >     get_device_parent()
> >       class_dir_create_and_add()
> >         kobject_add_internal()
> >           create_dir() // glue_dir
> > 
> >                                            fw_load_sysfs_fallback()
> >                                              device_add()
> >                                                get_device_parent()
> >                                                  kobject_get() //glue_dir
> > 
> >   device_del()
> >     cleanup_glue_dir()
> >       kobject_del()
> > 
> >                                                kobject_add()
> >                                                  kobject_add_internal()
> >                                                    create_dir() // in 
> > glue_dir
> >                                                      kernfs_create_dir_ns()
> > 
> >        sysfs_remove_dir() //glue_dir->sd=NULL
> >        sysfs_put() // free glue_dir->sd
> > 
> >                                                        kernfs_new_node()
> >                                                          
> > kernfs_get(glue_dir)
> > 
> > Fix this race by making sure that kernfs_node for glue_dir is released only
> > when refcount for glue_dir kobj is 1.
> > 
> > Signed-off-by: Prateek Sood <prs...@codeaurora.org>
> > 
> > ---
> > 
> > Changes from v2->v3:
> >  - Added patch version change related comments.
> > 
> > Changes from v1->v2:
> >  - Updated callstack from _request_firmware_load() to 
> > fw_load_sysfs_fallback().
> > 
> > 
> >  drivers/base/core.c | 5 ++++-
> >  1 file changed, 4 insertions(+), 1 deletion(-)
> > 
> > diff --git a/drivers/base/core.c b/drivers/base/core.c
> > index 4aeaa0c..3955d07 100644
> > --- a/drivers/base/core.c
> > +++ b/drivers/base/core.c
> > @@ -1820,12 +1820,15 @@ static inline struct kobject *get_glue_dir(struct 
> > device *dev)
> >   */
> >  static void cleanup_glue_dir(struct device *dev, struct kobject *glue_dir)
> >  {
> > +   unsigned int refcount;
> > +
> >     /* see if we live in a "glue" directory */
> >     if (!live_in_glue_dir(glue_dir, dev))
> >             return;
> >  
> >     mutex_lock(&gdp_mutex);
> > -   if (!kobject_has_children(glue_dir))
> > +   refcount = kref_read(&glue_dir->kref);
> > +   if (!kobject_has_children(glue_dir) && !--refcount)
> >             kobject_del(glue_dir);
> >     kobject_put(glue_dir);
> >     mutex_unlock(&gdp_mutex);
> > 
> 
> Folks,
> 
> Please share feedback on the race condition and the patch to
> fix it.

Please relax, we will get to this eventually, it has only been a week...

greg k-h

Reply via email to