On 5/1/19 5:29 PM, Prateek Sood wrote:
> While loading firmware blobs parallely in different threads, it is possible
> to free sysfs node of glue_dirs in device_del() from a thread while another
> thread is trying to add subdir from device_add() in glue_dirs sysfs node.
> 
>     CPU1                                           CPU2
> fw_load_sysfs_fallback()
>   device_add()
>     get_device_parent()
>       class_dir_create_and_add()
>         kobject_add_internal()
>           create_dir() // glue_dir
> 
>                                            fw_load_sysfs_fallback()
>                                              device_add()
>                                                get_device_parent()
>                                                  kobject_get() //glue_dir
> 
>   device_del()
>     cleanup_glue_dir()
>       kobject_del()
> 
>                                                kobject_add()
>                                                  kobject_add_internal()
>                                                    create_dir() // in glue_dir
>                                                      kernfs_create_dir_ns()
> 
>        sysfs_remove_dir() //glue_dir->sd=NULL
>        sysfs_put() // free glue_dir->sd
> 
>                                                        kernfs_new_node()
>                                                          kernfs_get(glue_dir)
> 
> Fix this race by making sure that kernfs_node for glue_dir is released only
> when refcount for glue_dir kobj is 1.
> 
> Signed-off-by: Prateek Sood <prs...@codeaurora.org>
> 
> ---
> 
> Changes from v2->v3:
>  - Added patch version change related comments.
> 
> Changes from v1->v2:
>  - Updated callstack from _request_firmware_load() to 
> fw_load_sysfs_fallback().
> 
> 
>  drivers/base/core.c | 5 ++++-
>  1 file changed, 4 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/base/core.c b/drivers/base/core.c
> index 4aeaa0c..3955d07 100644
> --- a/drivers/base/core.c
> +++ b/drivers/base/core.c
> @@ -1820,12 +1820,15 @@ static inline struct kobject *get_glue_dir(struct 
> device *dev)
>   */
>  static void cleanup_glue_dir(struct device *dev, struct kobject *glue_dir)
>  {
> +     unsigned int refcount;
> +
>       /* see if we live in a "glue" directory */
>       if (!live_in_glue_dir(glue_dir, dev))
>               return;
>  
>       mutex_lock(&gdp_mutex);
> -     if (!kobject_has_children(glue_dir))
> +     refcount = kref_read(&glue_dir->kref);
> +     if (!kobject_has_children(glue_dir) && !--refcount)
>               kobject_del(glue_dir);
>       kobject_put(glue_dir);
>       mutex_unlock(&gdp_mutex);
> 

Folks,

Please share feedback on the race condition and the patch to
fix it.

-- 
Qualcomm India Private Limited, on behalf of Qualcomm Innovation
Center, Inc., is a member of Code Aurora Forum, a Linux Foundation
Collaborative Project

Reply via email to