On Wed 08-02-17 17:27:58, Vivek Trivedi wrote:
> It has been observed that apps may block in sys_sync for long time if there
> is parallel mount request for large size storage block device in a loaded
> environment.
> 
> For example, sys_sync is reported to be hunged when a large size disk
> (e.g. 4TB/8TB disks) is connected. sys_mount may take time for reading large
> amount of metedata - free space accounting by reading bitmap during mount.
> The larger the volume, the larger the size of the bitmaps read during mount.
> 
> During mount operation s_umount lock is taken by sys_mount. The lock is not
> released till the mount is completed. The sync() process keeps on waiting till
> s_umount is relased by sys_mount.
> 
> Scenario
>         Process1                                                Process2
>         sys_sync()                                              sys_mount()
>         iterate_supers                                          do_mount()
>         ...                                                     
> vfs_kern_mount()
>                                                                 mount_fs() 
> (Filesystem_mount)
>         ...                                                     mount_bdev()
>                                                                 sget()
>                                                                 alloc_super()
>                                                                 
> down_write_nested(&s->s_umount, SINGLE_DEPTH_NESTING); => LOCK HELD by MOUNT
>                                                                 ...
>         down_read(&sb->s_umount); => WAITING FOR LOCK           ...
>         .                                                       ...
>         .                                                       fill_super() 
> -> time TAKING (as per filesystem)
>         .                                                       
> up_write(&sb->s_umount); => LOCK RELEASE by mount process
>         .
>         . STUCKED TILL MOUNT is completed
> 
> Since, the superblock gets added to the list during the mount path 
> alloc_super,
> so the 'sb' is visible in the s_list. But the behaviour of waiting to sync() a
> filesystem which is not active yet, seems ambigous here.
> 
> To avoid this issue, may be we should consider about having to check only
> the ACTIVE filesystem for doing operations from the superblock list.
> 'sb' is valid and referencable as long as part of the s_list and MS_ACTIVE is
> set after successful mount and cleared during the umount path from
> generic_shutdown_super.
> 
> ---
> Fixed a typo and updated the condition in iterate_supers.

Changelog of a patch belongs under the diffstat so that it does not get
into the final commit log.

> Signed-off-by: Vivek Trivedi <[email protected]>
> Reviewed-by: Amit Sahrawat <[email protected]>
> ---
>  fs/super.c | 4 ++++
>  1 file changed, 4 insertions(+)
> 
> diff --git a/fs/super.c b/fs/super.c
> index b8b6a08..01711a4 100644
> --- a/fs/super.c
> +++ b/fs/super.c
> @@ -587,6 +587,10 @@ void iterate_supers(void (*f)(struct super_block *, void 
> *), void *arg)
>       list_for_each_entry(sb, &super_blocks, s_list) {
>               if (hlist_unhashed(&sb->s_instances))
>                       continue;
> +
> +             if (!(sb->s_flags & MS_ACTIVE))
> +                     continue;
> +

So I have two comments to this:

1) Using MS_BORN would be more appropriate here - and consistent with the
fact that we do check it in iterate_supers(), just under s_umount.

2) iterate_supers_type() should be updated as well to keep them consistent.

Otherwise the patch looks OK to me.

                                                                Honza
-- 
Jan Kara <[email protected]>
SUSE Labs, CR

Reply via email to