On Fri, Nov 11, 2016 at 12:06 PM, Miklos Szeredi <mik...@szeredi.hu> wrote: > On Fri, Nov 11, 2016 at 10:46 AM, Konstantin Khlebnikov > <koc...@gmail.com> wrote: >> On Fri, Nov 11, 2016 at 1:56 AM, Amir Goldstein <amir7...@gmail.com> wrote: >>> On Mon, Nov 7, 2016 at 3:38 PM, Amir Goldstein <amir7...@gmail.com> wrote: >>>> On Mon, Nov 7, 2016 at 12:08 PM, Konstantin Khlebnikov <koc...@gmail.com> >>>> wrote: >>>>> On Mon, Nov 7, 2016 at 1:04 PM, Miklos Szeredi <mik...@szeredi.hu> wrote: >>>>>> On Mon, Nov 7, 2016 at 10:58 AM, Konstantin Khlebnikov >>>>>> <koc...@gmail.com> wrote: >>>>>> >>>>>>> I've stumbled on somehow related problem - concurrent copy-ups are >>>>>>> strictly serialized by rename locks. >>>>>>> Obviously, file copying could be done in parallel: locks are required >>>>>>> only for final rename. >>>>>>> Because of that overlay slower that aufs for some workloads. >>>>>> >>>>>> Easy to fix: for each copy up create a separate subdir of "work". >>>>>> Then the contention is only for the time of creating the subdir, which >>>>>> is very short. >>>>> >>>>> Yeah, but lock_rename() also takes per-sb s_vfs_rename_mutex (kludge by >>>>> Al Viro) >>>>> I think proper synchronization for concurrent copy-up (for example >>>>> round flag on ovl_entry) and locking rename only for rename could be >>>>> better. >>>> >>>> Removing s_vfs_rename_mutex from copy-up path is something I have been >>>> pondering about. >>>> Assuming that I understand Al's comment above vfs_rename() correctly, >>>> the sole purpose of per-sb serialization is to prevent loop creations. >>>> However, how can one create a loop by moving a non-directory? >>>> So it looks like at least for the non-dir copy up case, a much finer >>>> grained >>>> lock is in order. >>>> >>> >>> >>> I posted patches to relax the s_vfs_rename_mutex for copy-up and >>> whiteout in some use cases. >>> >>> Konstantin, >>> >>> It would be useful to know if those patches help with your use case. >>> >> >> Well.. I think relaxing only s_vfs_rename_mutex wouldn't help much here. >> Copying is still serialized by i_mutex on workdir? >> Data copying should be done without rename locks at all. > > We do need something to prevent multiple copy-ups starting up in > parallel on the same file, though. >
I guess an inode_lock on the copy-up victim should suffice? I will look into it as soon as I am done with profiling. So far I ran only 2 rm -rf threads on 2 different overlay mounts on the same underlying fs and s_vfs_rename_mutex was contended about ~4% of the time. In this test, copy-up is not dominant - only ~2% for the directory copy-ups, but vfs_whiteouts take 20% and the vfs_rename itself 10%, both with s_vfs_rename_mutex held. Amir.