On Thu, Dec 3, 2020 at 2:28 AM Greg KH <gre...@linuxfoundation.org> wrote: > > On Wed, Dec 02, 2020 at 10:58:35PM +0800, Fox Chen wrote: > > Hello, > > > > kernfs is an important facillity to support pseudo file systems and cgroup. > > Currently, with a global mutex, reading files concurrently from kernfs > > (e.g. /sys) > > is very slow. > > > > This problem is reported by Brice Goglin on thread: > > Re: [PATCH 1/4] drivers core: Introduce CPU type sysfs interface > > https://lore.kernel.org/lkml/x60dvjot4furc...@kroah.com/ > > > > I independently comfirmed this on a 96-core AWS c5.metal server. > > Do open+read+write on /sys/devices/system/cpu/cpu15/topology/core_id 1000 > > times. > > With a single thread it takes ~2.5 us for each open+read+close. > > With one thread per core, 96 threads running simultaneously takes 540 us > > for each of the same operation (without much variation) -- 200x slower than > > the > > single thread one. > > > > The problem can only be observed in large machines (>=16 cores). > > The more cores you have the slower it can be. > > > > Perf shows that CPUs spend most of the time (>80%) waiting on mutex locks in > > kernfs_iop_permission and kernfs_dop_revalidate. > > > > This patchset contains the following 2 patches: > > 0001-kernfs-replace-the-mutex-in-kernfs_iop_permission-wi.patch > > 0002-kernfs-remove-mutex-in-kernfs_dop_revalidate.patch > > > > 0001 replace the mutex lock in kernfs_iop_permission with a new rwlock and > > 0002 removes the mutex lock in kernfs_dop_revalidate. > > > > After applying this patchset, the multi-thread performance becomes linear > > with > > the fastest one at ~30 us to the worst at ~150 us, very similar as I tested > > it > > on a normal ext4 file system with fastest one at ~20 us to slowest at ~100 > > us. > > And I believe that is largely due to spin_locks in filesystems which are > > normal. > > > > Although it's still slower than single thread, users can benefit from this > > patchset, especially ones working on HPC realm with lots of cpu cores and > > want to > > fetch system information from sysfs. > > Does this mean that the changes slow down the single-threaded case? Or > that it's just not as good as the speed of a single-threaded access?
No, It won't influence the single-threaded case. I meant multi-threaded case is still not as good as single-threaded one. > But anyway, thanks so much for looking into this, it should help the > crazy systems out today, which means the normal systems in 5 years will > really appreciate this :) thanks :)