[PATCH 57/64] drivers/gpu: use mm locking wrappers

2018-02-04 Thread Davidlohr Bueso
From: Davidlohr Bueso <d...@stgolabs.net> This becomes quite straightforward with the mmrange in place. Those mmap_sem users that don't know about mmrange are updated trivially as the sem is used in the same context of the caller. Signed-off-by: Davidlohr Bueso <dbu...@suse.de> ---

[PATCH 55/64] arch/riscv: use mm locking wrappers

2018-02-04 Thread Davidlohr Bueso
From: Davidlohr Bueso <d...@stgolabs.net> This becomes quite straightforward with the mmrange in place. Signed-off-by: Davidlohr Bueso <dbu...@suse.de> --- arch/riscv/kernel/vdso.c | 5 +++-- arch/riscv/mm/fault.c| 10 +- 2 files changed, 8 insertions(+), 7 deletions(-)

[PATCH 57/64] drivers/gpu: use mm locking wrappers

2018-02-04 Thread Davidlohr Bueso
From: Davidlohr Bueso This becomes quite straightforward with the mmrange in place. Those mmap_sem users that don't know about mmrange are updated trivially as the sem is used in the same context of the caller. Signed-off-by: Davidlohr Bueso --- drivers/gpu/drm/amd/amdgpu/amdgpu_mn.c | 7

[PATCH 55/64] arch/riscv: use mm locking wrappers

2018-02-04 Thread Davidlohr Bueso
From: Davidlohr Bueso This becomes quite straightforward with the mmrange in place. Signed-off-by: Davidlohr Bueso --- arch/riscv/kernel/vdso.c | 5 +++-- arch/riscv/mm/fault.c| 10 +- 2 files changed, 8 insertions(+), 7 deletions(-) diff --git a/arch/riscv/kernel/vdso.c b/arch

[PATCH 58/64] drivers/infiniband: use mm locking wrappers

2018-02-04 Thread Davidlohr Bueso
From: Davidlohr Bueso <d...@stgolabs.net> This becomes quite straightforward with the mmrange in place. Signed-off-by: Davidlohr Bueso <dbu...@suse.de> --- drivers/infiniband/core/umem.c | 16 +--- drivers/infiniband/core/umem_odp.c | 11 ++--

[PATCH 58/64] drivers/infiniband: use mm locking wrappers

2018-02-04 Thread Davidlohr Bueso
From: Davidlohr Bueso This becomes quite straightforward with the mmrange in place. Signed-off-by: Davidlohr Bueso --- drivers/infiniband/core/umem.c | 16 +--- drivers/infiniband/core/umem_odp.c | 11 ++- drivers/infiniband/hw/hfi1/user_pages.c| 15

[PATCH 63/64] mm/mmap: hack drop down_write_nest_lock()

2018-02-04 Thread Davidlohr Bueso
From: Davidlohr Bueso <d...@stgolabs.net> * THIS IS A HACK * Directly call down_write() on i_mmap_rwsem (such that we don't have to convert it to a range lock) Signed-off-by: Davidlohr Bueso <dbu...@suse.de> --- mm/mmap.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)

[PATCH 63/64] mm/mmap: hack drop down_write_nest_lock()

2018-02-04 Thread Davidlohr Bueso
From: Davidlohr Bueso * THIS IS A HACK * Directly call down_write() on i_mmap_rwsem (such that we don't have to convert it to a range lock) Signed-off-by: Davidlohr Bueso --- mm/mmap.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/mm/mmap.c b/mm/mmap.c index

[PATCH 64/64] mm: convert mmap_sem to range mmap_lock

2018-02-04 Thread Davidlohr Bueso
From: Davidlohr Bueso <d...@stgolabs.net> With mmrange now in place and everyone using the mm locking wrappers, we can convert the rwsem to a the range locking scheme. Every single user of mmap_sem will use a full range, which means that there is no more parallelism than what we alrea

[PATCH 64/64] mm: convert mmap_sem to range mmap_lock

2018-02-04 Thread Davidlohr Bueso
From: Davidlohr Bueso With mmrange now in place and everyone using the mm locking wrappers, we can convert the rwsem to a the range locking scheme. Every single user of mmap_sem will use a full range, which means that there is no more parallelism than what we already had. This is the worst case

[PATCH 18/64] mm/ksm: teach about range locking

2018-02-04 Thread Davidlohr Bueso
From: Davidlohr Bueso <d...@stgolabs.net> Conversion is straightforward as most calls use mmap_sem within the same function context. No changes in semantics. Signed-off-by: Davidlohr Bueso <dbu...@suse.de> --- mm/ksm.c | 40 +++- 1 file changed, 2

[PATCH 18/64] mm/ksm: teach about range locking

2018-02-04 Thread Davidlohr Bueso
From: Davidlohr Bueso Conversion is straightforward as most calls use mmap_sem within the same function context. No changes in semantics. Signed-off-by: Davidlohr Bueso --- mm/ksm.c | 40 +++- 1 file changed, 23 insertions(+), 17 deletions(-) diff --git

[PATCH 01/64] interval-tree: build unconditionally

2018-02-04 Thread Davidlohr Bueso
From: Davidlohr Bueso <d...@stgolabs.net> In preparation for range locking, this patch gets rid of CONFIG_INTERVAL_TREE option as we will unconditionally build it. Signed-off-by: Davidlohr Bueso <dbu...@suse.de> --- drivers/gpu/drm/Kconfig | 2 -- drivers/gpu/drm/i915/Kconfig

[PATCH 13/64] fs/proc: teach about range locking

2018-02-04 Thread Davidlohr Bueso
From: Davidlohr Bueso <d...@stgolabs.net> And use mm locking wrappers -- no change in semantics. Signed-off-by: Davidlohr Bueso <dbu...@suse.de> --- fs/proc/base.c | 33 - fs/proc/task_mmu.c | 22 +++--- fs/proc/task_

[RFC PATCH 00/64] mm: towards parallel address space operations

2018-02-04 Thread Davidlohr Bueso
From: Davidlohr Bueso <d...@stgolabs.net> Hi, This patchset is a new version of both the range locking machinery as well as a full mmap_sem conversion that makes use of it -- as the worst case scenario as all mmap_sem calls are converted to a full range mmap_lock equivalent. As such,

[PATCH 08/64] mm: teach lock_page_or_retry() about range locking

2018-02-04 Thread Davidlohr Bueso
From: Davidlohr Bueso <d...@stgolabs.net> The mmap_sem locking rules for lock_page_or_retry() depends on the page being locked upon return, and can get funky. As such we need to teach the function about mmrange, which is passed on via vm_fault. Signed-off-by: Davidlohr Bueso <dbu..

[PATCH 01/64] interval-tree: build unconditionally

2018-02-04 Thread Davidlohr Bueso
From: Davidlohr Bueso In preparation for range locking, this patch gets rid of CONFIG_INTERVAL_TREE option as we will unconditionally build it. Signed-off-by: Davidlohr Bueso --- drivers/gpu/drm/Kconfig | 2 -- drivers/gpu/drm/i915/Kconfig | 1 - lib/Kconfig | 14

[PATCH 13/64] fs/proc: teach about range locking

2018-02-04 Thread Davidlohr Bueso
From: Davidlohr Bueso And use mm locking wrappers -- no change in semantics. Signed-off-by: Davidlohr Bueso --- fs/proc/base.c | 33 - fs/proc/task_mmu.c | 22 +++--- fs/proc/task_nommu.c | 22 +- 3 files changed, 44

[RFC PATCH 00/64] mm: towards parallel address space operations

2018-02-04 Thread Davidlohr Bueso
From: Davidlohr Bueso Hi, This patchset is a new version of both the range locking machinery as well as a full mmap_sem conversion that makes use of it -- as the worst case scenario as all mmap_sem calls are converted to a full range mmap_lock equivalent. As such, while there is no improvement

[PATCH 08/64] mm: teach lock_page_or_retry() about range locking

2018-02-04 Thread Davidlohr Bueso
From: Davidlohr Bueso The mmap_sem locking rules for lock_page_or_retry() depends on the page being locked upon return, and can get funky. As such we need to teach the function about mmrange, which is passed on via vm_fault. Signed-off-by: Davidlohr Bueso --- include/linux/pagemap.h | 7

[PATCH 03/64] mm: introduce mm locking wrappers

2018-02-04 Thread Davidlohr Bueso
From: Davidlohr Bueso <d...@stgolabs.net> This patch adds the necessary wrappers to encapsulate mmap_sem locking and will enable any future changes to be a lot more confined to here. In addition, future users will incrementally be added in the next patches. mm_[read/write]_[un]lock()

[PATCH 03/64] mm: introduce mm locking wrappers

2018-02-04 Thread Davidlohr Bueso
From: Davidlohr Bueso This patch adds the necessary wrappers to encapsulate mmap_sem locking and will enable any future changes to be a lot more confined to here. In addition, future users will incrementally be added in the next patches. mm_[read/write]_[un]lock() naming is used. Signed-off

[PATCH 05/64] mm,khugepaged: prepare passing of rangelock field to vm_fault

2018-02-04 Thread Davidlohr Bueso
From: Davidlohr Bueso <d...@stgolabs.net> When collapsing huge pages from swapin, a vm_fault structure is built and passed to do_swap_page(). The new range field of the vm_fault structure must be set correctly when dealing with range_lock. We teach the main workhorse, khugepaged_scan_m

[PATCH 05/64] mm,khugepaged: prepare passing of rangelock field to vm_fault

2018-02-04 Thread Davidlohr Bueso
From: Davidlohr Bueso When collapsing huge pages from swapin, a vm_fault structure is built and passed to do_swap_page(). The new range field of the vm_fault structure must be set correctly when dealing with range_lock. We teach the main workhorse, khugepaged_scan_mm_slot(), to pass on a full

[PATCH 15/64] ipc: use mm locking wrappers

2018-02-04 Thread Davidlohr Bueso
From: Davidlohr Bueso <d...@stgolabs.net> This is straightforward as the necessary syscalls already know about mmrange. No change in semantics. Signed-off-by: Davidlohr Bueso <dbu...@suse.de> --- ipc/shm.c | 8 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a

[PATCH 15/64] ipc: use mm locking wrappers

2018-02-04 Thread Davidlohr Bueso
From: Davidlohr Bueso This is straightforward as the necessary syscalls already know about mmrange. No change in semantics. Signed-off-by: Davidlohr Bueso --- ipc/shm.c | 8 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/ipc/shm.c b/ipc/shm.c index 6c29c791c7f2

[PATCH 02/64] Introduce range reader/writer lock

2018-02-04 Thread Davidlohr Bueso
From: Davidlohr Bueso <d...@stgolabs.net> This implements a sleepable range rwlock, based on interval tree, serializing conflicting/intersecting/overlapping ranges within the tree. The largest range is given by [0, ~0] (inclusive). Unlike traditional locks, range locking involves d

[PATCH 07/64] mm/hugetlb: teach hugetlb_fault() about range locking

2018-02-04 Thread Davidlohr Bueso
From: Davidlohr Bueso <d...@stgolabs.net> Such that we can pass the mmrange along to vm_fault for page in userfault range (handle_userfault()) which gets funky with mmap_sem - just look at the locking rules. Signed-off-by: Davidlohr Bueso <dbu...@suse.de> --- include/linux/hu

[PATCH 07/64] mm/hugetlb: teach hugetlb_fault() about range locking

2018-02-04 Thread Davidlohr Bueso
From: Davidlohr Bueso Such that we can pass the mmrange along to vm_fault for page in userfault range (handle_userfault()) which gets funky with mmap_sem - just look at the locking rules. Signed-off-by: Davidlohr Bueso --- include/linux/hugetlb.h | 9 + mm/gup.c| 3

[PATCH 02/64] Introduce range reader/writer lock

2018-02-04 Thread Davidlohr Bueso
From: Davidlohr Bueso This implements a sleepable range rwlock, based on interval tree, serializing conflicting/intersecting/overlapping ranges within the tree. The largest range is given by [0, ~0] (inclusive). Unlike traditional locks, range locking involves dealing with the tree itself

[PATCH 09/64] mm/mmu_notifier: teach oom reaper about range locking

2018-02-04 Thread Davidlohr Bueso
From: Davidlohr Bueso <d...@stgolabs.net> Also begin using mm_is_locked() wrappers (which is sometimes the only reason why mm_has_blockable_invalidate_notifiers() needs to be aware of the range passed back in oom_reap_task_mm(). Signed-off-by: Davidlohr Bueso <dbu...@suse.de> --- i

[PATCH 09/64] mm/mmu_notifier: teach oom reaper about range locking

2018-02-04 Thread Davidlohr Bueso
From: Davidlohr Bueso Also begin using mm_is_locked() wrappers (which is sometimes the only reason why mm_has_blockable_invalidate_notifiers() needs to be aware of the range passed back in oom_reap_task_mm(). Signed-off-by: Davidlohr Bueso --- include/linux/mmu_notifier.h | 6 -- mm

[PATCH 04/64] mm: add a range parameter to the vm_fault structure

2018-02-04 Thread Davidlohr Bueso
From: Davidlohr Bueso <d...@stgolabs.net> When handling a page fault, it happens that the mmap_sem is released during the processing. As moving to range lock requires to pass the range parameter to the lock/unlock operation, this patch add a pointer to the range structure used when l

[PATCH 04/64] mm: add a range parameter to the vm_fault structure

2018-02-04 Thread Davidlohr Bueso
From: Davidlohr Bueso When handling a page fault, it happens that the mmap_sem is released during the processing. As moving to range lock requires to pass the range parameter to the lock/unlock operation, this patch add a pointer to the range structure used when locking the mmap_sem to vm_fault

Re: [PATCH] IB/mthca: Fix how mthca_map_user_db() calls gup

2018-01-25 Thread Davidlohr Bueso
possible. Furthermore this is safe wrt to the mutex as other callers that take the lock (unmap and alloc_db) are not called under mmap_sem (hence possible deadlock). Signed-off-by: Davidlohr Bueso <dbu...@suse.de> --- drivers/infiniband/hw/mthca/mthca_memfree.c | 2 +- 1 file changed, 1 insertion

Re: [PATCH] IB/mthca: Fix how mthca_map_user_db() calls gup

2018-01-25 Thread Davidlohr Bueso
possible. Furthermore this is safe wrt to the mutex as other callers that take the lock (unmap and alloc_db) are not called under mmap_sem (hence possible deadlock). Signed-off-by: Davidlohr Bueso --- drivers/infiniband/hw/mthca/mthca_memfree.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) d

[PATCH] IB/mthca: Fix how mthca_map_user_db() calls gup

2018-01-23 Thread Davidlohr Bueso
t;v...@zeniv.linux.org.uk> Signed-off-by: Davidlohr Bueso <dbu...@suse.de> --- - Compile tested only. - Should I be wrong about no callers already holding mmap_sem, I still think calling gup without the mutex makes sense for improved paralellism. Now, if callers can hold the mmap_sem, it's

[PATCH] IB/mthca: Fix how mthca_map_user_db() calls gup

2018-01-23 Thread Davidlohr Bueso
-by: Davidlohr Bueso --- - Compile tested only. - Should I be wrong about no callers already holding mmap_sem, I still think calling gup without the mutex makes sense for improved paralellism. Now, if callers can hold the mmap_sem, it's wrong to do copy_from_user right before calling

[PATCH] ia64/err-inject: Use get_user_pages_fast()

2018-01-22 Thread Davidlohr Bueso
At the point of sysfs callback, the call to gup is done without mmap_sem (or any lock for that matter). This is racy. As such, use the get_user_pages_fast() alternative and safely avoid taking the lock, if possible. Signed-off-by: Davidlohr Bueso <dbu...@suse.de> --- arch/ia64/

[PATCH] ia64/err-inject: Use get_user_pages_fast()

2018-01-22 Thread Davidlohr Bueso
At the point of sysfs callback, the call to gup is done without mmap_sem (or any lock for that matter). This is racy. As such, use the get_user_pages_fast() alternative and safely avoid taking the lock, if possible. Signed-off-by: Davidlohr Bueso --- arch/ia64/kernel/err_inject.c | 2 +- 1 file

Re: [PATCH] arch/cris: use get_user_pages_fast()

2018-01-21 Thread Davidlohr Bueso
On Sun, 21 Jan 2018, Al Viro wrote: On Sun, Jan 21, 2018 at 02:59:29PM -0800, Davidlohr Bueso wrote: Since we currently hold mmap_sem across both gup calls (and nothing more), we can substitute it with two _fast() alternatives and possibly avoid grabbing the lock. This was found while adding

Re: [PATCH] arch/cris: use get_user_pages_fast()

2018-01-21 Thread Davidlohr Bueso
On Sun, 21 Jan 2018, Al Viro wrote: On Sun, Jan 21, 2018 at 02:59:29PM -0800, Davidlohr Bueso wrote: Since we currently hold mmap_sem across both gup calls (and nothing more), we can substitute it with two _fast() alternatives and possibly avoid grabbing the lock. This was found while adding

[PATCH] arch/cris: use get_user_pages_fast()

2018-01-21 Thread Davidlohr Bueso
-off-by: Davidlohr Bueso <dbu...@suse.de> --- arch/cris/arch-v32/drivers/cryptocop.c | 29 ++--- 1 file changed, 10 insertions(+), 19 deletions(-) diff --git a/arch/cris/arch-v32/drivers/cryptocop.c b/arch/cris/arch-v32/drivers/cryptocop.c index d688fe117dca..76f8d3

[PATCH] arch/cris: use get_user_pages_fast()

2018-01-21 Thread Davidlohr Bueso
-off-by: Davidlohr Bueso --- arch/cris/arch-v32/drivers/cryptocop.c | 29 ++--- 1 file changed, 10 insertions(+), 19 deletions(-) diff --git a/arch/cris/arch-v32/drivers/cryptocop.c b/arch/cris/arch-v32/drivers/cryptocop.c index d688fe117dca..76f8d3b1d39e 100644 --- a/arch

Re: [lkp-robot] [perf machine] 8edf8850d5: stderr./usr/src/linux-perf-x86_64-rhel-#/tools/perf/util/rb_resort.h:#:#:error:passing_argument#of'threads_sorted__new'from_incompatible_pointer_type[-Werro

2018-01-05 Thread Davidlohr Bueso
On Thu, 04 Jan 2018, kernel test robot wrote: [ 68.830934] /usr/src/linux-perf-x86_64-rhel-7.2-8edf8850d51e911a35b5d7aad4f8604db11abc66/tools/perf/util/rb_resort.h:148:28: error: passing argument 1 of 'threads_sorted__new' from incompatible pointer type [-Werror=incompatible-pointer-types]

Re: [lkp-robot] [perf machine] 8edf8850d5: stderr./usr/src/linux-perf-x86_64-rhel-#/tools/perf/util/rb_resort.h:#:#:error:passing_argument#of'threads_sorted__new'from_incompatible_pointer_type[-Werro

2018-01-05 Thread Davidlohr Bueso
On Thu, 04 Jan 2018, kernel test robot wrote: [ 68.830934] /usr/src/linux-perf-x86_64-rhel-7.2-8edf8850d51e911a35b5d7aad4f8604db11abc66/tools/perf/util/rb_resort.h:148:28: error: passing argument 1 of 'threads_sorted__new' from incompatible pointer type [-Werror=incompatible-pointer-types]

Re: [PATCH 2/2] lib: cleanup dead code from rbtree_test.c

2017-12-17 Thread Davidlohr Bueso
On Sun, 17 Dec 2017, Pravin Shedge wrote: lib/rbtree_test.c code allows to compile either as a loadable modules or builtin into the kernel. Current code returns -EAGAIN on successful termination from module_init. Such a fail will directly unload the module and hence there is no scope to

Re: [PATCH 2/2] lib: cleanup dead code from rbtree_test.c

2017-12-17 Thread Davidlohr Bueso
On Sun, 17 Dec 2017, Pravin Shedge wrote: lib/rbtree_test.c code allows to compile either as a loadable modules or builtin into the kernel. Current code returns -EAGAIN on successful termination from module_init. Such a fail will directly unload the module and hence there is no scope to

Re: [PATCH 2/3] perf bench futex: Add --affine-wakers option to wake-parallel

2017-12-06 Thread Davidlohr Bueso
Hi, any reason this patch didn't make it into -tip?k Thanks, Davidlohr

Re: [PATCH 2/3] perf bench futex: Add --affine-wakers option to wake-parallel

2017-12-06 Thread Davidlohr Bueso
Hi, any reason this patch didn't make it into -tip?k Thanks, Davidlohr

[tip:perf/core] perf bench futex: Use cpumaps

2017-12-06 Thread tip-bot for Davidlohr Bueso
Commit-ID: 3b2323c2c1c4acf8961cfcdddcee9889daaa21e3 Gitweb: https://git.kernel.org/tip/3b2323c2c1c4acf8961cfcdddcee9889daaa21e3 Author: Davidlohr Bueso <d...@stgolabs.net> AuthorDate: Sun, 26 Nov 2017 20:20:59 -0800 Committer: Arnaldo Carvalho de Melo <a...@redhat.com> Com

[tip:perf/core] perf bench futex: Use cpumaps

2017-12-06 Thread tip-bot for Davidlohr Bueso
Commit-ID: 3b2323c2c1c4acf8961cfcdddcee9889daaa21e3 Gitweb: https://git.kernel.org/tip/3b2323c2c1c4acf8961cfcdddcee9889daaa21e3 Author: Davidlohr Bueso AuthorDate: Sun, 26 Nov 2017 20:20:59 -0800 Committer: Arnaldo Carvalho de Melo CommitDate: Thu, 30 Nov 2017 14:02:05 -0300 perf

Re: waitqueue lockdep annotation

2017-12-05 Thread Davidlohr Bueso
On Tue, 05 Dec 2017, Jason Baron wrote: On 12/01/2017 06:03 PM, Christoph Hellwig wrote: True. The patch below survives the amazing complex booting and starting systemd with lockdep enabled test. Do we have something resembling a epoll test suite? I don't think we have any in the kernel

Re: waitqueue lockdep annotation

2017-12-05 Thread Davidlohr Bueso
On Tue, 05 Dec 2017, Jason Baron wrote: On 12/01/2017 06:03 PM, Christoph Hellwig wrote: True. The patch below survives the amazing complex booting and starting systemd with lockdep enabled test. Do we have something resembling a epoll test suite? I don't think we have any in the kernel

Re: [PATCH 2/2] ipc: Fix ipc data structures inconsistency

2017-12-02 Thread Davidlohr Bueso
On Sat, 02 Dec 2017, Philippe Mikoyan wrote: On Fri, 1 Dec 2017 09:20:07 -0800 Davidlohr Bueso <d...@stgolabs.net> wrote: Hmm yeah that's pretty fishy, also shm_atime = 0, no? Yeah, definitely, other data structure fields can also be inconsistent, and applying not only to shmem, bu

Re: [PATCH 2/2] ipc: Fix ipc data structures inconsistency

2017-12-02 Thread Davidlohr Bueso
On Sat, 02 Dec 2017, Philippe Mikoyan wrote: On Fri, 1 Dec 2017 09:20:07 -0800 Davidlohr Bueso wrote: Hmm yeah that's pretty fishy, also shm_atime = 0, no? Yeah, definitely, other data structure fields can also be inconsistent, and applying not only to shmem, but also to other ipc

Re: [PATCH] rbtree.txt: fix typo in cached rbtree section

2017-12-02 Thread Davidlohr Bueso
I'm happy to ack your patch but you have to send it correctly. You are missing a changelog (although it ought to be small due to the trivial change) as well as you SoB tag. Please consult Documentation/process/submitting-patches.rst. You also need to Cc akpm as he routes such areas to Linus.

Re: [PATCH] rbtree.txt: fix typo in cached rbtree section

2017-12-02 Thread Davidlohr Bueso
I'm happy to ack your patch but you have to send it correctly. You are missing a changelog (although it ought to be small due to the trivial change) as well as you SoB tag. Please consult Documentation/process/submitting-patches.rst. You also need to Cc akpm as he routes such areas to Linus.

Re: [PATCH 2/2] ipc: Fix ipc data structures inconsistency

2017-12-01 Thread Davidlohr Bueso
much the same (being queued)... but that's irrelevant to this patch. I like that you manage to do security and such checks still only under rcu, like all ipc calls work; *_stat() is no longer special. With a nit below: Reviewed-by: Davidlohr Bueso <dbu...@suse.de> diff --git a/ipc/util.c

Re: [PATCH 2/2] ipc: Fix ipc data structures inconsistency

2017-12-01 Thread Davidlohr Bueso
much the same (being queued)... but that's irrelevant to this patch. I like that you manage to do security and such checks still only under rcu, like all ipc calls work; *_stat() is no longer special. With a nit below: Reviewed-by: Davidlohr Bueso diff --git a/ipc/util.c b/ipc/util.c index

Re: [PATCH v8 1/6] lib/dlock-list: Distributed and lock-protected lists

2017-11-29 Thread Davidlohr Bueso
to the list_for_each_entry() and list_for_each_entry_safe() macros respectively. The iteration states are keep in a dlock_list_iter structure that is passed to the iteration macros. Signed-off-by: Waiman Long <long...@redhat.com> Reviewed-by: Jan Kara <j...@suse.cz> Reviewed-by: Davidloh

Re: [PATCH v8 1/6] lib/dlock-list: Distributed and lock-protected lists

2017-11-29 Thread Davidlohr Bueso
to the list_for_each_entry() and list_for_each_entry_safe() macros respectively. The iteration states are keep in a dlock_list_iter structure that is passed to the iteration macros. Signed-off-by: Waiman Long Reviewed-by: Jan Kara Reviewed-by: Davidlohr Bueso

Re: [PATCH v8 0/6] vfs: Use dlock list for SB's s_inodes list

2017-11-29 Thread Davidlohr Bueso
Are you planning on sending a v9 with the discussed changes? afaict: - Drop last two patches - Fix tearing (WRITE/READ_ONCE()) - Reduce barrier usage for dlock_lists_empty() -- which I'll be sending you shortly. Thanks, Davidlohr On Tue, 31 Oct 2017, Waiman Long wrote: v7->v8: - Integrate

Re: [PATCH v8 0/6] vfs: Use dlock list for SB's s_inodes list

2017-11-29 Thread Davidlohr Bueso
Are you planning on sending a v9 with the discussed changes? afaict: - Drop last two patches - Fix tearing (WRITE/READ_ONCE()) - Reduce barrier usage for dlock_lists_empty() -- which I'll be sending you shortly. Thanks, Davidlohr On Tue, 31 Oct 2017, Waiman Long wrote: v7->v8: - Integrate

Re: [RFC PATCH] ipc, mqueue: lazy call kern_mount_data in new namespaces

2017-11-27 Thread Davidlohr Bueso
This is akpm domain. On Mon, 27 Nov 2017, Giuseppe Scrivano wrote: kern_mount_data is a relatively expensive operation when creating a new IPC namespace, so delay the mount until its first usage when not creating the the global namespace. On my machine, the time for creating 1000 new IPC

Re: [RFC PATCH] ipc, mqueue: lazy call kern_mount_data in new namespaces

2017-11-27 Thread Davidlohr Bueso
This is akpm domain. On Mon, 27 Nov 2017, Giuseppe Scrivano wrote: kern_mount_data is a relatively expensive operation when creating a new IPC namespace, so delay the mount until its first usage when not creating the the global namespace. On my machine, the time for creating 1000 new IPC

Re: [PATCH -tip 0/7] tools/perf: Update rbtree implementation and optimize users

2017-11-27 Thread Davidlohr Bueso
On Sun, 26 Nov 2017, Andi Kleen wrote: Applies on today's -tip tree. Please consider for v4.16. No numbers on the improvements? I have not done performance tests perse for perf, but ultimately it will depend a lot on the workload and size of the tree. I had previously measured, on a Xeon

Re: [PATCH -tip 0/7] tools/perf: Update rbtree implementation and optimize users

2017-11-27 Thread Davidlohr Bueso
On Sun, 26 Nov 2017, Andi Kleen wrote: Applies on today's -tip tree. Please consider for v4.16. No numbers on the improvements? I have not done performance tests perse for perf, but ultimately it will depend a lot on the workload and size of the tree. I had previously measured, on a Xeon

[PATCH 1/3] perf bench futex: Use cpumaps

2017-11-26 Thread Davidlohr Bueso
.@arm.com> Signed-off-by: Davidlohr Bueso <dbu...@suse.de> --- tools/perf/bench/futex-hash.c | 19 --- tools/perf/bench/futex-lock-pi.c | 23 ++- tools/perf/bench/futex-requeue.c | 22 +- tools/perf/bench/futex-wa

[PATCH 1/3] perf bench futex: Use cpumaps

2017-11-26 Thread Davidlohr Bueso
an approach with cpumaps that use an in house flavor. Instead of re-inventing the wheel, I've redone the patch such that we use the perf's util/cpumap.c interface instead. Applies to all futex benchmarks. Cc: Kim Phillips Originally-from: James Yang Signed-off-by: Davidlohr Bueso --- tools

[PATCH 3/3] perf bench futex: sync waker threads

2017-11-26 Thread Davidlohr Bueso
s show they are not all running concurrently because older waker threads finish their task before newer waker threads even start. This patch uses a barrier to better synchronize the waker threads. Cc: Kim Phillips <kim.phill...@arm.com> Signed-off-by: James Yang <james.y...@arm.com Signed-off-by:

[PATCH 3/3] perf bench futex: sync waker threads

2017-11-26 Thread Davidlohr Bueso
From: James Yang Waker threads in the futex wake-parallel benchmark are started by a loop using pthread_create(). However, there is no synchronization for when the waker threads wake the waiting threads. Comparison of the waker threads' measurement timestamps show they are not all running

[PATCH -tip 0/3] perf bench futex: Improvements

2017-11-26 Thread Davidlohr Bueso
Hi, I'm resending the patches from James Yang and Kim Phillips that improve the perf bench futex benchmarks. Noticibly: patch1 makes use of util/cpumap.c patch2/3 are the same as the original only that I've split it into two and removed some bogus debug noise. Davidlohr Bueso (3): perf bench

[PATCH -tip 0/3] perf bench futex: Improvements

2017-11-26 Thread Davidlohr Bueso
Hi, I'm resending the patches from James Yang and Kim Phillips that improve the perf bench futex benchmarks. Noticibly: patch1 makes use of util/cpumap.c patch2/3 are the same as the original only that I've split it into two and removed some bogus debug noise. Davidlohr Bueso (3): perf bench

[PATCH 2/3] perf bench futex: Add --affine-wakers option to wake-parallel

2017-11-26 Thread Davidlohr Bueso
e CPUs instead of having the scheduler place them. Cc: Kim Phillips <kim.phill...@arm.com> Signed-off-by: James Yang <james.y...@arm.com> Signed-off-by: Davidlohr Bueso <dbu...@suse.de> --- tools/perf/bench/futex-wake-parallel.c | 18 -- 1 file changed, 16 in

[PATCH 2/3] perf bench futex: Add --affine-wakers option to wake-parallel

2017-11-26 Thread Davidlohr Bueso
the scheduler place them. Cc: Kim Phillips Signed-off-by: James Yang Signed-off-by: Davidlohr Bueso --- tools/perf/bench/futex-wake-parallel.c | 18 -- 1 file changed, 16 insertions(+), 2 deletions(-) diff --git a/tools/perf/bench/futex-wake-parallel.c b/tools/perf/bench/futex

[PATCH 5/7] perf symbols: Use cached rbtrees

2017-11-26 Thread Davidlohr Bueso
At the cost of an extra pointer, we can avoid the O(logN) cost of finding the first element in the tree (smallest node). Signed-off-by: Davidlohr Bueso <dbu...@suse.de> --- tools/perf/builtin-annotate.c | 2 +- tools/perf/builtin-report.c | 2 +- tools/perf/builtin

[PATCH 5/7] perf symbols: Use cached rbtrees

2017-11-26 Thread Davidlohr Bueso
At the cost of an extra pointer, we can avoid the O(logN) cost of finding the first element in the tree (smallest node). Signed-off-by: Davidlohr Bueso --- tools/perf/builtin-annotate.c | 2 +- tools/perf/builtin-report.c | 2 +- tools/perf/builtin-top.c| 4

[PATCH 2/7] perf machine: Use cached rbtrees

2017-11-26 Thread Davidlohr Bueso
ticing that the rb_erase_init() calls have been replaced by rb_erase_cached() which has no _init() flavor, however, the node is explicitly cleared next anyway, which was redundant until now. Signed-off-by: Davidlohr Bueso <dbu...@suse.de> --- tools/perf/util/build-id.c | 12 +++ too

[PATCH 2/7] perf machine: Use cached rbtrees

2017-11-26 Thread Davidlohr Bueso
ticing that the rb_erase_init() calls have been replaced by rb_erase_cached() which has no _init() flavor, however, the node is explicitly cleared next anyway, which was redundant until now. Signed-off-by: Davidlohr Bueso --- tools/perf/util/build-id.c | 12 +++ tools/perf/util/machine.c

[PATCH 4/7] perf util: Use cached rbtree for rblists

2017-11-26 Thread Davidlohr Bueso
probes, and buildid. Signed-off-by: Davidlohr Bueso <dbu...@suse.de> --- tools/perf/util/intlist.h | 2 +- tools/perf/util/metricgroup.c | 2 +- tools/perf/util/rblist.c | 30 ++ tools/perf/util/rblist.h | 2 +- tools/perf/util/stat-shadow.

[PATCH 4/7] perf util: Use cached rbtree for rblists

2017-11-26 Thread Davidlohr Bueso
probes, and buildid. Signed-off-by: Davidlohr Bueso --- tools/perf/util/intlist.h | 2 +- tools/perf/util/metricgroup.c | 2 +- tools/perf/util/rblist.c | 30 ++ tools/perf/util/rblist.h | 2 +- tools/perf/util/stat-shadow.c | 2 +- tools/perf/util

[PATCH 6/7] perf hist: Use cached rbtrees

2017-11-26 Thread Davidlohr Bueso
hist::entries hist::entries_collapsed hist_entry::hroot_in hist_entry::hroot_out Signed-off-by: Davidlohr Bueso <dbu...@suse.de> --- tools/perf/builtin-annotate.c | 2 +- tools/perf/builtin-c2c.c | 6 +- tools/perf/builtin-diff.c | 10 +- tools/perf/builtin

[PATCH 6/7] perf hist: Use cached rbtrees

2017-11-26 Thread Davidlohr Bueso
hist::entries hist::entries_collapsed hist_entry::hroot_in hist_entry::hroot_out Signed-off-by: Davidlohr Bueso --- tools/perf/builtin-annotate.c | 2 +- tools/perf/builtin-c2c.c | 6 +- tools/perf/builtin-diff.c | 10 +- tools/perf/builtin-top.c | 2 +- tools

[PATCH -tip 0/7] tools/perf: Update rbtree implementation and optimize users

2017-11-26 Thread Davidlohr Bueso
be a little tricky depending on if the user get smart about the trees). I'm sorry if some patches seem too big, I've tried to split them the best I could. Applies on today's -tip tree. Please consider for v4.16. Thanks! Davidlohr Bueso (7): tools/perf: Update rbtree implementation perf machine

[PATCH 7/7] perf sched: Use cached rbtrees

2017-11-26 Thread Davidlohr Bueso
At the cost of an extra pointer, we can avoid the O(logN) cost of finding the first element in the tree (smallest node), which is something heavily required for perf-sched. Signed-off-by: Davidlohr Bueso <dbu...@suse.de> --- tools/perf/builtin-sched.

[PATCH -tip 0/7] tools/perf: Update rbtree implementation and optimize users

2017-11-26 Thread Davidlohr Bueso
be a little tricky depending on if the user get smart about the trees). I'm sorry if some patches seem too big, I've tried to split them the best I could. Applies on today's -tip tree. Please consider for v4.16. Thanks! Davidlohr Bueso (7): tools/perf: Update rbtree implementation perf machine

[PATCH 7/7] perf sched: Use cached rbtrees

2017-11-26 Thread Davidlohr Bueso
At the cost of an extra pointer, we can avoid the O(logN) cost of finding the first element in the tree (smallest node), which is something heavily required for perf-sched. Signed-off-by: Davidlohr Bueso --- tools/perf/builtin-sched.c | 45 + 1 file

[PATCH 3/7] perf callchain: Use cached rbtrees

2017-11-26 Thread Davidlohr Bueso
At the cost of an extra pointer, we can avoid the O(logN) cost of finding the first element in the tree (smallest node), which is something required for nearly every srcline callchain node deletion (srcline__tree_delete()). Signed-off-by: Davidlohr Bueso <dbu...@suse.de> --- tools/per

[PATCH 3/7] perf callchain: Use cached rbtrees

2017-11-26 Thread Davidlohr Bueso
At the cost of an extra pointer, we can avoid the O(logN) cost of finding the first element in the tree (smallest node), which is something required for nearly every srcline callchain node deletion (srcline__tree_delete()). Signed-off-by: Davidlohr Bueso --- tools/perf/util/dso.c | 2

[PATCH 1/7] tools/perf: Update rbtree implementation

2017-11-26 Thread Davidlohr Bueso
There have been a number of changes in the kernel's rbrtee implementation, including loose lockless searching guarantees and rb_root_cached, which later patches will use as an optimization. Signed-off-by: Davidlohr Bueso <dbu...@suse.de> --- tools/include/linux/rbtree.h

[PATCH 1/7] tools/perf: Update rbtree implementation

2017-11-26 Thread Davidlohr Bueso
There have been a number of changes in the kernel's rbrtee implementation, including loose lockless searching guarantees and rb_root_cached, which later patches will use as an optimization. Signed-off-by: Davidlohr Bueso --- tools/include/linux/rbtree.h | 50 -- tools/include

Re: [PATCH 1/3] perf bench futex: benchmark only online CPUs

2017-11-24 Thread Davidlohr Bueso
On Thu, 23 Nov 2017, Arnaldo Carvalho de Melo wrote: Em Thu, Nov 23, 2017 at 12:09:48PM -0300, Arnaldo Carvalho de Melo escreveu: Em Wed, Nov 22, 2017 at 06:25:28PM -0600, Kim Phillips escreveu: > From: James Yang > > The "perf bench futex" benchmarks have a problem when

Re: [PATCH 1/3] perf bench futex: benchmark only online CPUs

2017-11-24 Thread Davidlohr Bueso
On Thu, 23 Nov 2017, Arnaldo Carvalho de Melo wrote: Em Thu, Nov 23, 2017 at 12:09:48PM -0300, Arnaldo Carvalho de Melo escreveu: Em Wed, Nov 22, 2017 at 06:25:28PM -0600, Kim Phillips escreveu: > From: James Yang > > The "perf bench futex" benchmarks have a problem when not all CPUs in > the

Re: [PATCH] mm: Use vma_pages helper

2017-11-22 Thread Davidlohr Bueso
com> Acked-by: Davidlohr Bueso <dbu...@suse.de> ... but you missed akpm.

Re: [PATCH] mm: Use vma_pages helper

2017-11-22 Thread Davidlohr Bueso
On Wed, 22 Nov 2017, Vasyl Gomonovych wrote: Use vma_pages function on vma object instead of explicit computation. mm/interval_tree.c:21:27-33: WARNING: Consider using vma_pages helper Generated by: scripts/coccinelle/api/vma_pages.cocci Signed-off-by: Vasyl Gomonovych Acked-by: Davidlohr

Re: [PATCH v4] lib/dlock-list: Scale dlock_lists_empty()

2017-11-09 Thread Davidlohr Bueso
On Wed, 08 Nov 2017, Boqun Feng wrote: Or worse: * CPU0 CPU1 * dlock_list_add() dlock_lists_empty() *smp_mb__before_atomic(); *[L]

Re: [PATCH v4] lib/dlock-list: Scale dlock_lists_empty()

2017-11-09 Thread Davidlohr Bueso
On Wed, 08 Nov 2017, Boqun Feng wrote: Or worse: * CPU0 CPU1 * dlock_list_add() dlock_lists_empty() *smp_mb__before_atomic(); *[L]

Re: [PATCH] locktorture: Fix Oops when reader/writer count is 0

2017-11-09 Thread Davidlohr Bueso
On Wed, 08 Nov 2017, Paul E. McKenney wrote: Jeremy, could you please test Dave's patches and make sure that they work for you? That way I can apply your Tested-by. Dave, any objection to my adding Jeremy's Reported-by to your /201 patch? No, feel free. Thanks, Davidlohr

Re: [PATCH] locktorture: Fix Oops when reader/writer count is 0

2017-11-09 Thread Davidlohr Bueso
On Wed, 08 Nov 2017, Paul E. McKenney wrote: Jeremy, could you please test Dave's patches and make sure that they work for you? That way I can apply your Tested-by. Dave, any objection to my adding Jeremy's Reported-by to your /201 patch? No, feel free. Thanks, Davidlohr

<    4   5   6   7   8   9   10   11   12   13   >