Re: [PATCH v2] signals: Avoid unnecessary taking of sighand->siglock
On 09/23/2016 03:43 PM, Stas Sergeev wrote: 23.09.2016 19:56, Waiman Long пишет: When running certain database workload on a high-end system with many CPUs, it was found that spinlock contention in the sigprocmask syscalls became a significant portion of the overall CPU cycles as shown below. Hi, I was recently facing the same problem, and my solution was to extract swapcontext() from libtask - it has better semantic and does not do sigprocmask. How much you hack sigprocmask, it is still faster to just not call it at all. Alternatively, perhaps the speed-up can be achieved if the current mask is exported to glibc via vdso. Just my 2 cents. The problem was in a third-party software not under our control. I am just doing my part to try to alleviate the problem from the kernel's perspective. Cheers, Longman
Re: [PATCH v2] signals: Avoid unnecessary taking of sighand->siglock
On 09/23, Waiman Long wrote: > > > + /* > + * In case the signal mask hasn't changed, we won't need to take > + * the lock. The current blocked mask can be modified by other CPUs. > + * To be safe, we need to do an atomic read without lock. As a result, > + * this check will only be done on 64-bit architectures. > + */ > + if ((_NSIG_WORDS == 1) && > + (READ_ONCE(tsk->blocked.sig[0]) == newset->sig[0])) > + return; so in case you missed my reply to V1, I still think that the comment is wrong and you should drop the _NSIG_WORDS check. Oleg.
Re: [PATCH v2] signals: Avoid unnecessary taking of sighand->siglock
23.09.2016 19:56, Waiman Long пишет: When running certain database workload on a high-end system with many CPUs, it was found that spinlock contention in the sigprocmask syscalls became a significant portion of the overall CPU cycles as shown below. Hi, I was recently facing the same problem, and my solution was to extract swapcontext() from libtask - it has better semantic and does not do sigprocmask. How much you hack sigprocmask, it is still faster to just not call it at all. Alternatively, perhaps the speed-up can be achieved if the current mask is exported to glibc via vdso. Just my 2 cents.
[PATCH v2] signals: Avoid unnecessary taking of sighand->siglock
When running certain database workload on a high-end system with many CPUs, it was found that spinlock contention in the sigprocmask syscalls became a significant portion of the overall CPU cycles as shown below. 9.30% 9.30% 905387 dataserver /proc/kcore 0x7fff8163f4d2 [k] _raw_spin_lock_irq | ---_raw_spin_lock_irq | |--99.34%-- __set_current_blocked | sigprocmask | sys_rt_sigprocmask | system_call_fastpath | | | |--50.63%-- __swapcontext | | | | | |--99.91%-- upsleepgeneric | | | |--49.36%-- __setcontext | | ktskRun Looking further into the swapcontext function in glibc, it was found that the function always call sigprocmask() without checking if there are changes in the signal mask. A check was added to the __set_current_blocked() function to avoid taking the sighand->siglock spinlock if there is no change in the signal mask. This will prevent unneeded spinlock contention when many threads are trying to call sigprocmask(). With this patch applied, the spinlock contention in sigprocmask() was gone. This patch is currently only active for 64-bit architectures. Signed-off-by: Waiman Long --- v1->v2: - Fix compiler warning in mips. kernel/signal.c | 10 ++ 1 files changed, 10 insertions(+), 0 deletions(-) diff --git a/kernel/signal.c b/kernel/signal.c index af21afc..e4296b6 100644 --- a/kernel/signal.c +++ b/kernel/signal.c @@ -2485,6 +2485,16 @@ void __set_current_blocked(const sigset_t *newset) { struct task_struct *tsk = current; + /* +* In case the signal mask hasn't changed, we won't need to take +* the lock. The current blocked mask can be modified by other CPUs. +* To be safe, we need to do an atomic read without lock. As a result, +* this check will only be done on 64-bit architectures. +*/ + if ((_NSIG_WORDS == 1) && + (READ_ONCE(tsk->blocked.sig[0]) == newset->sig[0])) + return; + spin_lock_irq(&tsk->sighand->siglock); __set_task_blocked(tsk, newset); spin_unlock_irq(&tsk->sighand->siglock); -- 1.7.1