Hi,

this patchset makes __do_SAK() to take tasklist_lock for very small time
in comparison to that it does now. Though this function is executed
in process context and it takes tasklist_lock read locked with interrupts 
enabled,
another tasks may want to take it for writing with interrupt disabled
(e.g., forking tasks), and these tasks may evoke hard lockups.

I've observed several hard lockups caused by long execution of __do_SAK()
on the node with 200 big containers. 3.10 kernel is used there, and mainline
kernel does not have differences in comparation to that, because of __do_SAK()
function has not changed for a long time. So, mainline kernel has this problem 
too.

The patchset proposes two optimizations in __do__SAK(). The first one is
to skip threads, when they share previous thread's fd table [2/3].

The second optimization is to iterate task list under rcu_read_lock().
This allows to take tasklist_lock for a very small time just to check we
reached the end of the task list. See patch [3/3] for the details.

v2: All three patches changed. Now we don't care about races with
    unshare_files() and do not take tasklist_lock on reaching task
    list end. Link to v1: https://lkml.org/lkml/2018/1/11/486

Thanks,
Kirill

---

Kirill Tkhai (2):
      Revert "do_SAK: Don't recursively take the tasklist_lock"
      tty: Use RCU read lock to iterate tasks and threads in __do_SAK()

Oleg Nesterov (1):
      tty: Avoid threads files iterations in __do_SAK()


 drivers/tty/tty_io.c |   41 ++++++++++++++++++++++++++++-------------
 1 file changed, 28 insertions(+), 13 deletions(-)

--
Signed-off-by: Kirill Tkhai <ktk...@virtuozzo.com>

Reply via email to