I was able to verify/test in both Bionic and Cosmic proposed kernels, respectively: 4.15.0-44.47 and 4.18.0-14.15.
I don't have a reproducer, but to exercise the paths modified by the patches, the following approach was taken: (a) Open ssh connection to the host/test machine, and run the following there: DIR="/sys/kernel/debug/tracing" echo tty_reopen > $DIR/set_ftrace_filter echo function > $DIR/current_tracer echo 'p:tty_name n_tty_receive_buf2 tty=+0x170(%di):string' > $DIR/kprobe_events echo 1 > $DIR/events/kprobes/tty_name/enable echo > trace Then, start running the following loop: $ while true; do pkill -9 -t pts/1; sleep 1; done In this point, we don't have a pts/1 there, but keep it running. (b) In another terminal from the ssh client, run: $ while true; do ssh <host/test machine ip>; done Notice it's interesting to have the following in the .ssh/config of the ssh client machine: Host <test/host machine alias> ControlMaster auto ControlPath ~/.ssh/%r@%h-%p ControlPersist 600 in order to keep only one ssh connection opened. (c) While the SSH in pts/1 is opened and killed automatically (and reopened by the loop), user must keep typing things in the keyboard in that terminal to force the tty flush. (d) After running that for some seconds, one can verify in the trace output that the functions modified by the main patch in the SRUed series are there: $ grep "pts1\|reopen" $DIR/trace|cut -f2- -d]|cut -f2- -d:|sort |uniq -c 66 tty_name: (n_tty_receive_buf2+0x0/0x20) tty="pts1" 60 tty_reopen <-tty_open Also, the pattern showed in the trace file shows that the functions are called intermixed: [...] kworker/u56:1-3602 [000] .... 881.779225: tty_name: (n_tty_receive_buf2+0x0/0x20) tty="pts1" kworker/u56:1-3602 [000] .... 881.861901: tty_name: (n_tty_receive_buf2+0x0/0x20) tty="pts1" sshd-3403 [023] .... 882.249355: tty_reopen <-tty_open bash-4052 [008] .... 882.250432: tty_reopen <-tty_open bash-4052 [008] .... 882.250441: tty_reopen <-tty_open bash-4052 [008] .... 882.251935: tty_reopen <-tty_open kworker/u56:1-3602 [000] .... 882.440866: tty_name: (n_tty_receive_buf2+0x0/0x20) tty="pts1" kworker/u56:1-3602 [000] .... 882.482994: tty_name: (n_tty_receive_buf2+0x0/0x20) tty="pts1" [...] Worth to notice that I've ran the test in 4.18.0-13 before, and I've noticed a small delay in the machine while running the test after updating to the -proposed version, probably due to the lock mechanism added. ** Tags removed: verification-needed-bionic verification-needed-cosmic ** Tags added: verification-done-bionic verification-done-cosmic -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1791758 Title: ldisc crash on reopened tty Status in linux package in Ubuntu: Fix Committed Status in linux source package in Trusty: Won't Fix Status in linux source package in Xenial: Fix Committed Status in linux source package in Bionic: Fix Committed Status in linux source package in Cosmic: Fix Committed Bug description: [Impact] * Line discipline code is racy when we have buffer being flush while the tty is being initialized or reinitialized. For the first problem, we have an upstream patch since January 2018: b027e2298bd5 ("tty: fix data race between tty_init_dev and flush of buf") - although it is not in Ubuntu kernel 4.4, only in kernels 4.15 and subsequent ones. * For the race between the buffer flush while tty is being reopened, we have a patch that addresses this issue recently merged for 5.0-rc1: 83d817f41070 ("tty: Hold tty_ldisc_lock() during tty_reopen()"). No Ubuntu kernel currently contains this patch, hence we're hereby submitting the SRU request. The upstream complete patch series for this is in [0]. * The approach of both patches are similar - they rely in locking/semaphore to prevent race conditions. Some additional patches are necessary to prevent correlated issues, like preventing a potential deadlock due to bad prioritization in servicing I/O over releasing tty_ldisc_lock() - refer to c96cf923a98d ("tty: Don't block on IO when ldisc change is pending"). All the necessary fixes are grouped here in this SRU request. * The symptom of the race condition between the buffer flush and the tty reopen routine is a kernel crash with the following trace: BUG: unable to handle kernel paging request at 0000000000002268 IP: [<addr>] n_tty_receive_buf_common+0x6a/0xae0 [...] Call Trace: [<addr>] ? kvm_sched_clock_read+0x1e/0x30 [<addr>] n_tty_receive_buf2+0x14/0x20 [<addr>] flush_to_ldisc+0xd5/0x120 [<addr>] process_one_work+0x156/0x400 [<addr>] worker_thread+0x11a/0x480 [...] * A kernel crash was collected from an user, analysis is present in comment #4 in this LP. [Test Case] * It is not trivial to trigger this fault, but the usual recipe is to keep accessing a machine through SSH (or keep killing getty when in IPMI serial console) and in some way run commands before the terminal is ready in that machine (like hacking some echo into ttySx or pts in an infinite loop). * We have reports of users that could reproduce this issue in their production environment, and with the patches present in this SRU request the problem was fixed. [Regression Potential] * tty subsystem is highly central and patches in that area are always delicate. For example, the upstream series [0] is a re-spin (V6) due to a hard to reproduce issue reported in the PA-RISC architecture, which was found in the V5 iteration [1] but was fixed by the patch c96cf923a98d, present in this SRU request. * The patchset [0] is present in tty-next tree since mid-November, and the patch b027e2298bd5 is available upstream since January/2018 (it's available in both Ubuntu kernels 4.15 and 4.18), so the overall likelihood of regressions is low. * These patches were sniff-tested for the 3 versions (4.4, 4.15 and 4.18) and didn't show any issues. [0] https://marc.info/?l=linux-kernel&m=154103190111795 [1] https://marc.info/?l=linux-kernel&m=153737852618183 To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1791758/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp