Re: [PATCH] Bluetooth: Fix l2cap_sock_teardown_cb race condition with bt_accept_dequeue
Hi Marcel, > so I am not big fan of the conditional locking in case of parent is set or > not. Do you have a test case that reproduces the mentioned race. It would > love to have that in tools/l2cap-tester or similar. So far I could only reproduce the bug by repeatedly performing RFCOMM connections and resets. I'll try to implement something in rfcomm-tester or l2cap-tester. Since this is a race condition, I'm not confident that I can help you reproduce the bug reliably on a different test setup. I'd appreciate it very much if you can offer any tips on triggering a race condition faster in a test case. > Maybe the code needs some restructuring to avoid the conditional locking. I agree that my patch is not very elegant, and I'd love any way to improve it. I have some ideas, but I'm not familiar enough with kernel development to know whether other solutions are safe to implement, such as: * Removing bt_accept_unlink from l2cap_teardown_cb, and relying on bt_accept_dequeue to unlink the socket when it's enumerated. Is it safe to leave a zapped sock in accept_q? * Perform "unlock sock; lock parent; lock sock" before calling bt_accept_unlink in teardown_cb. This is still conditional locking, but around a smaller block of code. Is it safe to unlock a zapped sock? * Use RCU for handling accept_q. Is this appropriate? Please let me know what you think. Regards, Yichen Zhao
[PATCH] Bluetooth: Fix l2cap_sock_teardown_cb race condition with bt_accept_dequeue
Fix a race condition between l2cap_sock_teardown_cb on an L2CAP socket and bt_accept_dequeue on its parent socket. When the race condition is encountered bt_accept_dequeue may call bt_accept_unlink on an already unlinked socket and result in a NULL pointer dereference. Even if bt_accept_unlink is not called by bt_accept_dequeue, bt_accept_unlink called by l2cap_sock_teardown_cb can race with list_for_each_entry_safe in bt_accept_dequeue, causing the latter to loop indefinitely on the unlinked socket, until release_sock crashes with a NULL pointer dereference when the sock pointer is freed. The race condition is fixed by locking the parent socket in l2cap_sock_teardown_cb. [50510.241632] BUG: unable to handle kernel NULL pointer dereference at 01a8 [50510.241694] IP: [] bt_accept_unlink+0x47/0xa0 [bluetooth] [50510.241759] PGD 0 [50510.241776] Oops: 0002 [#1] SMP [50510.241802] Modules linked in: rtl8192cu rtl_usb rtlwifi rtl8192c_common 8021q garp stp mrp llc rfcomm bnep nls_iso8859_1 intel_rapl x86_pkg_temp_thermal intel_powerclamp coretemp arc4 ath9k ath9k_common ath9k_hw ath kvm eeepc_wmi asus_wmi mac80211 snd_hda_codec_hdmi snd_hda_codec_realtek sparse_keymap crct10dif_pclmul snd_hda_codec_generic crc32_pclmul snd_hda_intel snd_hda_controller cfg80211 snd_hda_codec i915 snd_hwdep snd_pcm ghash_clmulni_intel snd_timer snd soundcore serio_raw cryptd drm_kms_helper drm i2c_algo_bit shpchp ath3k mei_me lpc_ich btusb bluetooth 6lowpan_iphc mei lp parport wmi video mac_hid psmouse ahci libahci r8169 mii [50510.242279] CPU: 0 PID: 934 Comm: krfcommd Not tainted 3.16.0-49-generic #65~14.04.1-Ubuntu [50510.242327] Hardware name: ASUSTeK Computer INC. VM40B/VM40B, BIOS 1501 12/09/2014 [50510.242370] task: 8800d9068a30 ti: 8800d7a54000 task.ti: 8800d7a54000 [50510.242413] RIP: 0010:[] [] bt_accept_unlink+0x47/0xa0 [bluetooth] [50510.242480] RSP: 0018:8800d7a57d58 EFLAGS: 00010246 [50510.242511] RAX: RBX: 880119bb8c00 RCX: 880119bb8eb0 [50510.242552] RDX: 880119bb8eb0 RSI: fe01 RDI: 880119bb8c00 [50510.242592] RBP: 8800d7a57d60 R08: 0283 R09: 0001 [50510.242633] R10: R11: R12: 8800d8da9eb0 [50510.242673] R13: 8800d74fdb80 R14: 880119bb8c00 R15: 8800d8da9c00 [50510.242715] FS: () GS:88011fa0() knlGS: [50510.242761] CS: 0010 DS: ES: CR0: 80050033 [50510.242794] CR2: 01a8 CR3: 01c13000 CR4: 001407f0 [50510.242835] Stack: [50510.242849] 880119bb8eb0 8800d7a57da0 c0124506 8800d8da9eb0 [50510.242899] 8800d8da9c00 8800d9068a30 8800d74fdb80 [50510.242949] 8800d6f85208 8800d7a57e08 c0159985 001f [50510.242999] Call Trace: [50510.243027] [] bt_accept_dequeue+0xb6/0x180 [bluetooth] [50510.243085] [] l2cap_sock_accept+0x125/0x220 [bluetooth] [50510.243128] [] ? wake_up_state+0x20/0x20 [50510.243163] [] kernel_accept+0x4e/0xa0 [50510.243200] [] rfcomm_run+0x1ad/0x890 [rfcomm] [50510.243238] [] ? rfcomm_process_rx+0x8a0/0x8a0 [rfcomm] [50510.243281] [] kthread+0xd2/0xf0 [50510.243312] [] ? kthread_create_on_node+0x1c0/0x1c0 [50510.243353] [] ret_from_fork+0x58/0x90 [50510.243387] [] ? kthread_create_on_node+0x1c0/0x1c0 [50510.243424] Code: 00 48 8b 93 b8 02 00 00 48 8d 83 b0 02 00 00 48 89 51 08 48 89 0a 48 89 83 b0 02 00 00 48 89 83 b8 02 00 00 48 8b 83 c0 02 00 00 <66> 83 a8 a8 01 00 00 01 48 c7 83 c0 02 00 00 00 00 00 00 f0 ff [50510.243685] RIP [] bt_accept_unlink+0x47/0xa0 [bluetooth] [50510.243737] RSP [50510.243758] CR2: 01a8 [50510.249457] ---[ end trace bb984f932c4e3ab3 ]--- Signed-off-by: Yichen Zhao --- net/bluetooth/l2cap_sock.c | 18 +- 1 file changed, 17 insertions(+), 1 deletion(-) diff --git a/net/bluetooth/l2cap_sock.c b/net/bluetooth/l2cap_sock.c index e4cae72..ff1c821 100644 --- a/net/bluetooth/l2cap_sock.c +++ b/net/bluetooth/l2cap_sock.c @@ -1307,6 +1307,15 @@ static void l2cap_sock_teardown_cb(struct l2cap_chan *chan, int err) BT_DBG("chan %p state %s", chan, state_to_string(chan->state)); + parent = bt_sk(sk)->parent; + + /* The parent sock must be locked if its state is mutated by +* bt_accept_unlink. It must be locked before sk to maintain the same +* locking order as bt_accept_dequeue. +*/ + if (parent) + lock_sock_nested(parent, L2CAP_NESTING_PARENT); + /* This callback can be called both for server (BT_LISTEN) * sockets as well as "normal" ones. To avoid lockdep warnings * with child socket locking (through l2cap_sock_cleanup_listen) @@ -1316,7 +1325,11 @@ static void l2cap_sock_teardown_cb(struct l2cap_chan *chan, int err) */ lock_sock_nested(sk, atomic_read(&a
[PATCH] Bluetooth: Fix locking in bt_accept_dequeue after disconnection
Fix a crash that may happen when bt_accept_dequeue is run after a Bluetooth connection has been disconnected. bt_accept_unlink was called after release_sock, permitting bt_accept_unlink to run twice on the same socket and cause a NULL pointer dereference. [50510.241632] BUG: unable to handle kernel NULL pointer dereference at 01a8 [50510.241694] IP: [] bt_accept_unlink+0x47/0xa0 [bluetooth] [50510.241759] PGD 0 [50510.241776] Oops: 0002 [#1] SMP [50510.241802] Modules linked in: rtl8192cu rtl_usb rtlwifi rtl8192c_common 8021q garp stp mrp llc rfcomm bnep nls_iso8859_1 intel_rapl x86_pkg_temp_thermal intel_powerclamp coretemp arc4 ath9k ath9k_common ath9k_hw ath kvm eeepc_wmi asus_wmi mac80211 snd_hda_codec_hdmi snd_hda_codec_realtek sparse_keymap crct10dif_pclmul snd_hda_codec_generic crc32_pclmul snd_hda_intel snd_hda_controller cfg80211 snd_hda_codec i915 snd_hwdep snd_pcm ghash_clmulni_intel snd_timer snd soundcore serio_raw cryptd drm_kms_helper drm i2c_algo_bit shpchp ath3k mei_me lpc_ich btusb bluetooth 6lowpan_iphc mei lp parport wmi video mac_hid psmouse ahci libahci r8169 mii [50510.242279] CPU: 0 PID: 934 Comm: krfcommd Not tainted 3.16.0-49-generic #65~14.04.1-Ubuntu [50510.242327] Hardware name: ASUSTeK Computer INC. VM40B/VM40B, BIOS 1501 12/09/2014 [50510.242370] task: 8800d9068a30 ti: 8800d7a54000 task.ti: 8800d7a54000 [50510.242413] RIP: 0010:[] [] bt_accept_unlink+0x47/0xa0 [bluetooth] [50510.242480] RSP: 0018:8800d7a57d58 EFLAGS: 00010246 [50510.242511] RAX: RBX: 880119bb8c00 RCX: 880119bb8eb0 [50510.242552] RDX: 880119bb8eb0 RSI: fe01 RDI: 880119bb8c00 [50510.242592] RBP: 8800d7a57d60 R08: 0283 R09: 0001 [50510.242633] R10: R11: R12: 8800d8da9eb0 [50510.242673] R13: 8800d74fdb80 R14: 880119bb8c00 R15: 8800d8da9c00 [50510.242715] FS: () GS:88011fa0() knlGS: [50510.242761] CS: 0010 DS: ES: CR0: 80050033 [50510.242794] CR2: 01a8 CR3: 01c13000 CR4: 001407f0 [50510.242835] Stack: [50510.242849] 880119bb8eb0 8800d7a57da0 c0124506 8800d8da9eb0 [50510.242899] 8800d8da9c00 8800d9068a30 8800d74fdb80 [50510.242949] 8800d6f85208 8800d7a57e08 c0159985 001f [50510.242999] Call Trace: [50510.243027] [] bt_accept_dequeue+0xb6/0x180 [bluetooth] [50510.243085] [] l2cap_sock_accept+0x125/0x220 [bluetooth] [50510.243128] [] ? wake_up_state+0x20/0x20 [50510.243163] [] kernel_accept+0x4e/0xa0 [50510.243200] [] rfcomm_run+0x1ad/0x890 [rfcomm] [50510.243238] [] ? rfcomm_process_rx+0x8a0/0x8a0 [rfcomm] [50510.243281] [] kthread+0xd2/0xf0 [50510.243312] [] ? kthread_create_on_node+0x1c0/0x1c0 [50510.243353] [] ret_from_fork+0x58/0x90 [50510.243387] [] ? kthread_create_on_node+0x1c0/0x1c0 [50510.243424] Code: 00 48 8b 93 b8 02 00 00 48 8d 83 b0 02 00 00 48 89 51 08 48 89 0a 48 89 83 b0 02 00 00 48 89 83 b8 02 00 00 48 8b 83 c0 02 00 00 <66> 83 a8 a8 01 00 00 01 48 c7 83 c0 02 00 00 00 00 00 00 f0 ff [50510.243685] RIP [] bt_accept_unlink+0x47/0xa0 [bluetooth] [50510.243737] RSP [50510.243758] CR2: 01a8 [50510.249457] ---[ end trace bb984f932c4e3ab3 ]--- Signed-off-by: Yichen Zhao --- net/bluetooth/af_bluetooth.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/net/bluetooth/af_bluetooth.c b/net/bluetooth/af_bluetooth.c index a3bffd1..a542b99 100644 --- a/net/bluetooth/af_bluetooth.c +++ b/net/bluetooth/af_bluetooth.c @@ -188,8 +188,8 @@ struct sock *bt_accept_dequeue(struct sock *parent, struct socket *newsock) /* FIXME: Is this check still needed */ if (sk->sk_state == BT_CLOSED) { - release_sock(sk); bt_accept_unlink(sk); + release_sock(sk); continue; } -- 2.6.0.rc2.230.g3dd15c0 -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html