Package: linux Version: 4.4.6-1~bpo8+1 We plan to upgrade servers to 4.4.6 for networking improvement of 4.4 series. We are using debian wheezy & jessie.
At test phase, we get serious deadlocks for some servers. These servers are using jessie (thus systemd). The symptom is that boot process hangs before login prompt appears, so systemd hangs. The servers work well on 3.16.7-ckt20 and 4.2.6 bpo kernel. We enable lockdep and lock_stat to trace as below: [ 1680.821488] INFO: task systemd:1 blocked for more than 120 seconds. [ 1680.821533] Tainted: G O 4.4.6 #2 [ 1680.821574] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 1680.821634] systemd D ffff8818116f3d30 0 1 0 0x00000000 [ 1680.821690] ffff8818116f3d30 0000000000000007 0000000000000006 ffff8830065d6a58 [ 1680.821771] ffff881811202140 ffff8818116d2040 ffff8818116f4000 0000000000000246 [ 1680.821853] ffffffff81c69f08 ffff8818116d2040 00000000ffffffff ffff8818116f3d48 [ 1680.821941] Call Trace: [ 1680.821982] [<ffffffff8165f42c>] schedule+0x3c/0x90 [ 1680.822021] [<ffffffff8165f775>] schedule_preempt_disabled+0x15/0x20 [ 1680.822064] [<ffffffff8166117c>] mutex_lock_nested+0x18c/0x3e0 [ 1680.822105] [<ffffffff8113d8be>] ? proc_cgroup_show+0x4e/0x300 [ 1680.822146] [<ffffffff8113d8be>] ? proc_cgroup_show+0x4e/0x300 [ 1680.822184] [<ffffffff8113d8be>] proc_cgroup_show+0x4e/0x300 [ 1680.822228] [<ffffffff812a7e40>] proc_single_show+0x50/0x90 [ 1680.822266] [<ffffffff81258e99>] seq_read+0xe9/0x3c0 [ 1680.822306] [<ffffffff8122f658>] __vfs_read+0x18/0x40 [ 1680.822342] [<ffffffff8122fc69>] vfs_read+0x89/0x130 [ 1680.822382] [<ffffffff81230a69>] SyS_read+0x49/0xb0 [ 1680.822418] [<ffffffff81665db6>] system_call_fast_compare_end+0xc/0x70 [ 1680.822461] 2 locks held by systemd/1: [ 1680.822494] #0: (&p->lock){+.+.+.}, at: [<ffffffff81258ded>] seq_read+0x3d/0x3c0 [ 1680.822593] #1: (cgroup_mutex){+.+.+.}, at: [<ffffffff8113d8be>] proc_cgroup_show+0x4e/0x300 [ 1680.822689] INFO: task kthreadd:2 blocked for more than 120 seconds. [ 1680.822726] Tainted: G O 4.4.6 #2 [ 1680.822764] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 1680.822824] kthreadd D ffff8818116fbc78 0 2 0 0x00000000 [ 1680.822878] ffff8818116fbc78 ffffffff8166533c ffff8818116f4080 ffff8830061d6a58 [ 1680.822959] ffff8818111f60c0 ffff8818116f4080 ffff8818116fc000 ffffffff82e0ed08 [ 1680.823040] ffffffff82e0ed20 0000000000000000 0000000000000000 ffff8818116fbc90 [ 1680.823123] Call Trace: [ 1680.823156] [<ffffffff8166533c>] ? _raw_spin_unlock_irq+0x2c/0x40 [ 1680.823197] [<ffffffff8165f42c>] schedule+0x3c/0x90 [ 1680.823236] [<ffffffff8166394e>] rwsem_down_read_failed+0xee/0x140 [ 1680.823275] [<ffffffff8137aa34>] call_rwsem_down_read_failed+0x14/0x30 [ 1680.823321] [<ffffffff810dc0c9>] ? percpu_down_read+0x79/0xa0 [ 1680.823359] [<ffffffff81081b57>] ? copy_process+0x5b7/0x1e40 [ 1680.823400] [<ffffffff81081b57>] copy_process+0x5b7/0x1e40 [ 1680.823437] [<ffffffff810e0bdd>] ? __lock_acquire+0x5cd/0x1e90 [ 1680.823480] [<ffffffff810aa670>] ? kthread_create_on_node+0x250/0x250 [ 1680.823519] [<ffffffff8108354e>] _do_fork+0x7e/0x760 [ 1680.823560] [<ffffffff810ab560>] ? kthreadd+0x1b0/0x280 [ 1680.823596] [<ffffffff810ab560>] ? kthreadd+0x1b0/0x280 [ 1680.823637] [<ffffffff810ab5b3>] ? kthreadd+0x203/0x280 [ 1680.823677] [<ffffffff81083c59>] kernel_thread+0x29/0x30 [ 1680.823718] [<ffffffff810ab5d4>] kthreadd+0x224/0x280 [ 1680.823754] [<ffffffff8166617f>] ? ret_from_fork+0x3f/0x70 [ 1680.823794] [<ffffffff810ab3b0>] ? kthread_create_on_cpu+0x70/0x70 [ 1680.823832] [<ffffffff8166617f>] ret_from_fork+0x3f/0x70 [ 1680.823875] [<ffffffff810ab3b0>] ? kthread_create_on_cpu+0x70/0x70 [ 1680.823913] 1 lock held by kthreadd/2: [ 1680.823949] #0: (&cgroup_threadgroup_rwsem){++++++}, at: [<ffffffff81081b57>] copy_process+0x5b7/0x1e40 [ 1680.824071] INFO: task kworker/0:3:294 blocked for more than 120 seconds. [ 1680.824109] Tainted: G O 4.4.6 #2 [ 1680.824147] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 1680.824206] kworker/0:3 D ffff881809db7c50 0 294 2 0x00000000 [ 1680.824260] Workqueue: events cgroup_release_agent [ 1680.824300] ffff881809db7c50 0000000000000007 0000000000000006 ffff88181e3d6a58 [ 1680.824381] ffffffff81c125c0 ffff881809db2200 ffff881809db8000 0000000000000246 [ 1680.824463] ffffffff81c69f08 ffff881809db2200 00000000ffffffff ffff881809db7c68 [ 1680.824544] Call Trace: [ 1680.824578] [<ffffffff8165f42c>] schedule+0x3c/0x90 [ 1680.824614] [<ffffffff8165f775>] schedule_preempt_disabled+0x15/0x20 [ 1680.824657] [<ffffffff8166117c>] mutex_lock_nested+0x18c/0x3e0 [ 1680.824694] [<ffffffff81135053>] ? cgroup_release_agent+0x23/0xf0 [ 1680.824735] [<ffffffff81135053>] ? cgroup_release_agent+0x23/0xf0 [ 1680.824773] [<ffffffff81135053>] cgroup_release_agent+0x23/0xf0 [ 1680.824815] [<ffffffff810a35e5>] process_one_work+0x1f5/0x790 [ 1680.824853] [<ffffffff810a3550>] ? process_one_work+0x160/0x790 [ 1680.824894] [<ffffffff810a3be9>] worker_thread+0x69/0x480 [ 1680.824931] [<ffffffff810a3b80>] ? process_one_work+0x790/0x790 [ 1680.824972] [<ffffffff810a3b80>] ? process_one_work+0x790/0x790 [ 1680.825010] [<ffffffff810aa78c>] kthread+0x11c/0x140 [ 1680.825050] [<ffffffff810aa670>] ? kthread_create_on_node+0x250/0x250 [ 1680.825090] [<ffffffff8166617f>] ret_from_fork+0x3f/0x70 [ 1680.825130] [<ffffffff810aa670>] ? kthread_create_on_node+0x250/0x250 [ 1680.825169] 3 locks held by kworker/0:3/294: [ 1680.825205] #0: ("events"){.+.+.+}, at: [<ffffffff810a3550>] process_one_work+0x160/0x790 [ 1680.825297] #1: ((&cgrp->release_agent_work)){+.+.+.}, at: [<ffffffff810a3550>] process_one_work+0x160/0x790 [ 1680.825393] #2: (cgroup_mutex){+.+.+.}, at: [<ffffffff81135053>] cgroup_release_agent+0x23/0xf0 [ 1680.825507] INFO: task systemd-journal:530 blocked for more than 120 seconds. [ 1680.825547] Tainted: G O 4.4.6 #2 [ 1680.825585] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 1680.825644] systemd-journal D ffff88180aaa7d30 0 530 1 0x00000000 [ 1680.825698] ffff88180aaa7d30 0000000000000007 0000000000000006 ffff88181efd6a58 [ 1680.825778] ffff8818111d0440 ffff88180a82a0c0 ffff88180aaa8000 0000000000000246 [ 1680.825839] ffffffff81c69f08 ffff88180a82a0c0 00000000ffffffff ffff88180aaa7d48 [ 1680.825841] Call Trace: [ 1680.825845] [<ffffffff8165f42c>] schedule+0x3c/0x90 [ 1680.825847] [<ffffffff8165f775>] schedule_preempt_disabled+0x15/0x20 [ 1680.825848] [<ffffffff8166117c>] mutex_lock_nested+0x18c/0x3e0 [ 1680.825850] [<ffffffff8113d8be>] ? proc_cgroup_show+0x4e/0x300 [ 1680.825851] [<ffffffff8113d8be>] ? proc_cgroup_show+0x4e/0x300 [ 1680.825853] [<ffffffff8113d8be>] proc_cgroup_show+0x4e/0x300 [ 1680.825855] [<ffffffff812a7e40>] proc_single_show+0x50/0x90 [ 1680.825856] [<ffffffff81258e99>] seq_read+0xe9/0x3c0 [ 1680.825858] [<ffffffff8122f658>] __vfs_read+0x18/0x40 [ 1680.825859] [<ffffffff8122fc69>] vfs_read+0x89/0x130 [ 1680.825860] [<ffffffff81230a69>] SyS_read+0x49/0xb0 [ 1680.825861] [<ffffffff81665db6>] system_call_fast_compare_end+0xc/0x70 [ 1680.825862] 2 locks held by systemd-journal/530: [ 1680.825865] #0: (&p->lock){+.+.+.}, at: [<ffffffff81258ded>] seq_read+0x3d/0x3c0 [ 1680.825867] #1: (cgroup_mutex){+.+.+.}, at: [<ffffffff8113d8be>] proc_cgroup_show+0x4e/0x300 [ 1680.825868] INFO: task kworker/12:2:544 blocked for more than 120 seconds. [ 1680.825869] Tainted: G O 4.4.6 #2 [ 1680.825870] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 1680.825872] kworker/12:2 D ffff882fef583c50 0 544 2 0x00000000 [ 1680.825873] Workqueue: events cgroup_release_agent [ 1680.825875] ffff882fef583c50 0000000000000007 0000000000000006 ffff883005fd6a58 [ 1680.825876] ffff8818111f4080 ffff882ff0336040 ffff882fef584000 0000000000000246 [ 1680.825878] ffffffff81c69f08 ffff882ff0336040 00000000ffffffff ffff882fef583c68 [ 1680.825878] Call Trace: [ 1680.825880] [<ffffffff8165f42c>] schedule+0x3c/0x90 [ 1680.825881] [<ffffffff8165f775>] schedule_preempt_disabled+0x15/0x20 [ 1680.825883] [<ffffffff8166117c>] mutex_lock_nested+0x18c/0x3e0 [ 1680.825884] [<ffffffff81135053>] ? cgroup_release_agent+0x23/0xf0 [ 1680.825885] [<ffffffff81135053>] ? cgroup_release_agent+0x23/0xf0 [ 1680.825886] [<ffffffff81135053>] cgroup_release_agent+0x23/0xf0 [ 1680.825887] [<ffffffff810a35e5>] process_one_work+0x1f5/0x790 [ 1680.825889] [<ffffffff810a3550>] ? process_one_work+0x160/0x790 [ 1680.825890] [<ffffffff810a3be9>] worker_thread+0x69/0x480 [ 1680.825891] [<ffffffff810a3b80>] ? process_one_work+0x790/0x790 [ 1680.825892] [<ffffffff810a3b80>] ? process_one_work+0x790/0x790 [ 1680.825894] [<ffffffff810aa78c>] kthread+0x11c/0x140 [ 1680.825896] [<ffffffff810aa670>] ? kthread_create_on_node+0x250/0x250 [ 1680.825897] [<ffffffff8166617f>] ret_from_fork+0x3f/0x70 [ 1680.825898] [<ffffffff810aa670>] ? kthread_create_on_node+0x250/0x250 [ 1680.825899] 3 locks held by kworker/12:2/544: [ 1680.825902] #0: ("events"){.+.+.+}, at: [<ffffffff810a3550>] process_one_work+0x160/0x790 [ 1680.825904] #1: ((&cgrp->release_agent_work)){+.+.+.}, at: [<ffffffff810a3550>] process_one_work+0x160/0x790 [ 1680.825906] #2: (cgroup_mutex){+.+.+.}, at: [<ffffffff81135053>] cgroup_release_agent+0x23/0xf0 [ 1680.825912] INFO: task sshd:1341 blocked for more than 120 seconds. [ 1680.825913] Tainted: G O 4.4.6 #2 [ 1680.825913] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 1680.825915] sshd D ffff88180c9efcc8 0 1341 1 0x00000000 [ 1680.825916] ffff88180c9efcc8 ffffffff8166533c ffff88180cd740c0 ffff883005dd6a58 [ 1680.825920] ffff8818111ea040 ffff88180cd740c0 ffff88180c9f0000 ffffffff82e0ed08 [ 1680.825921] ffffffff82e0ed20 0000000000000000 00007fee2b563ad0 ffff88180c9efce0 [ 1680.825921] Call Trace: [ 1680.825922] [<ffffffff8166533c>] ? _raw_spin_unlock_irq+0x2c/0x40 [ 1680.825924] [<ffffffff8165f42c>] schedule+0x3c/0x90 [ 1680.825926] [<ffffffff8166394e>] rwsem_down_read_failed+0xee/0x140 [ 1680.825927] [<ffffffff8137aa34>] call_rwsem_down_read_failed+0x14/0x30 [ 1680.825929] [<ffffffff810dc0c9>] ? percpu_down_read+0x79/0xa0 [ 1680.825930] [<ffffffff81081b57>] ? copy_process+0x5b7/0x1e40 [ 1680.825931] [<ffffffff81081b57>] copy_process+0x5b7/0x1e40 [ 1680.825933] [<ffffffff8108354e>] _do_fork+0x7e/0x760 [ 1680.825935] [<ffffffff81251e45>] ? __fd_install+0x5/0x2e0 [ 1680.825939] [<ffffffff811dc09e>] ? __might_fault+0x4e/0xb0 [ 1680.825942] [<ffffffff81003044>] ? lockdep_sys_exit_thunk+0x12/0x14 [ 1680.825943] [<ffffffff81083cd9>] SyS_clone+0x19/0x20 [ 1680.825944] [<ffffffff81665db6>] system_call_fast_compare_end+0xc/0x70 [ 1680.825945] 1 lock held by sshd/1341: [ 1680.825947] #0: (&cgroup_threadgroup_rwsem){++++++}, at: [<ffffffff81081b57>] copy_process+0x5b7/0x1e40 [ 1680.825950] INFO: task gmond:1569 blocked for more than 120 seconds. [ 1680.825951] Tainted: G O 4.4.6 #2 [ 1680.825951] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 1680.825953] gmond D ffff8817ad8ebcc8 0 1569 1 0x00000000 [ 1680.825954] ffff8817ad8ebcc8 ffffffff8166533c ffff8817ad8e2480 ffff88181edd6a58 [ 1680.825956] ffff8818111c6400 ffff8817ad8e2480 ffff8817ad8ec000 ffffffff82e0ed08 [ 1680.825957] ffffffff82e0ed20 0000000000000000 00007f4dd8af4a50 ffff8817ad8ebce0 [ 1680.825958] Call Trace: [ 1680.825959] [<ffffffff8166533c>] ? _raw_spin_unlock_irq+0x2c/0x40 [ 1680.825961] [<ffffffff8165f42c>] schedule+0x3c/0x90 [ 1680.825962] [<ffffffff8166394e>] rwsem_down_read_failed+0xee/0x140 [ 1680.825964] [<ffffffff8137aa34>] call_rwsem_down_read_failed+0x14/0x30 [ 1680.825965] [<ffffffff810dc0c9>] ? percpu_down_read+0x79/0xa0 [ 1680.825966] [<ffffffff81081b57>] ? copy_process+0x5b7/0x1e40 [ 1680.825967] [<ffffffff81081b57>] copy_process+0x5b7/0x1e40 [ 1680.825968] [<ffffffff8108354e>] _do_fork+0x7e/0x760 [ 1680.825970] [<ffffffff81666904>] ? retint_user+0x18/0x23 [ 1680.825971] [<ffffffff81003017>] ? trace_hardirqs_on_thunk+0x17/0x19 [ 1680.825972] [<ffffffff81083cd9>] SyS_clone+0x19/0x20 [ 1680.825973] [<ffffffff81665db6>] system_call_fast_compare_end+0xc/0x70 [ 1680.825974] 1 lock held by gmond/1569: [ 1680.825976] #0: (&cgroup_threadgroup_rwsem){++++++}, at: [<ffffffff81081b57>] copy_process+0x5b7/0x1e40 [ 1680.825978] INFO: task python:1702 blocked for more than 120 seconds. [ 1680.825978] Tainted: G O 4.4.6 #2 [ 1680.825978] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 1680.825980] python D ffff8817b05d3cc8 0 1702 1543 0x00000000 [ 1680.825982] ffff8817b05d3cc8 ffffffff8166533c ffff88180c780240 ffff88181e7d6a58 [ 1680.825983] ffff8818111b8340 ffff88180c780240 ffff8817b05d4000 ffffffff82e0ed08 [ 1680.825984] ffffffff82e0ed20 0000000000000000 00007f5256eda9d0 ffff8817b05d3ce0 [ 1680.825984] Call Trace: [ 1680.825985] [<ffffffff8166533c>] ? _raw_spin_unlock_irq+0x2c/0x40 [ 1680.825987] [<ffffffff8165f42c>] schedule+0x3c/0x90 [ 1680.825988] [<ffffffff8166394e>] rwsem_down_read_failed+0xee/0x140 [ 1680.825990] [<ffffffff8137aa34>] call_rwsem_down_read_failed+0x14/0x30 [ 1680.825991] [<ffffffff810dc0c9>] ? percpu_down_read+0x79/0xa0 [ 1680.825992] [<ffffffff81081b57>] ? copy_process+0x5b7/0x1e40 [ 1680.825993] [<ffffffff81081b57>] copy_process+0x5b7/0x1e40 [ 1680.825995] [<ffffffff810e0bdd>] ? __lock_acquire+0x5cd/0x1e90 [ 1680.825996] [<ffffffff8108354e>] _do_fork+0x7e/0x760 [ 1680.825998] [<ffffffff81251f3a>] ? __fd_install+0xfa/0x2e0 [ 1680.825999] [<ffffffff81251e45>] ? __fd_install+0x5/0x2e0 [ 1680.826000] [<ffffffff811dc09e>] ? __might_fault+0x4e/0xb0 [ 1680.826002] [<ffffffff81003044>] ? lockdep_sys_exit_thunk+0x12/0x14 [ 1680.826003] [<ffffffff81083cd9>] SyS_clone+0x19/0x20 [ 1680.826003] [<ffffffff81665db6>] system_call_fast_compare_end+0xc/0x70 [ 1680.826004] 1 lock held by python/1702: [ 1680.826007] #0: (&cgroup_threadgroup_rwsem){++++++}, at: [<ffffffff81081b57>] copy_process+0x5b7/0x1e40 [ 1680.826008] INFO: task local_resource_:1775 blocked for more than 120 seconds. [ 1680.826009] Tainted: G O 4.4.6 #2 [ 1680.826009] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 1680.826011] local_resource_ D ffff8817ace0bcc8 0 1775 1531 0x00000000 [ 1680.826012] ffff8817ace0bcc8 ffffffff8166533c ffff8817acc861c0 ffff883005fd6a58 [ 1680.826013] ffff8818111f4080 ffff8817acc861c0 ffff8817ace0c000 ffffffff82e0ed08 [ 1680.826015] ffffffff82e0ed20 0000000000000000 0000000000000008 ffff8817ace0bce0 [ 1680.826015] Call Trace: [ 1680.826016] [<ffffffff8166533c>] ? _raw_spin_unlock_irq+0x2c/0x40 [ 1680.826017] [<ffffffff8165f42c>] schedule+0x3c/0x90 [ 1680.826019] [<ffffffff8166394e>] rwsem_down_read_failed+0xee/0x140 [ 1680.826020] [<ffffffff8137aa34>] call_rwsem_down_read_failed+0x14/0x30 [ 1680.826022] [<ffffffff810dc0c9>] ? percpu_down_read+0x79/0xa0 [ 1680.826023] [<ffffffff81081b57>] ? copy_process+0x5b7/0x1e40 [ 1680.826023] [<ffffffff81081b57>] copy_process+0x5b7/0x1e40 [ 1680.826025] [<ffffffff8108354e>] _do_fork+0x7e/0x760 [ 1680.826026] [<ffffffff811dc09e>] ? __might_fault+0x4e/0xb0 [ 1680.826028] [<ffffffff81003044>] ? lockdep_sys_exit_thunk+0x12/0x14 [ 1680.826029] [<ffffffff81083cd9>] SyS_clone+0x19/0x20 [ 1680.826030] [<ffffffff81665db6>] system_call_fast_compare_end+0xc/0x70 [ 1680.826031] 1 lock held by local_resource_/1775: [ 1680.826033] #0: (&cgroup_threadgroup_rwsem){++++++}, at: [<ffffffff81081b57>] copy_process+0x5b7/0x1e40 [ 1680.826072] INFO: task run:5120 blocked for more than 120 seconds. [ 1680.826073] Tainted: G O 4.4.6 #2 [ 1680.826073] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 1680.826075] run D ffff882ff4367ce0 0 5120 5117 0x00000000 [ 1680.826077] ffff882ff4367ce0 0000000000000007 0000000000000006 ffff88181edd6a58 [ 1680.826078] ffff8818111c6400 ffff882ff452a180 ffff882ff4368000 0000000000000246 [ 1680.826080] ffffffff81c69f08 ffff882ff452a180 00000000ffffffff ffff882ff4367cf8 [ 1680.826080] Call Trace: [ 1680.826082] [<ffffffff8165f42c>] schedule+0x3c/0x90 [ 1680.826083] [<ffffffff8165f775>] schedule_preempt_disabled+0x15/0x20 [ 1680.826085] [<ffffffff8166117c>] mutex_lock_nested+0x18c/0x3e0 [ 1680.826086] [<ffffffff811371f0>] ? cgroup_kn_lock_live+0x50/0x1d0 [ 1680.826087] [<ffffffff811371f0>] ? cgroup_kn_lock_live+0x50/0x1d0 [ 1680.826089] [<ffffffff811371f0>] cgroup_kn_lock_live+0x50/0x1d0 [ 1680.826090] [<ffffffff81137208>] ? cgroup_kn_lock_live+0x68/0x1d0 [ 1680.826091] [<ffffffff8113ac12>] __cgroup_procs_write+0x52/0x460 [ 1680.826093] [<ffffffff8113b031>] cgroup_tasks_write+0x11/0x20 [ 1680.826094] [<ffffffff81136a5e>] cgroup_file_write+0x3e/0x1c0 [ 1680.826096] [<ffffffff812ba611>] kernfs_fop_write+0x141/0x190 [ 1680.826097] [<ffffffff8122f748>] __vfs_write+0x18/0x40 [ 1680.826098] [<ffffffff8122fdbc>] vfs_write+0xac/0x1a0 [ 1680.826100] [<ffffffff81251606>] ? __fget_light+0x66/0x90 [ 1680.826101] [<ffffffff81230b19>] SyS_write+0x49/0xb0 [ 1680.826102] [<ffffffff81665db6>] system_call_fast_compare_end+0xc/0x70 [ 1680.826103] 3 locks held by run/5120: [ 1680.826107] #0: (sb_writers#8){.+.+.+}, at: [<ffffffff81232b01>] __sb_start_write+0xd1/0xf0 [ 1680.826109] #1: (&of->mutex){+.+.+.}, at: [<ffffffff812ba536>] kernfs_fop_write+0x66/0x190 [ 1680.826112] #2: (cgroup_mutex){+.+.+.}, at: [<ffffffff811371f0>] cgroup_kn_lock_live+0x50/0x1d0 Tejun<t...@kernel.org> helped us looking into this issue. There's a patch for 4.6, which works for 4.4 too: http://lkml.kernel.org/g/20160415191719.gk12...@htj.duckdns.org The deadlock is easy to trigger if using cgroup directly in services brought up by systemd. It's a major showstopper.