Public bug reported:

We got feedback from customer that cvm(cloud virtual machine) crashed when 
using kubelet updating container-service in ubuntu xenial. Logs show as follow. 
We find a patch (commit 33c35aa4817864e056fd772230b0c6b552e36ea2) in linux 
mainline, which can indeed fix this bug. But ubuntu-xenial.git has not merged 
it yet. 

Do you guys have a plan for merging?

----------------------panic log-----------------------------
[2018-02-02 10:21:48][4397731.721563] BUG: unable to handle kernel paging 
request at 000000010000005c
[2018-02-02 10:40:50][4397731.722666] IP: css_clear_dir+0x5/0x70
[2018-02-02 10:40:50][4397731.723261] PGD a12b067 
[2018-02-02 10:40:50][4397731.723261] PUD 0 
[2018-02-02 10:40:50][4397731.723628] 
[2018-02-02 10:40:50][4397731.724004] Oops: 0000 [#1] SMP
[2018-02-02 10:40:50][4397731.724004] Modules linked in: xt_statistic 
nf_conntrack_netlink ebt_ip ebtable_filter ebtables veth xt_set ip_set_hash_net 
ip_set nfnetlink xt_nat xt_recent xt_mark ipt_REJ[2018-02-02 10:40:50]ECT 
nf_reject_ipv4 xt_tcpudp xt_comment ipt_MASQUERADE nf_nat_masquerade_ipv4 
xfrm_user xfrm_algo iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 
xt_addrtype iptable_fil[2018-02-02 10:40:50]ter ip_tables xt_conntrack x_tables 
nf_nat nf_conntrack br_netfilter bridge stp llc aufs ppdev sb_edac edac_core 
crct10dif_pclmul crc32_pclmul ghash_clmulni_intel joydev input_le[2018-02-02 
10:40:50]ds serio_raw parport_pc parport i2c_piix4 mac_hid ib_iser rdma_cm 
iw_cm ib_cm ib_core configfs iscsi_tcp libiscsi_tcp libiscsi 
scsi_transport_iscsi autofs4 btrfs raid10 raid456 a[2018-02-02 
10:40:50]sync_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq 
libcrc32c raid1 raid0 multipath
[2018-02-02 10:40:50][4397731.724004]  linear cirrus ttm drm_kms_helper 
syscopyarea sysfillrect sysimgblt aesni_intel fb_sys_fops aes_x86_64 
crypto_simd cryptd glue_helper psmouse virtio_blk virtio_n[2018-02-02 
10:40:50]et drm pata_acpi floppy
[2018-02-02 10:40:50][4397731.724004] CPU: 0 PID: 23347 Comm: kubelet Not 
tainted 4.10.0-32-generic #36~16.04.1-Ubuntu
[2018-02-02 10:40:50][4397731.724004] Hardware name: Bochs Bochs, BIOS Bochs 
01/01/2011
[2018-02-02 10:40:50][4397731.724004] task: ffff92abde590000 task.stack: 
ffffbaa94165c000
[2018-02-02 10:40:50][4397731.724004] RIP: 0010:css_clear_dir+0x5/0x70
[2018-02-02 10:40:50][4397731.724004] RSP: 0018:ffffbaa94165fe10 EFLAGS: 
00010206
[2018-02-02 10:40:50][4397731.724004] RAX: 000047fd40005d7b RBX: 
00000000ffffffe8 RCX: ffff92abffc0fcec
[2018-02-02 10:40:50][4397731.724004] RDX: ffffffff9b070800 RSI: 
0000000000000206 RDI: 00000000ffffffe8
[2018-02-02 10:40:50][4397731.724004] RBP: ffffbaa94165fe20 R08: 
00000000c8b18701 R09: 0000000180220017
[2018-02-02 10:40:50][4397731.724004] R10: ffff92abc8b187f8 R11: 
ffff92abf7751d00 R12: ffff92abd5601000
[2018-02-02 10:40:50][4397731.724004] R13: 0000000000000000 R14: 
ffff92abd5601150 R15: 0000000000000000
[2018-02-02 10:40:50][4397731.724004] FS:  00007f6f92ffd700(0000) 
GS:ffff92abffc00000(0000) knlGS:0000000000000000
[2018-02-02 10:40:50][4397731.724004] CS:  0010 DS: 0000 ES: 0000 CR0: 
0000000080050033
[2018-02-02 10:40:50][4397731.724004] CR2: 000000010000005c CR3: 
00000000280cb000 CR4: 00000000000406f0
[2018-02-02 10:40:50][4397731.724004] Call Trace:
[2018-02-02 10:40:50][4397731.724004]  ? kill_css+0x12/0x60
[2018-02-02 10:40:50][4397731.724004]  cgroup_destroy_locked+0xa5/0xf0
[2018-02-02 10:40:50][4397731.724004]  cgroup_rmdir+0x2c/0x90
[2018-02-02 10:40:50][4397731.724004]  kernfs_iop_rmdir+0x4d/0x80
[2018-02-02 10:40:50][4397731.724004]  vfs_rmdir+0xb4/0x130
[2018-02-02 10:40:50][4397731.724004]  do_rmdir+0x1c7/0x1e0
[2018-02-02 10:40:50][4397731.724004]  SyS_unlinkat+0x22/0x30
[2018-02-02 10:40:50][4397731.724004]  entry_SYSCALL_64_fastpath+0x1e/0xad
[2018-02-02 10:40:50][4397731.724004] RIP: 0033:0x481bd4
[2018-02-02 10:40:50][4397731.724004] RSP: 002b:000000c422893af0 EFLAGS: 
00000246 ORIG_RAX: 0000000000000107
[2018-02-02 10:40:50][4397731.724004] RAX: ffffffffffffffda RBX: 
0000000000000000 RCX: 0000000000481bd4
[2018-02-02 10:40:50][4397731.724004] RDX: 0000000000000200 RSI: 
000000c421c7ef00 RDI: ffffffffffffff9c
[2018-02-02 10:40:50][4397731.724004] RBP: 000000c422893bc0 R08: 
0000000000000000 R09: 0000000000000000
[2018-02-02 10:40:50][4397731.724004] R10: 0000000000000000 R11: 
0000000000000246 R12: 00000000000000ce
[2018-02-02 10:40:50][4397731.724004] R13: 00000000ffffffee R14: 
0000000000001740 R15: 0000000000000055
[2018-02-02 10:40:50][4397731.724004] Code: fd ff ff 85 c0 41 89 c6 0f 84 5b fd 
ff ff eb 83 4d 89 fc e9 0f ff ff ff e8 d9 37 f6 ff 66 0f 1f 84 00 00 00 00 00 
0f 1f 44 00 00 <8b> 47 74 a8 08 74 5d 55 [2018-02-02 10:40:50]83 e0 f7 48 89 e5 
41 55 41 54 53 89 47 
[2018-02-02 10:40:50][4397731.724004] RIP: css_clear_dir+0x5/0x70 RSP: 
ffffbaa94165fe10
[2018-02-02 10:40:50][4397731.724004] CR2: 000000010000005c


----------------------patch in linux.git----------------------------
commit 33c35aa4817864e056fd772230b0c6b552e36ea2
Author: Waiman Long <long...@redhat.com>
Date:   Mon May 15 09:34:06 2017 -0400

    cgroup: Prevent kill_css() from being called more than once
    
    The kill_css() function may be called more than once under the condition
    that the css was killed but not physically removed yet followed by the
    removal of the cgroup that is hosting the css. This patch prevents any
    harmm from being done when that happens.
    
    Signed-off-by: Waiman Long <long...@redhat.com>
    Signed-off-by: Tejun Heo <t...@kernel.org>
    Cc: sta...@vger.kernel.org # v4.5+

diff --git a/kernel/cgroup/cgroup.c b/kernel/cgroup/cgroup.c
index c3c9a0e1b3c9..8d4e85eae42c 100644
--- a/kernel/cgroup/cgroup.c
+++ b/kernel/cgroup/cgroup.c
@@ -4265,6 +4265,11 @@ static void kill_css(struct cgroup_subsys_state *css)
{
        lockdep_assert_held(&cgroup_mutex);

+       if (css->flags & CSS_DYING)
+               return;
+
+       css->flags |= CSS_DYING;
+
        /*
         * This must happen before css is disassociated with its cgroup.
         * See seq_css() for details.

** Affects: linux (Ubuntu)
     Importance: Undecided
         Status: Incomplete


** Tags: zesty

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1748342

Title:
  cgroup: remove cgroup directory leading kernel crash in kill_css

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1748342/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

Reply via email to