[Bug 1021471] Re: stuck on mutex_lock creating a new network namespace when starting a container
After some testing, I think this is not a LXC specific issue. It's probably related to kernel CLONE_NEWNET code. Since if we run testing like this: - sudo ./reproducer - ctrl+c - sudo ./reproducer wait for a while - dmesg | grep unregister we can still get the same error message. looks like the first try reproducer didn't release loopback device. -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1021471 Title: clone() hang when creating new network namespace (dmesg show unregister_netdevice: waiting for lo to become free. Usage count = 2) To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1021471/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1021471] Re: stuck on mutex_lock creating a new network namespace when starting a container
I can reproduce this as Stephane's mentioned, but I only got message like "unregister_netdevice: waiting for lo to become free. Usage count = 2". There is no other oops messages like mutex_lock() and I think the oops is because lxc-start was blocked for too long. So probably the subject of this bug should be changed. -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1021471 Title: stuck on mutex_lock creating a new network namespace when starting a container To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1021471/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1021471] Re: stuck on mutex_lock creating a new network namespace when starting a container
I reproduced this on the first run of my lxc-ized buildbot setup script on a quantal host, so it's likely to hit real users. -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1021471 Title: stuck on mutex_lock creating a new network namespace when starting a container To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1021471/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1021471] Re: stuck on mutex_lock creating a new network namespace when starting a container
I have finally been able to reproduce this, but it takes me much longer than it does Stephane. -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1021471 Title: stuck on mutex_lock creating a new network namespace when starting a container To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1021471/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1021471] Re: stuck on mutex_lock creating a new network namespace when starting a container
** Tags added: kernel-da-key kernel-key -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1021471 Title: stuck on mutex_lock creating a new network namespace when starting a container To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1021471/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1021471] Re: stuck on mutex_lock creating a new network namespace when starting a container
The following seems pretty reliable to me: - gcc reproducer.c -o reproducer (using the paste.ubuntu.com code above) - sudo ./reproducer - ctrl+c - lxc-start -n - dmesg | grep unregister It appears that reproducing it that way is very reliable here, though the result is slightly different. Using this reproducer, the container will usually hang at startup for a few minutes, then eventually succeed to boot. When getting the bug without that reproducer, it'd usually hang indefinitely (where indefinitely > 10 minutes). -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1021471 Title: stuck on mutex_lock creating a new network namespace when starting a container To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1021471/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
Re: [Bug 1021471] Re: stuck on mutex_lock creating a new network namespace when starting a container
For some reason I've still never seen this. Do you have a recipe by which, after a reboot, can you 100% reproduce this? -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1021471 Title: stuck on mutex_lock creating a new network namespace when starting a container To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1021471/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1021471] Re: stuck on mutex_lock creating a new network namespace when starting a container
The last time I saw this happening was 5 minutes ago on a Lenovo x230 (no legacy BIOS), running: Linux castiana 3.5.0-13-generic #14-Ubuntu SMP Wed Aug 29 16:48:44 UTC 2012 x86_64 x86_64 x86_64 GNU/Linux As Jean-Baptiste says, this bug is extremely annoying as anyone using LXC and hitting this bug (that part seems quite random) won't be able to work until they power cycle the system, would appreciate if someone could actually look at this. -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1021471 Title: stuck on mutex_lock creating a new network namespace when starting a container To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1021471/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1021471] Re: stuck on mutex_lock creating a new network namespace when starting a container
Looking around for this bug, after getting it myself a few more times... I found http://lists.debian.org/debian-kernel/2012/05/msg00494.html which mentions a similar behaviour. I extracted the C example and built it: http://paste.ubuntu.com/1182799/ Running it, indeed triggered the issue here, any subsequent call to lxc-start will just hang. When running lxc-start under strace, I'm getting: stat("/home/stgraber/data/vm/lxc/lib/precise-gui-i386/rootfs", {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0 open("/home/stgraber/data/vm/lxc/lib/precise-gui-i386/rootfs.hold", O_RDWR|O_CREAT, 0600) = 17 clone( So it looks like, whatever the issue is, it's triggering when trying to clone(CLONE_NEWNET). Hope that helps point towards the right direction. ** Changed in: linux (Ubuntu) Status: Confirmed => Triaged -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1021471 Title: stuck on mutex_lock creating a new network namespace when starting a container To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1021471/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1021471] Re: stuck on mutex_lock creating a new network namespace when starting a container
I can reproduce it very reliably on my system after shutting down an LXC container with poweroff from inside the container. I'm setting to High because then the container cannot be started again without restarting the host system, and the host system won't shutdown waiting forever for lo to become free. Only SysRq helps in that case. ** Changed in: linux (Ubuntu) Importance: Medium => High ** Tags added: rls-q-incoming -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1021471 Title: stuck on mutex_lock creating a new network namespace when starting a container To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1021471/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1021471] Re: stuck on mutex_lock creating a new network namespace when starting a container
On my somewhat lagged quantal, I have been seeing similar issues: Linux clint-MacBookPro 3.5.0-8-generic #8-Ubuntu SMP Sat Aug 4 04:42:28 UTC 2012 x86_64 x86_64 x86_64 GNU/Linux [194038.144050] unregister_netdevice: waiting for lo to become free. Usage count = 1 [194040.576173] INFO: task lxc-start:23872 blocked for more than 120 seconds. [194040.576178] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [194040.576180] lxc-start D 88014fd13980 0 23872 1 0x [194040.576186] 880116909cc0 0086 880090ad2e00 880116909fd8 [194040.576192] 880116909fd8 880116909fd8 88014483 880090ad2e00 [194040.576197] 880116909cc0 81ca91a0 880090ad2e00 81ca91a4 [194040.576202] Call Trace: [194040.576212] [] schedule+0x29/0x70 [194040.576217] [] schedule_preempt_disabled+0xe/0x10 [194040.576221] [] __mutex_lock_slowpath+0xd7/0x150 [194040.576225] [] mutex_lock+0x2a/0x50 [194040.576230] [] copy_net_ns+0x71/0x100 [194040.576236] [] create_new_namespaces+0xdb/0x190 [194040.576239] [] copy_namespaces+0x8c/0xd0 [194040.576245] [] copy_process.part.22+0x902/0x1520 [194040.576249] [] do_fork+0x135/0x390 [194040.576254] [] ? vfs_write+0x105/0x180 [194040.576258] [] sys_clone+0x28/0x30 [194040.576263] [] stub_clone+0x13/0x20 [194040.576267] [] ? system_call_fastpath+0x16/0x1b [194048.384149] unregister_netdevice: waiting for lo to become free. Usage count = 1 [194058.624071] unregister_netdevice: waiting for lo to become free. Usage count = 1 [194068.864079] unregister_netdevice: waiting for lo to become free. Usage count = 1 [194079.104158] unregister_netdevice: waiting for lo to become free. Usage count = 1 [194089.344152] unregister_netdevice: waiting for lo to become free. Usage count = 1 [194099.584105] unregister_netdevice: waiting for lo to become free. Usage count = 1 [194109.824044] unregister_netdevice: waiting for lo to become free. Usage count = 1 [194120.064158] unregister_netdevice: waiting for lo to become free. Usage count = 1 [194130.304148] unregister_netdevice: waiting for lo to become free. Usage count = 1 [194140.544146] unregister_netdevice: waiting for lo to become free. Usage count = 1 [194150.784065] unregister_netdevice: waiting for lo to become free. Usage count = 1 [194160.576246] INFO: task lxc-start:23872 blocked for more than 120 seconds. [194160.576251] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [194160.576253] lxc-start D 88014fd13980 0 23872 1 0x [194160.576259] 880116909cc0 0086 880090ad2e00 880116909fd8 [194160.576265] 880116909fd8 880116909fd8 88014483 880090ad2e00 [194160.576270] 880116909cc0 81ca91a0 880090ad2e00 81ca91a4 [194160.576275] Call Trace: [194160.576286] [] schedule+0x29/0x70 [194160.576290] [] schedule_preempt_disabled+0xe/0x10 [194160.576294] [] __mutex_lock_slowpath+0xd7/0x150 [194160.576299] [] mutex_lock+0x2a/0x50 [194160.576304] [] copy_net_ns+0x71/0x100 [194160.576309] [] create_new_namespaces+0xdb/0x190 [194160.576313] [] copy_namespaces+0x8c/0xd0 [194160.576318] [] copy_process.part.22+0x902/0x1520 [194160.576322] [] do_fork+0x135/0x390 [194160.576327] [] ? vfs_write+0x105/0x180 [194160.576332] [] sys_clone+0x28/0x30 [194160.576337] [] stub_clone+0x13/0x20 [194160.576341] [] ? system_call_fastpath+0x16/0x1b [194161.024151] unregister_netdevice: waiting for lo to become free. Usage count = 1 I've been creating/destroying a lot of LXC containers, so its possible the veth's created for them are causing some issues. I also have a ton of network-interface-security jobs running suggesting that they're being added but not removed: network-interface-security (network-interface/vethx3SWbR) start/running network-interface-security (network-interface/vethWUOSpt) start/running network-interface-security (network-interface/veth90RDZM) start/running network-interface-security (network-interface/vethCdnGSx) start/running network-interface-security (network-interface/vetha8REFc) start/running network-interface-security (network-interface/veth8yrXSC) start/running network-interface-security (network-interface/vethvtEy9P) start/running These issues are blocking some LXC work I'm doing, so I'm going to try upgrading which may take me out of the 'affected' category, so I've apt- cloned so we can get back to this state if need be: ** Changed in: linux (Ubuntu) Status: Incomplete => Confirmed -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1021471 Title: stuck on mutex_lock creating a new network namespace when starting a container To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1021471/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@list
[Bug 1021471] Re: stuck on mutex_lock creating a new network namespace when starting a container
I'm quite surprised that with all of these tests I haven't got the mutex_lock bug again though, it was definitely happening on that machine... maybe some other fixes fixed it or I'm just not exercising the exact code path that's triggering it. -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1021471 Title: stuck on mutex_lock creating a new network namespace when starting a container To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1021471/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1021471] Re: stuck on mutex_lock creating a new network namespace when starting a container
Restarted the same test with the default I/O scheduler and after a few hours, got the same crash again: Jul 19 16:58:20 lantea kernel: [14707.004394] general protection fault: [#1] SMP Jul 19 16:58:20 lantea kernel: [14707.008026] Jul 19 16:58:20 lantea kernel: [14707.008026] Pid: 20505, comm: dbus-daemon Not tainted 3.5.0-5-generic #5-Ubuntu/945GSE Jul 19 16:58:20 lantea kernel: [14707.008026] EIP: 0060:[] EFLAGS: 00010286 CPU: 0 Jul 19 16:58:20 lantea kernel: [14707.008026] EIP is at unix_stream_recvmsg+0x4eb/0x680 Jul 19 16:58:20 lantea kernel: [14707.008026] EAX: EBX: f28cc180 ECX: f18f1d40 EDX: Jul 19 16:58:20 lantea kernel: [14707.008026] ESI: EDI: edd6c240 EBP: f18f1d68 ESP: f18f1ccc Jul 19 16:58:20 lantea kernel: [14707.008026] DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068 Jul 19 16:58:20 lantea kernel: [14707.008026] CR0: 80050033 CR2: b7702130 CR3: 318d6000 CR4: 07e0 Jul 19 16:58:20 lantea kernel: [14707.008026] DR0: DR1: DR2: DR3: Jul 19 16:58:20 lantea kernel: [14707.008026] DR6: 0ff0 DR7: 0400 Jul 19 16:58:20 lantea kernel: [14707.008026] Process dbus-daemon (pid: 20505, ti=f18f task=f48dd8d0 task.ti=f18f) Jul 19 16:58:20 lantea kernel: [14707.008026] Stack: Jul 19 16:58:20 lantea kernel: [14707.008026] c15bfa3d f18f1cdc edd6e688 f18f1d40 c14c419c f48dd8d0 f48dd8d0 Jul 19 16:58:20 lantea kernel: [14707.008026] edd6c3f8 0001 f18f1d7c edd6c288 Jul 19 16:58:20 lantea kernel: [14707.008026] ec7ff500 f18f1f4c edd6c420 f4acd400 Jul 19 16:58:20 lantea kernel: [14707.008026] Call Trace: Jul 19 16:58:20 lantea kernel: [14707.008026] [] ? _raw_spin_lock_irqsave+0x2d/0x40 Jul 19 16:58:20 lantea kernel: [14707.008026] [] ? skb_queue_tail+0x3c/0x50 Jul 19 16:58:20 lantea kernel: [14707.008026] [] ? aa_revalidate_sk+0x83/0x90 Jul 19 16:58:20 lantea kernel: [14707.008026] [] sock_recvmsg+0xcc/0x100 Jul 19 16:58:20 lantea kernel: [14707.008026] [] ? __pollwait+0xd0/0xd0 Jul 19 16:58:20 lantea kernel: [14707.008026] [] ? _copy_from_user+0x41/0x60 Jul 19 16:58:20 lantea kernel: [14707.008026] [] ? verify_iovec+0x3f/0xb0 Jul 19 16:58:20 lantea kernel: [14707.008026] [] ? sock_sendmsg_nosec+0xf0/0xf0 Jul 19 16:58:20 lantea kernel: [14707.008026] [] __sys_recvmsg+0x110/0x1d0 Jul 19 16:58:20 lantea kernel: [14707.008026] [] ? sock_sendmsg_nosec+0xf0/0xf0 Jul 19 16:58:20 lantea kernel: [14707.008026] [] ? trigger_load_balance+0x4f/0x1c0 Jul 19 16:58:20 lantea kernel: [14707.008026] [] ? scheduler_tick+0xda/0x100 Jul 19 16:58:20 lantea kernel: [14707.008026] [] ? timerqueue_add+0x58/0xb0 Jul 19 16:58:20 lantea kernel: [14707.008026] [] ? ktime_get+0x65/0xf0 Jul 19 16:58:20 lantea kernel: [14707.008026] [] ? lapic_next_event+0x1b/0x20 Jul 19 16:58:20 lantea kernel: [14707.008026] [] sys_recvmsg+0x3b/0x60 Jul 19 16:58:20 lantea kernel: [14707.008026] [] sys_socketcall+0x28b/0x2d0 Jul 19 16:58:20 lantea kernel: [14707.008026] [] ? smp_apic_timer_interrupt+0x5e/0x8d Jul 19 16:58:20 lantea kernel: [14707.008026] [] ? sys_clock_gettime+0x48/0x70 Jul 19 16:58:20 lantea kernel: [14707.008026] [] sysenter_do_call+0x12/0x28 Jul 19 16:58:20 lantea kernel: [14707.008026] Code: 6c ff ff ff 89 8d 74 ff ff ff 74 03 f0 ff 00 8b 95 74 ff ff ff 89 02 8b 95 6c ff ff ff 85 d2 0f 84 2d 01 00 00 8b 95 6c ff ff ff ff 02 89 55 84 8b 55 84 8b 8d 74 ff ff ff 89 51 04 8b 95 6c Jul 19 16:58:20 lantea kernel: [14707.008026] EIP: [] unix_stream_recvmsg+0x4eb/0x680 SS:ESP 0068:f18f1ccc Jul 19 16:58:20 lantea kernel: [14707.537680] ---[ end trace 8455671fd435d7f5 ]--- Jul 19 16:58:21 lantea kernel: [14707.548025] [] put_files_struct+0x75/0xc0 Jul 19 16:58:21 lantea kernel: [14707.548025] [] exit_files+0x46/0x60 Jul 19 16:58:21 lantea kernel: [14707.548025] [] do_exit+0x14a/0x7a0 Jul 19 16:58:21 lantea kernel: [14707.548025] [] ? print_oops_end_marker+0x2f/0x40 Jul 19 16:58:21 lantea kernel: [14707.548025] [] oops_end+0x8d/0xd0 Jul 19 16:58:21 lantea kernel: [14707.548025] [] die+0x54/0x80 Jul 19 16:58:21 lantea kernel: [14707.548025] [] do_general_protection+0x102/0x180 Jul 19 16:58:21 lantea kernel: [14707.548025] [] ? kfree+0xcc/0xf0 Jul 19 16:58:21 lantea kernel: [14707.548025] [] ? skb_free_head+0x45/0x50 Jul 19 16:58:21 lantea kernel: [14707.548025] [] ? do_trap+0xd0/0xd0 Jul 19 16:58:21 lantea kernel: [14707.548025] [] error_code+0x67/0x6c Jul 19 16:58:21 lantea kernel: [14707.548025] [] ? proto_register+0x19b/0x210 Jul 19 16:58:21 lantea kernel: [14707.548025] [] ? unix_stream_recvmsg+0x4eb/0x680 Jul 19 16:58:21 lantea kernel: [14707.548025] [] ? _raw_spin_lock_irqsave+0x2d/0x40 Jul 19 16:58:21 lantea kernel: [14707.548025] [] ? skb_queue_tail+0x3c/0x50 Jul 19 16:58:21 lantea kernel: [14707.548025] [] ? aa_revalidate_sk+0x83/0x90 Jul 19 16:58:21 lantea kernel: [14707.548025] []
[Bug 1021471] Re: stuck on mutex_lock creating a new network namespace when starting a container
Not much luck reproducing at the moment with an up to date quantal, though running using the deadline scheduler with two containers rebooting in a loop, I eventually hit that: Jul 19 07:22:34 lantea kernel: [46965.795778] ---[ end trace c212400a9b13d700 ]--- Jul 19 07:22:35 lantea kernel: [46965.809353] general protection fault: [#2] SMP Jul 19 07:22:35 lantea kernel: [46965.812019] Modules linked in: veth ipt_MASQUERADE iptable_nat nf_nat nf_conntrack_ipv4 nf_conntrack nf_defrag_ipv4 ip_tables x_tables 8021q garp bridge stp llc snd_hda_codec_realtek snd_hda_intel snd_hda_codec snd_hwdep snd_pcm coretemp microcode snd_seq_midi snd_rawmidi psmouse serio_raw snd_seq_midi_event lpc_ich snd_seq snd_timer snd_seq_device i915 bonding rfcomm bnep bluetooth parport_pc ppdev mac_hid snd drm_kms_helper drm i2c_algo_bit soundcore snd_page_alloc video lp parport hid_generic usbhid hid r8169 floppy Jul 19 07:22:35 lantea kernel: [46965.812019] Jul 19 07:22:35 lantea kernel: [46965.812019] Pid: 11839, comm: initctl Tainted: G D 3.5.0-5-generic #5-Ubuntu/945GSE Jul 19 07:22:35 lantea kernel: [46965.812019] EIP: 0060:[] EFLAGS: 00010286 CPU: 0 Jul 19 07:22:35 lantea kernel: [46965.812019] EIP is at unix_destruct_scm+0x53/0x90 Jul 19 07:22:35 lantea kernel: [46965.812019] EAX: EBX: f71740c0 ECX: EDX: Jul 19 07:22:35 lantea kernel: [46965.812019] ESI: e0a828c8 EDI: f71740c0 EBP: e0a89adc ESP: e0a89abc Jul 19 07:22:35 lantea kernel: [46965.812019] DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068 Jul 19 07:22:35 lantea kernel: [46965.812019] CR0: 80050033 CR2: b7606fb8 CR3: 01968000 CR4: 07e0 Jul 19 07:22:35 lantea kernel: [46965.812019] DR0: DR1: DR2: DR3: Jul 19 07:22:35 lantea kernel: [46965.812019] DR6: 0ff0 DR7: 0400 Jul 19 07:22:35 lantea kernel: [46965.812019] Process initctl (pid: 11839, ti=e0a88000 task=f34e6580 task.ti=e0a88000) Jul 19 07:22:35 lantea kernel: [46965.812019] Stack: Jul 19 07:22:35 lantea kernel: [46965.812019] f71740c0 Jul 19 07:22:35 lantea kernel: [46965.812019] e0a89ae8 c14c45d3 f71740c0 e0a89af4 c14c43d0 0001 e0a89b0c c14c4486 Jul 19 07:22:35 lantea kernel: [46965.812019] c154bc6f 0001 e0a828c8 f71740c0 e0a89b38 c154bc6f e0a80ae0 Jul 19 07:22:35 lantea kernel: [46965.812019] Call Trace: Jul 19 07:22:35 lantea kernel: [46965.812019] [] skb_release_head_state+0x43/0xc0 Jul 19 07:22:35 lantea kernel: [46965.812019] [] __kfree_skb+0x10/0x90 Jul 19 07:22:35 lantea kernel: [46965.812019] [] kfree_skb+0x36/0x80 Jul 19 07:22:35 lantea kernel: [46965.812019] [] ? unix_release_sock+0x13f/0x240 Jul 19 07:22:35 lantea kernel: [46965.812019] [] unix_release_sock+0x13f/0x240 Jul 19 07:22:35 lantea kernel: [46965.812019] [] unix_release+0x1f/0x30 Jul 19 07:22:35 lantea kernel: [46965.812019] [] sock_release+0x20/0x70 Jul 19 07:22:35 lantea kernel: [46965.812019] [] sock_close+0x17/0x30 Jul 19 07:22:35 lantea kernel: [46965.812019] [] fput+0xe6/0x210 Jul 19 07:22:35 lantea kernel: [46965.812019] [] filp_close+0x54/0x80 Jul 19 07:22:35 lantea kernel: [46965.812019] [] put_files_struct+0x75/0xc0 Jul 19 07:22:35 lantea kernel: [46965.812019] [] exit_files+0x46/0x60 Jul 19 07:22:35 lantea kernel: [46965.812019] [] do_exit+0x14a/0x7a0 Jul 19 07:22:35 lantea kernel: [46965.812019] [] ? print_oops_end_marker+0x2f/0x40 Jul 19 07:22:35 lantea kernel: [46965.812019] [] oops_end+0x8d/0xd0 Jul 19 07:22:35 lantea kernel: [46965.812019] [] die+0x54/0x80 Jul 19 07:22:35 lantea kernel: [46965.812019] [] do_general_protection+0x102/0x180 Jul 19 07:22:35 lantea kernel: [46965.812019] [] ? default_wake_function+0x10/0x20 Jul 19 07:22:35 lantea kernel: [46965.812019] [] ? pollwake+0x62/0x70 Jul 19 07:22:35 lantea kernel: [46965.812019] [] ? do_trap+0xd0/0xd0 Jul 19 07:22:35 lantea kernel: [46965.812019] [] error_code+0x67/0x6c Jul 19 07:22:35 lantea kernel: [46965.812019] [] ? unix_stream_recvmsg+0x4eb/0x680 Jul 19 07:22:35 lantea kernel: [46965.812019] [] ? aa_revalidate_sk+0x83/0x90 Jul 19 07:22:35 lantea kernel: [46965.812019] [] sock_recvmsg+0xcc/0x100 Jul 19 07:22:35 lantea kernel: [46965.812019] [] ? _copy_from_user+0x41/0x60 Jul 19 07:22:35 lantea kernel: [46965.812019] [] ? verify_iovec+0x3f/0xb0 Jul 19 07:22:35 lantea kernel: [46965.812019] [] ? sock_sendmsg_nosec+0xf0/0xf0 Jul 19 07:22:35 lantea kernel: [46965.812019] [] __sys_recvmsg+0x110/0x1d0 Jul 19 07:22:35 lantea kernel: [46965.812019] [] ? sock_sendmsg_nosec+0xf0/0xf0 Jul 19 07:22:35 lantea kernel: [46965.812019] [] ? trigger_load_balance+0x4f/0x1c0 Jul 19 07:22:35 lantea kernel: [46965.812019] [] ? __dequeue_entity+0x25/0x40 Jul 19 07:22:35 lantea kernel: [46965.812019] [] ? __switch_to+0xbc/0x260 Jul 19 07:22:35 lantea kernel: [46965.812019] [] ? finish_task_switch+0x41/0xc0 Jul 19 07:22:35 lantea kernel: [46965.8120
[Bug 1021471] Re: stuck on mutex_lock creating a new network namespace when starting a container
hmmm, that's very hard for us to analyze. We did meet a similar oops before because of the heavy workload and CFQ block IO scheduler. Could you a test for us? change your default Block IO scheduler from CFQ to deadline and run LXC as usual to verify this issue is gone. I'm just guess and hope this can do some help. -Bryan ** Changed in: linux (Ubuntu) Status: Confirmed => Incomplete -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1021471 Title: stuck on mutex_lock creating a new network namespace when starting a container To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1021471/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1021471] Re: stuck on mutex_lock creating a new network namespace when starting a container
Sorry, I don't yet have a recipe for reproducing it. I did manage to get the system into the broken state again after I filed this bug report by suspending/resuming and starting/stopping/using containers as usual, but I can't trigger it on demand. As for heavy load: not at the point it breaks. The main container I use is one I'm working on some Launchpad changes in. I often run the test suite inside it, and I guess that is rather heavy. -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1021471 Title: stuck on mutex_lock creating a new network namespace when starting a container To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1021471/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1021471] Re: stuck on mutex_lock creating a new network namespace when starting a container
Can you tell us how to reproduce this issue? >From the dmesg kernel warning opps, I think it is not lxc/cgroups specific issue. Looks like lxc-start was blocked by some stuff for a long time. Is there any heavy workload on your system? Thanks, -Bryan ** Changed in: linux (Ubuntu) Importance: Undecided => Medium ** Changed in: linux (Ubuntu) Assignee: (unassigned) => Bryan Wu (cooloney) -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1021471 Title: stuck on mutex_lock creating a new network namespace when starting a container To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1021471/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1021471] Re: stuck on mutex_lock creating a new network namespace when starting a container
** Changed in: linux (Ubuntu) Status: New => Confirmed -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1021471 Title: stuck on mutex_lock creating a new network namespace when starting a container To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1021471/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1021471] Re: 'stuck on mutex_lock creating a new network namespace when starting a container
** Summary changed: - lxc-start sometimes stops starting containers + 'stuck on mutex_lock creating a new network namespace when starting a container ** Summary changed: - 'stuck on mutex_lock creating a new network namespace when starting a container + stuck on mutex_lock creating a new network namespace when starting a container -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1021471 Title: stuck on mutex_lock creating a new network namespace when starting a container To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1021471/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs