Re: [lxc-users] lxd process using lots of CPU
On 2017-03-10 01:52, Stéphane Graber wrote: Do you see a flood of events if you run "lxc monitor --type=logging"? Nope, just this: # lxc monitor --type=logging metadata: context: {} level: dbug message: 'New events listener: 9e429089-289b-4ab8-9965-069054e7371c' timestamp: 2017-03-09T18:57:56.311175444+01:00 type: logging Tomasz Chmielewski https://lxadm.com ___ lxc-users mailing list lxc-users@lists.linuxcontainers.org http://lists.linuxcontainers.org/listinfo/lxc-users
Re: [lxc-users] lxd process using lots of CPU
On Fri, Mar 10, 2017 at 11:25:40AM +0900, Tomasz Chmielewski wrote: > On 2017-03-10 03:16, Stéphane Graber wrote: > > > Hmm, then it matches another such report I've seen where some of the > > threads are reported as using a lot of CPU, yet when trying to trace > > them you don't actually see anything. > > > > Can you try to run "strace -p" on the various threads that are reported > > as eating all your CPU? > > > > The similar report I got of this would just show them stuck on a futex, > > which wouldn't explain the CPU use. And unfortunately it looked like > > tracing the threads actually somehow fixed the CPU problem for that > > user... > > > > > > If you just want the problem gone, "systemctl restart lxd" should fix > > things without interrupting your containers, but we'd definitely like to > > figure this one out if we can. > > Yes, restarting lxd fixed it. > > stracing different threads was showing a similar output to what I've pasted > before. Stuck in some kind of loop? That'd be some kind of loop which doesn't involve any syscall, so certainly possible but we don't exactly have many of those in the LXD code base and none seem like good candidates. We'd definitely love to fix this but short of having a reliable reproducer or at least debug logs of the time at which the daemon started misbehaving, it's going to be near impossible to track down... -- Stéphane Graber Ubuntu developer http://www.ubuntu.com signature.asc Description: PGP signature ___ lxc-users mailing list lxc-users@lists.linuxcontainers.org http://lists.linuxcontainers.org/listinfo/lxc-users
Re: [lxc-users] lxd process using lots of CPU
On 2017-03-10 03:16, Stéphane Graber wrote: Hmm, then it matches another such report I've seen where some of the threads are reported as using a lot of CPU, yet when trying to trace them you don't actually see anything. Can you try to run "strace -p" on the various threads that are reported as eating all your CPU? The similar report I got of this would just show them stuck on a futex, which wouldn't explain the CPU use. And unfortunately it looked like tracing the threads actually somehow fixed the CPU problem for that user... If you just want the problem gone, "systemctl restart lxd" should fix things without interrupting your containers, but we'd definitely like to figure this one out if we can. Yes, restarting lxd fixed it. stracing different threads was showing a similar output to what I've pasted before. Stuck in some kind of loop? Tomasz Chmielewski https://lxadm.com ___ lxc-users mailing list lxc-users@lists.linuxcontainers.org http://lists.linuxcontainers.org/listinfo/lxc-users
Re: [lxc-users] lxd process using lots of CPU
On Fri, Mar 10, 2017 at 03:01:02AM +0900, Tomasz Chmielewski wrote: > On 2017-03-10 01:52, Stéphane Graber wrote: > > > Do you see a flood of events if you run "lxc monitor --type=logging"? > > Nope, just this: > > # lxc monitor --type=logging > metadata: > context: {} > level: dbug > message: 'New events listener: 9e429089-289b-4ab8-9965-069054e7371c' > timestamp: 2017-03-09T18:57:56.311175444+01:00 > type: logging Hmm, then it matches another such report I've seen where some of the threads are reported as using a lot of CPU, yet when trying to trace them you don't actually see anything. Can you try to run "strace -p" on the various threads that are reported as eating all your CPU? The similar report I got of this would just show them stuck on a futex, which wouldn't explain the CPU use. And unfortunately it looked like tracing the threads actually somehow fixed the CPU problem for that user... If you just want the problem gone, "systemctl restart lxd" should fix things without interrupting your containers, but we'd definitely like to figure this one out if we can. -- Stéphane Graber Ubuntu developer http://www.ubuntu.com signature.asc Description: PGP signature ___ lxc-users mailing list lxc-users@lists.linuxcontainers.org http://lists.linuxcontainers.org/listinfo/lxc-users
Re: [lxc-users] lxd process using lots of CPU
On Thu, Mar 09, 2017 at 11:01:34PM +0900, Tomasz Chmielewski wrote: > On a server with several ~idlish containers: > > > PID USER PRI NI VIRT RES SHR S CPU% MEM% TIME+ Command > 19104 root 20 0 2548M 44132 15236 S 140. 0.0 58h03:17 /usr/bin/lxd > --group lxd --logfile=/var/log/lxd/lxd.log > 24966 root 20 0 2548M 44132 15236 S 18.2 0.0 2h45:36 /usr/bin/lxd > --group lxd --logfile=/var/log/lxd/lxd.log > 19162 root 20 0 2548M 44132 15236 S 17.5 0.0 3h31:49 /usr/bin/lxd > --group lxd --logfile=/var/log/lxd/lxd.log > 19120 root 20 0 2548M 44132 15236 S 16.2 0.0 3h16:11 /usr/bin/lxd > --group lxd --logfile=/var/log/lxd/lxd.log > 19244 root 20 0 2548M 44132 15236 S 11.0 0.0 1h48:56 /usr/bin/lxd > --group lxd --logfile=/var/log/lxd/lxd.log > 19123 root 20 0 2548M 44132 15236 S 11.0 0.0 3h34:42 /usr/bin/lxd > --group lxd --logfile=/var/log/lxd/lxd.log > 19243 root 20 0 2548M 44132 15236 S 10.4 0.0 1h06:13 /usr/bin/lxd > --group lxd --logfile=/var/log/lxd/lxd.log > 14962 root 20 0 2548M 44132 15236 R 10.4 0.0 3h17:27 /usr/bin/lxd > --group lxd --logfile=/var/log/lxd/lxd.log > 19356 root 20 0 2548M 44132 15236 S 9.7 0.0 2h16:44 /usr/bin/lxd > --group lxd --logfile=/var/log/lxd/lxd.log > 19161 root 20 0 2548M 44132 15236 R 9.7 0.0 1h26:40 /usr/bin/lxd > --group lxd --logfile=/var/log/lxd/lxd.log > 19126 root 20 0 2548M 44132 15236 R 9.1 0.0 22:11.20 /usr/bin/lxd > --group lxd --logfile=/var/log/lxd/lxd.log > 19115 root 20 0 2548M 44132 15236 R 8.4 0.0 2h55:21 /usr/bin/lxd > --group lxd --logfile=/var/log/lxd/lxd.log > 693 root 20 0 2548M 44132 15236 R 8.4 0.0 2h28:02 /usr/bin/lxd > --group lxd --logfile=/var/log/lxd/lxd.log > > > That's actually one lxd process with many threads; view from htop. > > Expected? > > ii liblxc12.0.7-0ubuntu1~16.04.1 > amd64Linux Containers userspace tools (library) > ii lxc-common 2.0.7-0ubuntu1~16.04.1 > amd64Linux Containers userspace tools (common tools) > ii lxcfs 2.0.6-0ubuntu1~16.04.1 > amd64FUSE based filesystem for LXC > ii lxd2.0.9-0ubuntu1~16.04.2 > amd64Container hypervisor based on LXC - daemon > ii lxd-client 2.0.9-0ubuntu1~16.04.2 > amd64Container hypervisor based on LXC - client > > > > strace of the process mainly shows: > > [pid 19124] poll([{fd=28, events=POLLIN|POLLPRI|POLLERR|POLLHUP|0x2000}], 1, > -1 > [pid 19120] poll([{fd=13, events=POLLIN|POLLPRI|POLLERR|POLLHUP|0x2000}], 1, > -1 > [pid 19124] <... poll resumed> )= 1 ([{fd=28, revents=POLLNVAL}]) > [pid 19120] <... poll resumed> )= 1 ([{fd=13, revents=POLLNVAL}]) > [pid 19124] poll([{fd=28, events=POLLIN|POLLPRI|POLLERR|POLLHUP|0x2000}], 1, > -1 > [pid 19120] poll([{fd=13, events=POLLIN|POLLPRI|POLLERR|POLLHUP|0x2000}], 1, > -1 > [pid 19124] <... poll resumed> )= 1 ([{fd=28, revents=POLLNVAL}]) > [pid 19120] <... poll resumed> )= 1 ([{fd=13, revents=POLLNVAL}]) > [pid 19124] poll([{fd=28, events=POLLIN|POLLPRI|POLLERR|POLLHUP|0x2000}], 1, > -1 > [pid 19120] poll([{fd=13, events=POLLIN|POLLPRI|POLLERR|POLLHUP|0x2000}], 1, > -1 > [pid 19124] <... poll resumed> )= 1 ([{fd=28, revents=POLLNVAL}]) > [pid 19120] <... poll resumed> )= 1 ([{fd=13, revents=POLLNVAL}]) > [pid 19124] poll([{fd=28, events=POLLIN|POLLPRI|POLLERR|POLLHUP|0x2000}], 1, > -1 > [pid 19120] poll([{fd=13, events=POLLIN|POLLPRI|POLLERR|POLLHUP|0x2000}], 1, > -1 > [pid 19124] <... poll resumed> )= 1 ([{fd=28, revents=POLLNVAL}]) > [pid 19120] <... poll resumed> )= 1 ([{fd=13, revents=POLLNVAL}]) > [pid 19124] poll([{fd=28, events=POLLIN|POLLPRI|POLLERR|POLLHUP|0x2000}], 1, > -1 > [pid 19120] poll([{fd=13, events=POLLIN|POLLPRI|POLLERR|POLLHUP|0x2000}], 1, > -1 > [pid 19124] <... poll resumed> )= 1 ([{fd=28, revents=POLLNVAL}]) > [pid 19120] <... poll resumed> )= 1 ([{fd=13, revents=POLLNVAL}]) > [pid 19124] poll([{fd=28, events=POLLIN|POLLPRI|POLLERR|POLLHUP|0x2000}], 1, > -1 > [pid 19120] poll([{fd=13, events=POLLIN|POLLPRI|POLLERR|POLLHUP|0x2000}], 1, > -1 > > > Tomasz Chmielewski > https://lxadm.com Do you see a flood of events if you run "lxc monitor --type=logging"? -- Stéphane Graber Ubuntu developer http://www.ubuntu.com signature.asc Description: PGP signature ___ lxc-users mailing list lxc-users@lists.linuxcontainers.org http://lists.linuxcontainers.org/listinfo/lxc-users
[lxc-users] lxd process using lots of CPU
On a server with several ~idlish containers: PID USER PRI NI VIRT RES SHR S CPU% MEM% TIME+ Command 19104 root 20 0 2548M 44132 15236 S 140. 0.0 58h03:17 /usr/bin/lxd --group lxd --logfile=/var/log/lxd/lxd.log 24966 root 20 0 2548M 44132 15236 S 18.2 0.0 2h45:36 /usr/bin/lxd --group lxd --logfile=/var/log/lxd/lxd.log 19162 root 20 0 2548M 44132 15236 S 17.5 0.0 3h31:49 /usr/bin/lxd --group lxd --logfile=/var/log/lxd/lxd.log 19120 root 20 0 2548M 44132 15236 S 16.2 0.0 3h16:11 /usr/bin/lxd --group lxd --logfile=/var/log/lxd/lxd.log 19244 root 20 0 2548M 44132 15236 S 11.0 0.0 1h48:56 /usr/bin/lxd --group lxd --logfile=/var/log/lxd/lxd.log 19123 root 20 0 2548M 44132 15236 S 11.0 0.0 3h34:42 /usr/bin/lxd --group lxd --logfile=/var/log/lxd/lxd.log 19243 root 20 0 2548M 44132 15236 S 10.4 0.0 1h06:13 /usr/bin/lxd --group lxd --logfile=/var/log/lxd/lxd.log 14962 root 20 0 2548M 44132 15236 R 10.4 0.0 3h17:27 /usr/bin/lxd --group lxd --logfile=/var/log/lxd/lxd.log 19356 root 20 0 2548M 44132 15236 S 9.7 0.0 2h16:44 /usr/bin/lxd --group lxd --logfile=/var/log/lxd/lxd.log 19161 root 20 0 2548M 44132 15236 R 9.7 0.0 1h26:40 /usr/bin/lxd --group lxd --logfile=/var/log/lxd/lxd.log 19126 root 20 0 2548M 44132 15236 R 9.1 0.0 22:11.20 /usr/bin/lxd --group lxd --logfile=/var/log/lxd/lxd.log 19115 root 20 0 2548M 44132 15236 R 8.4 0.0 2h55:21 /usr/bin/lxd --group lxd --logfile=/var/log/lxd/lxd.log 693 root 20 0 2548M 44132 15236 R 8.4 0.0 2h28:02 /usr/bin/lxd --group lxd --logfile=/var/log/lxd/lxd.log That's actually one lxd process with many threads; view from htop. Expected? ii liblxc12.0.7-0ubuntu1~16.04.1 amd64Linux Containers userspace tools (library) ii lxc-common 2.0.7-0ubuntu1~16.04.1 amd64Linux Containers userspace tools (common tools) ii lxcfs 2.0.6-0ubuntu1~16.04.1 amd64FUSE based filesystem for LXC ii lxd2.0.9-0ubuntu1~16.04.2 amd64Container hypervisor based on LXC - daemon ii lxd-client 2.0.9-0ubuntu1~16.04.2 amd64Container hypervisor based on LXC - client strace of the process mainly shows: [pid 19124] poll([{fd=28, events=POLLIN|POLLPRI|POLLERR|POLLHUP|0x2000}], 1, -1 [pid 19120] poll([{fd=13, events=POLLIN|POLLPRI|POLLERR|POLLHUP|0x2000}], 1, -1 [pid 19124] <... poll resumed> )= 1 ([{fd=28, revents=POLLNVAL}]) [pid 19120] <... poll resumed> )= 1 ([{fd=13, revents=POLLNVAL}]) [pid 19124] poll([{fd=28, events=POLLIN|POLLPRI|POLLERR|POLLHUP|0x2000}], 1, -1 [pid 19120] poll([{fd=13, events=POLLIN|POLLPRI|POLLERR|POLLHUP|0x2000}], 1, -1 [pid 19124] <... poll resumed> )= 1 ([{fd=28, revents=POLLNVAL}]) [pid 19120] <... poll resumed> )= 1 ([{fd=13, revents=POLLNVAL}]) [pid 19124] poll([{fd=28, events=POLLIN|POLLPRI|POLLERR|POLLHUP|0x2000}], 1, -1 [pid 19120] poll([{fd=13, events=POLLIN|POLLPRI|POLLERR|POLLHUP|0x2000}], 1, -1 [pid 19124] <... poll resumed> )= 1 ([{fd=28, revents=POLLNVAL}]) [pid 19120] <... poll resumed> )= 1 ([{fd=13, revents=POLLNVAL}]) [pid 19124] poll([{fd=28, events=POLLIN|POLLPRI|POLLERR|POLLHUP|0x2000}], 1, -1 [pid 19120] poll([{fd=13, events=POLLIN|POLLPRI|POLLERR|POLLHUP|0x2000}], 1, -1 [pid 19124] <... poll resumed> )= 1 ([{fd=28, revents=POLLNVAL}]) [pid 19120] <... poll resumed> )= 1 ([{fd=13, revents=POLLNVAL}]) [pid 19124] poll([{fd=28, events=POLLIN|POLLPRI|POLLERR|POLLHUP|0x2000}], 1, -1 [pid 19120] poll([{fd=13, events=POLLIN|POLLPRI|POLLERR|POLLHUP|0x2000}], 1, -1 [pid 19124] <... poll resumed> )= 1 ([{fd=28, revents=POLLNVAL}]) [pid 19120] <... poll resumed> )= 1 ([{fd=13, revents=POLLNVAL}]) [pid 19124] poll([{fd=28, events=POLLIN|POLLPRI|POLLERR|POLLHUP|0x2000}], 1, -1 [pid 19120] poll([{fd=13, events=POLLIN|POLLPRI|POLLERR|POLLHUP|0x2000}], 1, -1 Tomasz Chmielewski https://lxadm.com ___ lxc-users mailing list lxc-users@lists.linuxcontainers.org http://lists.linuxcontainers.org/listinfo/lxc-users