Re: [Lxc-users] lxc-destroy does not destroy cgroup
Quoting Arie Skliarouk (sklia...@gmail.com): I don't have the /cgroup directory mounted. Somehow, the directory is mounted automatically onto the /sys/fs/cgroup *root@mf:~# df | grep cgroup cgroup12368328 0 12368328 0% /sys/fs/cgroup root@mf:~# ls /sys/fs/cgroup/ blkio cpu cpuacct cpuset devices freezer memory net_cls perf_event* Each subdirectory of the above contains directory per container with knobs that are specific to the resource: *root@mf:~# ls /sys/fs/cgroup/cpu/dev cgroup.clone_children cgroup.procs cpu.rt_runtime_us notify_on_release cgroup.event_control cpu.rt_period_us cpu.shares tasks root@mf:~#* Could well be this is because of the 3.0.0-12-server kernel. I don't see No, userspace does the mounting. i.e. in ubuntu the cgroup-lite or cgroup-bin packages both do it. how I can rename a stuck cgroup easily in this situation. Any advices? You can build an lxc with my patch (until Daniel has a chance to apply it), but in the meantime you can make a script 'move_cgroup.sh' along the lines of: #!/bin/sh if [ $# -lt 1 ]; then echo Usage: $0 cgroup-name echo Moves the cgroup-name out of the way. fi g=$1 t=`mktemp -u cg.` for d in /sys/fs/cgroup/*; do mv $d/$g $d/$g.$t done Note that doesn't clean anything up, so if there are hung tasks those will still be around. A script to list details of each task in the hung cgroup would be pretty simple too, and useful - if you write one, you might send it here for inclusion in lxc! BTW, I once had /cgroup mounted from fstab like this: *none /cgroup cgroup defaults 0 0* It grouped all settings into per-container directory nicely, but the server failed to boot with that. Yes, once early userspace has mounted the /sys/fs/cgroup/*, that fstab entry would cause trouble. But if you remove the package doing the cgroup mounting, you should be able to go back to using this fstab entry. -serge -- Write once. Port to many. Get the SDK and tools to simplify cross-platform app development. Create new or port existing apps to sell to consumers worldwide. Explore the Intel AppUpSM program developer opportunity. appdeveloper.intel.com/join http://p.sf.net/sfu/intel-appdev ___ Lxc-users mailing list Lxc-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/lxc-users
Re: [Lxc-users] lxc-destroy does not destroy cgroup
What could be worse than a cgroup is not deleted by lxc-destroy? Why, inability to create a cgroup using lxc-create! Seriously, the host machine can not start vservers anymore. This is after one of the cgroups got stuck in the unremovable state. With these issues it becomes harder and harder for me to justify LXC to my boss... -- Arie On Tue, Dec 13, 2011 at 00:01, Serge Hallyn serge.hal...@canonical.comwrote: Quoting Gordon Henderson (gor...@drogon.net): On Thu, 8 Dec 2011, Arie Skliarouk wrote: When I tried to restart the vserver, it did not came up. Long story short, I found that lxc-destroy did not destroy the cgroup of the same name as the server. The cgroup remains visible in the /sys/fs/cgroup/cpu/master directory. The tasks file is empty though. And just now, I've had the same thing happen - a container failed to start and it left it's body in /cgroup - with empy tasks. The patch I sent out on Friday should help handle that more gracefully - it moves the cgroup out of the way so a new container can start. You'll need to clean the old one up by hand if you care to, though lxc could easily provide a tool to clean it up (and move and analyze any tasks left running in the cgroup, though I suspect in most cases there are none). -serge -- Learn Windows Azure Live! Tuesday, Dec 13, 2011 Microsoft is holding a special Learn Windows Azure training event for developers. It will provide a great way to learn Windows Azure and what it provides. You can attend the event by watching it streamed LIVE online. Learn more at http://p.sf.net/sfu/ms-windowsazure ___ Lxc-users mailing list Lxc-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/lxc-users -- Learn Windows Azure Live! Tuesday, Dec 13, 2011 Microsoft is holding a special Learn Windows Azure training event for developers. It will provide a great way to learn Windows Azure and what it provides. You can attend the event by watching it streamed LIVE online. Learn more at http://p.sf.net/sfu/ms-windowsazure___ Lxc-users mailing list Lxc-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/lxc-users
Re: [Lxc-users] lxc-destroy does not destroy cgroup
On Dec 18, 2011 1:09 PM, Jérôme Petazzoni jerome.petazz...@dotcloud.com wrote: If that happens, just try to terminate the other processes running in the cgroup, rename it (mv /cgroup/mylittlecontainer /cgroup/broken) and restart it. This saved me a few reboots already :-) You should be able to remove an empty cgroup with `rmdir`. -- C Anthony -- Learn Windows Azure Live! Tuesday, Dec 13, 2011 Microsoft is holding a special Learn Windows Azure training event for developers. It will provide a great way to learn Windows Azure and what it provides. You can attend the event by watching it streamed LIVE online. Learn more at http://p.sf.net/sfu/ms-windowsazure___ Lxc-users mailing list Lxc-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/lxc-users
Re: [Lxc-users] lxc-destroy does not destroy cgroup
On 12/18/2011 11:56 AM, C Anthony Risinger wrote: On Dec 18, 2011 1:09 PM, Jérôme Petazzoni jerome.petazz...@dotcloud.com mailto:jerome.petazz...@dotcloud.com wrote: If that happens, just try to terminate the other processes running in the cgroup, rename it (mv /cgroup/mylittlecontainer /cgroup/broken) and restart it. This saved me a few reboots already :-) You should be able to remove an empty cgroup with `rmdir`. Sorry, I forgot to mention: - if you can't remove the cgroup because it's not empty, - if the process remaining in the cgroup cannot be killed (for instance, because it's in uninterruptible sleep state), - if you can't move the said process in a different cgroup (with e.g. echo $PID /cgroup/othercgroup/tasks), - ... then you can still rename the old cgroup. -- Learn Windows Azure Live! Tuesday, Dec 13, 2011 Microsoft is holding a special Learn Windows Azure training event for developers. It will provide a great way to learn Windows Azure and what it provides. You can attend the event by watching it streamed LIVE online. Learn more at http://p.sf.net/sfu/ms-windowsazure___ Lxc-users mailing list Lxc-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/lxc-users
Re: [Lxc-users] lxc-destroy does not destroy cgroup
On Thu, 8 Dec 2011, Arie Skliarouk wrote: When I tried to restart the vserver, it did not came up. Long story short, I found that lxc-destroy did not destroy the cgroup of the same name as the server. The cgroup remains visible in the /sys/fs/cgroup/cpu/master directory. The tasks file is empty though. And just now, I've had the same thing happen - a container failed to start and it left it's body in /cgroup - with empy tasks. This is latest greatest - kernel 3.1.4, lxc 0.7.5, Debian squeeze (kernel lxc compiled) It may well have been my own fault - trying to start a container whos disk image was NFS mounted and I got the error: mount.nfs: an incorrect mount option was specified and lxc-start hung. so I may be doing something bogus anyway, however... (like e.g. trying to bind-mount /proc, /sys, /dev/pts, etc. into an nfs mounted directory?) Gordon -- Learn Windows Azure Live! Tuesday, Dec 13, 2011 Microsoft is holding a special Learn Windows Azure training event for developers. It will provide a great way to learn Windows Azure and what it provides. You can attend the event by watching it streamed LIVE online. Learn more at http://p.sf.net/sfu/ms-windowsazure ___ Lxc-users mailing list Lxc-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/lxc-users
Re: [Lxc-users] lxc-destroy does not destroy cgroup
When I tried to restart the vserver, it did not came up. Long story short, I found that lxc-destroy did not destroy the cgroup of the same name as the server. The cgroup remains visible in the /sys/fs/cgroup/cpu/master directory. The tasks file is empty though. I had to rename the container to be able to start it. About 18 hours after the event, the physical machine locked up hard. Without any message in dmesg or on its console. Before that, the machine worked pretty hard for about 60 days without a hitch. My gut feeling is that it is related to the stale cgroup somehow. Out of curiosity, what kernel are you running? I'm on 2.6.35, but looking at some of the later ones now... I use kernel 3.0.0-12-server amd64 as packaged in the ubuntu 11.10. I had problems with earlier kernels as they locked up the machine every week or so. -- Arie -- Learn Windows Azure Live! Tuesday, Dec 13, 2011 Microsoft is holding a special Learn Windows Azure training event for developers. It will provide a great way to learn Windows Azure and what it provides. You can attend the event by watching it streamed LIVE online. Learn more at http://p.sf.net/sfu/ms-windowsazure___ Lxc-users mailing list Lxc-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/lxc-users
Re: [Lxc-users] lxc-destroy does not destroy cgroup
On Sun, 11 Dec 2011, Arie Skliarouk wrote: When I tried to restart the vserver, it did not came up. Long story short, I found that lxc-destroy did not destroy the cgroup of the same name as the server. The cgroup remains visible in the /sys/fs/cgroup/cpu/master directory. The tasks file is empty though. I had to rename the container to be able to start it. About 18 hours after the event, the physical machine locked up hard. Without any message in dmesg or on its console. Before that, the machine worked pretty hard for about 60 days without a hitch. Ouch. And oddly enough, I had a hard-lockup a few days ago myself that needed a power cycle. My gut feeling is that it is related to the stale cgroup somehow. Out of curiosity, what kernel are you running? I'm on 2.6.35, but looking at some of the later ones now... I use kernel 3.0.0-12-server amd64 as packaged in the ubuntu 11.10. I had problems with earlier kernels as they locked up the machine every week or so. My base is Debian Stable, but I custom compile the kernels to match hardware. I've put the latest greatest on a test server to see how it fares - so-far so good, but there's no real load on it. I think it would be good for more people to start to post their experiences with LXC though - who knows how many people are using it - any big companies using it in anger (as opposed to KVM, XEN, etc.) and so on. (or small companies with big installations!) I have 2 areas of application for it - one is hosted Asterisk PBXs, and for that it seems to work really well, but the run-time environment is very carefully controlled - it basically runs sendmail, sshd, apache+php and asterisk and nothing else. The other application I use them for it more of a management side - for running general purpose LAMP servers in - mostly to make sure I can relatively quickly move an image from one server to another to cover hardware issues or temporarily/permanent increase (or reduction!) in avalable resources... I don't consider my own use big by any means at all though.. Gordon -- Learn Windows Azure Live! Tuesday, Dec 13, 2011 Microsoft is holding a special Learn Windows Azure training event for developers. It will provide a great way to learn Windows Azure and what it provides. You can attend the event by watching it streamed LIVE online. Learn more at http://p.sf.net/sfu/ms-windowsazure ___ Lxc-users mailing list Lxc-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/lxc-users
[Lxc-users] lxc-destroy does not destroy cgroup
Hi, Most of the time the lxc-destroy works properly, removing the cgroup with the same name as the container. Today something strange happened on one of my vservers - suddenly it stopped responding to requests and any attempt to connect just hanged (as if connection was successful, but no data was coming through). Checking dmesg on the host machine revealed that the vserver got into out-of-memory situation: *[304880.371274] Memory cgroup out of memory: Kill process 1959 (init) score 1 or sacrifice child [304880.403765] Killed process 10638 (apache2) total-vm:40608kB, anon-rss:12kB, file-rss:4088kB [304881.719832] bash invoked oom-killer: gfp_mask=0xd0, order=0, oom_adj=0, oom_score_adj=0 ... [304881.719965] Task in /master killed as a result of limit of /master [304881.719970] memory: usage 976564kB, limit 976564kB, failcnt 60487010 ... [304881.835887] [ 8938] 20081 8938 2625 90 6 0 0 sshd [304881.835897] Memory cgroup out of memory: Kill process 1959 (init) score 1 or sacrifice child [304881.836836] Killed process 19680 (apache2) total-vm:41372kB, anon-rss:0kB, file-rss:4504kB [304884.298748] bash invoked oom-killer: gfp_mask=0xd0, order=0, oom_adj=0, oom_score_adj=0 ... [304884.414478] Memory cgroup out of memory: Kill process 1959 (init) score 1 or sacrifice child [304884.415428] Killed process 1959 (init) total-vm:3188kB, anon-rss:0kB, file-rss:476kB* Note that the last process that got killed was init. IMHO it should be the last process to be killed, immediately after sshd, but that's a minor problem. When I tried to restart the vserver, it did not came up. Long story short, I found that lxc-destroy did not destroy the cgroup of the same name as the server. The cgroup remains visible in the /sys/fs/cgroup/cpu/master directory. The tasks file is empty though. I had to rename the container to be able to start it. All this on ubuntu 11.04, 3.0.0-12-server amd64. Thoughts, comments? -- Arie -- Cloud Services Checklist: Pricing and Packaging Optimization This white paper is intended to serve as a reference, checklist and point of discussion for anyone considering optimizing the pricing and packaging model of a cloud services business. Read Now! http://www.accelacomm.com/jaw/sfnl/114/51491232/___ Lxc-users mailing list Lxc-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/lxc-users
Re: [Lxc-users] lxc-destroy does not destroy cgroup
On Thu, 8 Dec 2011, Arie Skliarouk wrote: When I tried to restart the vserver, it did not came up. Long story short, I found that lxc-destroy did not destroy the cgroup of the same name as the server. The cgroup remains visible in the /sys/fs/cgroup/cpu/master directory. The tasks file is empty though. I had to rename the container to be able to start it. Did you remember to stop it first? All this on ubuntu 11.04, 3.0.0-12-server amd64. Thoughts, comments? Very very similar to what I experience from time to time. (Posted about recently with zero response) Although my more drastic solution is to reboot the host, but I have gotten away with lxc-stop then a start. I've now stopped using memory limits in containers and for the time being will let them swap (or share more memory with other containers and swap if needed) - they're mostly well behaved though. I don't have a solution I'm afraid. Gordon -- Cloud Services Checklist: Pricing and Packaging Optimization This white paper is intended to serve as a reference, checklist and point of discussion for anyone considering optimizing the pricing and packaging model of a cloud services business. Read Now! http://www.accelacomm.com/jaw/sfnl/114/51491232/ ___ Lxc-users mailing list Lxc-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/lxc-users
Re: [Lxc-users] lxc-destroy does not destroy cgroup
On Thu, Dec 8, 2011 at 14:05, Gordon Henderson gor...@drogon.net wrote: On Thu, 8 Dec 2011, Arie Skliarouk wrote: When I tried to restart the vserver, it did not came up. Long story short, I found that lxc-destroy did not destroy the cgroup of the same name as the server. The cgroup remains visible in the /sys/fs/cgroup/cpu/master directory. The tasks file is empty though. I had to rename the container to be able to start it. Did you remember to stop it first? Of course! It is part of the vserver stop script. All this on ubuntu 11.04, 3.0.0-12-server amd64. Thoughts, comments? Very very similar to what I experience from time to time. (Posted about recently with zero response) Although my more drastic solution is to reboot the host, but I have gotten away with lxc-stop then a start. Well, with 65 running containers (24GB of RAM) it is easier to rename the vserver :) I've now stopped using memory limits in containers and for the time being will let them swap (or share more memory with other containers and swap if needed) - they're mostly well behaved though. My vservers do not behave well and require restrictions. BTW, do you know how can I restrict number of running processes in a container (like in openvz)? -- Arie -- Cloud Services Checklist: Pricing and Packaging Optimization This white paper is intended to serve as a reference, checklist and point of discussion for anyone considering optimizing the pricing and packaging model of a cloud services business. Read Now! http://www.accelacomm.com/jaw/sfnl/114/51491232/___ Lxc-users mailing list Lxc-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/lxc-users
Re: [Lxc-users] lxc-destroy does not destroy cgroup
On Thu, 8 Dec 2011, Arie Skliarouk wrote: On Thu, Dec 8, 2011 at 14:05, Gordon Henderson gor...@drogon.net wrote: On Thu, 8 Dec 2011, Arie Skliarouk wrote: When I tried to restart the vserver, it did not came up. Long story short, I found that lxc-destroy did not destroy the cgroup of the same name as the server. The cgroup remains visible in the /sys/fs/cgroup/cpu/master directory. The tasks file is empty though. I had to rename the container to be able to start it. Did you remember to stop it first? Of course! It is part of the vserver stop script. Just checking! All this on ubuntu 11.04, 3.0.0-12-server amd64. Thoughts, comments? Very very similar to what I experience from time to time. (Posted about recently with zero response) Although my more drastic solution is to reboot the host, but I have gotten away with lxc-stop then a start. Well, with 65 running containers (24GB of RAM) it is easier to rename the vserver :) Yes. I can see that a system restart might irritate a few other people! Out of curiosity, what kernel are you running? I'm on 2.6.35, but looking at some of the later ones now... I've now stopped using memory limits in containers and for the time being will let them swap (or share more memory with other containers and swap if needed) - they're mostly well behaved though. My vservers do not behave well and require restrictions. OK. BTW, do you know how can I restrict number of running processes in a container (like in openvz)? No idea I'm afraid. I guess some sort of super limit passed into the containers init (via setrlimit() ?) is what's needed... Gordon -- Cloud Services Checklist: Pricing and Packaging Optimization This white paper is intended to serve as a reference, checklist and point of discussion for anyone considering optimizing the pricing and packaging model of a cloud services business. Read Now! http://www.accelacomm.com/jaw/sfnl/114/51491232/ ___ Lxc-users mailing list Lxc-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/lxc-users