Re: [Lxc-users] lxc-destroy does not destroy cgroup

2011-12-20 Thread Serge Hallyn
Quoting Arie Skliarouk (sklia...@gmail.com):
 I don't have the /cgroup directory mounted. Somehow, the directory is
 mounted automatically onto the /sys/fs/cgroup
 
 *root@mf:~# df | grep cgroup
 cgroup12368328 0  12368328   0% /sys/fs/cgroup
 root@mf:~# ls /sys/fs/cgroup/
 blkio  cpu  cpuacct  cpuset  devices  freezer  memory  net_cls  perf_event*
 
 Each subdirectory of the above contains directory per container with knobs
 that are specific to the resource:
 
 *root@mf:~# ls /sys/fs/cgroup/cpu/dev
 cgroup.clone_children  cgroup.procs  cpu.rt_runtime_us
 notify_on_release
 cgroup.event_control   cpu.rt_period_us  cpu.shares tasks
 root@mf:~#*
 
 Could well be this is because of the 3.0.0-12-server kernel. I don't see

No, userspace does the mounting.  i.e. in ubuntu the cgroup-lite or
cgroup-bin packages both do it.

 how I can rename a stuck cgroup easily in this situation. Any advices?

You can build an lxc with my patch (until Daniel has a chance to apply it),
but in the meantime you can make a script 'move_cgroup.sh' along the lines
of:

#!/bin/sh
if [ $# -lt 1 ]; then
echo Usage: $0 cgroup-name
echo  Moves the cgroup-name out of the way.
fi
g=$1

t=`mktemp -u cg.`
for d in /sys/fs/cgroup/*; do
mv $d/$g $d/$g.$t
done

Note that doesn't clean anything up, so if there are hung tasks those will
still be around.  A script to list details of each task in the hung cgroup
would be pretty simple too, and useful - if you write one, you might send
it here for inclusion in lxc!

 BTW, I once had /cgroup mounted from fstab like this:
 
 *none /cgroup cgroup defaults 0 0*
 
 It grouped all settings into per-container directory nicely, but the server
 failed to boot with that.

Yes, once early userspace has mounted the /sys/fs/cgroup/*, that fstab
entry would cause trouble.  But if you remove the package doing the
cgroup mounting, you should be able to go back to using this fstab
entry.

-serge

--
Write once. Port to many.
Get the SDK and tools to simplify cross-platform app development. Create 
new or port existing apps to sell to consumers worldwide. Explore the 
Intel AppUpSM program developer opportunity. appdeveloper.intel.com/join
http://p.sf.net/sfu/intel-appdev
___
Lxc-users mailing list
Lxc-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/lxc-users


Re: [Lxc-users] lxc-destroy does not destroy cgroup

2011-12-18 Thread Arie Skliarouk
What could be worse than a cgroup is not deleted by lxc-destroy? Why,
inability to create a cgroup using lxc-create!

Seriously, the host machine can not start vservers anymore. This is after
one of the cgroups got stuck in the unremovable state.

With these issues it becomes harder and harder for me to justify LXC to my
boss...

--
Arie

On Tue, Dec 13, 2011 at 00:01, Serge Hallyn serge.hal...@canonical.comwrote:

 Quoting Gordon Henderson (gor...@drogon.net):
  On Thu, 8 Dec 2011, Arie Skliarouk wrote:
 
   When I tried to restart the vserver, it did not came up. Long story
 short,
   I found that lxc-destroy did not destroy the cgroup of the same name
 as the
   server. The cgroup remains visible in the /sys/fs/cgroup/cpu/master
   directory. The tasks file is empty though.
 
  And just now, I've had the same thing happen - a container failed to
  start and it left it's body in /cgroup - with empy tasks.

 The patch I sent out on Friday should help handle that more gracefully -
 it moves the cgroup out of the way so a new container can start.  You'll
 need to clean the old one up by hand if you care to, though lxc could
 easily provide a tool to clean it up (and move and analyze any tasks left
 running in the cgroup, though I suspect in most cases there are none).

 -serge


 --
 Learn Windows Azure Live!  Tuesday, Dec 13, 2011
 Microsoft is holding a special Learn Windows Azure training event for
 developers. It will provide a great way to learn Windows Azure and what it
 provides. You can attend the event by watching it streamed LIVE online.
 Learn more at http://p.sf.net/sfu/ms-windowsazure
 ___
 Lxc-users mailing list
 Lxc-users@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/lxc-users

--
Learn Windows Azure Live!  Tuesday, Dec 13, 2011
Microsoft is holding a special Learn Windows Azure training event for 
developers. It will provide a great way to learn Windows Azure and what it 
provides. You can attend the event by watching it streamed LIVE online.  
Learn more at http://p.sf.net/sfu/ms-windowsazure___
Lxc-users mailing list
Lxc-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/lxc-users


Re: [Lxc-users] lxc-destroy does not destroy cgroup

2011-12-18 Thread C Anthony Risinger
On Dec 18, 2011 1:09 PM, Jérôme Petazzoni jerome.petazz...@dotcloud.com
wrote:

 If that happens, just try to terminate the other processes running in the
cgroup, rename it (mv /cgroup/mylittlecontainer /cgroup/broken) and
restart it.
 This saved me a few reboots already :-)

You should be able to remove an empty cgroup with `rmdir`.

--

C Anthony
--
Learn Windows Azure Live!  Tuesday, Dec 13, 2011
Microsoft is holding a special Learn Windows Azure training event for 
developers. It will provide a great way to learn Windows Azure and what it 
provides. You can attend the event by watching it streamed LIVE online.  
Learn more at http://p.sf.net/sfu/ms-windowsazure___
Lxc-users mailing list
Lxc-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/lxc-users


Re: [Lxc-users] lxc-destroy does not destroy cgroup

2011-12-18 Thread Jérôme Petazzoni

On 12/18/2011 11:56 AM, C Anthony Risinger wrote:


On Dec 18, 2011 1:09 PM, Jérôme Petazzoni 
jerome.petazz...@dotcloud.com mailto:jerome.petazz...@dotcloud.com 
wrote:


 If that happens, just try to terminate the other processes running 
in the cgroup, rename it (mv /cgroup/mylittlecontainer 
/cgroup/broken) and restart it.

 This saved me a few reboots already :-)

You should be able to remove an empty cgroup with `rmdir`.



Sorry, I forgot to mention:
- if you can't remove the cgroup because it's not empty,
- if the process remaining in the cgroup cannot be killed (for instance, 
because it's in uninterruptible sleep state),
- if you can't move the said process in a different cgroup (with e.g. 
echo $PID  /cgroup/othercgroup/tasks),

- ... then you can still rename the old cgroup.


--
Learn Windows Azure Live!  Tuesday, Dec 13, 2011
Microsoft is holding a special Learn Windows Azure training event for 
developers. It will provide a great way to learn Windows Azure and what it 
provides. You can attend the event by watching it streamed LIVE online.  
Learn more at http://p.sf.net/sfu/ms-windowsazure___
Lxc-users mailing list
Lxc-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/lxc-users


Re: [Lxc-users] lxc-destroy does not destroy cgroup

2011-12-12 Thread Gordon Henderson
On Thu, 8 Dec 2011, Arie Skliarouk wrote:

 When I tried to restart the vserver, it did not came up. Long story short,
 I found that lxc-destroy did not destroy the cgroup of the same name as the
 server. The cgroup remains visible in the /sys/fs/cgroup/cpu/master
 directory. The tasks file is empty though.

And just now, I've had the same thing happen - a container failed to 
start and it left it's body in /cgroup - with empy tasks.

This is latest  greatest - kernel 3.1.4, lxc 0.7.5, Debian squeeze 
(kernel  lxc compiled)

It may well have been my own fault - trying to start a container whos 
disk image was NFS mounted and I got the error:

mount.nfs: an incorrect mount option was specified

and lxc-start hung. so I may be doing something bogus anyway, however...

(like e.g. trying to bind-mount /proc, /sys, /dev/pts, etc. into an nfs 
mounted directory?)

Gordon

--
Learn Windows Azure Live!  Tuesday, Dec 13, 2011
Microsoft is holding a special Learn Windows Azure training event for 
developers. It will provide a great way to learn Windows Azure and what it 
provides. You can attend the event by watching it streamed LIVE online.  
Learn more at http://p.sf.net/sfu/ms-windowsazure
___
Lxc-users mailing list
Lxc-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/lxc-users


Re: [Lxc-users] lxc-destroy does not destroy cgroup

2011-12-11 Thread Arie Skliarouk

 When I tried to restart the vserver, it did not came up. Long story short,
 I found that lxc-destroy did not destroy the cgroup of the same name as the
 server. The cgroup remains visible in the /sys/fs/cgroup/cpu/master
 directory. The tasks file is empty though.

 I had to rename the container to be able to start it.


About 18 hours after the event, the physical machine locked up hard.
Without any message in dmesg or on its console. Before that, the machine
worked pretty hard for about 60 days without a hitch.

My gut feeling is that it is related to the stale cgroup somehow.

Out of curiosity, what kernel are you running? I'm on 2.6.35, but looking
 at some of the later ones now...


I use kernel 3.0.0-12-server amd64 as packaged in the ubuntu 11.10. I had
problems with earlier kernels as they locked up the machine every week or
so.

--
Arie
--
Learn Windows Azure Live!  Tuesday, Dec 13, 2011
Microsoft is holding a special Learn Windows Azure training event for 
developers. It will provide a great way to learn Windows Azure and what it 
provides. You can attend the event by watching it streamed LIVE online.  
Learn more at http://p.sf.net/sfu/ms-windowsazure___
Lxc-users mailing list
Lxc-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/lxc-users


Re: [Lxc-users] lxc-destroy does not destroy cgroup

2011-12-11 Thread Gordon Henderson
On Sun, 11 Dec 2011, Arie Skliarouk wrote:


 When I tried to restart the vserver, it did not came up. Long story short,
 I found that lxc-destroy did not destroy the cgroup of the same name as the
 server. The cgroup remains visible in the /sys/fs/cgroup/cpu/master
 directory. The tasks file is empty though.

 I had to rename the container to be able to start it.


 About 18 hours after the event, the physical machine locked up hard.
 Without any message in dmesg or on its console. Before that, the machine
 worked pretty hard for about 60 days without a hitch.

Ouch. And oddly enough, I had a hard-lockup a few days ago myself that 
needed a power cycle.

 My gut feeling is that it is related to the stale cgroup somehow.

 Out of curiosity, what kernel are you running? I'm on 2.6.35, but looking
 at some of the later ones now...

 I use kernel 3.0.0-12-server amd64 as packaged in the ubuntu 11.10. I had
 problems with earlier kernels as they locked up the machine every week or
 so.

My base is Debian Stable, but I custom compile the kernels to match 
hardware. I've put the latest  greatest on a test server to see how it 
fares - so-far so good, but there's no real load on it.

I think it would be good for more people to start to post their 
experiences with LXC though - who knows how many people are using it - any 
big companies using it in anger (as opposed to KVM, XEN, etc.) and so 
on. (or small companies with big installations!)

I have 2 areas of application for it - one is hosted Asterisk PBXs, and 
for that it seems to work really well, but the run-time environment is 
very carefully controlled - it basically runs sendmail, sshd, apache+php 
and asterisk and nothing else. The other application I use them for it 
more of a management side - for running general purpose LAMP servers in - 
mostly to make sure I can relatively quickly move an image from one server 
to another to cover hardware issues or temporarily/permanent increase (or 
reduction!) in avalable resources... I don't consider my own use big by 
any means at all though..

Gordon

--
Learn Windows Azure Live!  Tuesday, Dec 13, 2011
Microsoft is holding a special Learn Windows Azure training event for 
developers. It will provide a great way to learn Windows Azure and what it 
provides. You can attend the event by watching it streamed LIVE online.  
Learn more at http://p.sf.net/sfu/ms-windowsazure
___
Lxc-users mailing list
Lxc-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/lxc-users


[Lxc-users] lxc-destroy does not destroy cgroup

2011-12-08 Thread Arie Skliarouk
Hi,

Most of the time the lxc-destroy works properly, removing the cgroup with
the same name as the container.

Today something strange happened on one of my vservers - suddenly it
stopped responding to requests and any attempt to connect just hanged (as
if connection was successful, but no data was coming through).
Checking dmesg on the host machine revealed that the vserver got into
out-of-memory situation:

*[304880.371274] Memory cgroup out of memory: Kill process 1959 (init)
score 1 or sacrifice child
[304880.403765] Killed process 10638 (apache2) total-vm:40608kB,
anon-rss:12kB, file-rss:4088kB
[304881.719832] bash invoked oom-killer: gfp_mask=0xd0, order=0, oom_adj=0,
oom_score_adj=0
...
[304881.719965] Task in /master killed as a result of limit of /master
[304881.719970] memory: usage 976564kB, limit 976564kB, failcnt 60487010
...
[304881.835887] [ 8938] 20081  8938 2625   90   6
0 0 sshd
[304881.835897] Memory cgroup out of memory: Kill process 1959 (init) score
1 or sacrifice child
[304881.836836] Killed process 19680 (apache2) total-vm:41372kB,
anon-rss:0kB, file-rss:4504kB
[304884.298748] bash invoked oom-killer: gfp_mask=0xd0, order=0, oom_adj=0,
oom_score_adj=0
...
[304884.414478] Memory cgroup out of memory: Kill process 1959 (init) score
1 or sacrifice child
[304884.415428] Killed process 1959 (init) total-vm:3188kB, anon-rss:0kB,
file-rss:476kB*

Note that the last process that got killed was init. IMHO it should be the
last process to be killed, immediately after sshd, but that's a minor
problem.

When I tried to restart the vserver, it did not came up. Long story short,
I found that lxc-destroy did not destroy the cgroup of the same name as the
server. The cgroup remains visible in the /sys/fs/cgroup/cpu/master
directory. The tasks file is empty though.

I had to rename the container to be able to start it.

All this on ubuntu 11.04, 3.0.0-12-server amd64. Thoughts, comments?

--
Arie
--
Cloud Services Checklist: Pricing and Packaging Optimization
This white paper is intended to serve as a reference, checklist and point of 
discussion for anyone considering optimizing the pricing and packaging model 
of a cloud services business. Read Now!
http://www.accelacomm.com/jaw/sfnl/114/51491232/___
Lxc-users mailing list
Lxc-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/lxc-users


Re: [Lxc-users] lxc-destroy does not destroy cgroup

2011-12-08 Thread Gordon Henderson
On Thu, 8 Dec 2011, Arie Skliarouk wrote:

 When I tried to restart the vserver, it did not came up. Long story short,
 I found that lxc-destroy did not destroy the cgroup of the same name as the
 server. The cgroup remains visible in the /sys/fs/cgroup/cpu/master
 directory. The tasks file is empty though.

 I had to rename the container to be able to start it.

Did you remember to stop it first?

 All this on ubuntu 11.04, 3.0.0-12-server amd64. Thoughts, comments?

Very very similar to what I experience from time to time. (Posted about 
recently with zero response) Although my more drastic solution is to 
reboot the host, but I have gotten away with lxc-stop then a start.

I've now stopped using memory limits in containers and for the time being 
will let them swap (or share more memory with other containers and swap 
if needed) - they're mostly well behaved though.

I don't have a solution I'm afraid.

Gordon

--
Cloud Services Checklist: Pricing and Packaging Optimization
This white paper is intended to serve as a reference, checklist and point of 
discussion for anyone considering optimizing the pricing and packaging model 
of a cloud services business. Read Now!
http://www.accelacomm.com/jaw/sfnl/114/51491232/
___
Lxc-users mailing list
Lxc-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/lxc-users


Re: [Lxc-users] lxc-destroy does not destroy cgroup

2011-12-08 Thread Arie Skliarouk
On Thu, Dec 8, 2011 at 14:05, Gordon Henderson gor...@drogon.net wrote:

 On Thu, 8 Dec 2011, Arie Skliarouk wrote:

  When I tried to restart the vserver, it did not came up. Long story
 short, I found that lxc-destroy did not destroy the cgroup of the same name
 as the server. The cgroup remains visible in the /sys/fs/cgroup/cpu/master
 directory. The tasks file is empty though.

   I had to rename the container to be able to start it.

 Did you remember to stop it first?


Of course! It is part of the vserver stop script.


  All this on ubuntu 11.04, 3.0.0-12-server amd64. Thoughts, comments?

 Very very similar to what I experience from time to time. (Posted about
 recently with zero response) Although my more drastic solution is to reboot
 the host, but I have gotten away with lxc-stop then a start.


Well, with 65 running containers (24GB of RAM) it is easier to rename the
vserver :)

I've now stopped using memory limits in containers and for the time being
 will let them swap (or share more memory with other containers and swap if
 needed) - they're mostly well behaved though.


My vservers do not behave well and require restrictions.

BTW, do you know how can I restrict number of running processes in a
container (like in openvz)?

--
Arie
--
Cloud Services Checklist: Pricing and Packaging Optimization
This white paper is intended to serve as a reference, checklist and point of 
discussion for anyone considering optimizing the pricing and packaging model 
of a cloud services business. Read Now!
http://www.accelacomm.com/jaw/sfnl/114/51491232/___
Lxc-users mailing list
Lxc-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/lxc-users


Re: [Lxc-users] lxc-destroy does not destroy cgroup

2011-12-08 Thread Gordon Henderson
On Thu, 8 Dec 2011, Arie Skliarouk wrote:

 On Thu, Dec 8, 2011 at 14:05, Gordon Henderson gor...@drogon.net wrote:

 On Thu, 8 Dec 2011, Arie Skliarouk wrote:

 When I tried to restart the vserver, it did not came up. Long story
 short, I found that lxc-destroy did not destroy the cgroup of the same name
 as the server. The cgroup remains visible in the /sys/fs/cgroup/cpu/master
 directory. The tasks file is empty though.

  I had to rename the container to be able to start it.

 Did you remember to stop it first?

 Of course! It is part of the vserver stop script.

Just checking!

 All this on ubuntu 11.04, 3.0.0-12-server amd64. Thoughts, comments?

 Very very similar to what I experience from time to time. (Posted about
 recently with zero response) Although my more drastic solution is to reboot
 the host, but I have gotten away with lxc-stop then a start.

 Well, with 65 running containers (24GB of RAM) it is easier to rename the
 vserver :)

Yes. I can see that a system restart might irritate a few other people!

Out of curiosity, what kernel are you running? I'm on 2.6.35, but looking 
at some of the later ones now...

 I've now stopped using memory limits in containers and for the time being
 will let them swap (or share more memory with other containers and swap if
 needed) - they're mostly well behaved though.

 My vservers do not behave well and require restrictions.

OK.

 BTW, do you know how can I restrict number of running processes in a
 container (like in openvz)?

No idea I'm afraid. I guess some sort of super limit passed into the 
containers init (via setrlimit() ?) is what's needed...

Gordon

--
Cloud Services Checklist: Pricing and Packaging Optimization
This white paper is intended to serve as a reference, checklist and point of 
discussion for anyone considering optimizing the pricing and packaging model 
of a cloud services business. Read Now!
http://www.accelacomm.com/jaw/sfnl/114/51491232/
___
Lxc-users mailing list
Lxc-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/lxc-users