Re: [Lxc-users] Zombie container

2011-02-15 Thread Brian K. White
On 2/14/2011 6:50 PM, Trent W. Buck wrote:
> Daniel Lezcano  writes:
>
>> As a quick fix, I suggest you look what application created the new
>> namespace. Launch your container and then look at
>> /cgroup/blackbird/1234/tasks and look for the command line associated
>> with the pid in this file. I suspect vsftpd could be the culprit. If
>> this is the case, there is an option to disable the namespace
>> creation.
>
> Or, of course, pick a different application :-)
>
> If it is vsftpd, I *strongly* recommend switching to SFTP (part of SSH)
> for writes, and HTTP for reads.  http://mywiki.wooledge.org/FtpMustDie

Well, of course, but what's that got to do with LXC or the namespace 
trick that vsftpd happens to use?

Your observations, which everyone already knows, show that the ftp 
protocol is problematic. Granted but so what?

The discussion here is how to get all commonly used tools working within 
containers, using lxc, that are currently used outside of containers, 
not what tools to use.

3 things:

1) The vstftpd problem is not a problem with the ftp protocol. Apache or 
any other service or app that meets your religious or aesthetic approval 
might have the same or similar problem at any time. Here we are only 
interested in containerizing anything that currently is done on 
traditional servers. For better or for worse, FTP is widely used on 
trandtional servers, and specifically vsftpd is. And so the discussion 
is about how to use vsftpd within a container, not whether to use ftp.

2) As if everyone has any choice in the matter anyway, since most use of 
any communication protocol, such as ftp, involve two different parties, 
not yourself at both ends. Even if you were so gauche as to try to 
dictate internal IT policies and procedures and technologies to your own 
customers and vendors, you still don't get to dictate to 2nd or more 
removed customers and vendors of your own customers and vendors. So when 
_big honking global bank/manufacturer/retailer/shipper/etc_ says they 
will ftp to you or you to them, you just *&^*7 do it.

Oh you can offer the alternatives, and occasionally you get lucky, but 
that doesn't remove the need to make ftp work. Same goes for every other 
commonly used technology that you don't happen to personally like.

3) What makes http so special only for reading and sftp so special only 
for writing? Depending on my security needs and other factors I 
routinely use http for writing and/or sftp for reading. I also use rsync 
(native, not via ssh or rsh) for both reading and writing in many 
situations where most people use ftp or sftp or http. Conversely I never 
use nfs and only use samba extremely rarely, but I'm sure these 
technologies are perfectly justifiable and required for other people in 
other situations. Choice of tool is completely dependent on the job at 
hand and it's utterly silly to try to say what should and should not be 
used except within the context of a specific job, and then the answer 
only applies to that one specific job in that one specific context.

-- 
bkw

--
The ultimate all-in-one performance toolkit: Intel(R) Parallel Studio XE:
Pinpoint memory and threading errors before they happen.
Find and fix more than 250 security defects in the development cycle.
Locate bottlenecks in serial and parallel code that limit performance.
http://p.sf.net/sfu/intel-dev2devfeb
___
Lxc-users mailing list
Lxc-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/lxc-users


Re: [Lxc-users] Zombie container

2011-02-15 Thread Milan Zamazal
> "DL" == Daniel Lezcano  writes:

DL> * simply do rm -rf /cgroup/blackbird (don't care about the
DL> errors).
>> 
>> This fails with "Operation not permitted" and the problem
>> persists.

DL> Do you try to remove the directories as root when the container
DL> exited ?

Yes.

DL> It is not a kernel problem, it's the expected behavior but
DL> unfortunately the cgroup automatic creation does not really fit
DL> with the namespace concept. This is why the ns_cgroup will be
DL> removed in the next kernel version in order to manage the cgroup
DL> consistenly.

OK, I have to simply live with the problem (it's not fatal) until then.



--
The ultimate all-in-one performance toolkit: Intel(R) Parallel Studio XE:
Pinpoint memory and threading errors before they happen.
Find and fix more than 250 security defects in the development cycle.
Locate bottlenecks in serial and parallel code that limit performance.
http://p.sf.net/sfu/intel-dev2devfeb
___
Lxc-users mailing list
Lxc-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/lxc-users


Re: [Lxc-users] Zombie container

2011-02-15 Thread Daniel Lezcano
On 02/15/2011 10:17 AM, Milan Zamazal wrote:
>> "DL" == Daniel Lezcano  writes:
>  DL>  It is probable you have an application creating new namespaces
>  DL>  in the container. That's triggering a new cgroup creation which
>  DL>  is nested with the container's one. This is a kernel feature
>  DL>  (removed for the next kernel version).
>
> Thank you for explanation.
>
> By watching when these subdirectories get created I discovered the
> problem appears when I run `fusermount -u'.
>
>  DL>* simply do rm -rf /cgroup/blackbird (don't care about the
>  DL>errors).
>
> This fails with "Operation not permitted" and the problem persists.

Do you try to remove the directories as root when the container exited ?

>  DL>  Launch your container and then look at
>  DL>  /cgroup/blackbird/1234/tasks and look for the command line
>  DL>  associated with the pid in this file.
>
> The `tasks' file is empty.  But it must be fusermount or something
> related to its invocation.

Ok. Interesting.

>  DL>  Hope that helps.
>
> Thank you for help.  Now I know what creates the problem, but I still
> don't know how to safely prevent it or remedy it.  Maybe it's a kernel
> problem (I use standard kernel 2.6.32 from Debian)?

It is not a kernel problem, it's the expected behavior but unfortunately 
the cgroup automatic creation does not really fit with the namespace 
concept. This is why the ns_cgroup will be removed in the next kernel 
version in order to manage the cgroup consistenly.

http://git.kernel.org/?p=linux/kernel/git/sfr/linux-next.git;a=blob;f=Documentation/feature-removal-schedule.txt;h=ada3db8fc9f6307b0b9b51b503353a96b995b62d;hb=b7bbcc2b04070ebd77c827e8ebbd08a5b7493004






--
The ultimate all-in-one performance toolkit: Intel(R) Parallel Studio XE:
Pinpoint memory and threading errors before they happen.
Find and fix more than 250 security defects in the development cycle.
Locate bottlenecks in serial and parallel code that limit performance.
http://p.sf.net/sfu/intel-dev2devfeb
___
Lxc-users mailing list
Lxc-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/lxc-users


Re: [Lxc-users] Zombie container

2011-02-15 Thread Milan Zamazal
> "DL" == Daniel Lezcano  writes:

DL> It is probable you have an application creating new namespaces
DL> in the container. That's triggering a new cgroup creation which
DL> is nested with the container's one. This is a kernel feature
DL> (removed for the next kernel version).

Thank you for explanation.

By watching when these subdirectories get created I discovered the
problem appears when I run `fusermount -u'.

DL>   * simply do rm -rf /cgroup/blackbird (don't care about the
DL>   errors).

This fails with "Operation not permitted" and the problem persists.

DL> Launch your container and then look at
DL> /cgroup/blackbird/1234/tasks and look for the command line
DL> associated with the pid in this file.

The `tasks' file is empty.  But it must be fusermount or something
related to its invocation.

DL> Hope that helps.

Thank you for help.  Now I know what creates the problem, but I still
don't know how to safely prevent it or remedy it.  Maybe it's a kernel
problem (I use standard kernel 2.6.32 from Debian)?



--
The ultimate all-in-one performance toolkit: Intel(R) Parallel Studio XE:
Pinpoint memory and threading errors before they happen.
Find and fix more than 250 security defects in the development cycle.
Locate bottlenecks in serial and parallel code that limit performance.
http://p.sf.net/sfu/intel-dev2devfeb
___
Lxc-users mailing list
Lxc-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/lxc-users


Re: [Lxc-users] Zombie container

2011-02-14 Thread Miroslav Lednicky, AVONET, s.r.o.
Dne 15.2.2011 00:50, Trent W. Buck napsal(a):
> Daniel Lezcano  writes:
>
>> As a quick fix, I suggest you look what application created the new
>> namespace. Launch your container and then look at
>> /cgroup/blackbird/1234/tasks and look for the command line associated
>> with the pid in this file. I suspect vsftpd could be the culprit. If
>> this is the case, there is an option to disable the namespace
>> creation.
>
> Or, of course, pick a different application :-)
>
> If it is vsftpd, I *strongly* recommend switching to SFTP (part of SSH)
> for writes, and HTTP for reads.  http://mywiki.wooledge.org/FtpMustDie


If it is vsftpd, you can add:

isolate=NO
isolate_network=NO

to /etc/vsftpd.conf

and all will OK.

Miroslav.

-- 
Miroslav Lednicky, AVONET, s.r.o.

--
The ultimate all-in-one performance toolkit: Intel(R) Parallel Studio XE:
Pinpoint memory and threading errors before they happen.
Find and fix more than 250 security defects in the development cycle.
Locate bottlenecks in serial and parallel code that limit performance.
http://p.sf.net/sfu/intel-dev2devfeb
___
Lxc-users mailing list
Lxc-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/lxc-users


Re: [Lxc-users] Zombie container

2011-02-14 Thread Trent W. Buck
Daniel Lezcano  writes:

> As a quick fix, I suggest you look what application created the new
> namespace. Launch your container and then look at
> /cgroup/blackbird/1234/tasks and look for the command line associated
> with the pid in this file. I suspect vsftpd could be the culprit. If
> this is the case, there is an option to disable the namespace
> creation.

Or, of course, pick a different application :-)

If it is vsftpd, I *strongly* recommend switching to SFTP (part of SSH)
for writes, and HTTP for reads.  http://mywiki.wooledge.org/FtpMustDie


--
The ultimate all-in-one performance toolkit: Intel(R) Parallel Studio XE:
Pinpoint memory and threading errors before they happen.
Find and fix more than 250 security defects in the development cycle.
Locate bottlenecks in serial and parallel code that limit performance.
http://p.sf.net/sfu/intel-dev2devfeb
___
Lxc-users mailing list
Lxc-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/lxc-users


Re: [Lxc-users] Zombie container

2011-02-14 Thread Daniel Lezcano
On 02/14/2011 07:33 PM, Milan Zamazal wrote:
> On a Debian 6.0 machine, I've got a certain container that can't be
> started again once it was stopped:
>
># lxc-start -n blackbird
>lxc-start: Device or resource busy - failed to remove previous cgroup 
> '/cgroup/blackbird'
>lxc-start: failed to spawn 'blackbird'
>lxc-start: Device or resource busy - failed to remove cgroup 
> '/cgroup/blackbird'
>
> The container seems to be stopped completely but the /cgroup/blackbird/
> directory is indeed non-empty.  There are some subdirectories with
> numeric names there but I can't find any processes with such numbers in
> the system nor any other processes related to the container.  The only
> way to get rid of it is to reboot the host.
>
> Is there a way to force removal of the cgroup?  Or is there a way to
> find out what keeps the cgroup busy?

It is probable you have an application creating new namespaces in the 
container. That's triggering a new cgroup creation which is nested with 
the container's one. This is a kernel feature (removed for the next 
kernel version).

There are several solutions :

  * fix this behavior in lxc where we will recursively remove the cgroup 
directories
  * simply do rm -rf /cgroup/blackbird (don't care about the errors).

As a quick fix, I suggest you look what application created the new 
namespace. Launch your container and then look at 
/cgroup/blackbird/1234/tasks and look for the command line associated 
with the pid in this file. I suspect vsftpd could be the culprit. If 
this is the case, there is an option to disable the namespace creation.

http://www.mail-archive.com/lxc-users@lists.sourceforge.net/msg01110.html

Hope that helps.

   -- Daniel

--
The ultimate all-in-one performance toolkit: Intel(R) Parallel Studio XE:
Pinpoint memory and threading errors before they happen.
Find and fix more than 250 security defects in the development cycle.
Locate bottlenecks in serial and parallel code that limit performance.
http://p.sf.net/sfu/intel-dev2devfeb
___
Lxc-users mailing list
Lxc-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/lxc-users


[Lxc-users] Zombie container

2011-02-14 Thread Milan Zamazal
On a Debian 6.0 machine, I've got a certain container that can't be
started again once it was stopped:

  # lxc-start -n blackbird
  lxc-start: Device or resource busy - failed to remove previous cgroup 
'/cgroup/blackbird'
  lxc-start: failed to spawn 'blackbird'
  lxc-start: Device or resource busy - failed to remove cgroup 
'/cgroup/blackbird'

The container seems to be stopped completely but the /cgroup/blackbird/
directory is indeed non-empty.  There are some subdirectories with
numeric names there but I can't find any processes with such numbers in
the system nor any other processes related to the container.  The only
way to get rid of it is to reboot the host.

Is there a way to force removal of the cgroup?  Or is there a way to
find out what keeps the cgroup busy?



--
The ultimate all-in-one performance toolkit: Intel(R) Parallel Studio XE:
Pinpoint memory and threading errors before they happen.
Find and fix more than 250 security defects in the development cycle.
Locate bottlenecks in serial and parallel code that limit performance.
http://p.sf.net/sfu/intel-dev2devfeb
___
Lxc-users mailing list
Lxc-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/lxc-users