Re: [libvirt] [RFC][PATCH] lxc: fix for ns cgroups subsystem

2009-05-20 Thread Daniel Veillard
On Wed, May 20, 2009 at 12:27:58AM +0900, Ryota Ozaki wrote:
 Apologies for not responding. I've come down with a cold
 (not flu) for several days...
 
 On Tue, May 19, 2009 at 10:34 PM, Daniel Veillard veill...@redhat.com wrote:
[...]
   Just to make sure, this is not ready, right ?
  We freeze for libvirt-0.6.4 this week-end,
 
 Yes, not completed yet, but continue working on. So I
 hope to be in time.

  Okay, thanks !

Daniel

-- 
Daniel Veillard  | libxml Gnome XML XSLT toolkit  http://xmlsoft.org/
dan...@veillard.com  | Rpmfind RPM search engine http://rpmfind.net/
http://veillard.com/ | virtualization library  http://libvirt.org/

--
Libvir-list mailing list
Libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] [RFC][PATCH] lxc: fix for ns cgroups subsystem

2009-05-19 Thread Daniel Veillard
On Mon, May 11, 2009 at 06:01:07PM +0900, Ryota Ozaki wrote:
 Hi,
 I've updated the patch. The change includes support for multiple mount
 points of cgroups that I didn't cope with in the previous patch.
 
 Through the work, I found a bit messy problem. Current lxc controller writes
 pid in a 'tasks' file multiple times if one mount point has multiple 
 subsystems.
 It is bad because the first write changes the cgroups path of a controller, 
 and
 then the second write points a missing file like
 $CGROUPS_MOUNTPOINT/path_to_domain/path_to_domain/tasks where
 the correct file is $CGROUPS_MOUNTPOINT/path_to_domain/tasks.
 I did workaround this problem with a tricky way by truncating the
 duplicated path.
 We probably need a more feasible solution.
 
 Note that this patch intends a proof of concept to be clear the correct way
 to go and thus it remains a couple of awkward codes. Of course the codes
 should be removed if we find the correct way.

  Just to make sure, this is not ready, right ?
We freeze for libvirt-0.6.4 this week-end,

Daniel

-- 
Daniel Veillard  | libxml Gnome XML XSLT toolkit  http://xmlsoft.org/
dan...@veillard.com  | Rpmfind RPM search engine http://rpmfind.net/
http://veillard.com/ | virtualization library  http://libvirt.org/

--
Libvir-list mailing list
Libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] [RFC][PATCH] lxc: fix for ns cgroups subsystem

2009-05-16 Thread Serge E. Hallyn
Quoting Ryota Ozaki (ozaki.ry...@gmail.com):
 I've updated the patch. The change includes support for multiple mount
 points of cgroups that I didn't cope with in the previous patch.
 
 Through the work, I found a bit messy problem. Current lxc controller writes
 pid in a 'tasks' file multiple times if one mount point has multiple 
 subsystems.
 It is bad because the first write changes the cgroups path of a controller, 
 and
 then the second write points a missing file like
 $CGROUPS_MOUNTPOINT/path_to_domain/path_to_domain/tasks where
 the correct file is $CGROUPS_MOUNTPOINT/path_to_domain/tasks.
 I did workaround this problem with a tricky way by truncating the
 duplicated path.
 We probably need a more feasible solution.

Seems like the loop in virCgroupAddTask() should just keep track
of which hierarchies it has already written to.  Just a small
temporary hash table or for that matter a small array that you
walk linearly...

-serge

--
Libvir-list mailing list
Libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] [RFC][PATCH] lxc: fix for ns cgroups subsystem

2009-05-11 Thread Ryota Ozaki
Hi,

On Mon, May 11, 2009 at 10:26 AM, Ryota Ozaki ozaki.ry...@gmail.com wrote:
 Hi,

 On Sun, May 10, 2009 at 8:49 AM, Ryota Ozaki ozaki.ry...@gmail.com wrote:
 Hi Serge and Daniel,

 On Sat, May 9, 2009 at 4:03 AM, Serge E. Hallyn se...@us.ibm.com wrote:
 Quoting Daniel P. Berrange (berra...@redhat.com):
 On Fri, May 08, 2009 at 08:34:12AM -0500, Serge E. Hallyn wrote:
  Quoting Ryota Ozaki (ozaki.ry...@gmail.com):
   Hi Serge,
  
   On Fri, May 8, 2009 at 11:48 AM, Serge E. Hallyn se...@us.ibm.com 
   wrote:
IIUC, the real problem is that src/cgroup.c assumes that the
cgroup name should be $CGROUP_MOUNTPOINT/groupname.  But of
course if the ns cgroup is enabled, then the unshare(CLONE_NEWNS)
to create a new namespace in which to mount the new devpts
locks the driver under $CGROUP_MOUNTPOINT/pid_of_driver/
or somesuch.
   
If this fixes the problem I have no objections, but it seems
more fragile than perhaps trying to teach src/cgroup.c to
consider it's current cgroup as a starting point.
  
   hmm, I don't know why the assumption is bad and how the approach
   you are suggesting helps the ns problem.
 
  To be clear, the asssumption is that the driver starts in the
  root cgroup, i.e. it's pid is listed in $CGROUP_MOUNTPOINT/tasks.
  And that it can create $CGROUP_MOUNTPOINT/groupname and move
  itself into $CGROUP_MOUNTPOINT/groupname/tasks.
 
  So, the assumption is bad because when the driver does a
  unshare(CLONE_NEWNS), it gets moved into $CGROUP_MOUNTPOINT/X,
  and after that can only move itself into
  $CGROUP_MOUNTPOINT/X/groupname.
 
  Even with your patch, it's possible for the lxc driver to have
  been started under say $CGROUP_MOUNTPOINT/libvir or
  $CGROUP_MOUNTPOINT/username through libcgroup/PAM for instance,
  in which case your patch would be insufficient.

 Indeed, we can't assume we're in the root group at any time. Our
 general plan is that libvirt itself uses 3 levels of cgroup
 starting at the cgroup that libvirtd was placed in by the admin
 of the host, then a per-driver group, then per-guest groups

   $LIBVIRTD_ROOT_CGROUP
      |
      +- lxc
      |    |
      |    +- LXC-GUEST-1
      |    +- LXC-GUEST-2
      |    +- LXC-GUEST-3
      |    +- ...
      |
      +- qemu
           |
           +- QEMU-GUEST-1
           +- QEMU-GUEST-2
           +- QEMU-GUEST-3
           +- ...

 $LIBVIRTD_ROOT_CGROUP, may be the actaul root mount point for
 the cgroup controller in question, or it may be a sub-directory
 that the admin chose to put it in. Also, if running libvirtd
 as a normal user, the admin may have created per-user account
 cgroups and so libvirtd would end up in there. So we have to
 be prepared for anything wrt initial libvirtd cgroup root.

 NB The code for putting QEMU in a cgroup isn't merged yet.

 That sounds good.  I just don't think the code currently does
 it.  To do the right thing, IIUC, virCgroupPathOfGroup() should
 parse the /proc/pid/cgroup contents of the 'controller' process,
 and insert that between the virCgroupGetMount(controller)
 result and the group name.

 Or something...

 (right?)

 thanks,
 -serge


 OK I probably understood the problem and wrote a patch for that
 according to the Serge's suggestion.

 I expect this patch fixes the problem, however unfortunately
 I've not tried it yet under the situation you are worrying and
 just done regression tests. Because I don't know easy way to
 produce the situation. If you know that, please let me know...

 I found a way to go. lxc-unshare helps me and unveils my second
 path is broken...

  ozaki-r


 Thanks,
  ozaki-r


I've updated the patch. The change includes support for multiple mount
points of cgroups that I didn't cope with in the previous patch.

Through the work, I found a bit messy problem. Current lxc controller writes
pid in a 'tasks' file multiple times if one mount point has multiple subsystems.
It is bad because the first write changes the cgroups path of a controller, and
then the second write points a missing file like
$CGROUPS_MOUNTPOINT/path_to_domain/path_to_domain/tasks where
the correct file is $CGROUPS_MOUNTPOINT/path_to_domain/tasks.
I did workaround this problem with a tricky way by truncating the
duplicated path.
We probably need a more feasible solution.

Note that this patch intends a proof of concept to be clear the correct way
to go and thus it remains a couple of awkward codes. Of course the codes
should be removed if we find the correct way.

Thanks,
  ozaki-r

From 473183a77fbdb002d55acfed0d8f9bd485f548cd Mon Sep 17 00:00:00 2001
From: Ryota Ozaki ozaki.ry...@gmail.com
Date: Mon, 11 May 2009 15:18:47 +0900
Subject: [PATCH] [v2] cgroups: address libvirtd's bad assumption for
cgroups hierarchy

---
 src/cgroup.c |   87 -
 1 files changed, 85 insertions(+), 2 deletions(-)

diff --git a/src/cgroup.c b/src/cgroup.c
index 50517e2..58ba588 100644
--- a/src/cgroup.c
+++ 

Re: [libvirt] [RFC][PATCH] lxc: fix for ns cgroups subsystem

2009-05-10 Thread Ryota Ozaki
Hi,

On Sun, May 10, 2009 at 8:49 AM, Ryota Ozaki ozaki.ry...@gmail.com wrote:
 Hi Serge and Daniel,

 On Sat, May 9, 2009 at 4:03 AM, Serge E. Hallyn se...@us.ibm.com wrote:
 Quoting Daniel P. Berrange (berra...@redhat.com):
 On Fri, May 08, 2009 at 08:34:12AM -0500, Serge E. Hallyn wrote:
  Quoting Ryota Ozaki (ozaki.ry...@gmail.com):
   Hi Serge,
  
   On Fri, May 8, 2009 at 11:48 AM, Serge E. Hallyn se...@us.ibm.com 
   wrote:
IIUC, the real problem is that src/cgroup.c assumes that the
cgroup name should be $CGROUP_MOUNTPOINT/groupname.  But of
course if the ns cgroup is enabled, then the unshare(CLONE_NEWNS)
to create a new namespace in which to mount the new devpts
locks the driver under $CGROUP_MOUNTPOINT/pid_of_driver/
or somesuch.
   
If this fixes the problem I have no objections, but it seems
more fragile than perhaps trying to teach src/cgroup.c to
consider it's current cgroup as a starting point.
  
   hmm, I don't know why the assumption is bad and how the approach
   you are suggesting helps the ns problem.
 
  To be clear, the asssumption is that the driver starts in the
  root cgroup, i.e. it's pid is listed in $CGROUP_MOUNTPOINT/tasks.
  And that it can create $CGROUP_MOUNTPOINT/groupname and move
  itself into $CGROUP_MOUNTPOINT/groupname/tasks.
 
  So, the assumption is bad because when the driver does a
  unshare(CLONE_NEWNS), it gets moved into $CGROUP_MOUNTPOINT/X,
  and after that can only move itself into
  $CGROUP_MOUNTPOINT/X/groupname.
 
  Even with your patch, it's possible for the lxc driver to have
  been started under say $CGROUP_MOUNTPOINT/libvir or
  $CGROUP_MOUNTPOINT/username through libcgroup/PAM for instance,
  in which case your patch would be insufficient.

 Indeed, we can't assume we're in the root group at any time. Our
 general plan is that libvirt itself uses 3 levels of cgroup
 starting at the cgroup that libvirtd was placed in by the admin
 of the host, then a per-driver group, then per-guest groups

   $LIBVIRTD_ROOT_CGROUP
      |
      +- lxc
      |    |
      |    +- LXC-GUEST-1
      |    +- LXC-GUEST-2
      |    +- LXC-GUEST-3
      |    +- ...
      |
      +- qemu
           |
           +- QEMU-GUEST-1
           +- QEMU-GUEST-2
           +- QEMU-GUEST-3
           +- ...

 $LIBVIRTD_ROOT_CGROUP, may be the actaul root mount point for
 the cgroup controller in question, or it may be a sub-directory
 that the admin chose to put it in. Also, if running libvirtd
 as a normal user, the admin may have created per-user account
 cgroups and so libvirtd would end up in there. So we have to
 be prepared for anything wrt initial libvirtd cgroup root.

 NB The code for putting QEMU in a cgroup isn't merged yet.

 That sounds good.  I just don't think the code currently does
 it.  To do the right thing, IIUC, virCgroupPathOfGroup() should
 parse the /proc/pid/cgroup contents of the 'controller' process,
 and insert that between the virCgroupGetMount(controller)
 result and the group name.

 Or something...

 (right?)

 thanks,
 -serge


 OK I probably understood the problem and wrote a patch for that
 according to the Serge's suggestion.

 I expect this patch fixes the problem, however unfortunately
 I've not tried it yet under the situation you are worrying and
 just done regression tests. Because I don't know easy way to
 produce the situation. If you know that, please let me know...

I found a way to go. lxc-unshare helps me and unveils my second
path is broken...

  ozaki-r


 Thanks,
  ozaki-r

--
Libvir-list mailing list
Libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] [RFC][PATCH] lxc: fix for ns cgroups subsystem

2009-05-09 Thread Ryota Ozaki
Hi Serge and Daniel,

On Sat, May 9, 2009 at 4:03 AM, Serge E. Hallyn se...@us.ibm.com wrote:
 Quoting Daniel P. Berrange (berra...@redhat.com):
 On Fri, May 08, 2009 at 08:34:12AM -0500, Serge E. Hallyn wrote:
  Quoting Ryota Ozaki (ozaki.ry...@gmail.com):
   Hi Serge,
  
   On Fri, May 8, 2009 at 11:48 AM, Serge E. Hallyn se...@us.ibm.com 
   wrote:
IIUC, the real problem is that src/cgroup.c assumes that the
cgroup name should be $CGROUP_MOUNTPOINT/groupname.  But of
course if the ns cgroup is enabled, then the unshare(CLONE_NEWNS)
to create a new namespace in which to mount the new devpts
locks the driver under $CGROUP_MOUNTPOINT/pid_of_driver/
or somesuch.
   
If this fixes the problem I have no objections, but it seems
more fragile than perhaps trying to teach src/cgroup.c to
consider it's current cgroup as a starting point.
  
   hmm, I don't know why the assumption is bad and how the approach
   you are suggesting helps the ns problem.
 
  To be clear, the asssumption is that the driver starts in the
  root cgroup, i.e. it's pid is listed in $CGROUP_MOUNTPOINT/tasks.
  And that it can create $CGROUP_MOUNTPOINT/groupname and move
  itself into $CGROUP_MOUNTPOINT/groupname/tasks.
 
  So, the assumption is bad because when the driver does a
  unshare(CLONE_NEWNS), it gets moved into $CGROUP_MOUNTPOINT/X,
  and after that can only move itself into
  $CGROUP_MOUNTPOINT/X/groupname.
 
  Even with your patch, it's possible for the lxc driver to have
  been started under say $CGROUP_MOUNTPOINT/libvir or
  $CGROUP_MOUNTPOINT/username through libcgroup/PAM for instance,
  in which case your patch would be insufficient.

 Indeed, we can't assume we're in the root group at any time. Our
 general plan is that libvirt itself uses 3 levels of cgroup
 starting at the cgroup that libvirtd was placed in by the admin
 of the host, then a per-driver group, then per-guest groups

   $LIBVIRTD_ROOT_CGROUP
      |
      +- lxc
      |    |
      |    +- LXC-GUEST-1
      |    +- LXC-GUEST-2
      |    +- LXC-GUEST-3
      |    +- ...
      |
      +- qemu
           |
           +- QEMU-GUEST-1
           +- QEMU-GUEST-2
           +- QEMU-GUEST-3
           +- ...

 $LIBVIRTD_ROOT_CGROUP, may be the actaul root mount point for
 the cgroup controller in question, or it may be a sub-directory
 that the admin chose to put it in. Also, if running libvirtd
 as a normal user, the admin may have created per-user account
 cgroups and so libvirtd would end up in there. So we have to
 be prepared for anything wrt initial libvirtd cgroup root.

 NB The code for putting QEMU in a cgroup isn't merged yet.

 That sounds good.  I just don't think the code currently does
 it.  To do the right thing, IIUC, virCgroupPathOfGroup() should
 parse the /proc/pid/cgroup contents of the 'controller' process,
 and insert that between the virCgroupGetMount(controller)
 result and the group name.

 Or something...

 (right?)

 thanks,
 -serge


OK I probably understood the problem and wrote a patch for that
according to the Serge's suggestion.

I expect this patch fixes the problem, however unfortunately
I've not tried it yet under the situation you are worrying and
just done regression tests. Because I don't know easy way to
produce the situation. If you know that, please let me know...

Thanks,
  ozaki-r

From 713be8125b5cbd3b60919afe708f9c788206a572 Mon Sep 17 00:00:00 2001
From: Ryota Ozaki ozaki.ry...@gmail.com
Date: Sun, 10 May 2009 08:32:41 +0900
Subject: [PATCH] cgroups: address libvirtd's bad assumption for
cgroups hierarchy

---
 src/cgroup.c |   78 --
 1 files changed, 70 insertions(+), 8 deletions(-)

diff --git a/src/cgroup.c b/src/cgroup.c
index 50517e2..e02c37f 100644
--- a/src/cgroup.c
+++ b/src/cgroup.c
@@ -117,11 +117,60 @@ int virCgroupHaveSupport(void)
 return 0;
 }

+static int virCgroupGetCurrentPath(int pid, char **curpath)
+{
+char buf[PATH_MAX+1024];
+char *path = NULL;
+int rc = 0;
+int len;
+int fd;
+char *pos = NULL;
+
+if (virAsprintf(path, /proc/%d/cgroup, pid) == -1) {
+rc = -ENOMEM;
+goto error;
+}
+
+fd = open(path, O_RDONLY);
+if (fd  0) {
+DEBUG(Unable to open %s: %m, path);
+rc = -errno;
+goto error_open;
+}
+
+len = saferead(fd, buf, sizeof(buf));
+if (len = 0) {
+rc = -errno;
+goto error_saferead;
+}
+buf[len-1] = '¥0'; /* overwrite '¥n' */
+
+pos = strstr(buf, :/);
+if (pos == NULL) {
+rc = -1;
+goto error_strstr;
+}
+
+pos += 2; /* skip ':/' */
+if (virAsprintf(curpath, %s, pos) == -1) {
+rc = -ENOMEM;
+}
+
+error_strstr:
+error_saferead:
+close(fd);
+error_open:
+VIR_FREE(path);
+error:
+return rc;
+}
+
 static int virCgroupPathOfGroup(const char *group,
 const char *controller,
  

Re: [libvirt] [RFC][PATCH] lxc: fix for ns cgroups subsystem

2009-05-08 Thread Ryota Ozaki
Hi Serge,

On Fri, May 8, 2009 at 11:48 AM, Serge E. Hallyn se...@us.ibm.com wrote:
 IIUC, the real problem is that src/cgroup.c assumes that the
 cgroup name should be $CGROUP_MOUNTPOINT/groupname.  But of
 course if the ns cgroup is enabled, then the unshare(CLONE_NEWNS)
 to create a new namespace in which to mount the new devpts
 locks the driver under $CGROUP_MOUNTPOINT/pid_of_driver/
 or somesuch.

 If this fixes the problem I have no objections, but it seems
 more fragile than perhaps trying to teach src/cgroup.c to
 consider it's current cgroup as a starting point.

hmm, I don't know why the assumption is bad and how the approach
you are suggesting helps the ns problem.

Thanks,
  ozaki-r


 -serge

 Quoting Ryota Ozaki (ozaki.ry...@gmail.com):
 From 46531182708dc3eb132b14ce2f23fbc639430176 Mon Sep 17 00:00:00 2001
 From: Ryota Ozaki ozaki.ry...@gmail.com
 Date: Fri, 8 May 2009 05:31:03 +0900
 Subject: [PATCH] lxc: fix for ns cgroups subsystem

 lxc does not work if ns cgroups subsystem is enabled because
 of two factors; one is that ns has a special rule to create
 a group[*] unlike other subsystems and the other is lxc
 controller creates a new namespace for /dev/pts prior to
 create a new group for a domain. Unfortunately the new
 namespace breaks the rule of ns and that prevents a lxc
 controller from creating a new group.

 This patch addresses the problem by creating a new group
 before creating a new namespace (i.e. call unshare syscall).

 Note that this patch is only for the case ns is enabled and
 current code works well if it disabled. However, I think
 this patch makes sense because not just a few users know
 much about cgroups and likely to enable all of subsystems
 without notions (i.e. mount cgroups without any options).

 [*] 
 http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=blob;f=kernel/ns_cgroup.c;hb=HEAD
 ---
  src/lxc_controller.c |    6 +++---
  1 files changed, 3 insertions(+), 3 deletions(-)

 diff --git a/src/lxc_controller.c b/src/lxc_controller.c
 index e0fb05d..1231817 100644
 --- a/src/lxc_controller.c
 +++ b/src/lxc_controller.c
 @@ -458,6 +458,9 @@ lxcControllerRun(virDomainDefPtr def,
          goto cleanup;
      }

 +    if (lxcSetContainerResources(def)  0)
 +        goto cleanup;
 +
      root = virDomainGetRootFilesystem(def);

      /*
 @@ -543,9 +546,6 @@ lxcControllerRun(virDomainDefPtr def,
      }


 -    if (lxcSetContainerResources(def)  0)
 -        goto cleanup;
 -
      if ((container = lxcContainerStart(def,
                                         nveths,
                                         veths,
 --
 1.6.0.6

 --
 Libvir-list mailing list
 Libvir-list@redhat.com
 https://www.redhat.com/mailman/listinfo/libvir-list


--
Libvir-list mailing list
Libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] [RFC][PATCH] lxc: fix for ns cgroups subsystem

2009-05-08 Thread Serge E. Hallyn
Quoting Ryota Ozaki (ozaki.ry...@gmail.com):
 Hi Serge,
 
 On Fri, May 8, 2009 at 11:48 AM, Serge E. Hallyn se...@us.ibm.com wrote:
  IIUC, the real problem is that src/cgroup.c assumes that the
  cgroup name should be $CGROUP_MOUNTPOINT/groupname.  But of
  course if the ns cgroup is enabled, then the unshare(CLONE_NEWNS)
  to create a new namespace in which to mount the new devpts
  locks the driver under $CGROUP_MOUNTPOINT/pid_of_driver/
  or somesuch.
 
  If this fixes the problem I have no objections, but it seems
  more fragile than perhaps trying to teach src/cgroup.c to
  consider it's current cgroup as a starting point.
 
 hmm, I don't know why the assumption is bad and how the approach
 you are suggesting helps the ns problem.

To be clear, the asssumption is that the driver starts in the
root cgroup, i.e. it's pid is listed in $CGROUP_MOUNTPOINT/tasks.
And that it can create $CGROUP_MOUNTPOINT/groupname and move
itself into $CGROUP_MOUNTPOINT/groupname/tasks.

So, the assumption is bad because when the driver does a
unshare(CLONE_NEWNS), it gets moved into $CGROUP_MOUNTPOINT/X,
and after that can only move itself into
$CGROUP_MOUNTPOINT/X/groupname.

Even with your patch, it's possible for the lxc driver to have
been started under say $CGROUP_MOUNTPOINT/libvir or
$CGROUP_MOUNTPOINT/username through libcgroup/PAM for instance,
in which case your patch would be insufficient.

thanks,
-serge

PS
The point of the ns cgroup is to prevent even privileged tasks in a
resource group from escaping that resource group.  FWIW this can
currently also be done using selinux/smack, and eventually should
be accomplished using user namespaces.  At that point we should
seriously consider removing the movement restriction.

--
Libvir-list mailing list
Libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] [RFC][PATCH] lxc: fix for ns cgroups subsystem

2009-05-08 Thread Serge E. Hallyn
Quoting Daniel P. Berrange (berra...@redhat.com):
 On Fri, May 08, 2009 at 08:34:12AM -0500, Serge E. Hallyn wrote:
  Quoting Ryota Ozaki (ozaki.ry...@gmail.com):
   Hi Serge,
   
   On Fri, May 8, 2009 at 11:48 AM, Serge E. Hallyn se...@us.ibm.com wrote:
IIUC, the real problem is that src/cgroup.c assumes that the
cgroup name should be $CGROUP_MOUNTPOINT/groupname.  But of
course if the ns cgroup is enabled, then the unshare(CLONE_NEWNS)
to create a new namespace in which to mount the new devpts
locks the driver under $CGROUP_MOUNTPOINT/pid_of_driver/
or somesuch.
   
If this fixes the problem I have no objections, but it seems
more fragile than perhaps trying to teach src/cgroup.c to
consider it's current cgroup as a starting point.
   
   hmm, I don't know why the assumption is bad and how the approach
   you are suggesting helps the ns problem.
  
  To be clear, the asssumption is that the driver starts in the
  root cgroup, i.e. it's pid is listed in $CGROUP_MOUNTPOINT/tasks.
  And that it can create $CGROUP_MOUNTPOINT/groupname and move
  itself into $CGROUP_MOUNTPOINT/groupname/tasks.
  
  So, the assumption is bad because when the driver does a
  unshare(CLONE_NEWNS), it gets moved into $CGROUP_MOUNTPOINT/X,
  and after that can only move itself into
  $CGROUP_MOUNTPOINT/X/groupname.
  
  Even with your patch, it's possible for the lxc driver to have
  been started under say $CGROUP_MOUNTPOINT/libvir or
  $CGROUP_MOUNTPOINT/username through libcgroup/PAM for instance,
  in which case your patch would be insufficient.
 
 Indeed, we can't assume we're in the root group at any time. Our
 general plan is that libvirt itself uses 3 levels of cgroup
 starting at the cgroup that libvirtd was placed in by the admin
 of the host, then a per-driver group, then per-guest groups
 
   $LIBVIRTD_ROOT_CGROUP
  |
  +- lxc
  ||
  |+- LXC-GUEST-1
  |+- LXC-GUEST-2
  |+- LXC-GUEST-3
  |+- ...
  |
  +- qemu
   |
   +- QEMU-GUEST-1
   +- QEMU-GUEST-2
   +- QEMU-GUEST-3
   +- ...
 
 $LIBVIRTD_ROOT_CGROUP, may be the actaul root mount point for
 the cgroup controller in question, or it may be a sub-directory
 that the admin chose to put it in. Also, if running libvirtd
 as a normal user, the admin may have created per-user account
 cgroups and so libvirtd would end up in there. So we have to
 be prepared for anything wrt initial libvirtd cgroup root.
 
 NB The code for putting QEMU in a cgroup isn't merged yet.

That sounds good.  I just don't think the code currently does
it.  To do the right thing, IIUC, virCgroupPathOfGroup() should
parse the /proc/pid/cgroup contents of the 'controller' process,
and insert that between the virCgroupGetMount(controller)
result and the group name.

Or something...

(right?)

thanks,
-serge

--
Libvir-list mailing list
Libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


[libvirt] [RFC][PATCH] lxc: fix for ns cgroups subsystem

2009-05-07 Thread Ryota Ozaki
Hi,

lxc does not work if ns cgroups subsystem is enabled because
of two factors; one is that ns has a special rule to create
a group[*] unlike other subsystems and the other is lxc
controller creates a new namespace for /dev/pts prior to
create a new group for a domain. Unfortunately the new
namespace breaks the rule of ns and that prevents a lxc
controller from creating a new group.

This patch addresses the problem by creating a new group
before creating a new namespace (i.e. call unshare syscall).

Note that this patch is only for the case ns is enabled and
current code works well if it disabled. However, I think
this patch makes sense because not just a few users know
much about cgroups and likely to enable all of subsystems
without notions (i.e. mount cgroups without any options).

[*] 
http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=blob;f=kernel/ns_cgroup.c;hb=HEAD

Thanks,
  ozaki-r

Signed-off-by: Ryota Ozaki ozaki.ry...@gmail.com

From 46531182708dc3eb132b14ce2f23fbc639430176 Mon Sep 17 00:00:00 2001
From: Ryota Ozaki ozaki.ry...@gmail.com
Date: Fri, 8 May 2009 05:31:03 +0900
Subject: [PATCH] lxc: fix for ns cgroups subsystem

lxc does not work if ns cgroups subsystem is enabled because
of two factors; one is that ns has a special rule to create
a group[*] unlike other subsystems and the other is lxc
controller creates a new namespace for /dev/pts prior to
create a new group for a domain. Unfortunately the new
namespace breaks the rule of ns and that prevents a lxc
controller from creating a new group.

This patch addresses the problem by creating a new group
before creating a new namespace (i.e. call unshare syscall).

Note that this patch is only for the case ns is enabled and
current code works well if it disabled. However, I think
this patch makes sense because not just a few users know
much about cgroups and likely to enable all of subsystems
without notions (i.e. mount cgroups without any options).

[*] 
http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=blob;f=kernel/ns_cgroup.c;hb=HEAD
---
 src/lxc_controller.c |6 +++---
 1 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/src/lxc_controller.c b/src/lxc_controller.c
index e0fb05d..1231817 100644
--- a/src/lxc_controller.c
+++ b/src/lxc_controller.c
@@ -458,6 +458,9 @@ lxcControllerRun(virDomainDefPtr def,
 goto cleanup;
 }

+if (lxcSetContainerResources(def)  0)
+goto cleanup;
+
 root = virDomainGetRootFilesystem(def);

 /*
@@ -543,9 +546,6 @@ lxcControllerRun(virDomainDefPtr def,
 }


-if (lxcSetContainerResources(def)  0)
-goto cleanup;
-
 if ((container = lxcContainerStart(def,
nveths,
veths,
-- 
1.6.0.6

--
Libvir-list mailing list
Libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] [RFC][PATCH] lxc: fix for ns cgroups subsystem

2009-05-07 Thread Serge E. Hallyn
IIUC, the real problem is that src/cgroup.c assumes that the
cgroup name should be $CGROUP_MOUNTPOINT/groupname.  But of
course if the ns cgroup is enabled, then the unshare(CLONE_NEWNS)
to create a new namespace in which to mount the new devpts
locks the driver under $CGROUP_MOUNTPOINT/pid_of_driver/
or somesuch.

If this fixes the problem I have no objections, but it seems
more fragile than perhaps trying to teach src/cgroup.c to
consider it's current cgroup as a starting point.

-serge

Quoting Ryota Ozaki (ozaki.ry...@gmail.com):
 From 46531182708dc3eb132b14ce2f23fbc639430176 Mon Sep 17 00:00:00 2001
 From: Ryota Ozaki ozaki.ry...@gmail.com
 Date: Fri, 8 May 2009 05:31:03 +0900
 Subject: [PATCH] lxc: fix for ns cgroups subsystem
 
 lxc does not work if ns cgroups subsystem is enabled because
 of two factors; one is that ns has a special rule to create
 a group[*] unlike other subsystems and the other is lxc
 controller creates a new namespace for /dev/pts prior to
 create a new group for a domain. Unfortunately the new
 namespace breaks the rule of ns and that prevents a lxc
 controller from creating a new group.
 
 This patch addresses the problem by creating a new group
 before creating a new namespace (i.e. call unshare syscall).
 
 Note that this patch is only for the case ns is enabled and
 current code works well if it disabled. However, I think
 this patch makes sense because not just a few users know
 much about cgroups and likely to enable all of subsystems
 without notions (i.e. mount cgroups without any options).
 
 [*] 
 http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=blob;f=kernel/ns_cgroup.c;hb=HEAD
 ---
  src/lxc_controller.c |6 +++---
  1 files changed, 3 insertions(+), 3 deletions(-)
 
 diff --git a/src/lxc_controller.c b/src/lxc_controller.c
 index e0fb05d..1231817 100644
 --- a/src/lxc_controller.c
 +++ b/src/lxc_controller.c
 @@ -458,6 +458,9 @@ lxcControllerRun(virDomainDefPtr def,
  goto cleanup;
  }
 
 +if (lxcSetContainerResources(def)  0)
 +goto cleanup;
 +
  root = virDomainGetRootFilesystem(def);
 
  /*
 @@ -543,9 +546,6 @@ lxcControllerRun(virDomainDefPtr def,
  }
 
 
 -if (lxcSetContainerResources(def)  0)
 -goto cleanup;
 -
  if ((container = lxcContainerStart(def,
 nveths,
 veths,
 -- 
 1.6.0.6
 
 --
 Libvir-list mailing list
 Libvir-list@redhat.com
 https://www.redhat.com/mailman/listinfo/libvir-list

--
Libvir-list mailing list
Libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list