[lxc-devel] [lxc/lxc] 7d40e5: Update Japanese pam_cgfs(8) to reflect lack of sup...
68 Date: 2020-12-04 (Fri, 04 Dec 2020) Changed paths: M src/lxc/cgroups/cgfsng.c Log Message: --- cgfsng: adjust log level to warn instead of error Signed-off-by: lifeng68 Commit: dcc39fcae63c1b406e12448d826f5c3aea572cb8 https://github.com/lxc/lxc/commit/dcc39fcae63c1b406e12448d826f5c3aea572cb8 Author: Christian Brauner Date: 2020-12-04 (Fri, 04 Dec 2020) Changed paths: M src/lxc/file_utils.c M src/lxc/file_utils.h M src/lxc/parse.c Log Message: --- parse: rework config parsing routine Signed-off-by: Christian Brauner Commit: 1e9e5816d1756f9a0bb1cd4460094928f712665f https://github.com/lxc/lxc/commit/1e9e5816d1756f9a0bb1cd4460094928f712665f Author: Christian Brauner Date: 2020-12-04 (Fri, 04 Dec 2020) Changed paths: M src/lxc/conf.c Log Message: --- conf: switch to fd_to_fd() when copying mountinfo Closes: #3580. Link: https://bugzilla.kernel.org/show_bug.cgi?id=209971 Suggested-by: Joan Bruguera Signed-off-by: Christian Brauner Commit: 1c7c31b56847e4aef2ce7ecab1b6bd53cddd1a50 https://github.com/lxc/lxc/commit/1c7c31b56847e4aef2ce7ecab1b6bd53cddd1a50 Author: Christian Brauner Date: 2020-12-04 (Fri, 04 Dec 2020) Changed paths: M src/lxc/file_utils.c Log Message: --- file_utils: fix config file parsing We accidently used the "bytes_to_write" variable after we've written all the bytes at which point it is guaranteed to be 0. Let's use the "bytes_read" variable instead. Signed-off-by: Christian Brauner Commit: b70ddc2efe8e66f725eef25d48b76935ce987876 https://github.com/lxc/lxc/commit/b70ddc2efe8e66f725eef25d48b76935ce987876 Author: Christian Brauner Date: 2020-12-04 (Fri, 04 Dec 2020) Changed paths: M src/lxc/commands_utils.c M src/lxc/state.c Log Message: --- commands_utils: fix lxc-wait Closes: #3570 Fixes: 7792a5b60f79 ("commands: add additional check to lxc_cmd_sock_get_state()") Signed-off-by: Christian Brauner Commit: 92bc70903c8e9ca920503bcf288934a9e8f12e1f https://github.com/lxc/lxc/commit/92bc70903c8e9ca920503bcf288934a9e8f12e1f Author: Tycho Andersen Date: 2020-12-04 (Fri, 04 Dec 2020) Changed paths: M src/lxc/network.c Log Message: --- network: fix LXC_NET_NONE cleanup We have a case where we have a nested container with LXC_NET_NONE run inside a container that's *also* got no network namespace (run by lxc-usernsexec). The "am I root" check in this function then does not suffice, since the euid of the task is 0 but it does not have privilege over its network namespace, and thus cannot do any of the restore operations: lxc foo 20201201232059.271 TRACEnetwork - network.c:lxc_restore_phys_nics_to_netns:3299 - Moving physical network devices back to parent network namespace lxc foo 20201201232059.271 ERRORnetwork - network.c:lxc_restore_phys_nics_to_netns:3307 - Operation not permitted - Failed to enter network namespace lxc foo 20201201232059.271 ERRORstart - start.c:__lxc_start:2045 - Failed to move physical network devices back to parent network namespace Let's check that we indeed did clone the network namespace, and thus have things to restore to their correct namespace before attempting to actually restore them. I suspect it's possible we can also get rid of some of the network namespace preservation stuff in start.c in the LXC_NET_NONE case. Signed-off-by: Tycho Andersen Compare: https://github.com/lxc/lxc/compare/7bae22f73db9...92bc70903c8e ___ lxc-devel mailing list lxc-devel@lists.linuxcontainers.org http://lists.linuxcontainers.org/listinfo/lxc-devel
Re: [lxc-devel] LXC snapshot using overlayfs fsfreeze
Hi Amir, On Tue, Apr 11, 2017 at 01:37:53PM +0300, Amir Goldstein wrote: > On Mon, Apr 10, 2017 at 5:20 PM, Tycho Andersen wrote: > > Hi Amir, > > > > On Sat, Apr 08, 2017 at 09:35:01PM +0200, Amir Goldstein wrote: > >> [moving this discussion over from fsdevel to containers list and > >> changing the title] > >> > >> On Tue, Apr 4, 2017 at 9:07 PM, Tycho Andersen wrote: > >> > On Tue, Apr 04, 2017 at 09:59:16PM +0300, Amir Goldstein wrote: > >> >> On Tue, Apr 4, 2017 at 9:01 PM, Tycho Andersen wrote: > >> >> > On Tue, Apr 04, 2017 at 12:47:52PM -0500, Serge E. Hallyn wrote: > >> >> >> > Would lxc-snapshot gain anything from the ability to fsfreeze an > >> >> >> > overlay > >> >> >> > mount? > >> >> >> > >> >> >> lxc-snapshot only works on stopped containers. 'lxc snapshot' can > >> >> >> do live > >> >> >> snapshots using criu. Tycho, does that do anything right now to > >> >> >> freeze the > >> >> >> fs? > >> >> > > >> >> > Not that I'm aware of (CRIU might, but we don't in liblxc). > >> >> > > >> >> >> I'm not sure that freezing all the tasks is necessarily enough to > >> >> >> settle > >> >> >> the fs, but I assume you're doing something about that already? > >> >> > > >> >> > I suspect it's not, but we're not doing anything besides freezing the > >> >> > tasks. In fact, we freeze the tasks by using the freezer cgroup, > >> >> > which itself is buggy, since the freezer cgroup can race with various > >> >> > filesystems. So, freezing tasks is hard, and I haven't even thought > >> >> > about how to freeze the fs for real :) > >> >> > > >> >> > But in any case, an fs freezing primitive does sound useful for > >> >> > checkpoint restore, assuming that we're right and freezing the tasks > >> >> > is simply not enough. > >> >> > > >> >> > >> >> So I already asked Pavel that question and he said that freezing > >> >> the tasks is enough. I am not convinced it is really enough to bring > >> >> a file system image (i.e. underlying blockdev) to a quiescent state, > >> >> but I think it may be enough for getting a stable view of the mounted > >> >> file system, so the files could be dumped somewhere. > >> >> I am guessing is what lxc snapshot does? > >> > > >> > Yes, lxc snapshot is basically just a frontend for CRIU. > >> > > >> >> I still didn't understand wrt lxc snapshot, is there a use case for > >> >> taking live snapshots without using CRIU? (because freezer cgroup > >> >> mentioned races or whatnot?). > >> > > >> > No, I think CRIU is the only project that will ever attempt to do > >> > checkpoint restore this way ;-). > >> > >> I don't doubt that. > >> > >> My question is whether it is interesting to snapshot a live container fs > >> without having to checkpoint not restore at all. > >> > >> > CRIU supports two different ways of > >> > freezing tasks: one using the freezer cgroup and one without. The one > >> > without doesn't work against fork bombs very well, and the one with > >> > doesn't work because of some filesystems. So it's mostly a container > >> > engine implementation choice which to use. > >> > > >> >> It's definitely possible with btrfs and if my overlayfs freeze patches > >> >> are not terribly wrong, then it should be easy with overlayfs as well. > >> >> Does lxc snapshot already support live snapshot of btrfs container? > >> > > >> > Yes, it does. It freezes the tasks via the cgroup freezer and then > >> > does a btrfs snapshot of the filesystem once the tasks are frozen. > >> > > >> > >> So what I am not sure is if there are use cases where criu cannot be > >> used or maybe there are reasons not to use it. and for these cases > >> if it may be interesting to support snapshot of the storage by: > >> - fsfreeze -f > >> - copy upper dir > >> - fsfreeze -u > > > > I don't see a reas
Re: [lxc-devel] LXC snapshot using overlayfs fsfreeze
Hi Amir, On Sat, Apr 08, 2017 at 09:35:01PM +0200, Amir Goldstein wrote: > [moving this discussion over from fsdevel to containers list and > changing the title] > > On Tue, Apr 4, 2017 at 9:07 PM, Tycho Andersen wrote: > > On Tue, Apr 04, 2017 at 09:59:16PM +0300, Amir Goldstein wrote: > >> On Tue, Apr 4, 2017 at 9:01 PM, Tycho Andersen wrote: > >> > On Tue, Apr 04, 2017 at 12:47:52PM -0500, Serge E. Hallyn wrote: > >> >> > Would lxc-snapshot gain anything from the ability to fsfreeze an > >> >> > overlay > >> >> > mount? > >> >> > >> >> lxc-snapshot only works on stopped containers. 'lxc snapshot' can do > >> >> live > >> >> snapshots using criu. Tycho, does that do anything right now to freeze > >> >> the > >> >> fs? > >> > > >> > Not that I'm aware of (CRIU might, but we don't in liblxc). > >> > > >> >> I'm not sure that freezing all the tasks is necessarily enough to settle > >> >> the fs, but I assume you're doing something about that already? > >> > > >> > I suspect it's not, but we're not doing anything besides freezing the > >> > tasks. In fact, we freeze the tasks by using the freezer cgroup, > >> > which itself is buggy, since the freezer cgroup can race with various > >> > filesystems. So, freezing tasks is hard, and I haven't even thought > >> > about how to freeze the fs for real :) > >> > > >> > But in any case, an fs freezing primitive does sound useful for > >> > checkpoint restore, assuming that we're right and freezing the tasks > >> > is simply not enough. > >> > > >> > >> So I already asked Pavel that question and he said that freezing > >> the tasks is enough. I am not convinced it is really enough to bring > >> a file system image (i.e. underlying blockdev) to a quiescent state, > >> but I think it may be enough for getting a stable view of the mounted > >> file system, so the files could be dumped somewhere. > >> I am guessing is what lxc snapshot does? > > > > Yes, lxc snapshot is basically just a frontend for CRIU. > > > >> I still didn't understand wrt lxc snapshot, is there a use case for > >> taking live snapshots without using CRIU? (because freezer cgroup > >> mentioned races or whatnot?). > > > > No, I think CRIU is the only project that will ever attempt to do > > checkpoint restore this way ;-). > > I don't doubt that. > > My question is whether it is interesting to snapshot a live container fs > without having to checkpoint not restore at all. > > > CRIU supports two different ways of > > freezing tasks: one using the freezer cgroup and one without. The one > > without doesn't work against fork bombs very well, and the one with > > doesn't work because of some filesystems. So it's mostly a container > > engine implementation choice which to use. > > > >> It's definitely possible with btrfs and if my overlayfs freeze patches > >> are not terribly wrong, then it should be easy with overlayfs as well. > >> Does lxc snapshot already support live snapshot of btrfs container? > > > > Yes, it does. It freezes the tasks via the cgroup freezer and then > > does a btrfs snapshot of the filesystem once the tasks are frozen. > > > > So what I am not sure is if there are use cases where criu cannot be > used or maybe there are reasons not to use it. and for these cases > if it may be interesting to support snapshot of the storage by: > - fsfreeze -f > - copy upper dir > - fsfreeze -u I don't see a reason for it, but perhaps I'm not being very imaginative. Without the memory state, the potentially inconsistent fs state doesn't seem very helpful. Cheers, Tycho ___ lxc-devel mailing list lxc-devel@lists.linuxcontainers.org http://lists.linuxcontainers.org/listinfo/lxc-devel
Re: [lxc-devel] LXD live migration fail, help
Hi, On Fri, Mar 04, 2016 at 10:31:11AM +0900, 디케이 wrote: > I failed simple live migration 2.0.0.0 beta4. > so I installed source code, now LXD version is* 2.0.0.rc1.* > > I create a one container (name: psychosomatic-joella) and apply profile > 'migratable' to container > and copy profile to remote server . > > and then I tried Live migration again, but still fail. > > also still error logging as follow : > lxc 20160304022641.124 ERRORlxc_criu - criu.c:criu_version_ok:348 - > must have criu 1.9 or greater to checkpoint/restore > > Why? LXD still ask a criu 1.9 or greater ??? > *Is there any way to criu upgrade to 1.9??? where can I get criu 1.9??* As Stéphane mentioned in his reply to you yesterday, CRIU 2.0 will be tagged Monday, but for now you can build from source. Tycho > Thanks in advance. > ___ > lxc-devel mailing list > lxc-devel@lists.linuxcontainers.org > http://lists.linuxcontainers.org/listinfo/lxc-devel ___ lxc-devel mailing list lxc-devel@lists.linuxcontainers.org http://lists.linuxcontainers.org/listinfo/lxc-devel
Re: [lxc-devel] [PATCH] c/r: remember to chown the cgroup path (correctly)
On Thu, Jan 14, 2016 at 07:43:20PM +, Serge Hallyn wrote: > Quoting Tycho Andersen (tycho.ander...@canonical.com): > > On Thu, Jan 14, 2016 at 09:28:07AM +, Serge Hallyn wrote: > > > Quoting Tycho Andersen (tycho.ander...@canonical.com): > > > > On Wed, Jan 13, 2016 at 09:47:50PM +, Serge Hallyn wrote: > > > > > Quoting Tycho Andersen (tycho.ander...@canonical.com): > > > > > > 1. remember to chown the cgroup path when migrating a container > > > > > > 2. when restoring the cgroup path, try to compute the euid for root > > > > > > vs. > > > > > >using geteuid(); geteuid works for start, but it doesn't work for > > > > > >migration since we're still real root at that point. > > > > > > > > > > > > Signed-off-by: Tycho Andersen > > > > > > --- > > > > > > src/lxc/cgmanager.c | 6 +- > > > > > > src/lxc/criu.c | 5 + > > > > > > 2 files changed, 10 insertions(+), 1 deletion(-) > > > > > > > > > > > > diff --git a/src/lxc/cgmanager.c b/src/lxc/cgmanager.c > > > > > > index 357182a..54e6912 100644 > > > > > > --- a/src/lxc/cgmanager.c > > > > > > +++ b/src/lxc/cgmanager.c > > > > > > @@ -488,7 +488,11 @@ static bool chown_cgroup(const char > > > > > > *cgroup_path, struct lxc_conf *conf) > > > > > > return true; > > > > > > > > > > > > data.cgroup_path = cgroup_path; > > > > > > - data.origuid = geteuid(); > > > > > > + data.origuid = mapped_hostid(0, conf, ID_TYPE_UID); > > > > > > now, when starting a container, this happens in the > > > parent task in original uid. So the geteuid() returns 1000, > > > mapped_hostid(0, conf, ID_TYPE_UID) something like 10. > > > > Are you sure? Wouldn't it chmod everything to 1000 instead of 10 > > in that case and be all screwed up? > > No I'm not sure :) I was lazily asking you to add an fprintf and verify :) You're right! I think this may actually be the wrong fix, > > Cgmanager does that automatically. When you chown cgroup freezer:/lxc/c1, > cgmanager chowns the directory itself (so you can create new cgroups), and > the procs and tasks files, so you can move tasks in. It does not change > the other files so you can't escape your limits. > > And right, that is why the parent dir must be -1:10, so you can't chown > the files. This gets sabotaged when, as uid 1000 on the host, you map hostuid > 1000 into the container. But parent cgroups (like systemd user scopes) will > at > least still confine you. Ok. Let's drop this patch for now and I'll see about figuring out another way to do what I need here. Tycho ___ lxc-devel mailing list lxc-devel@lists.linuxcontainers.org http://lists.linuxcontainers.org/listinfo/lxc-devel
Re: [lxc-devel] [PATCH] c/r: remember to chown the cgroup path (correctly)
On Thu, Jan 14, 2016 at 09:28:07AM +, Serge Hallyn wrote: > Quoting Tycho Andersen (tycho.ander...@canonical.com): > > On Wed, Jan 13, 2016 at 09:47:50PM +, Serge Hallyn wrote: > > > Quoting Tycho Andersen (tycho.ander...@canonical.com): > > > > 1. remember to chown the cgroup path when migrating a container > > > > 2. when restoring the cgroup path, try to compute the euid for root vs. > > > >using geteuid(); geteuid works for start, but it doesn't work for > > > >migration since we're still real root at that point. > > > > > > > > Signed-off-by: Tycho Andersen > > > > --- > > > > src/lxc/cgmanager.c | 6 +- > > > > src/lxc/criu.c | 5 + > > > > 2 files changed, 10 insertions(+), 1 deletion(-) > > > > > > > > diff --git a/src/lxc/cgmanager.c b/src/lxc/cgmanager.c > > > > index 357182a..54e6912 100644 > > > > --- a/src/lxc/cgmanager.c > > > > +++ b/src/lxc/cgmanager.c > > > > @@ -488,7 +488,11 @@ static bool chown_cgroup(const char *cgroup_path, > > > > struct lxc_conf *conf) > > > > return true; > > > > > > > > data.cgroup_path = cgroup_path; > > > > - data.origuid = geteuid(); > > > > + data.origuid = mapped_hostid(0, conf, ID_TYPE_UID); > > now, when starting a container, this happens in the > parent task in original uid. So the geteuid() returns 1000, > mapped_hostid(0, conf, ID_TYPE_UID) something like 10. Are you sure? Wouldn't it chmod everything to 1000 instead of 10 in that case and be all screwed up? > This is probably ok - but did you run all the lxc tests against > this to make sure? You can still run 'lxc-cgroup' etc as an > unpriv user? Well, it does change the behavior in ways I don't quite understand yet, so that's probably worth talking about. Before this series (the 644 one and this patch), I get: /sys/fs/cgroup/cpuset/lxc/unpriv $ ls -alh total 0 drwxrwxr-x 2 root 10 0 Jan 14 06:25 ./ drwxr-xr-x 3 root root 0 Jan 14 06:25 ../ -rw-r--r-- 1 root root 0 Jan 14 06:25 cgroup.clone_children -rwxrwxr-x 1 root 10 0 Jan 14 06:26 cgroup.procs* -rw-r--r-- 1 root root 0 Jan 14 06:25 cpuset.cpu_exclusive -rw-r--r-- 1 root root 0 Jan 14 06:25 cpuset.cpus -r--r--r-- 1 root root 0 Jan 14 06:25 cpuset.effective_cpus -r--r--r-- 1 root root 0 Jan 14 06:25 cpuset.effective_mems -rw-r--r-- 1 root root 0 Jan 14 06:25 cpuset.mem_exclusive -rw-r--r-- 1 root root 0 Jan 14 06:25 cpuset.mem_hardwall -rw-r--r-- 1 root root 0 Jan 14 06:25 cpuset.memory_migrate -r--r--r-- 1 root root 0 Jan 14 06:25 cpuset.memory_pressure -rw-r--r-- 1 root root 0 Jan 14 06:25 cpuset.memory_spread_page -rw-r--r-- 1 root root 0 Jan 14 06:25 cpuset.memory_spread_slab -rw-r--r-- 1 root root 0 Jan 14 06:25 cpuset.mems -rw-r--r-- 1 root root 0 Jan 14 06:25 cpuset.sched_load_balance -rw-r--r-- 1 root root 0 Jan 14 06:25 cpuset.sched_relax_domain_level -rw-r--r-- 1 root root 0 Jan 14 06:25 notify_on_release -rwxrwxr-x 1 root 10 0 Jan 14 06:25 tasks* and after: /sys/fs/cgroup/cpuset/lxc/unpriv $ ls -alh total 0 drwxrwxr-x 2 10 10 0 Jan 14 06:27 ./ drwxr-xr-x 3 root root 0 Jan 14 06:27 ../ -rw-r--r-- 1 root root 0 Jan 14 06:27 cgroup.clone_children -rw-rw-r-- 1 10 10 0 Jan 14 06:27 cgroup.procs -rw-r--r-- 1 root root 0 Jan 14 06:27 cpuset.cpu_exclusive -rw-r--r-- 1 root root 0 Jan 14 06:27 cpuset.cpus -r--r--r-- 1 root root 0 Jan 14 06:27 cpuset.effective_cpus -r--r--r-- 1 root root 0 Jan 14 06:27 cpuset.effective_mems -rw-r--r-- 1 root root 0 Jan 14 06:27 cpuset.mem_exclusive -rw-r--r-- 1 root root 0 Jan 14 06:27 cpuset.mem_hardwall -rw-r--r-- 1 root root 0 Jan 14 06:27 cpuset.memory_migrate -r--r--r-- 1 root root 0 Jan 14 06:27 cpuset.memory_pressure -rw-r--r-- 1 root root 0 Jan 14 06:27 cpuset.memory_spread_page -rw-r--r-- 1 root root 0 Jan 14 06:27 cpuset.memory_spread_slab -rw-r--r-- 1 root root 0 Jan 14 06:27 cpuset.mems -rw-r--r-- 1 root root 0 Jan 14 06:27 cpuset.sched_load_balance -rw-r--r-- 1 root root 0 Jan 14 06:27 cpuset.sched_relax_domain_level -rw-r--r-- 1 root root 0 Jan 14 06:27 notify_on_release -rw-rw-r-- 1 10 10 0 Jan 14 06:27 tasks However, it's not clear really /why/ from this patch the uid on tasks and cgroup.progs gets changed as well (probably some mask bug? it was ID_TYPE_UID before but changing the gid? not sure). So, what's the correct behavior here? Should we leave the uid on the files as real root, or chmod those to container root as well? (if it's the latter, i.e. the behavior introduced by this patch, it's probably still worth understanding
Re: [lxc-devel] [PATCH] c/r: remember to chown the cgroup path (correctly)
On Wed, Jan 13, 2016 at 09:47:50PM +, Serge Hallyn wrote: > Quoting Tycho Andersen (tycho.ander...@canonical.com): > > 1. remember to chown the cgroup path when migrating a container > > 2. when restoring the cgroup path, try to compute the euid for root vs. > >using geteuid(); geteuid works for start, but it doesn't work for > >migration since we're still real root at that point. > > > > Signed-off-by: Tycho Andersen > > --- > > src/lxc/cgmanager.c | 6 +- > > src/lxc/criu.c | 5 + > > 2 files changed, 10 insertions(+), 1 deletion(-) > > > > diff --git a/src/lxc/cgmanager.c b/src/lxc/cgmanager.c > > index 357182a..54e6912 100644 > > --- a/src/lxc/cgmanager.c > > +++ b/src/lxc/cgmanager.c > > @@ -488,7 +488,11 @@ static bool chown_cgroup(const char *cgroup_path, > > struct lxc_conf *conf) > > return true; > > > > data.cgroup_path = cgroup_path; > > - data.origuid = geteuid(); > > + data.origuid = mapped_hostid(0, conf, ID_TYPE_UID); > > + if (data.origuid < 0) { > > Can you confirm that this does not break > > sudo lxc-create -t download -n x1 -- -d ubuntu -r trusty -a amd64 > sudo lxc-start -n x1 > > Because in that case I think we have no mappings, and mapped_hostid() will > return -1. You can't see it in the patch, but just above this is a lxc_list_empty() test, and this whole path isn't executed if lxc_list_empty() is true, so I think it should be ok. Tycho > > + ERROR("failed to get mapped root id"); > > + return false; > > + } > > > > /* Unpriv users can't chown it themselves, so chown from > > * a child namespace mapping both our own and the target uid > > diff --git a/src/lxc/criu.c b/src/lxc/criu.c > > index 6ef4905..f442612 100644 > > --- a/src/lxc/criu.c > > +++ b/src/lxc/criu.c > > @@ -466,6 +466,11 @@ void do_restore(struct lxc_container *c, int pipe, > > char *directory, bool verbose > > goto out_fini_handler; > > } > > > > + if (!cgroup_chown(handler)) { > > + ERROR("failed creating groups"); > > + goto out_fini_handler; > > + } > > + > > if (!restore_net_info(c)) { > > ERROR("failed restoring network info"); > > goto out_fini_handler; > > -- > > 2.6.4 > > > > ___ > > lxc-devel mailing list > > lxc-devel@lists.linuxcontainers.org > > http://lists.linuxcontainers.org/listinfo/lxc-devel > ___ > lxc-devel mailing list > lxc-devel@lists.linuxcontainers.org > http://lists.linuxcontainers.org/listinfo/lxc-devel ___ lxc-devel mailing list lxc-devel@lists.linuxcontainers.org http://lists.linuxcontainers.org/listinfo/lxc-devel
Re: [lxc-devel] [PATCH] .gitignore: add sparclinux make output
On Wed, Jan 13, 2016 at 10:20:12PM +, Serge Hallyn wrote: > Quoting Wim Coekaerts (wim.coekae...@oracle.com): > > On 1/13/16 1:50 PM, Serge Hallyn wrote: > > >Quoting Tycho Andersen (tycho.ander...@canonical.com): > > >>Signed-off-by: Tycho Andersen > > >Acked-by: Serge E. Hallyn > > > > > >>--- > > >> .gitignore | 1 + > > >> 1 file changed, 1 insertion(+) > > >> > > >>diff --git a/.gitignore b/.gitignore > > >>index 5e4912c..58e5dea 100644 > > >>--- a/.gitignore > > >>+++ b/.gitignore > > >>@@ -41,6 +41,7 @@ templates/lxc-opensuse > > >> templates/lxc-oracle > > >> templates/lxc-plamo > > >> templates/lxc-slackware > > >>+templates/lxc-sparclinux > > >> templates/lxc-sshd > > >> templates/lxc-ubuntu > > >> templates/lxc-ubuntu-cloud > > ugh do I feel stupid ! > > nonsense - we all do it. Indeed, please don't feel stupid :) Tycho ___ lxc-devel mailing list lxc-devel@lists.linuxcontainers.org http://lists.linuxcontainers.org/listinfo/lxc-devel
[lxc-devel] [PATCH] .gitignore: add sparclinux make output
Signed-off-by: Tycho Andersen --- .gitignore | 1 + 1 file changed, 1 insertion(+) diff --git a/.gitignore b/.gitignore index 5e4912c..58e5dea 100644 --- a/.gitignore +++ b/.gitignore @@ -41,6 +41,7 @@ templates/lxc-opensuse templates/lxc-oracle templates/lxc-plamo templates/lxc-slackware +templates/lxc-sparclinux templates/lxc-sshd templates/lxc-ubuntu templates/lxc-ubuntu-cloud -- 2.6.4 ___ lxc-devel mailing list lxc-devel@lists.linuxcontainers.org http://lists.linuxcontainers.org/listinfo/lxc-devel
[lxc-devel] [PATCH] c/r: remember to chown the cgroup path (correctly)
1. remember to chown the cgroup path when migrating a container 2. when restoring the cgroup path, try to compute the euid for root vs. using geteuid(); geteuid works for start, but it doesn't work for migration since we're still real root at that point. Signed-off-by: Tycho Andersen --- src/lxc/cgmanager.c | 6 +- src/lxc/criu.c | 5 + 2 files changed, 10 insertions(+), 1 deletion(-) diff --git a/src/lxc/cgmanager.c b/src/lxc/cgmanager.c index 357182a..54e6912 100644 --- a/src/lxc/cgmanager.c +++ b/src/lxc/cgmanager.c @@ -488,7 +488,11 @@ static bool chown_cgroup(const char *cgroup_path, struct lxc_conf *conf) return true; data.cgroup_path = cgroup_path; - data.origuid = geteuid(); + data.origuid = mapped_hostid(0, conf, ID_TYPE_UID); + if (data.origuid < 0) { + ERROR("failed to get mapped root id"); + return false; + } /* Unpriv users can't chown it themselves, so chown from * a child namespace mapping both our own and the target uid diff --git a/src/lxc/criu.c b/src/lxc/criu.c index 6ef4905..f442612 100644 --- a/src/lxc/criu.c +++ b/src/lxc/criu.c @@ -466,6 +466,11 @@ void do_restore(struct lxc_container *c, int pipe, char *directory, bool verbose goto out_fini_handler; } + if (!cgroup_chown(handler)) { + ERROR("failed creating groups"); + goto out_fini_handler; + } + if (!restore_net_info(c)) { ERROR("failed restoring network info"); goto out_fini_handler; -- 2.6.4 ___ lxc-devel mailing list lxc-devel@lists.linuxcontainers.org http://lists.linuxcontainers.org/listinfo/lxc-devel
[lxc-devel] [PATCH] cgmanager: don't make tasks + cgroup.procs +x
No reason for these to be +x, and it looks weird. Signed-off-by: Tycho Andersen --- src/lxc/cgmanager.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/src/lxc/cgmanager.c b/src/lxc/cgmanager.c index 5596285..357182a 100644 --- a/src/lxc/cgmanager.c +++ b/src/lxc/cgmanager.c @@ -509,9 +509,9 @@ static bool chown_cgroup(const char *cgroup_path, struct lxc_conf *conf) for (i = 0; slist[i]; i++) { if (!lxc_cgmanager_chmod(slist[i], cgroup_path, "", 0775)) return false; - if (!lxc_cgmanager_chmod(slist[i], cgroup_path, "tasks", 0775)) + if (!lxc_cgmanager_chmod(slist[i], cgroup_path, "tasks", 0664)) return false; - if (!lxc_cgmanager_chmod(slist[i], cgroup_path, "cgroup.procs", 0775)) + if (!lxc_cgmanager_chmod(slist[i], cgroup_path, "cgroup.procs", 0664)) return false; } -- 2.6.4 ___ lxc-devel mailing list lxc-devel@lists.linuxcontainers.org http://lists.linuxcontainers.org/listinfo/lxc-devel
Re: [lxc-devel] On the road to LXC 2.0.0
On Mon, Dec 21, 2015 at 05:12:12PM -0500, Stéphane Graber wrote: > Hey everyone, > > So you may have noticed I just tagged LXC 2.0.0 beta1. > > The current plan is as follow: > - LXC 2.0.0 beta2 next week (28th) > - LXC 2.0.0 rc1 the week after (4th) Can we potentially push this back? I don't have a lot of working days between now and the 4th (1.5, to be exact), but would like some time to play with 2.0 to make sure it has everything needed for p.haul. Thanks, Tycho ___ lxc-devel mailing list lxc-devel@lists.linuxcontainers.org http://lists.linuxcontainers.org/listinfo/lxc-devel
[lxc-devel] [PATCH 2/2] c/r: bump criu patchlevel for --lsm-profile
This option is only available in recent master of criu, so let's require that since we're using it. Signed-off-by: Tycho Andersen --- src/lxc/criu.h | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/src/lxc/criu.h b/src/lxc/criu.h index b7d241b..e35f98a 100644 --- a/src/lxc/criu.h +++ b/src/lxc/criu.h @@ -29,10 +29,10 @@ // We require either the criu major/minor version, or the criu GITID if criu // was built from git. -#define CRIU_VERSION "1.8" +#define CRIU_VERSION "1.9" -#define CRIU_GITID_VERSION "1.7" -#define CRIU_GITID_PATCHLEVEL 371 +#define CRIU_GITID_VERSION "1.8" +#define CRIU_GITID_PATCHLEVEL 21 struct criu_opts { /* The type of criu invocation, one of "dump" or "restore" */ -- 2.6.2 ___ lxc-devel mailing list lxc-devel@lists.linuxcontainers.org http://lists.linuxcontainers.org/listinfo/lxc-devel
[lxc-devel] [PATCH 1/2] c/r: use --lsm-profile if provided
Since we can rename a container on a migrate, let's tell CRIU to use the LSM profile name the user has specified. This change is motivated by LXD, which sets an LSM profile name based on the container name, so if a user changes the name of a container during migration, the old profile name (that criu has saved) won't exist on the new host. Signed-off-by: Tycho Andersen --- src/lxc/criu.c | 21 +++-- 1 file changed, 19 insertions(+), 2 deletions(-) diff --git a/src/lxc/criu.c b/src/lxc/criu.c index 5ca4f9f..c30fa33 100644 --- a/src/lxc/criu.c +++ b/src/lxc/criu.c @@ -89,8 +89,10 @@ void exec_criu(struct criu_opts *opts) static_args++; } else if (strcmp(opts->action, "restore") == 0) { /* --root $(lxc_mount_point) --restore-detached -* --restore-sibling --pidfile $foo --cgroup-root $foo */ - static_args += 8; +* --restore-sibling --pidfile $foo --cgroup-root $foo +* --lsm-profile apparmor:whatever +*/ + static_args += 10; } else { return; } @@ -184,6 +186,7 @@ void exec_criu(struct criu_opts *opts) } else if (strcmp(opts->action, "restore") == 0) { void *m; int additional; + struct lxc_conf *lxc_conf = opts->c->lxc_conf; DECLARE_ARG("--root"); DECLARE_ARG(opts->c->lxc_conf->rootfs.mount); @@ -194,6 +197,20 @@ void exec_criu(struct criu_opts *opts) DECLARE_ARG("--cgroup-root"); DECLARE_ARG(opts->cgroup_path); + if (lxc_conf->lsm_aa_profile || lxc_conf->lsm_se_context) { + + if (lxc_conf->lsm_aa_profile) + ret = snprintf(buf, sizeof(buf), "apparmor:%s", lxc_conf->lsm_aa_profile); + else + ret = snprintf(buf, sizeof(buf), "selinux:%s", lxc_conf->lsm_se_context); + + if (ret < 0 || ret >= sizeof(buf)) + goto err; + + DECLARE_ARG("--lsm-profile"); + DECLARE_ARG(buf); + } + additional = lxc_list_len(&opts->c->lxc_conf->network) * 2; m = realloc(argv, (argc + additional + 1) * sizeof(*argv)); -- 2.6.2 ___ lxc-devel mailing list lxc-devel@lists.linuxcontainers.org http://lists.linuxcontainers.org/listinfo/lxc-devel
Re: [lxc-devel] [PATCH 1/3] cgroup: add cgroup_escape() call
On Wed, Dec 09, 2015 at 02:58:20AM +, Serge Hallyn wrote: > Quoting Tycho Andersen (tycho.ander...@canonical.com): > > + bool ret = true, cgm_connected = false; > > Sorry, can you rename this disconnect_cgm or cgm_needs_disconnect ? Sure, see attached. Tycho >From c378426d560a8a071656f08e7f27bf5fbe9b17ad Mon Sep 17 00:00:00 2001 From: Tycho Andersen Date: Mon, 7 Dec 2015 17:07:05 -0700 Subject: [PATCH 1/4] cgroup: add cgroup_escape() call We'll use this in the next patch to escape to the root cgroup before we exec criu. v2: s/cgm_connected/cmg_needs_disconnect/g Signed-off-by: Tycho Andersen Acked-by: Serge E. Hallyn --- src/lxc/cgfs.c | 50 ++ src/lxc/cgmanager.c | 19 --- src/lxc/cgroup.c| 7 +++ src/lxc/cgroup.h| 2 ++ 4 files changed, 75 insertions(+), 3 deletions(-) diff --git a/src/lxc/cgfs.c b/src/lxc/cgfs.c index d65f2d7..94b3d87 100644 --- a/src/lxc/cgfs.c +++ b/src/lxc/cgfs.c @@ -2343,6 +2343,55 @@ static const char *cgfs_canonical_path(void *hdata) return path; } +static bool cgfs_escape(void) +{ + struct cgroup_meta_data *md; + int i; + bool ret = false; + + md = lxc_cgroup_load_meta(); + if (!md) + return false; + + for (i = 1; i <= md->maximum_hierarchy; i++) { + struct cgroup_hierarchy *h = md->hierarchies[i]; + struct cgroup_mount_point *mp; + char *tasks; + FILE *f; + int written; + + if (!h) { + WARN("not escaping hierarchy %d", i); + continue; + } + + mp = lxc_cgroup_find_mount_point(h, "/", true); + if (!mp) + goto out; + + tasks = cgroup_to_absolute_path(mp, "/", "tasks"); + if (!tasks) + goto out; + + f = fopen(tasks, "a"); + free(tasks); + if (!f) + goto out; + + written = fprintf(f, "%d\n", getpid()); + fclose(f); + if (written < 0) { + SYSERROR("writing tasks failed\n"); + goto out; + } + } + + ret = true; +out: + lxc_cgroup_put_meta(md); + return ret; +} + static bool cgfs_unfreeze(void *hdata) { struct cgfs_data *d = hdata; @@ -2408,6 +2457,7 @@ static struct cgroup_ops cgfs_ops = { .create_legacy = cgfs_create_legacy, .get_cgroup = cgfs_get_cgroup, .canonical_path = cgfs_canonical_path, + .escape = cgfs_escape, .get = lxc_cgroupfs_get, .set = lxc_cgroupfs_set, .unfreeze = cgfs_unfreeze, diff --git a/src/lxc/cgmanager.c b/src/lxc/cgmanager.c index 05df0da..b82be59 100644 --- a/src/lxc/cgmanager.c +++ b/src/lxc/cgmanager.c @@ -312,13 +312,22 @@ static bool lxc_cgmanager_create(const char *controller, const char *cgroup_path * be in "/lxc/c1" rather than "/user//c1" * called internally with connection already open */ -static bool lxc_cgmanager_escape(void) +static bool cgm_escape(void) { - bool ret = true; + bool ret = true, cgm_needs_disconnect = false; pid_t me = getpid(); char **slist = subsystems; int i; + if (!cgroup_manager) { + if (!cgm_dbus_connect()) { + ERROR("Error connecting to cgroup manager"); + return false; + } + cgm_needs_disconnect = true; + } + + if (cgm_all_controllers_same) slist = subsystems_inone; @@ -335,6 +344,9 @@ static bool lxc_cgmanager_escape(void) } } + if (cgm_needs_disconnect) + cgm_dbus_disconnect(); + return ret; } @@ -1307,7 +1319,7 @@ struct cgroup_ops *cgm_ops_init(void) goto err1; // root; try to escape to root cgroup - if (geteuid() == 0 && !lxc_cgmanager_escape()) + if (geteuid() == 0 && !cgm_escape()) goto err2; cgm_dbus_disconnect(); @@ -1524,6 +1536,7 @@ static struct cgroup_ops cgmanager_ops = { .create_legacy = NULL, .get_cgroup = cgm_get_cgroup, .canonical_path = cgm_canonical_path, + .escape = cgm_escape, .get = cgm_get, .set = cgm_set, .unfreeze = cgm_unfreeze, diff --git a/src/lxc/cgroup.c b/src/lxc/cgroup.c index b1c764f..aeb8a58 100644 --- a/src/lxc/cgroup.c +++ b/src/lxc/cgroup.c @@ -109,6 +109,13 @@ const char *cgroup_get_cgroup(struct lxc_handler *handler, const char *subsystem return NULL; } +bool cgroup_escape(void) +{ + if (ops) + return ops->escape(); + return false; +} + const char *cgroup_canonical_path(struct lxc_handler *handler) { if (geteuid()) { diff --git a/src/lxc/cgroup.h b/src/lxc/cgroup.h index 7704c04..e3d3ce4 100644 --- a/src/lxc/cgroup.h +++ b/src/lxc/cgroup.h @@ -47,6 +47,7 @@ struct cgroup_ops { bool (*create_legacy)(void *hdata, pid_t pid); const char *(*get_cgroup)(void *hdata, const char *subsystem); const char *(*canonical_path)(void *hdata); + bool (*escape)(void); int (*set)(const char *filename, const char *value, const char *name, const char *lxcpath); int (*get)(const char *filename, char *value, size_t len, const char *name, const char *lxcpath); bool (*unfreeze)(void *hdata); @@ -71,6 +72,7 @@ extern void cgroup_cleanup(struct lxc_handler *handler); extern bool cgroup_create_legacy(struct lxc
Re: [lxc-devel] [PATCH 1/3] cgroup: add cgroup_escape() call
On Tue, Dec 08, 2015 at 04:08:09PM -0700, Tycho Andersen wrote: > Whoops, forgot the --compose flag. I should mention that this is on top of my already sent patches about adding the ->migrate API call. It probably doesn't matter except for patch 3, and even that git can probably figure out, but it's worth mentioning. Tycho ___ lxc-devel mailing list lxc-devel@lists.linuxcontainers.org http://lists.linuxcontainers.org/listinfo/lxc-devel
[lxc-devel] [PATCH 2/3] c/r: escape cgroups before exec()ing criu
Signed-off-by: Tycho Andersen --- src/lxc/criu.c | 12 1 file changed, 12 insertions(+) diff --git a/src/lxc/criu.c b/src/lxc/criu.c index c0ce965..062289f 100644 --- a/src/lxc/criu.c +++ b/src/lxc/criu.c @@ -56,6 +56,18 @@ void exec_criu(struct criu_opts *opts) char buf[4096]; + /* If we are currently in a cgroup /foo/bar, and the container is in a +* cgroup /lxc/foo, lxcfs will give us an ENOENT if some task in the +* container has an open fd that points to one of the cgroup files +* (systemd always opens its "root" cgroup). So, let's escape to the +* /actual/ root cgroup so that lxcfs thinks criu has enough rights to +* see all cgroups. +*/ + if (!cgroup_escape()) { + ERROR("failed to escape cgroups"); + return; + } + /* The command line always looks like: * criu $(action) --tcp-established --file-locks --link-remap --force-irmap \ * --manage-cgroups action-script foo.sh -D $(directory) \ -- 2.6.2 ___ lxc-devel mailing list lxc-devel@lists.linuxcontainers.org http://lists.linuxcontainers.org/listinfo/lxc-devel
[lxc-devel] [PATCH 1/3] cgroup: add cgroup_escape() call
We'll use this in the next patch to escape to the root cgroup before we exec criu. Signed-off-by: Tycho Andersen --- src/lxc/cgfs.c | 50 ++ src/lxc/cgmanager.c | 19 --- src/lxc/cgroup.c| 7 +++ src/lxc/cgroup.h| 2 ++ 4 files changed, 75 insertions(+), 3 deletions(-) diff --git a/src/lxc/cgfs.c b/src/lxc/cgfs.c index d65f2d7..94b3d87 100644 --- a/src/lxc/cgfs.c +++ b/src/lxc/cgfs.c @@ -2343,6 +2343,55 @@ static const char *cgfs_canonical_path(void *hdata) return path; } +static bool cgfs_escape(void) +{ + struct cgroup_meta_data *md; + int i; + bool ret = false; + + md = lxc_cgroup_load_meta(); + if (!md) + return false; + + for (i = 1; i <= md->maximum_hierarchy; i++) { + struct cgroup_hierarchy *h = md->hierarchies[i]; + struct cgroup_mount_point *mp; + char *tasks; + FILE *f; + int written; + + if (!h) { + WARN("not escaping hierarchy %d", i); + continue; + } + + mp = lxc_cgroup_find_mount_point(h, "/", true); + if (!mp) + goto out; + + tasks = cgroup_to_absolute_path(mp, "/", "tasks"); + if (!tasks) + goto out; + + f = fopen(tasks, "a"); + free(tasks); + if (!f) + goto out; + + written = fprintf(f, "%d\n", getpid()); + fclose(f); + if (written < 0) { + SYSERROR("writing tasks failed\n"); + goto out; + } + } + + ret = true; +out: + lxc_cgroup_put_meta(md); + return ret; +} + static bool cgfs_unfreeze(void *hdata) { struct cgfs_data *d = hdata; @@ -2408,6 +2457,7 @@ static struct cgroup_ops cgfs_ops = { .create_legacy = cgfs_create_legacy, .get_cgroup = cgfs_get_cgroup, .canonical_path = cgfs_canonical_path, + .escape = cgfs_escape, .get = lxc_cgroupfs_get, .set = lxc_cgroupfs_set, .unfreeze = cgfs_unfreeze, diff --git a/src/lxc/cgmanager.c b/src/lxc/cgmanager.c index 05df0da..04adc69 100644 --- a/src/lxc/cgmanager.c +++ b/src/lxc/cgmanager.c @@ -312,13 +312,22 @@ static bool lxc_cgmanager_create(const char *controller, const char *cgroup_path * be in "/lxc/c1" rather than "/user//c1" * called internally with connection already open */ -static bool lxc_cgmanager_escape(void) +static bool cgm_escape(void) { - bool ret = true; + bool ret = true, cgm_connected = false; pid_t me = getpid(); char **slist = subsystems; int i; + if (!cgroup_manager) { + if (!cgm_dbus_connect()) { + ERROR("Error connecting to cgroup manager"); + return false; + } + cgm_connected = true; + } + + if (cgm_all_controllers_same) slist = subsystems_inone; @@ -335,6 +344,9 @@ static bool lxc_cgmanager_escape(void) } } + if (cgm_connected) + cgm_dbus_disconnect(); + return ret; } @@ -1307,7 +1319,7 @@ struct cgroup_ops *cgm_ops_init(void) goto err1; // root; try to escape to root cgroup - if (geteuid() == 0 && !lxc_cgmanager_escape()) + if (geteuid() == 0 && !cgm_escape()) goto err2; cgm_dbus_disconnect(); @@ -1524,6 +1536,7 @@ static struct cgroup_ops cgmanager_ops = { .create_legacy = NULL, .get_cgroup = cgm_get_cgroup, .canonical_path = cgm_canonical_path, + .escape = cgm_escape, .get = cgm_get, .set = cgm_set, .unfreeze = cgm_unfreeze, diff --git a/src/lxc/cgroup.c b/src/lxc/cgroup.c index b1c764f..aeb8a58 100644 --- a/src/lxc/cgroup.c +++ b/src/lxc/cgroup.c @@ -109,6 +109,13 @@ const char *cgroup_get_cgroup(struct lxc_handler *handler, const char *subsystem return NULL; } +bool cgroup_escape(void) +{ + if (ops) + return ops->escape(); + return false; +} + const char *cgroup_canonical_path(struct lxc_handler *handler) { if (geteuid()) { diff --git a/src/lxc/cgroup.h b/src/lxc/cgroup.h index 7704c04..e3d3ce4 100644 --- a/src/lxc/cgroup.h +++ b/src/lxc/cgroup.h @@ -47,6 +47,7 @@ struct cgroup_ops { bool (*create_legacy)(void *hdata, pid_t pid); const char *(*get_cgroup)(void *hdata, const char *subsystem); const char *(*canonical_path)(void *hdata); + bool (*escape)(void); int (*set)(const char *filename, const char *value, const cha
[lxc-devel] [PATCH 3/3] c/r: add more logging when restore fails
Signed-off-by: Tycho Andersen --- src/lxc/criu.c | 5 - 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/src/lxc/criu.c b/src/lxc/criu.c index 062289f..da909a9 100644 --- a/src/lxc/criu.c +++ b/src/lxc/criu.c @@ -525,6 +525,7 @@ void do_restore(struct lxc_container *c, int pipe, char *directory, bool verbose if (WIFEXITED(status)) { if (WEXITSTATUS(status)) { + ERROR("criu process exited %d\n", WEXITSTATUS(status)); goto out_fini_handler; } else { int ret; @@ -544,8 +545,10 @@ void do_restore(struct lxc_container *c, int pipe, char *directory, bool verbose goto out_fini_handler; } - if (lxc_set_state(c->name, handler, RUNNING)) + if (lxc_set_state(c->name, handler, RUNNING)) { + ERROR("error setting running state after restore"); goto out_fini_handler; + } } } else { ERROR("CRIU was killed with signal %d\n", WTERMSIG(status)); -- 2.6.2 ___ lxc-devel mailing list lxc-devel@lists.linuxcontainers.org http://lists.linuxcontainers.org/listinfo/lxc-devel
[lxc-devel] [PATCH] c/r: remove random line continuations
No idea how these got there, but let's get rid of them since they're weird. Signed-off-by: Tycho Andersen --- src/lxc/criu.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/src/lxc/criu.c b/src/lxc/criu.c index 695a763..a0c3a16 100644 --- a/src/lxc/criu.c +++ b/src/lxc/criu.c @@ -175,9 +175,9 @@ void exec_criu(struct criu_opts *opts) additional = lxc_list_len(&opts->c->lxc_conf->network) * 2; - m = realloc(argv, (argc + additional + 1) * sizeof(*argv)); \ - if (!m) \ - goto err; \ + m = realloc(argv, (argc + additional + 1) * sizeof(*argv)); + if (!m) + goto err; argv = m; lxc_list_for_each(it, &opts->c->lxc_conf->network) { -- 2.5.0 ___ lxc-devel mailing list lxc-devel@lists.linuxcontainers.org http://lists.linuxcontainers.org/listinfo/lxc-devel
Re: [lxc-devel] [PATCH 3/3] c/r: add a new ->migrate API call
On Wed, Dec 02, 2015 at 02:30:54PM -0700, Tycho Andersen wrote: > This patch adds a new ->migrate API call with three commands: Derp, here's a new version with a small but obvious bug fixed. Tycho >From 9c4c750b9305ee81e11fa399057d6f8d1d7eb1e2 Mon Sep 17 00:00:00 2001 From: Tycho Andersen Date: Mon, 30 Nov 2015 15:14:22 -0700 Subject: [PATCH 3/3] c/r: add a new ->migrate API call This patch adds a new ->migrate API call with three commands: MIGRATE_DUMP: this is basically just ->checkpoint() MIGRATE_RESTORE: this is just ->restore() MIGRATE_PRE_DUMP: this can be used to invoke criu's pre-dump command on the container. A small addition to the (pre-)dump commands is the ability to specify a previous partial dump directory, so that one can use a pre-dump of a container. Finally, this new API call uses a structure to pass options so that it can be easily extended in the future (e.g. to CRIU's --leave-frozen option in the future, for potentially smarter failure handling on restore). v2: remember to flip the return code for legacy ->checkpoint and ->restore calls Signed-off-by: Tycho Andersen --- src/lxc/criu.c | 155 +++-- src/lxc/criu.h | 9 ++- src/lxc/lxccontainer.c | 140 src/lxc/lxccontainer.h | 33 +++ 4 files changed, 237 insertions(+), 100 deletions(-) diff --git a/src/lxc/criu.c b/src/lxc/criu.c index 695a763..c0ce965 100644 --- a/src/lxc/criu.c +++ b/src/lxc/criu.c @@ -64,12 +64,16 @@ void exec_criu(struct criu_opts *opts) * --enable-fs hugetlbfs --enable-fs tracefs * +1 for final NULL */ - if (strcmp(opts->action, "dump") == 0) { + if (strcmp(opts->action, "dump") == 0 || strcmp(opts->action, "pre-dump") == 0) { /* -t pid --freeze-cgroup /lxc/ct */ static_args += 4; - /* --leave-running */ - if (!opts->stop) + /* --prev-images-dir */ + if (opts->predump_dir) + static_args += 2; + + /* --leave-running (only for final dump) */ + if (strcmp(opts->action, "dump") == 0 && !opts->stop) static_args++; } else if (strcmp(opts->action, "restore") == 0) { /* --root $(lxc_mount_point) --restore-detached @@ -133,13 +137,12 @@ void exec_criu(struct criu_opts *opts) if (opts->verbose) DECLARE_ARG("-vv"); - if (strcmp(opts->action, "dump") == 0) { + if (strcmp(opts->action, "dump") == 0 || strcmp(opts->action, "pre-dump") == 0) { char pid[32], *freezer_relative; if (sprintf(pid, "%d", opts->c->init_pid(opts->c)) < 0) goto err; - DECLARE_ARG("-t"); DECLARE_ARG(pid); @@ -158,7 +161,13 @@ void exec_criu(struct criu_opts *opts) DECLARE_ARG("--freeze-cgroup"); DECLARE_ARG(log); - if (!opts->stop) + if (opts->predump_dir) { + DECLARE_ARG("--prev-images-dir"); + DECLARE_ARG(opts->predump_dir); + } + + /* only for final dump */ + if (strcmp(opts->action, "dump") == 0 && !opts->stop) DECLARE_ARG("--leave-running"); } else if (strcmp(opts->action, "restore") == 0) { void *m; @@ -402,6 +411,8 @@ out_unlock: return !has_error; } +// do_restore never returns, the calling process is used as the +// monitor process. do_restore calls exit() if it fails. void do_restore(struct lxc_container *c, int pipe, char *directory, bool verbose) { pid_t pid; @@ -560,3 +571,135 @@ out: exit(1); } + +/* do one of either predump or a regular dump */ +static bool do_dump(struct lxc_container *c, char *mode, char *directory, + bool stop, bool verbose, char *predump_dir) +{ + pid_t pid; + + if (!criu_ok(c)) + return false; + + if (mkdir_p(directory, 0700) < 0) + return false; + + pid = fork(); + if (pid < 0) { + SYSERROR("fork failed"); + return false; + } + + if (pid == 0) { + struct criu_opts os; + + os.action = mode; + os.directory = directory; + os.c = c; + os.stop = stop; + os.verbose = verbose; + os.predump_dir = predump_dir; + + /* exec_criu() returning is an error */ + exec_criu(&os); + exit(1); + } else { + int status; + pid_t w = waitpid(pid, &status, 0); + if (w == -1) { + SYSERROR("waitpid"); + return false; + } + + if (WIFEXITED(status)) { + if (WEXITSTATUS(status)) { +ERROR("dump failed with %d\n", WEXITSTATUS(status)); +return false; + } + + return true; + } else if (WIFSIGNALED(status)) { + ERROR("dump signaled with %d\n", WTERMSIG(status)); + return false; + } else { + ERROR("unknown dump exit %d\n", status); + return false; + } + } +} + +bool pre_dump(struct lxc_container *c, char *directory, bool verbose, char *predump_dir) +{ + return do_dump(c, "pre-dump", directory, false, verbose, pr
[lxc-devel] [PATCH 1/3] api wrapper: only reset the current config if this call set it
Instead of *always* resetting the current_config to null, we should only reset it if this API call set it. This allows nesting of API calls, e.g. c->checkpoint() can pass stuff into criu.c, which can call c->init_pid() and not lose the ability to log stuff afterwards. Signed-off-by: Tycho Andersen --- src/lxc/lxccontainer.c | 48 1 file changed, 40 insertions(+), 8 deletions(-) diff --git a/src/lxc/lxccontainer.c b/src/lxc/lxccontainer.c index 69816da..613e894 100644 --- a/src/lxc/lxccontainer.c +++ b/src/lxc/lxccontainer.c @@ -345,9 +345,17 @@ out: static rettype fnname(struct lxc_container *c) \ { \ rettype ret;\ - current_config = c ? c->lxc_conf : NULL;\ + bool reset_config = false; \ + \ + if (!current_config && c && c->lxc_conf) { \ + current_config = c->lxc_conf; \ + reset_config = true;\ + } \ + \ ret = do_##fnname(c); \ - current_config = NULL; \ + if (reset_config) \ + current_config = NULL; \ + \ return ret; \ } @@ -355,9 +363,17 @@ static rettype fnname(struct lxc_container *c) \ static rettype fnname(struct lxc_container *c, t1 a1) \ { \ rettype ret;\ - current_config = c ? c->lxc_conf : NULL;\ + bool reset_config = false; \ + \ + if (!current_config && c && c->lxc_conf) { \ + current_config = c->lxc_conf; \ + reset_config = true;\ + } \ + \ ret = do_##fnname(c, a1); \ - current_config = NULL; \ + if (reset_config) \ + current_config = NULL; \ + \ return ret; \ } @@ -365,9 +381,17 @@ static rettype fnname(struct lxc_container *c, t1 a1) \ static rettype fnname(struct lxc_container *c, t1 a1, t2 a2) \ { \ rettype ret;\ - current_config = c ? c->lxc_conf : NULL;\ + bool reset_config = false; \ + \ + if (!current_config && c && c->lxc_conf) { \ + current_config = c->lxc_conf; \ + reset_config = true;\ + } \ + \ ret = do_##fnname(c, a1, a2); \ - current_config = NULL; \ + if (reset_config) \ + current_config = NULL; \ + \ return ret; \ } @@ -375,9 +399,17 @@ static rettype fnname(struct lxc_container *c, t1 a1, t2 a2) \ static rettype fnname(struct lxc_container *c, t1 a1, t2 a2, t3 a3)\ { \ rettype ret;
[lxc-devel] [PATCH 3/3] c/r: add a new ->migrate API call
This patch adds a new ->migrate API call with three commands: MIGRATE_DUMP: this is basically just ->checkpoint() MIGRATE_RESTORE: this is just ->restore() MIGRATE_PRE_DUMP: this can be used to invoke criu's pre-dump command on the container. A small addition to the (pre-)dump commands is the ability to specify a previous partial dump directory, so that one can use a pre-dump of a container. Finally, this new API call uses a structure to pass options so that it can be easily extended in the future (e.g. to CRIU's --leave-frozen option in the future, for potentially smarter failure handling on restore). Signed-off-by: Tycho Andersen --- src/lxc/criu.c | 155 +++-- src/lxc/criu.h | 9 ++- src/lxc/lxccontainer.c | 140 src/lxc/lxccontainer.h | 33 +++ 4 files changed, 237 insertions(+), 100 deletions(-) diff --git a/src/lxc/criu.c b/src/lxc/criu.c index 695a763..c0ce965 100644 --- a/src/lxc/criu.c +++ b/src/lxc/criu.c @@ -64,12 +64,16 @@ void exec_criu(struct criu_opts *opts) * --enable-fs hugetlbfs --enable-fs tracefs * +1 for final NULL */ - if (strcmp(opts->action, "dump") == 0) { + if (strcmp(opts->action, "dump") == 0 || strcmp(opts->action, "pre-dump") == 0) { /* -t pid --freeze-cgroup /lxc/ct */ static_args += 4; - /* --leave-running */ - if (!opts->stop) + /* --prev-images-dir */ + if (opts->predump_dir) + static_args += 2; + + /* --leave-running (only for final dump) */ + if (strcmp(opts->action, "dump") == 0 && !opts->stop) static_args++; } else if (strcmp(opts->action, "restore") == 0) { /* --root $(lxc_mount_point) --restore-detached @@ -133,13 +137,12 @@ void exec_criu(struct criu_opts *opts) if (opts->verbose) DECLARE_ARG("-vv"); - if (strcmp(opts->action, "dump") == 0) { + if (strcmp(opts->action, "dump") == 0 || strcmp(opts->action, "pre-dump") == 0) { char pid[32], *freezer_relative; if (sprintf(pid, "%d", opts->c->init_pid(opts->c)) < 0) goto err; - DECLARE_ARG("-t"); DECLARE_ARG(pid); @@ -158,7 +161,13 @@ void exec_criu(struct criu_opts *opts) DECLARE_ARG("--freeze-cgroup"); DECLARE_ARG(log); - if (!opts->stop) + if (opts->predump_dir) { + DECLARE_ARG("--prev-images-dir"); + DECLARE_ARG(opts->predump_dir); + } + + /* only for final dump */ + if (strcmp(opts->action, "dump") == 0 && !opts->stop) DECLARE_ARG("--leave-running"); } else if (strcmp(opts->action, "restore") == 0) { void *m; @@ -402,6 +411,8 @@ out_unlock: return !has_error; } +// do_restore never returns, the calling process is used as the +// monitor process. do_restore calls exit() if it fails. void do_restore(struct lxc_container *c, int pipe, char *directory, bool verbose) { pid_t pid; @@ -560,3 +571,135 @@ out: exit(1); } + +/* do one of either predump or a regular dump */ +static bool do_dump(struct lxc_container *c, char *mode, char *directory, + bool stop, bool verbose, char *predump_dir) +{ + pid_t pid; + + if (!criu_ok(c)) + return false; + + if (mkdir_p(directory, 0700) < 0) + return false; + + pid = fork(); + if (pid < 0) { + SYSERROR("fork failed"); + return false; + } + + if (pid == 0) { + struct criu_opts os; + + os.action = mode; + os.directory = directory; + os.c = c; + os.stop = stop; + os.verbose = verbose; + os.predump_dir = predump_dir; + + /* exec_criu() returning is an error */ + exec_criu(&os); + exit(1); + } else { + int status; + pid_t w = waitpid(pid, &status, 0); + if (w == -1) { + SYSERROR("waitpid"); + return false; + } + + if (WIFEXITED(status)) { + if (WEXITSTATUS(status)) { + ERROR("dump failed with %d\n", WEXITSTATUS(status)); + return false; +
[lxc-devel] [PATCH 2/3] c/r: bump criu version requirements
Since we're relying on 1.8 for the seccomp stuff, let's refuse to use anything lower than that. Signed-off-by: Tycho Andersen --- src/lxc/criu.h | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/src/lxc/criu.h b/src/lxc/criu.h index df63625..9714d17 100644 --- a/src/lxc/criu.h +++ b/src/lxc/criu.h @@ -29,10 +29,10 @@ // We require either the criu major/minor version, or the criu GITID if criu // was built from git. -#define CRIU_VERSION "1.6" +#define CRIU_VERSION "1.8" -#define CRIU_GITID_VERSION "1.5" -#define CRIU_GITID_PATCHLEVEL 133 +#define CRIU_GITID_VERSION "1.7" +#define CRIU_GITID_PATCHLEVEL 371 struct criu_opts { /* The type of criu invocation, one of "dump" or "restore" */ -- 2.6.2 ___ lxc-devel mailing list lxc-devel@lists.linuxcontainers.org http://lists.linuxcontainers.org/listinfo/lxc-devel
[lxc-devel] a new API for c/r for lxc 2.0
Hi all, Here's a new, hopefully more extensible, API for doing checkpoint/restore (and related) operations on a container. There are a few things landing soon in CRIU that I'd like to integrate, but I haven't specified them yet because they're not landed, and also because it will be a good test as to whether or not this extensibility actually works. The basic idea is similar to the kernel's bpf() syscall, where a variable length struct is allowed, as long as the extra bits of the struct past what the current liblxc understands are zeroed out (i.e. some reasonable default value). Thoughts welcome, Tycho ___ lxc-devel mailing list lxc-devel@lists.linuxcontainers.org http://lists.linuxcontainers.org/listinfo/lxc-devel
Re: [lxc-devel] [PATCH] log: use the right size for timestamp formatting
On Tue, Dec 01, 2015 at 08:58:26AM -0700, Tycho Andersen wrote: > Signed-off-by: Tycho Andersen Ah, derp. Some other random debug crap got in here. Here's a better patch. Tycho >From 0f65015bed38b2dc03d1245a4e82afff5a01c426 Mon Sep 17 00:00:00 2001 From: Tycho Andersen Date: Tue, 1 Dec 2015 08:59:30 -0700 Subject: [PATCH] log: use the right size for timestamp formatting v2: get rid of extra debug crap Signed-off-by: Tycho Andersen --- src/lxc/log.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/lxc/log.c b/src/lxc/log.c index f44a1d9..20be5ac 100644 --- a/src/lxc/log.c +++ b/src/lxc/log.c @@ -90,7 +90,7 @@ static int log_append_logfile(const struct lxc_log_appender *appender, return 0; t = localtime(&event->timestamp.tv_sec); - strftime(date, sizeof(LXC_LOG_DATEFOMAT_SIZE), "%Y%m%d%H%M%S", t); + strftime(date, sizeof(date), "%Y%m%d%H%M%S", t); ms = event->timestamp.tv_usec / 1000; n = snprintf(buffer, sizeof(buffer), "%15s %10s.%03d %-8s %s - %s:%s:%d - ", -- 2.6.2 ___ lxc-devel mailing list lxc-devel@lists.linuxcontainers.org http://lists.linuxcontainers.org/listinfo/lxc-devel
[lxc-devel] [PATCH] log: use the right size for timestamp formatting
Signed-off-by: Tycho Andersen --- src/lxc/confile.c | 2 ++ src/lxc/criu.c| 7 ++- src/lxc/log.c | 2 +- 3 files changed, 5 insertions(+), 6 deletions(-) diff --git a/src/lxc/confile.c b/src/lxc/confile.c index c2eaaa6..9ed9e38 100644 --- a/src/lxc/confile.c +++ b/src/lxc/confile.c @@ -2119,6 +2119,8 @@ static int lxc_get_cgroup_entry(struct lxc_conf *c, char *retv, int inlen, int all = 0; struct lxc_list *it; + ERROR("TYCHO: lxc_get_cgroup_entry %s %s\n", retv, key); + if (!retv) inlen = 0; else diff --git a/src/lxc/criu.c b/src/lxc/criu.c index 695a763..759db69 100644 --- a/src/lxc/criu.c +++ b/src/lxc/criu.c @@ -348,13 +348,10 @@ bool criu_ok(struct lxc_container *c) return false; } - if (c->lxc_conf->tty != 0) { - ERROR("lxc.tty must be 0\n"); - return false; - } - + ERROR("TYCHO: testing %d cgs\n", lxc_list_len(&c->lxc_conf->cgroup)); lxc_list_for_each(it, &c->lxc_conf->cgroup) { struct lxc_cgroup *cg = it->elem; + ERROR("TYCHO: subsystem: %s, value: %s\n", cg->subsystem, cg->value); if (strcmp(cg->subsystem, "devices.deny") == 0 && strcmp(cg->value, "c 5:1 rwm") == 0) { diff --git a/src/lxc/log.c b/src/lxc/log.c index f44a1d9..20be5ac 100644 --- a/src/lxc/log.c +++ b/src/lxc/log.c @@ -90,7 +90,7 @@ static int log_append_logfile(const struct lxc_log_appender *appender, return 0; t = localtime(&event->timestamp.tv_sec); - strftime(date, sizeof(LXC_LOG_DATEFOMAT_SIZE), "%Y%m%d%H%M%S", t); + strftime(date, sizeof(date), "%Y%m%d%H%M%S", t); ms = event->timestamp.tv_usec / 1000; n = snprintf(buffer, sizeof(buffer), "%15s %10s.%03d %-8s %s - %s:%s:%d - ", -- 2.6.2 ___ lxc-devel mailing list lxc-devel@lists.linuxcontainers.org http://lists.linuxcontainers.org/listinfo/lxc-devel
Re: [lxc-devel] [PATCH] c/r: tell CRIU about cgproxy sockets
On Tue, Nov 17, 2015 at 01:13:40PM -0700, Tycho Andersen wrote: > CRIU needs to be told about connections to "external" sockets (in > particular, that it is ok to dump containers with connections to these > sockets). Since cgproxy connects to the bind mounted socket provided by the > lxc runtime, we should (as the runtime) also tell CRIU that this is an > allowable external socket. > > This is mostly correct in the sense that the protocol is stateless, so a > simple reconnect should be enough to restore things. However, if cgproxy > was in the middle of making a request that is half over the wire (or > waiting for a response to a request), this request will fail and cgproxy > will probably give some weird errors. Hmm, actually drop this patch for a bit. Something is racy here and I'm not sure what. Tycho ___ lxc-devel mailing list lxc-devel@lists.linuxcontainers.org http://lists.linuxcontainers.org/listinfo/lxc-devel
[lxc-devel] [PATCH] c/r: tell CRIU about cgproxy sockets
CRIU needs to be told about connections to "external" sockets (in particular, that it is ok to dump containers with connections to these sockets). Since cgproxy connects to the bind mounted socket provided by the lxc runtime, we should (as the runtime) also tell CRIU that this is an allowable external socket. This is mostly correct in the sense that the protocol is stateless, so a simple reconnect should be enough to restore things. However, if cgproxy was in the middle of making a request that is half over the wire (or waiting for a response to a request), this request will fail and cgproxy will probably give some weird errors. Signed-off-by: Tycho Andersen --- src/lxc/criu.c | 97 +- 1 file changed, 96 insertions(+), 1 deletion(-) diff --git a/src/lxc/criu.c b/src/lxc/criu.c index 695a763..b10d58f 100644 --- a/src/lxc/criu.c +++ b/src/lxc/criu.c @@ -47,14 +47,97 @@ lxc_log_define(lxc_criu, lxc); +#define CGMANAGER_SOCK "/sys/fs/cgroup/cgmanager/sock" +int find_cgm_inodes(pid_t init, char **ino_arg) +{ + int ret = -1, rootns; + FILE *f; + + /* We're interested in the inodes for the connections to the cgmanager +* socket from the inside of the container, which means they are in the +* container's /proc/net/unix; we setns there to look. +*/ + rootns = open("/proc/self/ns/net", O_RDONLY); + if (rootns < 0) { + SYSERROR("opening own namespace"); + return -1; + } + + if (!switch_to_ns(init, "net")) + goto out; + + f = fopen("/proc/net/unix", "r"); + if (!f) { + SYSERROR("opening /proc/net/unix"); + goto out_ns; + } + + /* skip the header line */ + while(!feof(f) && fgetc(f) != '\n') + ; + + while(1) { + int scanned; + char path[PATH_MAX], *tmp; + ino_t inode; + + scanned = fscanf(f, "%*x: %*x %*d %*d %*d %*d %lu", &inode); + if (scanned != 1) { + if (feof(f)) + break; + ERROR("problem scanning for members (%d)\n", scanned); + goto out_f; + } + + /* no path means it's not what we're looking for */ + if (!fgets(path, sizeof(path), f)) + continue; + + /* chop off the \n */ + path[strlen(path) - 1] = 0; + + /* start from after the delimiter */ + if (strcmp(path + 1, CGMANAGER_SOCK)) + continue; + + if (!*ino_arg) { + if (asprintf(&tmp, "--ext-unix-sk=%lu", inode) < 0) + goto out_f; + } else { + if (asprintf(&tmp, "%s,%lu", *ino_arg, inode) < 0) + goto out_f; + } + + *ino_arg = tmp; + } + + ret = 0; + +out_f: + fclose(f); +out_ns: + if (setns(rootns, 0) < 0) { + SYSERROR("error returning to init net ns"); + ret = -1; + } +out: + close(rootns); + return ret; +} + void exec_criu(struct criu_opts *opts) { char **argv, log[PATH_MAX]; int static_args = 22, argc = 0, i, ret; int netnr = 0; struct lxc_list *it; + char *cgm_ino_arg = NULL; char buf[4096]; + pid_t init = opts->c->init_pid(opts->c); + + if (init < 0) + return; /* The command line always looks like: * criu $(action) --tcp-established --file-locks --link-remap --force-irmap \ @@ -71,6 +154,14 @@ void exec_criu(struct criu_opts *opts) /* --leave-running */ if (!opts->stop) static_args++; + + ret = find_cgm_inodes(init, &cgm_ino_arg); + if (ret < 0) + return; + + if (cgm_ino_arg != NULL) + static_args++; + } else if (strcmp(opts->action, "restore") == 0) { /* --root $(lxc_mount_point) --restore-detached * --restore-sibling --pidfile $foo --cgroup-root $foo */ @@ -136,7 +227,7 @@ void exec_criu(struct criu_opts *opts) if (strcmp(opts->action, "dump") == 0) { char pid[32], *freezer_relative; - if (sprintf(pid, "%d", opts->c->init_pid(opts->c)) < 0) + if (sprintf(pid, "%d", init) < 0) goto err; @@ -160,6 +251,10 @@ void exec_criu(struct criu_opts *opts)
[lxc-devel] [PATCH v3] don't truncate environment sometimes in setproctitle
Instead, let's just allocate new space for the proctitle to live and point the kernel at that. v2: take out testing hunk v3: check return from realloc Signed-off-by: Tycho Andersen Acked-by: Serge E. Hallyn --- src/lxc/utils.c | 30 -- 1 file changed, 12 insertions(+), 18 deletions(-) diff --git a/src/lxc/utils.c b/src/lxc/utils.c index dac6418..ad9b0a2 100644 --- a/src/lxc/utils.c +++ b/src/lxc/utils.c @@ -1349,6 +1349,7 @@ char *get_template_path(const char *t) */ int setproctitle(char *title) { + static char *proctitle = NULL; char buf[2048], *tmp; FILE *f; int i, len, ret = 0; @@ -1413,28 +1414,21 @@ int setproctitle(char *title) * want to have room for it. */ len = strlen(title) + 1; - /* We're truncating the environment, so we should use at most the -* length of the argument + environment for the title. */ - if (len > env_end - arg_start) { - arg_end = env_end; - len = env_end - arg_start; - title[len-1] = '\0'; - } else { - /* Only truncate the environment if we're actually going to -* overwrite part of it. */ - if (len >= arg_end - arg_start) { - env_start = env_end; - } - - arg_end = arg_start + len; - - /* check overflow */ - if (arg_end < len || arg_end < arg_start) { + /* If we don't have enough room by just overwriting the old proctitle, +* let's allocate a new one. +*/ + if (len > arg_end - arg_start) { + void *m; + m = realloc(proctitle, len); + if (!m) return -1; - } + proctitle = m; + arg_start = (unsigned long) proctitle; } + arg_end = arg_start + len; + brk_val = syscall(__NR_brk, 0); prctl_map = (struct prctl_mm_map) { -- 2.5.0 ___ lxc-devel mailing list lxc-devel@lists.linuxcontainers.org http://lists.linuxcontainers.org/listinfo/lxc-devel
[lxc-devel] [PATCH v2] don't truncate environment sometimes in setproctitle
Instead, let's just allocate new space for the proctitle to live and point the kernel at that. v2: take out testing hunk Signed-off-by: Tycho Andersen --- src/lxc/utils.c | 29 + 1 file changed, 9 insertions(+), 20 deletions(-) diff --git a/src/lxc/utils.c b/src/lxc/utils.c index dac6418..a5ec2ec 100644 --- a/src/lxc/utils.c +++ b/src/lxc/utils.c @@ -1349,6 +1349,7 @@ char *get_template_path(const char *t) */ int setproctitle(char *title) { + static char *proctitle = NULL; char buf[2048], *tmp; FILE *f; int i, len, ret = 0; @@ -1413,28 +1414,16 @@ int setproctitle(char *title) * want to have room for it. */ len = strlen(title) + 1; - /* We're truncating the environment, so we should use at most the -* length of the argument + environment for the title. */ - if (len > env_end - arg_start) { - arg_end = env_end; - len = env_end - arg_start; - title[len-1] = '\0'; - } else { - /* Only truncate the environment if we're actually going to -* overwrite part of it. */ - if (len >= arg_end - arg_start) { - env_start = env_end; - } - - arg_end = arg_start + len; - - /* check overflow */ - if (arg_end < len || arg_end < arg_start) { - return -1; - } - + /* If we don't have enough room by just overwriting the old proctitle, +* let's allocate a new one. +*/ + if (len > arg_end - arg_start) { + proctitle = realloc(proctitle, len); + arg_start = (unsigned long) proctitle; } + arg_end = arg_start + len; + brk_val = syscall(__NR_brk, 0); prctl_map = (struct prctl_mm_map) { -- 2.5.0 ___ lxc-devel mailing list lxc-devel@lists.linuxcontainers.org http://lists.linuxcontainers.org/listinfo/lxc-devel
Re: [lxc-devel] [PATCH] don't truncate environment sometimes in setproctitle
On Mon, Nov 16, 2015 at 02:49:56PM -0700, Tycho Andersen wrote: > Instead, let's just allocate new space for the proctitle to live and point > the kernel at that. > > Signed-off-by: Tycho Andersen > --- > src/lxc/lxccontainer.c | 2 +- > src/lxc/utils.c| 29 + > 2 files changed, 10 insertions(+), 21 deletions(-) > > diff --git a/src/lxc/lxccontainer.c b/src/lxc/lxccontainer.c > index 5207255..c46d4f0 100644 > --- a/src/lxc/lxccontainer.c > +++ b/src/lxc/lxccontainer.c > @@ -762,7 +762,7 @@ static bool do_lxcapi_start(struct lxc_container *c, int > useinit, char * const a >* characters; all that it means is that the proctitle will be >* ugly. Similarly, we also don't care if setproctitle() >* fails. */ > - snprintf(title, sizeof(title), "[lxc monitor] %s %s", > c->config_path, c->name); > + snprintf(title, sizeof(title), "[lxc monitor this is a really > long proctitle > 01234567890123456789012345678901234567890012345678901234567890123456789012345678901234567890123456789012345678901234567890] > %s %s", c->config_path, c->name); Heh. Will resend without this hunk. Tycho > INFO("Attempting to set proc title to %s", title); > setproctitle(title); > > diff --git a/src/lxc/utils.c b/src/lxc/utils.c > index dac6418..a5ec2ec 100644 > --- a/src/lxc/utils.c > +++ b/src/lxc/utils.c > @@ -1349,6 +1349,7 @@ char *get_template_path(const char *t) > */ > int setproctitle(char *title) > { > + static char *proctitle = NULL; > char buf[2048], *tmp; > FILE *f; > int i, len, ret = 0; > @@ -1413,28 +1414,16 @@ int setproctitle(char *title) >* want to have room for it. */ > len = strlen(title) + 1; > > - /* We're truncating the environment, so we should use at most the > - * length of the argument + environment for the title. */ > - if (len > env_end - arg_start) { > - arg_end = env_end; > - len = env_end - arg_start; > - title[len-1] = '\0'; > - } else { > - /* Only truncate the environment if we're actually going to > - * overwrite part of it. */ > - if (len >= arg_end - arg_start) { > - env_start = env_end; > - } > - > - arg_end = arg_start + len; > - > - /* check overflow */ > - if (arg_end < len || arg_end < arg_start) { > - return -1; > - } > - > + /* If we don't have enough room by just overwriting the old proctitle, > + * let's allocate a new one. > + */ > + if (len > arg_end - arg_start) { > + proctitle = realloc(proctitle, len); > + arg_start = (unsigned long) proctitle; > } > > + arg_end = arg_start + len; > + > brk_val = syscall(__NR_brk, 0); > > prctl_map = (struct prctl_mm_map) { > -- > 2.5.0 > ___ lxc-devel mailing list lxc-devel@lists.linuxcontainers.org http://lists.linuxcontainers.org/listinfo/lxc-devel
[lxc-devel] [PATCH] don't truncate environment sometimes in setproctitle
Instead, let's just allocate new space for the proctitle to live and point the kernel at that. Signed-off-by: Tycho Andersen --- src/lxc/lxccontainer.c | 2 +- src/lxc/utils.c| 29 + 2 files changed, 10 insertions(+), 21 deletions(-) diff --git a/src/lxc/lxccontainer.c b/src/lxc/lxccontainer.c index 5207255..c46d4f0 100644 --- a/src/lxc/lxccontainer.c +++ b/src/lxc/lxccontainer.c @@ -762,7 +762,7 @@ static bool do_lxcapi_start(struct lxc_container *c, int useinit, char * const a * characters; all that it means is that the proctitle will be * ugly. Similarly, we also don't care if setproctitle() * fails. */ - snprintf(title, sizeof(title), "[lxc monitor] %s %s", c->config_path, c->name); + snprintf(title, sizeof(title), "[lxc monitor this is a really long proctitle 01234567890123456789012345678901234567890012345678901234567890123456789012345678901234567890123456789012345678901234567890] %s %s", c->config_path, c->name); INFO("Attempting to set proc title to %s", title); setproctitle(title); diff --git a/src/lxc/utils.c b/src/lxc/utils.c index dac6418..a5ec2ec 100644 --- a/src/lxc/utils.c +++ b/src/lxc/utils.c @@ -1349,6 +1349,7 @@ char *get_template_path(const char *t) */ int setproctitle(char *title) { + static char *proctitle = NULL; char buf[2048], *tmp; FILE *f; int i, len, ret = 0; @@ -1413,28 +1414,16 @@ int setproctitle(char *title) * want to have room for it. */ len = strlen(title) + 1; - /* We're truncating the environment, so we should use at most the -* length of the argument + environment for the title. */ - if (len > env_end - arg_start) { - arg_end = env_end; - len = env_end - arg_start; - title[len-1] = '\0'; - } else { - /* Only truncate the environment if we're actually going to -* overwrite part of it. */ - if (len >= arg_end - arg_start) { - env_start = env_end; - } - - arg_end = arg_start + len; - - /* check overflow */ - if (arg_end < len || arg_end < arg_start) { - return -1; - } - + /* If we don't have enough room by just overwriting the old proctitle, +* let's allocate a new one. +*/ + if (len > arg_end - arg_start) { + proctitle = realloc(proctitle, len); + arg_start = (unsigned long) proctitle; } + arg_end = arg_start + len; + brk_val = syscall(__NR_brk, 0); prctl_map = (struct prctl_mm_map) { -- 2.5.0 ___ lxc-devel mailing list lxc-devel@lists.linuxcontainers.org http://lists.linuxcontainers.org/listinfo/lxc-devel
Re: [lxc-devel] [PATCH] get rid of fancy proctitles
On Mon, Nov 16, 2015 at 09:00:24PM +, Serge Hallyn wrote: > Hey Tycho, > > I'm not sure what the best way to handle this is, but I dont' like > having an ephemeral pastebin link in the changelog. Can you just stick > the test program in the commit msg? Actually I think we can drop this one for now, there may be a better way. Thanks, Tycho > Quoting Tycho Andersen (tycho.ander...@canonical.com): > > It turns out clobbering the environ kernel stack does actually affect the > > kernel environment (viz [1]), regardless of what previous behavior or man > > pages said. For the 1.1 series, we should simply not try to clobber > > proctitles. > > > > [1]: http://paste.ubuntu.com/13300018/ > > > > Signed-off-by: Tycho Andersen > > --- > > src/lxc/criu.c | 9 > > src/lxc/lxccontainer.c | 9 > > src/lxc/utils.c| 123 > > - > > src/lxc/utils.h| 1 - > > 4 files changed, 142 deletions(-) > > > > diff --git a/src/lxc/criu.c b/src/lxc/criu.c > > index 7ee6cbe..a76f02d 100644 > > --- a/src/lxc/criu.c > > +++ b/src/lxc/criu.c > > @@ -465,7 +465,6 @@ void do_restore(struct lxc_container *c, int pipe, char > > *directory, bool verbose > > goto out_fini_handler; > > } else { > > int ret; > > - char title[2048]; > > > > pid_t w = waitpid(pid, &status, 0); > > if (w == -1) { > > @@ -511,14 +510,6 @@ void do_restore(struct lxc_container *c, int pipe, > > char *directory, bool verbose > > goto out_fini_handler; > > } > > > > - /* > > -* See comment in lxcapi_start; we don't care if these > > -* fail because it's just a beauty thing. We just > > -* assign the return here to silence potential. > > -*/ > > - ret = snprintf(title, sizeof(title), "[lxc monitor] %s %s", > > c->config_path, c->name); > > - ret = setproctitle(title); > > - > > ret = lxc_poll(c->name, handler); > > if (ret) > > lxc_abort(c->name, handler); > > diff --git a/src/lxc/lxccontainer.c b/src/lxc/lxccontainer.c > > index 11d1822..503925e 100644 > > --- a/src/lxc/lxccontainer.c > > +++ b/src/lxc/lxccontainer.c > > @@ -689,7 +689,6 @@ static bool do_lxcapi_start(struct lxc_container *c, > > int useinit, char * const a > > * while container is running... > > */ > > if (daemonize) { > > - char title[2048]; > > lxc_monitord_spawn(c->config_path); > > > > pid_t pid = fork(); > > @@ -704,14 +703,6 @@ static bool do_lxcapi_start(struct lxc_container *c, > > int useinit, char * const a > > return wait_on_daemonized_start(c, pid); > > } > > > > - /* We don't really care if this doesn't print all the > > -* characters; all that it means is that the proctitle will be > > -* ugly. Similarly, we also don't care if setproctitle() > > -* fails. */ > > - snprintf(title, sizeof(title), "[lxc monitor] %s %s", > > c->config_path, c->name); > > - INFO("Attempting to set proc title to %s", title); > > - setproctitle(title); > > - > > /* second fork to be reparented by init */ > > pid = fork(); > > if (pid < 0) { > > diff --git a/src/lxc/utils.c b/src/lxc/utils.c > > index d9e769d..3f83865 100644 > > --- a/src/lxc/utils.c > > +++ b/src/lxc/utils.c > > @@ -1341,129 +1341,6 @@ char *get_template_path(const char *t) > > } > > > > /* > > - * Sets the process title to the specified title. Note: > > - * 1. this function requires root to succeed > > - * 2. it clears /proc/self/environ > > - * 3. it may not succed (e.g. if title is longer than /proc/self/environ > > + > > - * the original title) > > - */ > > -int setproctitle(char *title) > > -{ > > - char buf[2048], *tmp; > > - FILE *f; > > - int i, len, ret = 0; > > - > > - /* We don't really need to know all of this stuff, but unfortunately > > -* PR_SET_MM_MAP requires us to set it all at once, so we have to > > -* figure it out anyway. > > -*/ > > -
[lxc-devel] [PATCH] get rid of fancy proctitles
It turns out clobbering the environ kernel stack does actually affect the kernel environment (viz [1]), regardless of what previous behavior or man pages said. For the 1.1 series, we should simply not try to clobber proctitles. [1]: http://paste.ubuntu.com/13300018/ Signed-off-by: Tycho Andersen --- src/lxc/criu.c | 9 src/lxc/lxccontainer.c | 9 src/lxc/utils.c| 123 - src/lxc/utils.h| 1 - 4 files changed, 142 deletions(-) diff --git a/src/lxc/criu.c b/src/lxc/criu.c index 7ee6cbe..a76f02d 100644 --- a/src/lxc/criu.c +++ b/src/lxc/criu.c @@ -465,7 +465,6 @@ void do_restore(struct lxc_container *c, int pipe, char *directory, bool verbose goto out_fini_handler; } else { int ret; - char title[2048]; pid_t w = waitpid(pid, &status, 0); if (w == -1) { @@ -511,14 +510,6 @@ void do_restore(struct lxc_container *c, int pipe, char *directory, bool verbose goto out_fini_handler; } - /* -* See comment in lxcapi_start; we don't care if these -* fail because it's just a beauty thing. We just -* assign the return here to silence potential. -*/ - ret = snprintf(title, sizeof(title), "[lxc monitor] %s %s", c->config_path, c->name); - ret = setproctitle(title); - ret = lxc_poll(c->name, handler); if (ret) lxc_abort(c->name, handler); diff --git a/src/lxc/lxccontainer.c b/src/lxc/lxccontainer.c index 11d1822..503925e 100644 --- a/src/lxc/lxccontainer.c +++ b/src/lxc/lxccontainer.c @@ -689,7 +689,6 @@ static bool do_lxcapi_start(struct lxc_container *c, int useinit, char * const a * while container is running... */ if (daemonize) { - char title[2048]; lxc_monitord_spawn(c->config_path); pid_t pid = fork(); @@ -704,14 +703,6 @@ static bool do_lxcapi_start(struct lxc_container *c, int useinit, char * const a return wait_on_daemonized_start(c, pid); } - /* We don't really care if this doesn't print all the -* characters; all that it means is that the proctitle will be -* ugly. Similarly, we also don't care if setproctitle() -* fails. */ - snprintf(title, sizeof(title), "[lxc monitor] %s %s", c->config_path, c->name); - INFO("Attempting to set proc title to %s", title); - setproctitle(title); - /* second fork to be reparented by init */ pid = fork(); if (pid < 0) { diff --git a/src/lxc/utils.c b/src/lxc/utils.c index d9e769d..3f83865 100644 --- a/src/lxc/utils.c +++ b/src/lxc/utils.c @@ -1341,129 +1341,6 @@ char *get_template_path(const char *t) } /* - * Sets the process title to the specified title. Note: - * 1. this function requires root to succeed - * 2. it clears /proc/self/environ - * 3. it may not succed (e.g. if title is longer than /proc/self/environ + - * the original title) - */ -int setproctitle(char *title) -{ - char buf[2048], *tmp; - FILE *f; - int i, len, ret = 0; - - /* We don't really need to know all of this stuff, but unfortunately -* PR_SET_MM_MAP requires us to set it all at once, so we have to -* figure it out anyway. -*/ - unsigned long start_data, end_data, start_brk, start_code, end_code, - start_stack, arg_start, arg_end, env_start, env_end, - brk_val; - struct prctl_mm_map prctl_map; - - f = fopen_cloexec("/proc/self/stat", "r"); - if (!f) { - return -1; - } - - tmp = fgets(buf, sizeof(buf), f); - fclose(f); - if (!tmp) { - return -1; - } - - /* Skip the first 25 fields, column 26-28 are start_code, end_code, -* and start_stack */ - tmp = strchr(buf, ' '); - for (i = 0; i < 24; i++) { - if (!tmp) - return -1; - tmp = strchr(tmp+1, ' '); - } - if (!tmp) - return -1; - - i = sscanf(tmp, "%lu %lu %lu", &start_code, &end_code, &start_stack); - if (i != 3) - return -1; - - /* Skip the next 19 fields, column 45-51 are start_data to arg_end */ - for (i = 0; i < 19; i++) { - if (!tmp) - return -1; - tmp = strchr(tmp+1, ' '); - } - - if (!tmp) - return -1; - - i = sscanf(tmp, &
[lxc-devel] [PATCH] utils: dialback setproctitle failure message
This isn't in any way fatal, so let's only warn about it with INFO, not ERROR. Signed-off-by: Tycho Andersen --- src/lxc/utils.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/lxc/utils.c b/src/lxc/utils.c index d9e769d..dac6418 100644 --- a/src/lxc/utils.c +++ b/src/lxc/utils.c @@ -1458,7 +1458,7 @@ int setproctitle(char *title) if (ret == 0) strcpy((char*)arg_start, title); else - SYSERROR("setting cmdline failed"); + INFO("setting cmdline failed - %s", strerror(errno)); return ret; } -- 2.6.2 ___ lxc-devel mailing list lxc-devel@lists.linuxcontainers.org http://lists.linuxcontainers.org/listinfo/lxc-devel
[lxc-devel] [PATCH] c/r: use freezer to seize tasks
Instead of relying on the old ptrace loop, we should instead put all the tasks in the container into the freezer. This will stop them all at the same time, preventing fork bombs from causing criu to infinite loop (and is also simply a lot faster). Note that this uses --freeze-cgroup which isn't in criu 1.7, so it should only go into master. Signed-off-by: Tycho Andersen --- src/lxc/criu.c | 24 +--- 1 file changed, 21 insertions(+), 3 deletions(-) diff --git a/src/lxc/criu.c b/src/lxc/criu.c index 7ee6cbe..695a763 100644 --- a/src/lxc/criu.c +++ b/src/lxc/criu.c @@ -37,6 +37,7 @@ #include "bdev.h" #include "cgroup.h" #include "conf.h" +#include "commands.h" #include "criu.h" #include "log.h" #include "lxc.h" @@ -64,8 +65,8 @@ void exec_criu(struct criu_opts *opts) * +1 for final NULL */ if (strcmp(opts->action, "dump") == 0) { - /* -t pid */ - static_args += 2; + /* -t pid --freeze-cgroup /lxc/ct */ + static_args += 4; /* --leave-running */ if (!opts->stop) @@ -133,13 +134,30 @@ void exec_criu(struct criu_opts *opts) DECLARE_ARG("-vv"); if (strcmp(opts->action, "dump") == 0) { - char pid[32]; + char pid[32], *freezer_relative; if (sprintf(pid, "%d", opts->c->init_pid(opts->c)) < 0) goto err; + DECLARE_ARG("-t"); DECLARE_ARG(pid); + + freezer_relative = lxc_cmd_get_cgroup_path(opts->c->name, + opts->c->config_path, + "freezer"); + if (!freezer_relative) { + ERROR("failed getting freezer path"); + goto err; + } + + ret = snprintf(log, sizeof(log), "/sys/fs/cgroup/freezer/%s", freezer_relative); + if (ret < 0 || ret >= sizeof(log)) + goto err; + + DECLARE_ARG("--freeze-cgroup"); + DECLARE_ARG(log); + if (!opts->stop) DECLARE_ARG("--leave-running"); } else if (strcmp(opts->action, "restore") == 0) { -- 2.6.2 ___ lxc-devel mailing list lxc-devel@lists.linuxcontainers.org http://lists.linuxcontainers.org/listinfo/lxc-devel
[lxc-devel] [PATCH] define PR_SET_MM_MAP & friends if necessary
PR_SET_MM_MAP only went in to the kernel at 3.18 (or 3.19), so we need to define these for kernels before then. If there was an error, the code simply logs the failure and continues on. Also, we can drop the PR_SET_MM_otherstuff contstants since those were dropped in 93525c00c76b2804c46cf3c275d610ebe71cb4be. Signed-off-by: Tycho Andersen --- src/lxc/utils.c | 33 +++-- 1 file changed, 19 insertions(+), 14 deletions(-) diff --git a/src/lxc/utils.c b/src/lxc/utils.c index d592243..d9e769d 100644 --- a/src/lxc/utils.c +++ b/src/lxc/utils.c @@ -51,20 +51,25 @@ #define PR_SET_MM 35 #endif -#ifndef PR_SET_MM_ARG_START -#define PR_SET_MM_ARG_START 8 -#endif - -#ifndef PR_SET_MM_ARG_END -#define PR_SET_MM_ARG_END 9 -#endif - -#ifndef PR_SET_MM_ENV_START -#define PR_SET_MM_ENV_START 10 -#endif - -#ifndef PR_SET_MM_ENV_END -#define PR_SET_MM_ENV_END 11 +#ifndef PR_SET_MM_MAP +#define PR_SET_MM_MAP 14 + +struct prctl_mm_map { +uint64_t start_code; +uint64_t end_code; +uint64_t start_data; +uint64_t end_data; +uint64_t start_brk; +uint64_t brk; +uint64_t start_stack; +uint64_t arg_start; +uint64_t arg_end; +uint64_t env_start; +uint64_t env_end; +uint64_t *auxv; +uint32_t auxv_size; +uint32_t exe_fd; +}; #endif #ifndef O_PATH -- 2.6.2 ___ lxc-devel mailing list lxc-devel@lists.linuxcontainers.org http://lists.linuxcontainers.org/listinfo/lxc-devel
[lxc-devel] [PATCH] c/r: don't require a veth link to c/r
veths can be unconnected in the container's config, and we should handle this case. Signed-off-by: Tycho Andersen --- src/lxc/criu.c | 5 - 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/src/lxc/criu.c b/src/lxc/criu.c index 798036a..7ee6cbe 100644 --- a/src/lxc/criu.c +++ b/src/lxc/criu.c @@ -178,7 +178,10 @@ void exec_criu(struct criu_opts *opts) veth = n->priv.veth_attr.pair; - ret = snprintf(buf, sizeof(buf), "%s=%s@%s", eth, veth, n->link); + if (n->link) + ret = snprintf(buf, sizeof(buf), "%s=%s@%s", eth, veth, n->link); + else + ret = snprintf(buf, sizeof(buf), "%s=%s", eth, veth); if (ret < 0 || ret >= sizeof(buf)) goto err; -- 2.6.2 ___ lxc-devel mailing list lxc-devel@lists.linuxcontainers.org http://lists.linuxcontainers.org/listinfo/lxc-devel
[lxc-devel] [PATCH v2] use PR_SET_MM_MAP instead of PR_SET_MM
PR_SET_MM_MAP can be called as non-root, which we are in the unprivileged (or nested) case. Also, let's not do the strcpy() for the new cmdline until after we're sure the prctl succeeded. This means that even if it does fail, we won't mutilate the command line like we did before, it just won't be as pretty. v2: remember to chop off bits of the string that are too long Signed-off-by: Tycho Andersen --- src/lxc/utils.c | 70 +++-- 1 file changed, 58 insertions(+), 12 deletions(-) diff --git a/src/lxc/utils.c b/src/lxc/utils.c index 01774c0..d592243 100644 --- a/src/lxc/utils.c +++ b/src/lxc/utils.c @@ -1347,7 +1347,15 @@ int setproctitle(char *title) char buf[2048], *tmp; FILE *f; int i, len, ret = 0; - unsigned long arg_start, arg_end, env_start, env_end; + + /* We don't really need to know all of this stuff, but unfortunately +* PR_SET_MM_MAP requires us to set it all at once, so we have to +* figure it out anyway. +*/ + unsigned long start_data, end_data, start_brk, start_code, end_code, + start_stack, arg_start, arg_end, env_start, env_end, + brk_val; + struct prctl_mm_map prctl_map; f = fopen_cloexec("/proc/self/stat", "r"); if (!f) { @@ -1360,23 +1368,42 @@ int setproctitle(char *title) return -1; } - /* Skip the first 47 fields, column 48-51 are ARG_START and -* ARG_END. */ + /* Skip the first 25 fields, column 26-28 are start_code, end_code, +* and start_stack */ tmp = strchr(buf, ' '); - for (i = 0; i < 46; i++) { + for (i = 0; i < 24; i++) { if (!tmp) return -1; tmp = strchr(tmp+1, ' '); } - if (!tmp) return -1; - i = sscanf(tmp, "%lu %lu %lu %lu", &arg_start, &arg_end, &env_start, &env_end); - if (i != 4) { + i = sscanf(tmp, "%lu %lu %lu", &start_code, &end_code, &start_stack); + if (i != 3) return -1; + + /* Skip the next 19 fields, column 45-51 are start_data to arg_end */ + for (i = 0; i < 19; i++) { + if (!tmp) + return -1; + tmp = strchr(tmp+1, ' '); } + if (!tmp) + return -1; + + i = sscanf(tmp, "%lu %lu %lu %lu %lu %lu %lu", + &start_data, + &end_data, + &start_brk, + &arg_start, + &arg_end, + &env_start, + &env_end); + if (i != 7) + return -1; + /* Include the null byte here, because in the calculations below we * want to have room for it. */ len = strlen(title) + 1; @@ -1386,6 +1413,7 @@ int setproctitle(char *title) if (len > env_end - arg_start) { arg_end = env_end; len = env_end - arg_start; + title[len-1] = '\0'; } else { /* Only truncate the environment if we're actually going to * overwrite part of it. */ @@ -1402,12 +1430,30 @@ int setproctitle(char *title) } - strcpy((char*)arg_start, title); + brk_val = syscall(__NR_brk, 0); - ret |= prctl(PR_SET_MM, PR_SET_MM_ARG_START, arg_start, 0, 0); - ret |= prctl(PR_SET_MM, PR_SET_MM_ARG_END, arg_end, 0, 0); - ret |= prctl(PR_SET_MM, PR_SET_MM_ENV_START, env_start, 0, 0); - ret |= prctl(PR_SET_MM, PR_SET_MM_ENV_END, env_end, 0, 0); + prctl_map = (struct prctl_mm_map) { + .start_code = start_code, + .end_code = end_code, + .start_stack = start_stack, + .start_data = start_data, + .end_data = end_data, + .start_brk = start_brk, + .brk = brk_val, + .arg_start = arg_start, + .arg_end = arg_end, + .env_start = env_start, + .env_end = env_end, + .auxv = NULL, + .auxv_size = 0, + .exe_fd = -1, + }; + + ret = prctl(PR_SET_MM, PR_SET_MM_MAP, (long) &prctl_map, sizeof(prctl_map), 0); + if (ret == 0) + strcpy((char*)arg_start, title); + else + SYSERROR("setting cmdline failed"); return ret; } -- 2.5.0 ___ lxc-devel mailing list lxc-devel@lists.linuxcontainers.org http://lists.linuxcontainers.org/listinfo/lxc-devel
[lxc-devel] [PATCH] use PR_SET_MM_MAP instead of PR_SET_MM
PR_SET_MM_MAP can be called as non-root, which we are in the unprivileged (or nested) case. Also, let's not do the strcpy() for the new cmdline until after we're sure the prctl succeeded. This means that even if it does fail, we won't mutilate the command line like we did before, it just won't be as pretty. Signed-off-by: Tycho Andersen --- src/lxc/utils.c | 69 +++-- 1 file changed, 57 insertions(+), 12 deletions(-) diff --git a/src/lxc/utils.c b/src/lxc/utils.c index 01774c0..fc6bb41 100644 --- a/src/lxc/utils.c +++ b/src/lxc/utils.c @@ -1347,7 +1347,15 @@ int setproctitle(char *title) char buf[2048], *tmp; FILE *f; int i, len, ret = 0; - unsigned long arg_start, arg_end, env_start, env_end; + + /* We don't really need to know all of this stuff, but unfortunately +* PR_SET_MM_MAP requires us to set it all at once, so we have to +* figure it out anyway. +*/ + unsigned long start_data, end_data, start_brk, start_code, end_code, + start_stack, arg_start, arg_end, env_start, env_end, + brk_val; + struct prctl_mm_map prctl_map; f = fopen_cloexec("/proc/self/stat", "r"); if (!f) { @@ -1360,23 +1368,42 @@ int setproctitle(char *title) return -1; } - /* Skip the first 47 fields, column 48-51 are ARG_START and -* ARG_END. */ + /* Skip the first 25 fields, column 26-28 are start_code, end_code, +* and start_stack */ tmp = strchr(buf, ' '); - for (i = 0; i < 46; i++) { + for (i = 0; i < 24; i++) { if (!tmp) return -1; tmp = strchr(tmp+1, ' '); } - if (!tmp) return -1; - i = sscanf(tmp, "%lu %lu %lu %lu", &arg_start, &arg_end, &env_start, &env_end); - if (i != 4) { + i = sscanf(tmp, "%lu %lu %lu", &start_code, &end_code, &start_stack); + if (i != 3) return -1; + + /* Skip the next 19 fields, column 45-51 are start_data to arg_end */ + for (i = 0; i < 19; i++) { + if (!tmp) + return -1; + tmp = strchr(tmp+1, ' '); } + if (!tmp) + return -1; + + i = sscanf(tmp, "%lu %lu %lu %lu %lu %lu %lu", + &start_data, + &end_data, + &start_brk, + &arg_start, + &arg_end, + &env_start, + &env_end); + if (i != 7) + return -1; + /* Include the null byte here, because in the calculations below we * want to have room for it. */ len = strlen(title) + 1; @@ -1402,12 +1429,30 @@ int setproctitle(char *title) } - strcpy((char*)arg_start, title); + brk_val = syscall(__NR_brk, 0); - ret |= prctl(PR_SET_MM, PR_SET_MM_ARG_START, arg_start, 0, 0); - ret |= prctl(PR_SET_MM, PR_SET_MM_ARG_END, arg_end, 0, 0); - ret |= prctl(PR_SET_MM, PR_SET_MM_ENV_START, env_start, 0, 0); - ret |= prctl(PR_SET_MM, PR_SET_MM_ENV_END, env_end, 0, 0); + prctl_map = (struct prctl_mm_map) { + .start_code = start_code, + .end_code = end_code, + .start_stack = start_stack, + .start_data = start_data, + .end_data = end_data, + .start_brk = start_brk, + .brk = brk_val, + .arg_start = arg_start, + .arg_end = arg_end, + .env_start = env_start, + .env_end = env_end, + .auxv = NULL, + .auxv_size = 0, + .exe_fd = -1, + }; + + ret = prctl(PR_SET_MM, PR_SET_MM_MAP, (long) &prctl_map, sizeof(prctl_map), 0); + if (ret == 0) + strcpy((char*)arg_start, title); + else + SYSERROR("setting cmdline failed"); return ret; } -- 2.5.0 ___ lxc-devel mailing list lxc-devel@lists.linuxcontainers.org http://lists.linuxcontainers.org/listinfo/lxc-devel
Re: [lxc-devel] odd freezer cgroup behavior
On Wed, Oct 28, 2015 at 11:55:32AM +0100, Christian Brauner wrote: > On Wed, Oct 28, 2015 at 04:08:09PM +0900, Tycho Andersen wrote: > > Hi all, > > > > I'm seeing some (what seems to me to be) odd behavior, where only a > > task's init process is in its freezer cgroup: > > > > firedrill:~ sudo lxc-info -n proposed > > Name: proposed > > State: RUNNING > > PID:10959 > > IP: 10.0.3.176 > > IP: 10.0.4.1 > > CPU use:2.77 seconds > > BlkIO use: 0 bytes > > Memory use: 24.42 MiB > > KMem use: 0 bytes > > Link: vethM6Q3GG > > TX bytes: 1.01 KiB > > RX bytes: 3.66 KiB > > Total bytes: 4.67 KiB > > firedrill:~ cat /sys/fs/cgroup/freezer/lxc/proposed/tasks > > 10959 > > > > I vaguely recall something like this before, but I don't remember how it was > > resolved. Surely all of the tasks in the container should live in the > > freezer > > cgroup? > > > > Note that this doesn't seem to happen with LXD (even with privileged > > containers). Haven't had time to investigate further. > > Could this be by any chance related to a kernel bug? I don't see this > behaviour > at all (neither priv nor unpriv): I suppose so; I'm on, Linux firedrill 4.2.0-10-generic #12-Ubuntu SMP Tue Sep 15 19:43:01 UTC 2015 x86_64 x86_64 x86_64 GNU/Linux (i.e. ubuntu wily with the latest kernel + lxc). It seems odd to me that the children aren't here, but perhaps I'm misunderstanding how freezer is supposed to work. Tycho > Archlinux > Kernel 4.2.5 > lxc 1.1.4 > lxcfs 0.11 > cgmanager 0.39 > ___ > lxc-devel mailing list > lxc-devel@lists.linuxcontainers.org > http://lists.linuxcontainers.org/listinfo/lxc-devel ___ lxc-devel mailing list lxc-devel@lists.linuxcontainers.org http://lists.linuxcontainers.org/listinfo/lxc-devel
[lxc-devel] odd freezer cgroup behavior
Hi all, I'm seeing some (what seems to me to be) odd behavior, where only a task's init process is in its freezer cgroup: firedrill:~ sudo lxc-info -n proposed Name: proposed State: RUNNING PID:10959 IP: 10.0.3.176 IP: 10.0.4.1 CPU use:2.77 seconds BlkIO use: 0 bytes Memory use: 24.42 MiB KMem use: 0 bytes Link: vethM6Q3GG TX bytes: 1.01 KiB RX bytes: 3.66 KiB Total bytes: 4.67 KiB firedrill:~ cat /sys/fs/cgroup/freezer/lxc/proposed/tasks 10959 I vaguely recall something like this before, but I don't remember how it was resolved. Surely all of the tasks in the container should live in the freezer cgroup? Note that this doesn't seem to happen with LXD (even with privileged containers). Haven't had time to investigate further. Thanks, Tycho ___ lxc-devel mailing list lxc-devel@lists.linuxcontainers.org http://lists.linuxcontainers.org/listinfo/lxc-devel
[lxc-devel] [PATCH] cmds: fix abstract socket length problem
Since we want to use null-terminated abstract sockets, let's compute the length of them correctly. Signed-off-by: Tycho Andersen --- src/lxc/commands.c | 14 -- 1 file changed, 12 insertions(+), 2 deletions(-) diff --git a/src/lxc/commands.c b/src/lxc/commands.c index b70ee72..a807da3 100644 --- a/src/lxc/commands.c +++ b/src/lxc/commands.c @@ -279,7 +279,12 @@ static int lxc_cmd(const char *name, struct lxc_cmd_rr *cmd, int *stopped, *stopped = 0; - len = sizeof(path)-1; + /* -2 here because this is an abstract unix socket so it needs a +* leading \0, and we null terminate, so it needs a trailing \0. +* Although null termination isn't required by the API, we do it anyway +* because we print the sockname out sometimes. +*/ + len = sizeof(path)-2; if (fill_sock_name(offset, len, name, lxcpath, hashed_sock_name)) return -1; @@ -972,7 +977,12 @@ int lxc_cmd_init(const char *name, struct lxc_handler *handler, char *offset = &path[1]; int len; - len = sizeof(path)-1; + /* -2 here because this is an abstract unix socket so it needs a +* leading \0, and we null terminate, so it needs a trailing \0. +* Although null termination isn't required by the API, we do it anyway +* because we print the sockname out sometimes. +*/ + len = sizeof(path)-2; if (fill_sock_name(offset, len, name, lxcpath, NULL)) return -1; -- 2.1.4 ___ lxc-devel mailing list lxc-devel@lists.linuxcontainers.org http://lists.linuxcontainers.org/listinfo/lxc-devel
Re: [lxc-devel] [PATCH 1/3] lxc-checkconfig: add some more config options
On Mon, Sep 14, 2015 at 02:17:03PM +, Serge Hallyn wrote: > Quoting Tycho Andersen (tycho.ander...@canonical.com): > > Here's some more config options that we do actually require to be able to > > boot containers. > > > > Signed-off-by: Tycho Andersen > > Acked-by: Serge E. Hallyn > > I wonder if it's worth augmenting the script to know about > missing, required, and a new optional flag. Things like c/r > and probably even ipv6 could be marked optional. Yeah, I thought about doing this but then I got lazy. I suppose we could just put (optional) after things that are, I can send a patch on top of this one to do that. Tycho > Probably not worth our time, though, as this script is mainly an > informative thing... > > > --- > > src/lxc/lxc-checkconfig.in | 19 +++ > > 1 file changed, 19 insertions(+) > > > > diff --git a/src/lxc/lxc-checkconfig.in b/src/lxc/lxc-checkconfig.in > > index a4c3ecb..247b70f 100644 > > --- a/src/lxc/lxc-checkconfig.in > > +++ b/src/lxc/lxc-checkconfig.in > > @@ -111,6 +111,25 @@ echo "--- Misc ---" > > echo -n "Veth pair device: " && is_enabled CONFIG_VETH > > echo -n "Macvlan: " && is_enabled CONFIG_MACVLAN > > echo -n "Vlan: " && is_enabled CONFIG_VLAN_8021Q > > +echo -n "Bridges: " && is_enabled CONFIG_BRIDGE > > +echo -n "Advanced netfilter: " && is_enabled CONFIG_NETFILTER_ADVANCED > > +echo -n "CONFIG_NF_NAT_IPV4: " && is_enabled CONFIG_NF_NAT_IPV4 > > +echo -n "CONFIG_NF_NAT_IPV6: " && is_enabled CONFIG_NF_NAT_IPV6 > > +echo -n "CONFIG_IP_NF_TARGET_MASQUERADE: " && is_enabled > > CONFIG_IP_NF_TARGET_MASQUERADE > > +echo -n "CONFIG_IP6_NF_TARGET_MASQUERADE: " && is_enabled > > CONFIG_IP6_NF_TARGET_MASQUERADE > > +echo -n "CONFIG_NETFILTER_XT_TARGET_CHECKSUM: " && is_enabled > > CONFIG_NETFILTER_XT_TARGET_CHECKSUM > > + > > +echo > > +echo "--- Checkpoint/Restore ---" > > +echo -n "checkpoint restore: " && is_enabled CONFIG_CHECKPOINT_RESTORE > > +echo -n "CONFIG_FHANDLE: " && is_enabled CONFIG_FHANDLE > > +echo -n "CONFIG_EVENTFD: " && is_enabled CONFIG_EVENTFD > > +echo -n "CONFIG_EPOLL: " && is_enabled CONFIG_EPOLL > > +echo -n "CONFIG_UNIX_DIAG: " && is_enabled CONFIG_UNIX_DIAG > > +echo -n "CONFIG_INET_DIAG: " && is_enabled CONFIG_INET_DIAG > > +echo -n "CONFIG_PACKET_DIAG: " && is_enabled CONFIG_PACKET_DIAG > > +echo -n "CONFIG_NETLINK_DIAG: " && is_enabled CONFIG_NETLINK_DIAG > > + > > echo -n "File capabilities: " && \ > > ( [ "${KVER_MAJOR}" = 2 ] && [ ${KVER_MINOR} -lt 33 ] && \ > > is_enabled CONFIG_SECURITY_FILE_CAPABILITIES ) || \ > > -- > > 2.1.4 > > > > ___ > > lxc-devel mailing list > > lxc-devel@lists.linuxcontainers.org > > http://lists.linuxcontainers.org/listinfo/lxc-devel > ___ > lxc-devel mailing list > lxc-devel@lists.linuxcontainers.org > http://lists.linuxcontainers.org/listinfo/lxc-devel ___ lxc-devel mailing list lxc-devel@lists.linuxcontainers.org http://lists.linuxcontainers.org/listinfo/lxc-devel
[lxc-devel] [PATCH 2/3] gitignore: add Korean man page output
Signed-off-by: Tycho Andersen --- .gitignore | 3 +++ 1 file changed, 3 insertions(+) diff --git a/.gitignore b/.gitignore index 35d89f2..9c60c92 100644 --- a/.gitignore +++ b/.gitignore @@ -135,6 +135,9 @@ doc/ja/*.5 doc/ja/*.7 doc/ja/legacy/*.1 doc/legacy/*.1 +doc/ko/*.1 +doc/ko/*.5 +doc/ko/*.7 doc/manpage.links doc/manpage.refs doc/api/html/* -- 2.1.4 ___ lxc-devel mailing list lxc-devel@lists.linuxcontainers.org http://lists.linuxcontainers.org/listinfo/lxc-devel
[lxc-devel] [PATCH 3/3] gitignore: add strange lxc@.service file
I have no idea what this file is, but the build system seems to be generating it, so let's ignore it. Signed-off-by: Tycho Andersen --- .gitignore | 1 + 1 file changed, 1 insertion(+) diff --git a/.gitignore b/.gitignore index 9c60c92..5ec52c5 100644 --- a/.gitignore +++ b/.gitignore @@ -122,6 +122,7 @@ config/init/common/lxc-net config/init/systemd/lxc-autostart-helper config/init/systemd/lxc-net.service config/init/systemd/lxc.service +config/init/systemd/lxc@.service config/init/sysvinit/lxc config/init/sysvinit/lxc-containers config/init/sysvinit/lxc-net -- 2.1.4 ___ lxc-devel mailing list lxc-devel@lists.linuxcontainers.org http://lists.linuxcontainers.org/listinfo/lxc-devel
[lxc-devel] [PATCH 1/3] lxc-checkconfig: add some more config options
Here's some more config options that we do actually require to be able to boot containers. Signed-off-by: Tycho Andersen --- src/lxc/lxc-checkconfig.in | 19 +++ 1 file changed, 19 insertions(+) diff --git a/src/lxc/lxc-checkconfig.in b/src/lxc/lxc-checkconfig.in index a4c3ecb..247b70f 100644 --- a/src/lxc/lxc-checkconfig.in +++ b/src/lxc/lxc-checkconfig.in @@ -111,6 +111,25 @@ echo "--- Misc ---" echo -n "Veth pair device: " && is_enabled CONFIG_VETH echo -n "Macvlan: " && is_enabled CONFIG_MACVLAN echo -n "Vlan: " && is_enabled CONFIG_VLAN_8021Q +echo -n "Bridges: " && is_enabled CONFIG_BRIDGE +echo -n "Advanced netfilter: " && is_enabled CONFIG_NETFILTER_ADVANCED +echo -n "CONFIG_NF_NAT_IPV4: " && is_enabled CONFIG_NF_NAT_IPV4 +echo -n "CONFIG_NF_NAT_IPV6: " && is_enabled CONFIG_NF_NAT_IPV6 +echo -n "CONFIG_IP_NF_TARGET_MASQUERADE: " && is_enabled CONFIG_IP_NF_TARGET_MASQUERADE +echo -n "CONFIG_IP6_NF_TARGET_MASQUERADE: " && is_enabled CONFIG_IP6_NF_TARGET_MASQUERADE +echo -n "CONFIG_NETFILTER_XT_TARGET_CHECKSUM: " && is_enabled CONFIG_NETFILTER_XT_TARGET_CHECKSUM + +echo +echo "--- Checkpoint/Restore ---" +echo -n "checkpoint restore: " && is_enabled CONFIG_CHECKPOINT_RESTORE +echo -n "CONFIG_FHANDLE: " && is_enabled CONFIG_FHANDLE +echo -n "CONFIG_EVENTFD: " && is_enabled CONFIG_EVENTFD +echo -n "CONFIG_EPOLL: " && is_enabled CONFIG_EPOLL +echo -n "CONFIG_UNIX_DIAG: " && is_enabled CONFIG_UNIX_DIAG +echo -n "CONFIG_INET_DIAG: " && is_enabled CONFIG_INET_DIAG +echo -n "CONFIG_PACKET_DIAG: " && is_enabled CONFIG_PACKET_DIAG +echo -n "CONFIG_NETLINK_DIAG: " && is_enabled CONFIG_NETLINK_DIAG + echo -n "File capabilities: " && \ ( [ "${KVER_MAJOR}" = 2 ] && [ ${KVER_MINOR} -lt 33 ] && \ is_enabled CONFIG_SECURITY_FILE_CAPABILITIES ) || \ -- 2.1.4 ___ lxc-devel mailing list lxc-devel@lists.linuxcontainers.org http://lists.linuxcontainers.org/listinfo/lxc-devel
[lxc-devel] [PATCH] c/r: enable tracefs
tracefs is a new filesystem that can be mounted by users. Only the options and fs name need to be passed to restore the state, so we can use criu's auto fs feature. Signed-off-by: Tycho Andersen --- src/lxc/criu.c | 6 -- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/src/lxc/criu.c b/src/lxc/criu.c index e939b37..bd6ecac 100644 --- a/src/lxc/criu.c +++ b/src/lxc/criu.c @@ -49,7 +49,7 @@ lxc_log_define(lxc_criu, lxc); void exec_criu(struct criu_opts *opts) { char **argv, log[PATH_MAX]; - int static_args = 20, argc = 0, i, ret; + int static_args = 22, argc = 0, i, ret; int netnr = 0; struct lxc_list *it; @@ -60,7 +60,7 @@ void exec_criu(struct criu_opts *opts) * --manage-cgroups action-script foo.sh -D $(directory) \ * -o $(directory)/$(action).log --ext-mount-map auto * --enable-external-sharing --enable-external-masters -* --enable-fs hugetlbfs +* --enable-fs hugetlbfs --enable-fs tracefs * +1 for final NULL */ if (strcmp(opts->action, "dump") == 0) { @@ -122,6 +122,8 @@ void exec_criu(struct criu_opts *opts) DECLARE_ARG("--enable-external-masters"); DECLARE_ARG("--enable-fs"); DECLARE_ARG("hugetlbfs"); + DECLARE_ARG("--enable-fs"); + DECLARE_ARG("tracefs"); DECLARE_ARG("-D"); DECLARE_ARG(opts->directory); DECLARE_ARG("-o"); -- 2.1.4 ___ lxc-devel mailing list lxc-devel@lists.linuxcontainers.org http://lists.linuxcontainers.org/listinfo/lxc-devel
Re: [lxc-devel] [PATCH 2/3] c/r: get rid of dump_net_info()
On Wed, Aug 12, 2015 at 04:20:29PM +, Serge Hallyn wrote: > Quoting Tycho Andersen (tycho.ander...@canonical.com): > > On Wed, Aug 12, 2015 at 03:28:07PM +, Serge Hallyn wrote: > > > Quoting Tycho Andersen (tycho.ander...@canonical.com): > > > > On Wed, Aug 12, 2015 at 03:09:22PM +, Serge Hallyn wrote: > > > > > Quoting Tycho Andersen (tycho.ander...@canonical.com): > > > > > > On Wed, Aug 12, 2015 at 02:54:14PM +, Serge Hallyn wrote: > > > > > > > Quoting Tycho Andersen (tycho.ander...@canonical.com): > > > > > > > > This was originally used to propagate the bridge and veth names > > > > > > > > across > > > > > > > > hosts, but now we extract both from the container's config > > > > > > > > file, and > > > > > > > > > > > > > > Is that the right thing to do? > > > > > > > > > > > > I don't know, actually. I did it so that you could switch bridges > > > > > > across hosts (and because the code is considerably simpler); it does > > > > > > keep the MAC and all that the same, so in theory it should Just > > > > > > Work. > > > > > > Someone who knows more about networks can probably correct me if I'm > > > > > > wrong, though :) > > > > > > > > > > I do prefer simple, in general. > > > > > > > > > > My concern with this is that any change to the running container which > > > > > was not reflected in the config will not be retained. I wonder if it > > > > > is better to fix dump_net_info() to give the information you actually > > > > > want. > > > > > > > > Yeah, it is true that if you change the container-side veth's name > > > > it'll screw things up, so that's probably good to save. I think it is > > > > > > Or if you use lxc-device to add a new nic, or the container simply creates > > > a new veth pair internally for use by its application. > > > > So the second case should be handled correctly (there's nothing > > outward facing, so nothing can change relative to the config file). > > The first case could also cause problems since there won't > > necessarily be config entries for these devices. > > Oh, right - this is only for 'leaked' interfaces to the host, got it. > > So especially in that case my ack stands and let's go with this for now :) > > > > > useful to be able to change the host bridge the container is attached > > > > to, and in principle it shouldn't screw anything up. So I'm not quite > > > > sure how to deal with that case. > > > > > > > > (Note that none of this matters in LXD land, since LXD is managing all > > > > the configuration for you and you can't change it. It is an issue with > > > > lxc-checkpoint and the config file lxc, though.) > > > > > > Well if we simply don't know what the right thing is, then doing the > > > simplest thing for now is reasonable. > > > > Ok. I think there's some stuff here that could still bite people, but > > I'm not sure that it's a super high priority right now. I'll see about > > fixing it "soon" :) > > Well if you do that we need to first have a discussion about what exactly > migration is meant to be. In lxc, the 'running' container is one thing, > the configured container is something different. To the point that you > can have running containers with no config. So the question is what > are we trying to migrate. The running container, I guess. So really the > configuration stored on disk should be ignored. The configuration we > get from the API's save_config should be used. (Is it?) Right now it uses whatever the config in memory has set up. In the case of the lxc-checkpoint tool, this is the configuration on disk. In the case of LXD, it's whatever LXD sets before it calls ->checkpoint. An advantage of this behavior for lxc-checkpoint is that it allows users to change things like which bridge the container is on in their config file (in the LXD case, if the config was modified in LXD between the two hosts, then the bridge would change as well). If we keep this state in a separate location (i.e. the checkpoint directory), we won't be able to do this, which I think is not as nice. There are a few things that we probably should think about though, for example, what happens if the bridge in the configuration doesn't exist? Perhaps we should save the bridge name, but let the config file's version override it? Tycho ___ lxc-devel mailing list lxc-devel@lists.linuxcontainers.org http://lists.linuxcontainers.org/listinfo/lxc-devel
Re: [lxc-devel] [PATCH 2/3] c/r: get rid of dump_net_info()
On Wed, Aug 12, 2015 at 03:28:07PM +, Serge Hallyn wrote: > Quoting Tycho Andersen (tycho.ander...@canonical.com): > > On Wed, Aug 12, 2015 at 03:09:22PM +, Serge Hallyn wrote: > > > Quoting Tycho Andersen (tycho.ander...@canonical.com): > > > > On Wed, Aug 12, 2015 at 02:54:14PM +, Serge Hallyn wrote: > > > > > Quoting Tycho Andersen (tycho.ander...@canonical.com): > > > > > > This was originally used to propagate the bridge and veth names > > > > > > across > > > > > > hosts, but now we extract both from the container's config file, and > > > > > > > > > > Is that the right thing to do? > > > > > > > > I don't know, actually. I did it so that you could switch bridges > > > > across hosts (and because the code is considerably simpler); it does > > > > keep the MAC and all that the same, so in theory it should Just Work. > > > > Someone who knows more about networks can probably correct me if I'm > > > > wrong, though :) > > > > > > I do prefer simple, in general. > > > > > > My concern with this is that any change to the running container which > > > was not reflected in the config will not be retained. I wonder if it > > > is better to fix dump_net_info() to give the information you actually > > > want. > > > > Yeah, it is true that if you change the container-side veth's name > > it'll screw things up, so that's probably good to save. I think it is > > Or if you use lxc-device to add a new nic, or the container simply creates > a new veth pair internally for use by its application. So the second case should be handled correctly (there's nothing outward facing, so nothing can change relative to the config file). The first case could also cause problems since there won't necessarily be config entries for these devices. > > useful to be able to change the host bridge the container is attached > > to, and in principle it shouldn't screw anything up. So I'm not quite > > sure how to deal with that case. > > > > (Note that none of this matters in LXD land, since LXD is managing all > > the configuration for you and you can't change it. It is an issue with > > lxc-checkpoint and the config file lxc, though.) > > Well if we simply don't know what the right thing is, then doing the > simplest thing for now is reasonable. Ok. I think there's some stuff here that could still bite people, but I'm not sure that it's a super high priority right now. I'll see about fixing it "soon" :) Tycho ___ lxc-devel mailing list lxc-devel@lists.linuxcontainers.org http://lists.linuxcontainers.org/listinfo/lxc-devel
Re: [lxc-devel] [PATCH 2/3] c/r: get rid of dump_net_info()
On Wed, Aug 12, 2015 at 03:09:22PM +, Serge Hallyn wrote: > Quoting Tycho Andersen (tycho.ander...@canonical.com): > > On Wed, Aug 12, 2015 at 02:54:14PM +, Serge Hallyn wrote: > > > Quoting Tycho Andersen (tycho.ander...@canonical.com): > > > > This was originally used to propagate the bridge and veth names across > > > > hosts, but now we extract both from the container's config file, and > > > > > > Is that the right thing to do? > > > > I don't know, actually. I did it so that you could switch bridges > > across hosts (and because the code is considerably simpler); it does > > keep the MAC and all that the same, so in theory it should Just Work. > > Someone who knows more about networks can probably correct me if I'm > > wrong, though :) > > I do prefer simple, in general. > > My concern with this is that any change to the running container which > was not reflected in the config will not be retained. I wonder if it > is better to fix dump_net_info() to give the information you actually > want. Yeah, it is true that if you change the container-side veth's name it'll screw things up, so that's probably good to save. I think it is useful to be able to change the host bridge the container is attached to, and in principle it shouldn't screw anything up. So I'm not quite sure how to deal with that case. (Note that none of this matters in LXD land, since LXD is managing all the configuration for you and you can't change it. It is an issue with lxc-checkpoint and the config file lxc, though.) Tycho ___ lxc-devel mailing list lxc-devel@lists.linuxcontainers.org http://lists.linuxcontainers.org/listinfo/lxc-devel
Re: [lxc-devel] [PATCH 3/3] c/r: allow empty networks to be checkpointed/restored
On Wed, Aug 12, 2015 at 03:05:19PM +, Serge Hallyn wrote: > Quoting Tycho Andersen (tycho.ander...@canonical.com): > > Empty networks don't have anything (besides lo) for us to dump and restore, > > so we should allow these as well. > > > > Reported-by: Dietmar Maurer > > Signed-off-by: Tycho Andersen > > Acked-by: Serge E. Hallyn > > Do you know whether the distinction between none and empty will be > retained after restart? The distinction being that none => the container is in the host's net ns, and empty => the container is in its own net ns with no devices? That should be preserved. Is there something else? Tycho ___ lxc-devel mailing list lxc-devel@lists.linuxcontainers.org http://lists.linuxcontainers.org/listinfo/lxc-devel
Re: [lxc-devel] [PATCH 2/3] c/r: get rid of dump_net_info()
On Wed, Aug 12, 2015 at 02:54:14PM +, Serge Hallyn wrote: > Quoting Tycho Andersen (tycho.ander...@canonical.com): > > This was originally used to propagate the bridge and veth names across > > hosts, but now we extract both from the container's config file, and > > Is that the right thing to do? I don't know, actually. I did it so that you could switch bridges across hosts (and because the code is considerably simpler); it does keep the MAC and all that the same, so in theory it should Just Work. Someone who knows more about networks can probably correct me if I'm wrong, though :) Tycho ___ lxc-devel mailing list lxc-devel@lists.linuxcontainers.org http://lists.linuxcontainers.org/listinfo/lxc-devel
[lxc-devel] [PATCH 2/3] c/r: get rid of dump_net_info()
This was originally used to propagate the bridge and veth names across hosts, but now we extract both from the container's config file, and nothing reads the files that dump_net_info() writes, so let's just get rid of them. Signed-off-by: Tycho Andersen --- src/lxc/criu.c | 51 -- src/lxc/criu.h | 2 -- src/lxc/lxccontainer.c | 3 --- 3 files changed, 56 deletions(-) diff --git a/src/lxc/criu.c b/src/lxc/criu.c index e939b37..e1282e4 100644 --- a/src/lxc/criu.c +++ b/src/lxc/criu.c @@ -340,57 +340,6 @@ bool criu_ok(struct lxc_container *c) return true; } -bool dump_net_info(struct lxc_container *c, char *directory) -{ - int netnr; - struct lxc_list *it; - - netnr = 0; - lxc_list_for_each(it, &c->lxc_conf->network) { - char *veth = NULL, *bridge = NULL, veth_path[PATH_MAX], eth[128]; - struct lxc_netdev *n = it->elem; - bool has_error = true; - int pret; - - pret = snprintf(veth_path, PATH_MAX, "lxc.network.%d.veth.pair", netnr); - if (pret < 0 || pret >= PATH_MAX) - goto out; - - veth = c->get_running_config_item(c, veth_path); - if (!veth) { - /* criu_ok() checks that all interfaces are -* LXC_NET{VETH,NONE}, and VETHs should have this -* config */ - assert(n->type == LXC_NET_NONE); - break; - } - - bridge = c->get_running_config_item(c, veth_path); - if (!bridge) - goto out; - - pret = snprintf(veth_path, PATH_MAX, "%s/veth%d", directory, netnr); - if (pret < 0 || pret >= PATH_MAX || print_to_file(veth_path, veth) < 0) - goto out; - - if (n->name) { - if (strlen(n->name) >= 128) - goto out; - strncpy(eth, n->name, 128); - } else - sprintf(eth, "eth%d", netnr); - - has_error = false; -out: - free(veth); - free(bridge); - if (has_error) - return false; - } - - return true; -} - static bool restore_net_info(struct lxc_container *c) { struct lxc_list *it; diff --git a/src/lxc/criu.h b/src/lxc/criu.h index 1f65e47..df63625 100644 --- a/src/lxc/criu.h +++ b/src/lxc/criu.h @@ -61,8 +61,6 @@ void exec_criu(struct criu_opts *opts); * dump. */ bool criu_ok(struct lxc_container *c); -bool dump_net_info(struct lxc_container *c, char *directory); - // do_restore never returns, the calling process is used as the // monitor process. do_restore calls exit() if it fails. void do_restore(struct lxc_container *c, int pipe, char *directory, bool verbose); diff --git a/src/lxc/lxccontainer.c b/src/lxc/lxccontainer.c index 1c103e8..14ae796 100644 --- a/src/lxc/lxccontainer.c +++ b/src/lxc/lxccontainer.c @@ -3733,9 +3733,6 @@ static bool do_lxcapi_checkpoint(struct lxc_container *c, char *directory, bool return false; } - if (!dump_net_info(c, directory)) - return false; - pid = fork(); if (pid < 0) return false; -- 2.1.4 ___ lxc-devel mailing list lxc-devel@lists.linuxcontainers.org http://lists.linuxcontainers.org/listinfo/lxc-devel
[lxc-devel] [PATCH 3/3] c/r: allow empty networks to be checkpointed/restored
Empty networks don't have anything (besides lo) for us to dump and restore, so we should allow these as well. Reported-by: Dietmar Maurer Signed-off-by: Tycho Andersen --- src/lxc/criu.c | 14 +- 1 file changed, 13 insertions(+), 1 deletion(-) diff --git a/src/lxc/criu.c b/src/lxc/criu.c index e1282e4..88dc8cd 100644 --- a/src/lxc/criu.c +++ b/src/lxc/criu.c @@ -164,6 +164,9 @@ void exec_criu(struct criu_opts *opts) char eth[128], *veth; struct lxc_netdev *n = it->elem; + if (n->type != LXC_NET_VETH) + continue; + if (n->name) { if (strlen(n->name) >= sizeof(eth)) goto err; @@ -304,7 +307,12 @@ bool criu_ok(struct lxc_container *c) /* We only know how to restore containers with veth networks. */ lxc_list_for_each(it, &c->lxc_conf->network) { struct lxc_netdev *n = it->elem; - if (n->type != LXC_NET_VETH && n->type != LXC_NET_NONE) { + switch(n->type) { + case LXC_NET_VETH: + case LXC_NET_NONE: + case LXC_NET_EMPTY: + break; + default: ERROR("Found network that is not VETH or NONE\n"); return false; } @@ -351,6 +359,10 @@ static bool restore_net_info(struct lxc_container *c) lxc_list_for_each(it, &c->lxc_conf->network) { struct lxc_netdev *netdev = it->elem; char template[IFNAMSIZ]; + + if (netdev->type != LXC_NET_VETH) + continue; + snprintf(template, sizeof(template), "vethXX"); if (!netdev->priv.veth_attr.pair) -- 2.1.4 ___ lxc-devel mailing list lxc-devel@lists.linuxcontainers.org http://lists.linuxcontainers.org/listinfo/lxc-devel
[lxc-devel] [PATCH 1/3] gitignore: add TAGS files
Somehow our `make tags` target generates TAGS and not tags, so let's ignore that too. Signed-off-by: Tycho Andersen --- .gitignore | 1 + 1 file changed, 1 insertion(+) diff --git a/.gitignore b/.gitignore index 3cb5bfb..35d89f2 100644 --- a/.gitignore +++ b/.gitignore @@ -150,5 +150,6 @@ patches *.orig *.rej tags +TAGS doc/api/doxygen_sqlite3.db -- 2.1.4 ___ lxc-devel mailing list lxc-devel@lists.linuxcontainers.org http://lists.linuxcontainers.org/listinfo/lxc-devel
Re: [lxc-devel] automatic user id mapping?
Hi Harald, On Thu, Jul 30, 2015 at 03:42:49PM +0200, Harald Dunkel wrote: > Hi folks, > > I need local resource limits inside each container. Problem: > UID conflicts between containers. Setting lxc.id_map is *highly* > painful, because it just moves the burden to the admin. > > An automatic solution to avoid UID and GID conflicts between > containers would be very helpful. Is there hope? Did I miss > something in the documentation? Right now with LXC/LXD that's your best bet. We do have per-container ID maps on the TODO list for LXD [1], and that will probably come with a nice way of allocating separate ranges if you like. [1]: https://github.com/lxc/lxd/issues/164 Tycho > > Regards > Harri > ___ > lxc-devel mailing list > lxc-devel@lists.linuxcontainers.org > http://lists.linuxcontainers.org/listinfo/lxc-devel ___ lxc-devel mailing list lxc-devel@lists.linuxcontainers.org http://lists.linuxcontainers.org/listinfo/lxc-devel
Re: [lxc-devel] [PATCH] lxc-checkpoint: add pre-checkpoint
On Tue, Jun 30, 2015 at 05:09:40PM +0300, Ruslan Kuprieiev wrote: > Hi Tycho, > > On 06/30/2015 04:50 PM, Tycho Andersen wrote: > >Hey Ruslan, > > > >On Fri, Jun 26, 2015 at 11:24:32AM +0300, Ruslan Kuprieiev wrote: > >>Drop this one, please. > >I'm assuming you're probably going to send another version at some > >point, a question below. > > > >>>diff --git a/src/lxc/lxccontainer.h b/src/lxc/lxccontainer.h > >>>index d60e19a..1faded2 100644 > >>>--- a/src/lxc/lxccontainer.h > >>>+++ b/src/lxc/lxccontainer.h > >>>@@ -773,7 +773,7 @@ struct lxc_container { > >>>* \return \c true on success, else \c false. > >>>* present at compile time). > >>>*/ > >>>- bool (*checkpoint)(struct lxc_container *c, char *directory, bool stop, > >>>bool verbose); > >>>+ bool (*checkpoint)(struct lxc_container *c, char *directory, char > >>>*prev_dir, bool stop, bool verbose); > >Here we're making an ABI change, and I'm not sure what the protocol in > >LXC for this (Stéphane or Serge can tell us I'm sure :). Whatever the > >case, we'll have to do some tap dancing here. It may (?) be worth > >turning this into an argument struct with version information to avoid > >this in the future, depending on how we solve this. > > Neither am I happy about changing abi in such a way. That's one of the > reasons > why I asked to drop this patch for now =). > > But looks like there is no other way but to change abi, as current arguments > for do_checkpoint do not satisfy our needs, if we want teach lxc-checkpoint > to do anything more advanced than just plain dump. Yes, unfortunately I didn't do a good job of future proofing the API. > Struct for options would be nice, I agree. But I actually thought about > using > libcriu's options to set\get options without adding any additional mediator > structures. Though, It would require modifying libcriu to actually return > void * > with options in it as well as adding criu_get-ers and fixing criu_set-ers to > be > able to pass options struct as an argument. I see, and then just porting liblxc to use libcriu and allowing users to pass in a pre-initialized opaque criu options pointer? That also seems reasonable to me. FWIW, I think the only way to avoid an ABI break here would be to implement ->checkpoint2(), which kind of sucks. I'm not sure how much work it is to bump the soname, but maybe that could be a temporary solution. If it's not too hard, then we could do the above now and not have to deal with it again. Tycho > >Anyway, I just thought I'd get the discussion started. > > > >Tycho > >___ > >lxc-devel mailing list > >lxc-devel@lists.linuxcontainers.org > >http://lists.linuxcontainers.org/listinfo/lxc-devel > > ___ > lxc-devel mailing list > lxc-devel@lists.linuxcontainers.org > http://lists.linuxcontainers.org/listinfo/lxc-devel ___ lxc-devel mailing list lxc-devel@lists.linuxcontainers.org http://lists.linuxcontainers.org/listinfo/lxc-devel
Re: [lxc-devel] [PATCH] lxc-checkpoint: add pre-checkpoint
Hey Ruslan, On Fri, Jun 26, 2015 at 11:24:32AM +0300, Ruslan Kuprieiev wrote: > Drop this one, please. I'm assuming you're probably going to send another version at some point, a question below. > >diff --git a/src/lxc/lxccontainer.h b/src/lxc/lxccontainer.h > >index d60e19a..1faded2 100644 > >--- a/src/lxc/lxccontainer.h > >+++ b/src/lxc/lxccontainer.h > >@@ -773,7 +773,7 @@ struct lxc_container { > > * \return \c true on success, else \c false. > > * present at compile time). > > */ > >-bool (*checkpoint)(struct lxc_container *c, char *directory, bool stop, > >bool verbose); > >+bool (*checkpoint)(struct lxc_container *c, char *directory, char > >*prev_dir, bool stop, bool verbose); Here we're making an ABI change, and I'm not sure what the protocol in LXC for this (Stéphane or Serge can tell us I'm sure :). Whatever the case, we'll have to do some tap dancing here. It may (?) be worth turning this into an argument struct with version information to avoid this in the future, depending on how we solve this. Anyway, I just thought I'd get the discussion started. Tycho ___ lxc-devel mailing list lxc-devel@lists.linuxcontainers.org http://lists.linuxcontainers.org/listinfo/lxc-devel
Re: [lxc-devel] live migrate and /dev/lxd
On Mon, Jun 22, 2015 at 06:41:16AM +0200, Dietmar Maurer wrote: > > On dump, this would probably just be having the plugin query LXD to > > make sure that all requests for the container were flushed so that > > CRIU would at least dump the socket buffers for the kernel. On > > restore, the plugin would tell /dev/lxd to reconnect in some sort of > > "previously connected" mode, and then hand that FD back to CRIU as the > > fd that the application sees when it is resumed. As long as LXD and > > the plugin know how to negotiate restoring the external resource's > > state, everything should Just Work. > > That may work, but sounds quite complex to me. All of migration is quite complex :) > So what kind of events > to you plan to sent to the container (what for)? Perhaps hardware hotplug events or cloud-init userdata updates or something; I'm not sure we've really thought about all the ways we plan to use it yet. Tycho ___ lxc-devel mailing list lxc-devel@lists.linuxcontainers.org http://lists.linuxcontainers.org/listinfo/lxc-devel
Re: [lxc-devel] live migrate and /dev/lxd
Hi Dietmar, On Sat, Jun 20, 2015 at 05:45:51PM +0200, Dietmar Maurer wrote: > Hi all, > > I just detected the /dev/lxd specification: > > https://github.com/lxc/lxd/blob/master/specs/dev-lxd.md > > and saw that event API: > > --- > /1.0/events > GET > > Description: event interface > Return: websocket upgrade (similar to /1.0/events on main API) > --- > > This makes it possible to open a websocket connection from > running container to the host? Yes. Note however that the initial version of /dev/lxd was just landed on Friday, and isn't in any released version of LXD yet and the events API is currently not implemented in the code that did land, so it may be a while before this problem is a reality. > I just wonder how you would live migrate such connection? Good question. CRIU has a plugin mechanism, which we can use to dump special "external" resources (i.e. resources connected to stuff outside the container's namespaces, unix sockets in this case). What we'd have to do is write a CRIU plugin to dump and restore the live connections to /dev/lxd. On dump, this would probably just be having the plugin query LXD to make sure that all requests for the container were flushed so that CRIU would at least dump the socket buffers for the kernel. On restore, the plugin would tell /dev/lxd to reconnect in some sort of "previously connected" mode, and then hand that FD back to CRIU as the fd that the application sees when it is resumed. As long as LXD and the plugin know how to negotiate restoring the external resource's state, everything should Just Work. Tycho ___ lxc-devel mailing list lxc-devel@lists.linuxcontainers.org http://lists.linuxcontainers.org/listinfo/lxc-devel
Re: [lxc-devel] [PATCH] daemonized start: exit children on failure, don't return
On Thu, Jun 11, 2015 at 04:11:48AM +, Serge Hallyn wrote: > When starting a daemonized container, only the original parent > thread should return to the caller. The first forked child > immediately exits after forking, but the grandparent child > was in some places returning on error - causing a second instance > of the calling function. > > Signed-off-by: Serge Hallyn Acked-by: Tycho Andersen > --- > src/lxc/lxccontainer.c | 10 +++--- > 1 file changed, 7 insertions(+), 3 deletions(-) > > diff --git a/src/lxc/lxccontainer.c b/src/lxc/lxccontainer.c > index 7708a8c..1c103e8 100644 > --- a/src/lxc/lxccontainer.c > +++ b/src/lxc/lxccontainer.c > @@ -712,19 +712,19 @@ static bool do_lxcapi_start(struct lxc_container *c, > int useinit, char * const a > pid = fork(); > if (pid < 0) { > SYSERROR("Error doing dual-fork"); > - return false; > + exit(1); > } > if (pid != 0) > exit(0); > /* like daemon(), chdir to / and redirect 0,1,2 to /dev/null */ > if (chdir("/")) { > SYSERROR("Error chdir()ing to /."); > - return false; > + exit(1); > } > lxc_check_inherited(conf, true, -1); > if (null_stdfds() < 0) { > ERROR("failed to close fds"); > - return false; > + exit(1); > } > setsid(); > } else { > @@ -742,6 +742,8 @@ static bool do_lxcapi_start(struct lxc_container *c, int > useinit, char * const a > if (pid_fp == NULL) { > SYSERROR("Failed to create pidfile '%s' for '%s'", >c->pidfile, c->name); > + if (daemonize) > + exit(1); > return false; > } > > @@ -749,6 +751,8 @@ static bool do_lxcapi_start(struct lxc_container *c, int > useinit, char * const a > SYSERROR("Failed to write '%s'", c->pidfile); > fclose(pid_fp); > pid_fp = NULL; > + if (daemonize) > + exit(1); > return false; > } > > -- > 2.1.4 > > ___ > lxc-devel mailing list > lxc-devel@lists.linuxcontainers.org > http://lists.linuxcontainers.org/listinfo/lxc-devel ___ lxc-devel mailing list lxc-devel@lists.linuxcontainers.org http://lists.linuxcontainers.org/listinfo/lxc-devel
[lxc-devel] [PATCH 4/4] uniformly nullify std fds
In various places throughout the code, we want to "nullify" the std fds, opening them to /dev/null or zero or so. Instead, let's unify this code and do it in such a way that Coverity (probably) won't complain. v2: use /dev/null for stdin as well v3: add a comment about use of C's short circuiting v4: axe comment, check errors on dup2, s/quiet/need_null_stdfds Reported-by: Coverity Signed-off-by: Tycho Andersen --- src/lxc/bdev.c | 8 ++-- src/lxc/lxccontainer.c | 21 +++-- src/lxc/monitor.c | 8 ++-- src/lxc/start.c| 10 ++ src/lxc/utils.c| 21 + src/lxc/utils.h| 1 + 6 files changed, 35 insertions(+), 34 deletions(-) diff --git a/src/lxc/bdev.c b/src/lxc/bdev.c index 53465b1..520652c 100644 --- a/src/lxc/bdev.c +++ b/src/lxc/bdev.c @@ -224,12 +224,8 @@ static int do_mkfs(const char *path, const char *fstype) // If the file is not a block device, we don't want mkfs to ask // us about whether to proceed. - close(0); - close(1); - close(2); - open("/dev/zero", O_RDONLY); - open("/dev/null", O_RDWR); - open("/dev/null", O_RDWR); + if (null_stdfds() < 0) + exit(1); execlp("mkfs", "mkfs", "-t", fstype, path, NULL); exit(1); } diff --git a/src/lxc/lxccontainer.c b/src/lxc/lxccontainer.c index 445cc22..7708a8c 100644 --- a/src/lxc/lxccontainer.c +++ b/src/lxc/lxccontainer.c @@ -722,12 +722,10 @@ static bool do_lxcapi_start(struct lxc_container *c, int useinit, char * const a return false; } lxc_check_inherited(conf, true, -1); - close(0); - close(1); - close(2); - open("/dev/zero", O_RDONLY); - open("/dev/null", O_RDWR); - open("/dev/null", O_RDWR); + if (null_stdfds() < 0) { + ERROR("failed to close fds"); + return false; + } setsid(); } else { if (!am_single_threaded()) { @@ -956,7 +954,7 @@ static char *lxcbasename(char *path) return p; } -static bool create_run_template(struct lxc_container *c, char *tpath, bool quiet, +static bool create_run_template(struct lxc_container *c, char *tpath, bool need_null_stdfds, char *const argv[]) { pid_t pid; @@ -978,13 +976,8 @@ static bool create_run_template(struct lxc_container *c, char *tpath, bool quiet char **newargv; struct lxc_conf *conf = c->lxc_conf; - if (quiet) { - close(0); - close(1); - close(2); - open("/dev/zero", O_RDONLY); - open("/dev/null", O_RDWR); - open("/dev/null", O_RDWR); + if (need_null_stdfds && null_stdfds() < 0) { + exit(1); } src = c->lxc_conf->rootfs.path; diff --git a/src/lxc/monitor.c b/src/lxc/monitor.c index 0e56f75..144b16e 100644 --- a/src/lxc/monitor.c +++ b/src/lxc/monitor.c @@ -329,12 +329,8 @@ int lxc_monitord_spawn(const char *lxcpath) exit(EXIT_FAILURE); } lxc_check_inherited(NULL, true, pipefd[1]); - close(0); - close(1); - close(2); - open("/dev/null", O_RDONLY); - open("/dev/null", O_RDWR); - open("/dev/null", O_RDWR); + if (null_stdfds() < 0) + exit(EXIT_FAILURE); close(pipefd[0]); sprintf(pipefd_str, "%d", pipefd[1]); execvp(args[0], args); diff --git a/src/lxc/start.c b/src/lxc/start.c index 71cd9ef..6eded61 100644 --- a/src/lxc/start.c +++ b/src/lxc/start.c @@ -762,14 +762,8 @@ static int do_start(void *data) close(handler->sigfd); - if (handler->backgrounded) { - close(0); - close(1); - close(2); - open("/dev/zero", O_RDONLY); - open("/dev/null", O_RDWR); - open("/dev/null", O_RDWR); - } + if (handler->backgrounded && null_stdfds() < 0) + goto out_warn_father; /* after this call, we are in error because this * ops should not return as it execs */ diff --git a/src/lxc/utils.c b/src/lxc/utils.c index 467bc1b..7ced314 100644 --- a/src/lxc/utils.c +++ b/src/lxc/utils.c @@ -1445,3 +1445,24 @@ domount: INFO("Mounted /proc in container for security transition"); return 1; } + +int null_stdfds(void) +{ + int fd, ret = -1; + +
Re: [lxc-devel] [PATCH v3 4/4] uniformly nullify std fds
On Wed, Jun 10, 2015 at 12:03:08PM +, Serge Hallyn wrote: > Quoting Tycho Andersen (tycho.ander...@canonical.com): > > In various places throughout the code, we want to "nullify" the std fds, > > opening them to /dev/null or zero or so. Instead, let's unify this code and > > do it in such a way that Coverity (probably) won't complain. > > > > v2: use /dev/null for stdin as well > > v3: add a comment about use of C's short circuiting > > > > Reported-by: Coverity > > Signed-off-by: Tycho Andersen > > Acked-by: Serge E. Hallyn > > Two comments below, please resend with my ack. > > > --- > > src/lxc/bdev.c | 8 ++-- > > src/lxc/lxccontainer.c | 24 +++- > > src/lxc/monitor.c | 8 ++-- > > src/lxc/start.c| 10 ++ > > src/lxc/utils.c| 16 > > src/lxc/utils.h| 1 + > > 6 files changed, 34 insertions(+), 33 deletions(-) > > > > diff --git a/src/lxc/bdev.c b/src/lxc/bdev.c > > index 53465b1..520652c 100644 > > --- a/src/lxc/bdev.c > > +++ b/src/lxc/bdev.c > > @@ -224,12 +224,8 @@ static int do_mkfs(const char *path, const char > > *fstype) > > > > // If the file is not a block device, we don't want mkfs to ask > > // us about whether to proceed. > > - close(0); > > - close(1); > > - close(2); > > - open("/dev/zero", O_RDONLY); > > - open("/dev/null", O_RDWR); > > - open("/dev/null", O_RDWR); > > + if (null_stdfds() < 0) > > + exit(1); > > execlp("mkfs", "mkfs", "-t", fstype, path, NULL); > > exit(1); > > } > > diff --git a/src/lxc/lxccontainer.c b/src/lxc/lxccontainer.c > > index 445cc22..a0dd2a2 100644 > > --- a/src/lxc/lxccontainer.c > > +++ b/src/lxc/lxccontainer.c > > @@ -722,12 +722,10 @@ static bool do_lxcapi_start(struct lxc_container *c, > > int useinit, char * const a > > return false; > > } > > lxc_check_inherited(conf, true, -1); > > - close(0); > > - close(1); > > - close(2); > > - open("/dev/zero", O_RDONLY); > > - open("/dev/null", O_RDWR); > > - open("/dev/null", O_RDWR); > > + if (null_stdfds() < 0) { > > + ERROR("failed to close fds"); > > + return false; > > + } > > setsid(); > > } else { > > if (!am_single_threaded()) { > > @@ -978,13 +976,13 @@ static bool create_run_template(struct lxc_container > > *c, char *tpath, bool quiet > > char **newargv; > > struct lxc_conf *conf = c->lxc_conf; > > > > - if (quiet) { > > - close(0); > > - close(1); > > - close(2); > > - open("/dev/zero", O_RDONLY); > > - open("/dev/null", O_RDWR); > > - open("/dev/null", O_RDWR); > > + /* > > +* Here, we're taking advantage of C's short circuiting of > > +* conditions: we should only fail if quiet is set and > > +* null_stdfds fails. > > +*/ > > No, please drop that comment ^. > > > + if (quiet && null_stdfds() < 0) { > > + exit(1); > > } > > > > src = c->lxc_conf->rootfs.path; > > diff --git a/src/lxc/monitor.c b/src/lxc/monitor.c > > index 0e56f75..144b16e 100644 > > --- a/src/lxc/monitor.c > > +++ b/src/lxc/monitor.c > > @@ -329,12 +329,8 @@ int lxc_monitord_spawn(const char *lxcpath) > > exit(EXIT_FAILURE); > > } > > lxc_check_inherited(NULL, true, pipefd[1]); > > - close(0); > > - close(1); > > - close(2); > > - open("/dev/null", O_RDONLY); > > - open("/dev/null", O_RDWR); > > - open("/dev/null", O_RDWR); > > + if (null_stdfds() < 0) > > + exit(EXIT_FAILURE); > > close(pipefd[0]); > > sprintf(pipefd_str, "%d", pipefd[1]); > > execvp(args[0], args); > > diff --git a/src/lxc/start.c b/src/lxc/start.c > > index 71cd9ef..6eded61 100644 > > --- a/src/lxc/start.c > > +++ b/src/lxc/start.c >
Re: [lxc-devel] [PATCH v3 4/4] uniformly nullify std fds
On Tue, Jun 09, 2015 at 07:51:07PM +0200, Robert Vogelgesang wrote: > Hi Tycho, > > thank you for the updated patch, but you missed my intention, see below. > > On Tue, Jun 09, 2015 at 10:09:28AM -0600, Tycho Andersen wrote: > > In various places throughout the code, we want to "nullify" the std fds, > > opening them to /dev/null or zero or so. Instead, let's unify this code and > > do it in such a way that Coverity (probably) won't complain. > > > > v2: use /dev/null for stdin as well > > v3: add a comment about use of C's short circuiting > > > > Reported-by: Coverity > > Signed-off-by: Tycho Andersen > > --- > > src/lxc/bdev.c | 8 ++-- > > src/lxc/lxccontainer.c | 24 +++- > > src/lxc/monitor.c | 8 ++-- > > src/lxc/start.c| 10 ++ > > src/lxc/utils.c| 16 > > src/lxc/utils.h| 1 + > > 6 files changed, 34 insertions(+), 33 deletions(-) > > > > diff --git a/src/lxc/bdev.c b/src/lxc/bdev.c > > index 53465b1..520652c 100644 > > --- a/src/lxc/bdev.c > > +++ b/src/lxc/bdev.c > > @@ -224,12 +224,8 @@ static int do_mkfs(const char *path, const char > > *fstype) > > > > // If the file is not a block device, we don't want mkfs to ask > > // us about whether to proceed. > > - close(0); > > - close(1); > > - close(2); > > - open("/dev/zero", O_RDONLY); > > - open("/dev/null", O_RDWR); > > - open("/dev/null", O_RDWR); > > + if (null_stdfds() < 0) > > + exit(1); > > execlp("mkfs", "mkfs", "-t", fstype, path, NULL); > > exit(1); > > } > > diff --git a/src/lxc/lxccontainer.c b/src/lxc/lxccontainer.c > > index 445cc22..a0dd2a2 100644 > > --- a/src/lxc/lxccontainer.c > > +++ b/src/lxc/lxccontainer.c > > @@ -722,12 +722,10 @@ static bool do_lxcapi_start(struct lxc_container *c, > > int useinit, char * const a > > return false; > > } > > lxc_check_inherited(conf, true, -1); > > - close(0); > > - close(1); > > - close(2); > > - open("/dev/zero", O_RDONLY); > > - open("/dev/null", O_RDWR); > > - open("/dev/null", O_RDWR); > > + if (null_stdfds() < 0) { > > + ERROR("failed to close fds"); > > + return false; > > + } > > setsid(); > > } else { > > if (!am_single_threaded()) { > > @@ -978,13 +976,13 @@ static bool create_run_template(struct lxc_container > > *c, char *tpath, bool quiet > > char **newargv; > > struct lxc_conf *conf = c->lxc_conf; > > > > - if (quiet) { > > - close(0); > > - close(1); > > - close(2); > > - open("/dev/zero", O_RDONLY); > > - open("/dev/null", O_RDWR); > > - open("/dev/null", O_RDWR); > > + /* > > +* Here, we're taking advantage of C's short circuiting of > > +* conditions: we should only fail if quiet is set and > > +* null_stdfds fails. > > +*/ > > + if (quiet && null_stdfds() < 0) { > > + exit(1); > > My concern is not about someone not understanding short circuiting. > There should be a warning against reversing the order of the two parts. > > Short circuiting is rather common, but that "quiet" means to close > fds, is unusual and not obvious. > > If someone would change that to > > > + if (null_stdfds() < 0 && quiet) { > > then null_stdfds() is called independently of whether quiet is set or > not, and that would clearly be a change of behaviour. This could lead > to lost error messages in cases when someone is actually looking > for such messages. > > Thank you for your patience with a nitpicker. :-) Can you write exactly what you'd like the comment to say? I think this is a fairly common C idiom, so I doubt someone would switch the order. Tycho > Robert > > > > } > > > > src = c->lxc_conf->rootfs.path; > > diff --git a/src/lxc/monitor.c b/src/lxc/monitor.c > > index 0e56f75..144b16e 1006
[lxc-devel] [PATCH v2 4/4] uniformly nullify std fds
In various places throughout the code, we want to "nullify" the std fds, opening them to /dev/null or zero or so. Instead, let's unify this code and do it in such a way that Coverity (probably) won't complain. v2: use /dev/null for stdin as well Reported-by: Coverity Signed-off-by: Tycho Andersen --- src/lxc/bdev.c | 8 ++-- src/lxc/lxccontainer.c | 19 ++- src/lxc/monitor.c | 8 ++-- src/lxc/start.c| 10 ++ src/lxc/utils.c| 16 src/lxc/utils.h| 1 + 6 files changed, 29 insertions(+), 33 deletions(-) diff --git a/src/lxc/bdev.c b/src/lxc/bdev.c index 53465b1..520652c 100644 --- a/src/lxc/bdev.c +++ b/src/lxc/bdev.c @@ -224,12 +224,8 @@ static int do_mkfs(const char *path, const char *fstype) // If the file is not a block device, we don't want mkfs to ask // us about whether to proceed. - close(0); - close(1); - close(2); - open("/dev/zero", O_RDONLY); - open("/dev/null", O_RDWR); - open("/dev/null", O_RDWR); + if (null_stdfds() < 0) + exit(1); execlp("mkfs", "mkfs", "-t", fstype, path, NULL); exit(1); } diff --git a/src/lxc/lxccontainer.c b/src/lxc/lxccontainer.c index 445cc22..18fceeb 100644 --- a/src/lxc/lxccontainer.c +++ b/src/lxc/lxccontainer.c @@ -722,12 +722,10 @@ static bool do_lxcapi_start(struct lxc_container *c, int useinit, char * const a return false; } lxc_check_inherited(conf, true, -1); - close(0); - close(1); - close(2); - open("/dev/zero", O_RDONLY); - open("/dev/null", O_RDWR); - open("/dev/null", O_RDWR); + if (null_stdfds() < 0) { + ERROR("failed to close fds"); + return false; + } setsid(); } else { if (!am_single_threaded()) { @@ -978,13 +976,8 @@ static bool create_run_template(struct lxc_container *c, char *tpath, bool quiet char **newargv; struct lxc_conf *conf = c->lxc_conf; - if (quiet) { - close(0); - close(1); - close(2); - open("/dev/zero", O_RDONLY); - open("/dev/null", O_RDWR); - open("/dev/null", O_RDWR); + if (quiet && null_stdfds() < 0) { + exit(1); } src = c->lxc_conf->rootfs.path; diff --git a/src/lxc/monitor.c b/src/lxc/monitor.c index 0e56f75..144b16e 100644 --- a/src/lxc/monitor.c +++ b/src/lxc/monitor.c @@ -329,12 +329,8 @@ int lxc_monitord_spawn(const char *lxcpath) exit(EXIT_FAILURE); } lxc_check_inherited(NULL, true, pipefd[1]); - close(0); - close(1); - close(2); - open("/dev/null", O_RDONLY); - open("/dev/null", O_RDWR); - open("/dev/null", O_RDWR); + if (null_stdfds() < 0) + exit(EXIT_FAILURE); close(pipefd[0]); sprintf(pipefd_str, "%d", pipefd[1]); execvp(args[0], args); diff --git a/src/lxc/start.c b/src/lxc/start.c index 71cd9ef..6eded61 100644 --- a/src/lxc/start.c +++ b/src/lxc/start.c @@ -762,14 +762,8 @@ static int do_start(void *data) close(handler->sigfd); - if (handler->backgrounded) { - close(0); - close(1); - close(2); - open("/dev/zero", O_RDONLY); - open("/dev/null", O_RDWR); - open("/dev/null", O_RDWR); - } + if (handler->backgrounded && null_stdfds() < 0) + goto out_warn_father; /* after this call, we are in error because this * ops should not return as it execs */ diff --git a/src/lxc/utils.c b/src/lxc/utils.c index 467bc1b..42dd38f 100644 --- a/src/lxc/utils.c +++ b/src/lxc/utils.c @@ -1445,3 +1445,19 @@ domount: INFO("Mounted /proc in container for security transition"); return 1; } + +int null_stdfds(void) +{ + int fd; + + fd = open("/dev/null", O_RDWR); + if (fd < 0) + return -1; + + dup2(fd, 0); + dup2(fd, 1); + dup2(fd, 2); + close(fd); + + return 0; +} diff --git a/src/lxc/utils.h b/src/lxc/utils.h index 6bd05e0..ee12dde 100644 --- a/src/lxc/utils.h +++ b/src/lxc/utils.h @@ -280,4 +280,5 @@ int is_dir(const char *path); char *get_template_path(const char *t); int setproctitle(char *title); int mount_proc_if_needed(const char *rootfs); +int null_stdfds(void); #endif /* __LXC_UTILS_H */ -- 2.1.4 ___ lxc-devel mailing list lxc-devel@lists.linuxcontainers.org http://lists.linuxcontainers.org/listinfo/lxc-devel
[lxc-devel] [PATCH v3 4/4] uniformly nullify std fds
In various places throughout the code, we want to "nullify" the std fds, opening them to /dev/null or zero or so. Instead, let's unify this code and do it in such a way that Coverity (probably) won't complain. v2: use /dev/null for stdin as well v3: add a comment about use of C's short circuiting Reported-by: Coverity Signed-off-by: Tycho Andersen --- src/lxc/bdev.c | 8 ++-- src/lxc/lxccontainer.c | 24 +++- src/lxc/monitor.c | 8 ++-- src/lxc/start.c| 10 ++ src/lxc/utils.c| 16 src/lxc/utils.h| 1 + 6 files changed, 34 insertions(+), 33 deletions(-) diff --git a/src/lxc/bdev.c b/src/lxc/bdev.c index 53465b1..520652c 100644 --- a/src/lxc/bdev.c +++ b/src/lxc/bdev.c @@ -224,12 +224,8 @@ static int do_mkfs(const char *path, const char *fstype) // If the file is not a block device, we don't want mkfs to ask // us about whether to proceed. - close(0); - close(1); - close(2); - open("/dev/zero", O_RDONLY); - open("/dev/null", O_RDWR); - open("/dev/null", O_RDWR); + if (null_stdfds() < 0) + exit(1); execlp("mkfs", "mkfs", "-t", fstype, path, NULL); exit(1); } diff --git a/src/lxc/lxccontainer.c b/src/lxc/lxccontainer.c index 445cc22..a0dd2a2 100644 --- a/src/lxc/lxccontainer.c +++ b/src/lxc/lxccontainer.c @@ -722,12 +722,10 @@ static bool do_lxcapi_start(struct lxc_container *c, int useinit, char * const a return false; } lxc_check_inherited(conf, true, -1); - close(0); - close(1); - close(2); - open("/dev/zero", O_RDONLY); - open("/dev/null", O_RDWR); - open("/dev/null", O_RDWR); + if (null_stdfds() < 0) { + ERROR("failed to close fds"); + return false; + } setsid(); } else { if (!am_single_threaded()) { @@ -978,13 +976,13 @@ static bool create_run_template(struct lxc_container *c, char *tpath, bool quiet char **newargv; struct lxc_conf *conf = c->lxc_conf; - if (quiet) { - close(0); - close(1); - close(2); - open("/dev/zero", O_RDONLY); - open("/dev/null", O_RDWR); - open("/dev/null", O_RDWR); + /* +* Here, we're taking advantage of C's short circuiting of +* conditions: we should only fail if quiet is set and +* null_stdfds fails. +*/ + if (quiet && null_stdfds() < 0) { + exit(1); } src = c->lxc_conf->rootfs.path; diff --git a/src/lxc/monitor.c b/src/lxc/monitor.c index 0e56f75..144b16e 100644 --- a/src/lxc/monitor.c +++ b/src/lxc/monitor.c @@ -329,12 +329,8 @@ int lxc_monitord_spawn(const char *lxcpath) exit(EXIT_FAILURE); } lxc_check_inherited(NULL, true, pipefd[1]); - close(0); - close(1); - close(2); - open("/dev/null", O_RDONLY); - open("/dev/null", O_RDWR); - open("/dev/null", O_RDWR); + if (null_stdfds() < 0) + exit(EXIT_FAILURE); close(pipefd[0]); sprintf(pipefd_str, "%d", pipefd[1]); execvp(args[0], args); diff --git a/src/lxc/start.c b/src/lxc/start.c index 71cd9ef..6eded61 100644 --- a/src/lxc/start.c +++ b/src/lxc/start.c @@ -762,14 +762,8 @@ static int do_start(void *data) close(handler->sigfd); - if (handler->backgrounded) { - close(0); - close(1); - close(2); - open("/dev/zero", O_RDONLY); - open("/dev/null", O_RDWR); - open("/dev/null", O_RDWR); - } + if (handler->backgrounded && null_stdfds() < 0) + goto out_warn_father; /* after this call, we are in error because this * ops should not return as it execs */ diff --git a/src/lxc/utils.c b/src/lxc/utils.c index 467bc1b..42dd38f 100644 --- a/src/lxc/utils.c +++ b/src/lxc/utils.c @@ -1445,3 +1445,19 @@ domount: INFO("Mounted /proc in container for security transition"); return 1; } + +int null_stdfds(void) +{ + int fd; + + fd = open("/dev/null", O_RDWR); + if (fd < 0) + return -1; + + dup2(fd, 0); + dup2(fd, 1); + dup2(fd, 2); + close(fd); + +
Re: [lxc-devel] [PATCH v2 4/4] uniformly nullify std fds
On Tue, Jun 09, 2015 at 10:04:21AM -0600, Tycho Andersen wrote: > In various places throughout the code, we want to "nullify" the std fds, > opening them to /dev/null or zero or so. Instead, let's unify this code and > do it in such a way that Coverity (probably) won't complain. > > v2: use /dev/null for stdin as well Oh, whoops, I realized I forgot the extra comment... > Reported-by: Coverity > Signed-off-by: Tycho Andersen > --- > src/lxc/bdev.c | 8 ++-- > src/lxc/lxccontainer.c | 19 ++- > src/lxc/monitor.c | 8 ++-- > src/lxc/start.c| 10 ++ > src/lxc/utils.c| 16 > src/lxc/utils.h| 1 + > 6 files changed, 29 insertions(+), 33 deletions(-) > > diff --git a/src/lxc/bdev.c b/src/lxc/bdev.c > index 53465b1..520652c 100644 > --- a/src/lxc/bdev.c > +++ b/src/lxc/bdev.c > @@ -224,12 +224,8 @@ static int do_mkfs(const char *path, const char *fstype) > > // If the file is not a block device, we don't want mkfs to ask > // us about whether to proceed. > - close(0); > - close(1); > - close(2); > - open("/dev/zero", O_RDONLY); > - open("/dev/null", O_RDWR); > - open("/dev/null", O_RDWR); > + if (null_stdfds() < 0) > + exit(1); > execlp("mkfs", "mkfs", "-t", fstype, path, NULL); > exit(1); > } > diff --git a/src/lxc/lxccontainer.c b/src/lxc/lxccontainer.c > index 445cc22..18fceeb 100644 > --- a/src/lxc/lxccontainer.c > +++ b/src/lxc/lxccontainer.c > @@ -722,12 +722,10 @@ static bool do_lxcapi_start(struct lxc_container *c, > int useinit, char * const a > return false; > } > lxc_check_inherited(conf, true, -1); > - close(0); > - close(1); > - close(2); > - open("/dev/zero", O_RDONLY); > - open("/dev/null", O_RDWR); > - open("/dev/null", O_RDWR); > + if (null_stdfds() < 0) { > + ERROR("failed to close fds"); > + return false; > + } > setsid(); > } else { > if (!am_single_threaded()) { > @@ -978,13 +976,8 @@ static bool create_run_template(struct lxc_container *c, > char *tpath, bool quiet > char **newargv; > struct lxc_conf *conf = c->lxc_conf; > > - if (quiet) { > - close(0); > - close(1); > - close(2); > - open("/dev/zero", O_RDONLY); > - open("/dev/null", O_RDWR); > - open("/dev/null", O_RDWR); > + if (quiet && null_stdfds() < 0) { > + exit(1); > } > > src = c->lxc_conf->rootfs.path; > diff --git a/src/lxc/monitor.c b/src/lxc/monitor.c > index 0e56f75..144b16e 100644 > --- a/src/lxc/monitor.c > +++ b/src/lxc/monitor.c > @@ -329,12 +329,8 @@ int lxc_monitord_spawn(const char *lxcpath) > exit(EXIT_FAILURE); > } > lxc_check_inherited(NULL, true, pipefd[1]); > - close(0); > - close(1); > - close(2); > - open("/dev/null", O_RDONLY); > - open("/dev/null", O_RDWR); > - open("/dev/null", O_RDWR); > + if (null_stdfds() < 0) > + exit(EXIT_FAILURE); > close(pipefd[0]); > sprintf(pipefd_str, "%d", pipefd[1]); > execvp(args[0], args); > diff --git a/src/lxc/start.c b/src/lxc/start.c > index 71cd9ef..6eded61 100644 > --- a/src/lxc/start.c > +++ b/src/lxc/start.c > @@ -762,14 +762,8 @@ static int do_start(void *data) > > close(handler->sigfd); > > - if (handler->backgrounded) { > - close(0); > - close(1); > - close(2); > - open("/dev/zero", O_RDONLY); > - open("/dev/null", O_RDWR); > - open("/dev/null", O_RDWR); > - } > + if (handler->backgrounded && null_stdfds() < 0) > + goto out_warn_father; > > /* after this call, we are in error because this >* ops should not return as it execs */ > diff --git a/src/lxc/utils.c b/src/lxc/utils.c > index 467bc1b..42dd38f 100644 > --- a/src/lxc/utils.c > +++ b/src/lxc/utils.c > @@ -1445,3 +1445,19 @@ domount: > INFO("Mounted /proc in container for sec
Re: [lxc-devel] [PATCH 4/4] uniformly nullify std fds
On Tue, Jun 09, 2015 at 11:48:05AM +0200, Robert Vogelgesang wrote: > Hi, > > On Mon, Jun 08, 2015 at 07:59:54PM -0600, Tycho Andersen wrote: > > In various places throughout the code, we want to "nullify" the std fds, > > opening them to /dev/null or zero or so. Instead, let's unify this code and > > do it in such a way that Coverity (probably) won't complain. > > > > Reported-by: Coverity > > Signed-off-by: Tycho Andersen > > --- > > src/lxc/bdev.c | 8 ++-- > > src/lxc/lxccontainer.c | 19 ++- > > src/lxc/monitor.c | 8 ++-- > > src/lxc/start.c| 10 ++ > > src/lxc/utils.c| 22 ++ > > src/lxc/utils.h| 1 + > > 6 files changed, 35 insertions(+), 33 deletions(-) > > > > diff --git a/src/lxc/bdev.c b/src/lxc/bdev.c > > index 53465b1..520652c 100644 > > --- a/src/lxc/bdev.c > > +++ b/src/lxc/bdev.c > > @@ -224,12 +224,8 @@ static int do_mkfs(const char *path, const char > > *fstype) > > > > // If the file is not a block device, we don't want mkfs to ask > > // us about whether to proceed. > > - close(0); > > - close(1); > > - close(2); > > - open("/dev/zero", O_RDONLY); > > - open("/dev/null", O_RDWR); > > - open("/dev/null", O_RDWR); > > + if (null_stdfds() < 0) > > + exit(1); > > execlp("mkfs", "mkfs", "-t", fstype, path, NULL); > > exit(1); > > } > > diff --git a/src/lxc/lxccontainer.c b/src/lxc/lxccontainer.c > > index 445cc22..18fceeb 100644 > > --- a/src/lxc/lxccontainer.c > > +++ b/src/lxc/lxccontainer.c > > @@ -722,12 +722,10 @@ static bool do_lxcapi_start(struct lxc_container *c, > > int useinit, char * const a > > return false; > > } > > lxc_check_inherited(conf, true, -1); > > - close(0); > > - close(1); > > - close(2); > > - open("/dev/zero", O_RDONLY); > > - open("/dev/null", O_RDWR); > > - open("/dev/null", O_RDWR); > > + if (null_stdfds() < 0) { > > + ERROR("failed to close fds"); > > + return false; > > + } > > setsid(); > > } else { > > if (!am_single_threaded()) { > > @@ -978,13 +976,8 @@ static bool create_run_template(struct lxc_container > > *c, char *tpath, bool quiet > > char **newargv; > > struct lxc_conf *conf = c->lxc_conf; > > > > - if (quiet) { > > - close(0); > > - close(1); > > - close(2); > > - open("/dev/zero", O_RDONLY); > > - open("/dev/null", O_RDWR); > > - open("/dev/null", O_RDWR); > > + if (quiet && null_stdfds() < 0) { > > + exit(1); > > Here, maybe a comment should be added to say that "quiet" means we > should close the fds by calling null_stdfds(), and not the other way > around, that we should exit if null_stdfds() fails. Alternatively I can nest the ifs, but doesn't the order of the arguments in the && imply this? > > } > > > > src = c->lxc_conf->rootfs.path; > > diff --git a/src/lxc/monitor.c b/src/lxc/monitor.c > > index 0e56f75..144b16e 100644 > > --- a/src/lxc/monitor.c > > +++ b/src/lxc/monitor.c > > @@ -329,12 +329,8 @@ int lxc_monitord_spawn(const char *lxcpath) > > exit(EXIT_FAILURE); > > } > > lxc_check_inherited(NULL, true, pipefd[1]); > > - close(0); > > - close(1); > > - close(2); > > - open("/dev/null", O_RDONLY); > > - open("/dev/null", O_RDWR); > > - open("/dev/null", O_RDWR); > > + if (null_stdfds() < 0) > > + exit(EXIT_FAILURE); > > close(pipefd[0]); > > sprintf(pipefd_str, "%d", pipefd[1]); > > execvp(args[0], args); > > diff --git a/src/lxc/start.c b/src/lxc/start.c > > index 71cd9ef..6eded61 100644 > > --- a/src/lxc/start.c > > +++ b/src/lxc/start.c > > @@ -762,14 +762,8 @@ static int do_start(void *data) > > > > close(handler->sigfd); > > > > - if (handler->backgrounded) { > > -
[lxc-devel] [PATCH 1/4] c/r: use fclose instead of close
We're leaking the FILE* here while closing the underlying fd; let's just close the file and thus close both. Reported-by: Coverity Signed-off-by: Tycho Andersen --- src/lxc/criu.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/src/lxc/criu.c b/src/lxc/criu.c index 667f5d9..adcc626 100644 --- a/src/lxc/criu.c +++ b/src/lxc/criu.c @@ -279,11 +279,11 @@ static bool criu_version_ok() goto version_error; version_match: - close(pipes[0]); + fclose(f); return true; version_error: - close(pipes[0]); + fclose(f); ERROR("must have criu " CRIU_VERSION " or greater to checkpoint/restore\n"); return false; } -- 2.1.4 ___ lxc-devel mailing list lxc-devel@lists.linuxcontainers.org http://lists.linuxcontainers.org/listinfo/lxc-devel
[lxc-devel] [PATCH 2/4] c/r: remove unused variable mnts
Reported-by: Coverity Signed-off-by: Tycho Andersen --- src/lxc/criu.c | 3 --- 1 file changed, 3 deletions(-) diff --git a/src/lxc/criu.c b/src/lxc/criu.c index adcc626..e939b37 100644 --- a/src/lxc/criu.c +++ b/src/lxc/criu.c @@ -54,7 +54,6 @@ void exec_criu(struct criu_opts *opts) struct lxc_list *it; char buf[4096]; - FILE *mnts = NULL; /* The command line always looks like: * criu $(action) --tcp-established --file-locks --link-remap --force-irmap \ @@ -189,8 +188,6 @@ void exec_criu(struct criu_opts *opts) #undef DECLARE_ARG execv(argv[0], argv); err: - if (mnts) - fclose(mnts); for (i = 0; argv[i]; i++) free(argv[i]); free(argv); -- 2.1.4 ___ lxc-devel mailing list lxc-devel@lists.linuxcontainers.org http://lists.linuxcontainers.org/listinfo/lxc-devel
[lxc-devel] [PATCH 4/4] uniformly nullify std fds
In various places throughout the code, we want to "nullify" the std fds, opening them to /dev/null or zero or so. Instead, let's unify this code and do it in such a way that Coverity (probably) won't complain. Reported-by: Coverity Signed-off-by: Tycho Andersen --- src/lxc/bdev.c | 8 ++-- src/lxc/lxccontainer.c | 19 ++- src/lxc/monitor.c | 8 ++-- src/lxc/start.c| 10 ++ src/lxc/utils.c| 22 ++ src/lxc/utils.h| 1 + 6 files changed, 35 insertions(+), 33 deletions(-) diff --git a/src/lxc/bdev.c b/src/lxc/bdev.c index 53465b1..520652c 100644 --- a/src/lxc/bdev.c +++ b/src/lxc/bdev.c @@ -224,12 +224,8 @@ static int do_mkfs(const char *path, const char *fstype) // If the file is not a block device, we don't want mkfs to ask // us about whether to proceed. - close(0); - close(1); - close(2); - open("/dev/zero", O_RDONLY); - open("/dev/null", O_RDWR); - open("/dev/null", O_RDWR); + if (null_stdfds() < 0) + exit(1); execlp("mkfs", "mkfs", "-t", fstype, path, NULL); exit(1); } diff --git a/src/lxc/lxccontainer.c b/src/lxc/lxccontainer.c index 445cc22..18fceeb 100644 --- a/src/lxc/lxccontainer.c +++ b/src/lxc/lxccontainer.c @@ -722,12 +722,10 @@ static bool do_lxcapi_start(struct lxc_container *c, int useinit, char * const a return false; } lxc_check_inherited(conf, true, -1); - close(0); - close(1); - close(2); - open("/dev/zero", O_RDONLY); - open("/dev/null", O_RDWR); - open("/dev/null", O_RDWR); + if (null_stdfds() < 0) { + ERROR("failed to close fds"); + return false; + } setsid(); } else { if (!am_single_threaded()) { @@ -978,13 +976,8 @@ static bool create_run_template(struct lxc_container *c, char *tpath, bool quiet char **newargv; struct lxc_conf *conf = c->lxc_conf; - if (quiet) { - close(0); - close(1); - close(2); - open("/dev/zero", O_RDONLY); - open("/dev/null", O_RDWR); - open("/dev/null", O_RDWR); + if (quiet && null_stdfds() < 0) { + exit(1); } src = c->lxc_conf->rootfs.path; diff --git a/src/lxc/monitor.c b/src/lxc/monitor.c index 0e56f75..144b16e 100644 --- a/src/lxc/monitor.c +++ b/src/lxc/monitor.c @@ -329,12 +329,8 @@ int lxc_monitord_spawn(const char *lxcpath) exit(EXIT_FAILURE); } lxc_check_inherited(NULL, true, pipefd[1]); - close(0); - close(1); - close(2); - open("/dev/null", O_RDONLY); - open("/dev/null", O_RDWR); - open("/dev/null", O_RDWR); + if (null_stdfds() < 0) + exit(EXIT_FAILURE); close(pipefd[0]); sprintf(pipefd_str, "%d", pipefd[1]); execvp(args[0], args); diff --git a/src/lxc/start.c b/src/lxc/start.c index 71cd9ef..6eded61 100644 --- a/src/lxc/start.c +++ b/src/lxc/start.c @@ -762,14 +762,8 @@ static int do_start(void *data) close(handler->sigfd); - if (handler->backgrounded) { - close(0); - close(1); - close(2); - open("/dev/zero", O_RDONLY); - open("/dev/null", O_RDWR); - open("/dev/null", O_RDWR); - } + if (handler->backgrounded && null_stdfds() < 0) + goto out_warn_father; /* after this call, we are in error because this * ops should not return as it execs */ diff --git a/src/lxc/utils.c b/src/lxc/utils.c index 467bc1b..da9548f 100644 --- a/src/lxc/utils.c +++ b/src/lxc/utils.c @@ -1445,3 +1445,25 @@ domount: INFO("Mounted /proc in container for security transition"); return 1; } + +int null_stdfds(void) +{ + int fd; + + fd = open("/dev/zero", O_RDONLY); + if (fd < 0) + return -1; + + dup2(fd, 0); + close(fd); + + fd = open("/dev/null", O_WRONLY); + if (fd < 0) + return -1; + + dup2(fd, 1); + dup2(fd, 2); + close(fd); + + return 0; +} diff --git a/src/lxc/utils.h b/src/lxc/utils.h index 6bd05e0..ee12dde 100644 --- a/src/lxc/utils.h +++ b/src/lxc/utils.h @@ -280,4 +280,5 @@ int is_dir(const char *path); char *get_template_p
[lxc-devel] [PATCH 3/4] move utils.h #endif to end of file
Signed-off-by: Tycho Andersen --- src/lxc/utils.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/lxc/utils.h b/src/lxc/utils.h index e9e07d9..6bd05e0 100644 --- a/src/lxc/utils.h +++ b/src/lxc/utils.h @@ -268,7 +268,6 @@ extern bool dir_exists(const char *path); #define FNV1A_64_INIT ((uint64_t)0xcbf29ce484222325ULL) uint64_t fnv_64a_buf(void *buf, size_t len, uint64_t hval); -#endif int detect_shared_rootfs(void); int detect_ramfs_rootfs(void); @@ -281,3 +280,4 @@ int is_dir(const char *path); char *get_template_path(const char *t); int setproctitle(char *title); int mount_proc_if_needed(const char *rootfs); +#endif /* __LXC_UTILS_H */ -- 2.1.4 ___ lxc-devel mailing list lxc-devel@lists.linuxcontainers.org http://lists.linuxcontainers.org/listinfo/lxc-devel
[lxc-devel] [PATCH] don't hardcode the path to criu when checking versions
We use the right path when actually execing criu to checkpoint and restore, but when checking versions we didn't. Let's use the right path. Reported-by: Dietmar Maurer Signed-off-by: Tycho Andersen --- src/lxc/criu.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/src/lxc/criu.c b/src/lxc/criu.c index c331adf..c6b1863 100644 --- a/src/lxc/criu.c +++ b/src/lxc/criu.c @@ -223,13 +223,15 @@ static bool criu_version_ok() if (pid == 0) { char *args[] = { "criu", "--version", NULL }; + char *path; close(pipes[0]); close(STDERR_FILENO); if (dup2(pipes[1], STDOUT_FILENO) < 0) exit(1); - execv("/usr/local/sbin/criu", args); + path = on_path("criu", NULL); + execv(path, args); exit(1); } else { FILE *f; -- 2.1.4 ___ lxc-devel mailing list lxc-devel@lists.linuxcontainers.org http://lists.linuxcontainers.org/listinfo/lxc-devel
Re: [lxc-devel] hardcoded criu path?
On Wed, Jun 03, 2015 at 06:28:12AM +0200, Dietmar Maurer wrote: > in criu.c, line 232: > > execv("/usr/local/sbin/criu", args); Heh. Whoops. I'll send a patch for that ASAP. Tycho > this fails on my system because criu is installed in /usr/sbin. > > ___ > lxc-devel mailing list > lxc-devel@lists.linuxcontainers.org > http://lists.linuxcontainers.org/listinfo/lxc-devel ___ lxc-devel mailing list lxc-devel@lists.linuxcontainers.org http://lists.linuxcontainers.org/listinfo/lxc-devel
[lxc-devel] [PATCH] c/r: remember to clean up pidfile
When restoring, we didn't clean up the pidfile that criu uses to pass us the init pid on error or success; let's do that. Signed-off-by: Tycho Andersen --- src/lxc/criu.c | 5 + 1 file changed, 5 insertions(+) diff --git a/src/lxc/criu.c b/src/lxc/criu.c index 1913473..d45c96c 100644 --- a/src/lxc/criu.c +++ b/src/lxc/criu.c @@ -526,6 +526,9 @@ void do_restore(struct lxc_container *c, int pipe, char *directory, bool verbose ret = fscanf(f, "%d", (int*) &handler->pid); fclose(f); + if (unlink(pidfile) < 0 && errno != ENOENT) + SYSERROR("unlinking pidfile failed"); + if (ret != 1) { ERROR("reading restore pid failed"); goto out_fini_handler; @@ -556,6 +559,8 @@ void do_restore(struct lxc_container *c, int pipe, char *directory, bool verbose out_fini_handler: lxc_fini(c->name, handler); + if (unlink(pidfile) < 0 && errno != ENOENT) + SYSERROR("unlinking pidfile failed"); out: if (pipe >= 0) { -- 2.1.4 ___ lxc-devel mailing list lxc-devel@lists.linuxcontainers.org http://lists.linuxcontainers.org/listinfo/lxc-devel
[lxc-devel] [PATCH] c/r: complain when criu isn't exec()'d correctly
Signed-off-by: Tycho Andersen --- src/lxc/criu.c | 1 + 1 file changed, 1 insertion(+) diff --git a/src/lxc/criu.c b/src/lxc/criu.c index 1913473..5fbe7cb 100644 --- a/src/lxc/criu.c +++ b/src/lxc/criu.c @@ -239,6 +239,7 @@ static bool criu_version_ok() close(pipes[1]); if (wait_for_pid(pid) < 0) { close(pipes[0]); + SYSERROR("execing criu failed, is it installed?"); return false; } -- 2.1.4 ___ lxc-devel mailing list lxc-devel@lists.linuxcontainers.org http://lists.linuxcontainers.org/listinfo/lxc-devel
[lxc-devel] [PATCH] c/r: check for criu images in the checkpoint directory
CRIU can get confused if there are two dumps that are written to the same directory, so we make some minimal effort to prevent people from doing this. This is a better alternative than forcing liblxc to create the directory, since it is mostly race free (and neither solution is bullet proof anyway if someone rsyncs some bad images over the top of the good ones). Signed-off-by: Tycho Andersen --- src/lxc/lxccontainer.c | 10 ++ 1 file changed, 10 insertions(+) diff --git a/src/lxc/lxccontainer.c b/src/lxc/lxccontainer.c index dbcee99..8999f44 100644 --- a/src/lxc/lxccontainer.c +++ b/src/lxc/lxccontainer.c @@ -3683,6 +3683,7 @@ static bool do_lxcapi_checkpoint(struct lxc_container *c, char *directory, bool { pid_t pid; int status; + char path[PATH_MAX]; if (!criu_ok(c)) return false; @@ -3690,6 +3691,15 @@ static bool do_lxcapi_checkpoint(struct lxc_container *c, char *directory, bool if (mkdir(directory, 0700) < 0 && errno != EEXIST) return false; + status = snprintf(path, sizeof(path), "%s/inventory.img", directory); + if (status < 0 || status >= sizeof(path)) + return false; + + if (access(path, F_OK) == 0) { + ERROR("please use a fresh directory for the dump directory\n"); + return false; + } + if (!dump_net_info(c, directory)) return false; -- 2.1.4 ___ lxc-devel mailing list lxc-devel@lists.linuxcontainers.org http://lists.linuxcontainers.org/listinfo/lxc-devel
Re: [lxc-devel] [PATCH] c/r: force users to provide a fresh directory for criu images
On Wed, Apr 22, 2015 at 10:26:17PM +, Serge Hallyn wrote: > Quoting Tycho Andersen (tycho.ander...@canonical.com): > > On Wed, Apr 22, 2015 at 01:16:08PM -0600, Tycho Andersen wrote: > > > CRIU can get confused if more than one c/r is done in the same directory, > > > so we > > > should require lxcapi so that it refuses to dump to a directory with criu > > > images already in it. > > > > Hmm, actually I'm not sure this is the best way to do this. If users > > (e.g. LXD) use some sort of mechanism to generate a temporary > > directory name we could race if the same name is tried twice in a row. > > ? > > You mean if they use something like mkstemp and two threads get the > same name from it? (that shouldn't happen) The problem is that we pass liblxc the dirname, but this patch means that liblxc creates the directory. mkstemp (and go's equivalent, ioutil.TempDir) works because it opens the file, so presumably they test to see that it doesn't exist. However, we'd have to remove the directory after TempDir creates it, because liblxc wants to create it. That could get us into a situation where another TempDir call might generate the same name, in between when we remove it and when liblxc creates it. Tycho > > Instead, maybe it's just best to stat() for the key criu image file > > (inventory.img), and die if it's there. > > > > Thoughts? > > > > > This won't help if the user copies a checkpoint over another, but at > > > least the > > > common case will be caught. > > > > > > Signed-off-by: Tycho Andersen > > > --- > > > src/lxc/lxccontainer.c | 7 ++- > > > 1 file changed, 6 insertions(+), 1 deletion(-) > > > > > > diff --git a/src/lxc/lxccontainer.c b/src/lxc/lxccontainer.c > > > index dbcee99..4a290f1 100644 > > > --- a/src/lxc/lxccontainer.c > > > +++ b/src/lxc/lxccontainer.c > > > @@ -3687,8 +3687,13 @@ static bool do_lxcapi_checkpoint(struct > > > lxc_container *c, char *directory, bool > > > if (!criu_ok(c)) > > > return false; > > > > > > - if (mkdir(directory, 0700) < 0 && errno != EEXIST) > > > + if (mkdir(directory, 0700) < 0) { > > > + if (errno == EEXIST) > > > + ERROR("please use a new directory for criu state"); > > > + else > > > + SYSERROR("mkdir failed"); > > > return false; > > > + } > > > > > > if (!dump_net_info(c, directory)) > > > return false; > > > -- > > > 2.1.4 > > > > > ___ > > lxc-devel mailing list > > lxc-devel@lists.linuxcontainers.org > > http://lists.linuxcontainers.org/listinfo/lxc-devel > ___ > lxc-devel mailing list > lxc-devel@lists.linuxcontainers.org > http://lists.linuxcontainers.org/listinfo/lxc-devel ___ lxc-devel mailing list lxc-devel@lists.linuxcontainers.org http://lists.linuxcontainers.org/listinfo/lxc-devel
Re: [lxc-devel] [PATCH] c/r: force users to provide a fresh directory for criu images
On Wed, Apr 22, 2015 at 01:16:08PM -0600, Tycho Andersen wrote: > CRIU can get confused if more than one c/r is done in the same directory, so > we > should require lxcapi so that it refuses to dump to a directory with criu > images already in it. Hmm, actually I'm not sure this is the best way to do this. If users (e.g. LXD) use some sort of mechanism to generate a temporary directory name we could race if the same name is tried twice in a row. Instead, maybe it's just best to stat() for the key criu image file (inventory.img), and die if it's there. Thoughts? > This won't help if the user copies a checkpoint over another, but at least the > common case will be caught. > > Signed-off-by: Tycho Andersen > --- > src/lxc/lxccontainer.c | 7 ++- > 1 file changed, 6 insertions(+), 1 deletion(-) > > diff --git a/src/lxc/lxccontainer.c b/src/lxc/lxccontainer.c > index dbcee99..4a290f1 100644 > --- a/src/lxc/lxccontainer.c > +++ b/src/lxc/lxccontainer.c > @@ -3687,8 +3687,13 @@ static bool do_lxcapi_checkpoint(struct lxc_container > *c, char *directory, bool > if (!criu_ok(c)) > return false; > > - if (mkdir(directory, 0700) < 0 && errno != EEXIST) > + if (mkdir(directory, 0700) < 0) { > + if (errno == EEXIST) > + ERROR("please use a new directory for criu state"); > + else > + SYSERROR("mkdir failed"); > return false; > + } > > if (!dump_net_info(c, directory)) > return false; > -- > 2.1.4 > ___ lxc-devel mailing list lxc-devel@lists.linuxcontainers.org http://lists.linuxcontainers.org/listinfo/lxc-devel
[lxc-devel] [PATCH] c/r: force users to provide a fresh directory for criu images
CRIU can get confused if more than one c/r is done in the same directory, so we should require lxcapi so that it refuses to dump to a directory with criu images already in it. This won't help if the user copies a checkpoint over another, but at least the common case will be caught. Signed-off-by: Tycho Andersen --- src/lxc/lxccontainer.c | 7 ++- 1 file changed, 6 insertions(+), 1 deletion(-) diff --git a/src/lxc/lxccontainer.c b/src/lxc/lxccontainer.c index dbcee99..4a290f1 100644 --- a/src/lxc/lxccontainer.c +++ b/src/lxc/lxccontainer.c @@ -3687,8 +3687,13 @@ static bool do_lxcapi_checkpoint(struct lxc_container *c, char *directory, bool if (!criu_ok(c)) return false; - if (mkdir(directory, 0700) < 0 && errno != EEXIST) + if (mkdir(directory, 0700) < 0) { + if (errno == EEXIST) + ERROR("please use a new directory for criu state"); + else + SYSERROR("mkdir failed"); return false; + } if (!dump_net_info(c, directory)) return false; -- 2.1.4 ___ lxc-devel mailing list lxc-devel@lists.linuxcontainers.org http://lists.linuxcontainers.org/listinfo/lxc-devel
Re: [lxc-devel] [PATCH 2/2] c/r: re-open fds after clone()
On Tue, Apr 21, 2015 at 03:18:16PM +, Serge Hallyn wrote: > Quoting Tycho Andersen (tycho.ander...@canonical.com): > > If we don't re-open these after clone, the init process has a pointer to the > > parent's /dev/{zero,null}. CRIU seese these and wants to dump the parent's > > mount namespace, which is unnecessary. Instead, we should just re-open > > stdin/out/err after we do the clone and pivot root, to ensure that we have > > pointers to the devcies in init's rootfs instead of the host's. > > > > v2: Only close fds if the container was daemonized. This didn't turn out as > > nicely as described on the list because lxc_start() doesn't actually > > have > > the struct lxc_container, > > No, but lxc_container has a pointer to the handler. I was suggesting adding a > flag to the handler and (un/)setting that in lxcapi_start. The handler is allocated in __lxc_start, though, so lxcapi_start doesn't know about it. I suppose we could move the handler allocation up as an alternative. Tycho > ___ > lxc-devel mailing list > lxc-devel@lists.linuxcontainers.org > http://lists.linuxcontainers.org/listinfo/lxc-devel ___ lxc-devel mailing list lxc-devel@lists.linuxcontainers.org http://lists.linuxcontainers.org/listinfo/lxc-devel
[lxc-devel] [PATCH 2/2] c/r: re-open fds after clone()
If we don't re-open these after clone, the init process has a pointer to the parent's /dev/{zero,null}. CRIU seese these and wants to dump the parent's mount namespace, which is unnecessary. Instead, we should just re-open stdin/out/err after we do the clone and pivot root, to ensure that we have pointers to the devcies in init's rootfs instead of the host's. v2: Only close fds if the container was daemonized. This didn't turn out as nicely as described on the list because lxc_start() doesn't actually have the struct lxc_container, so it cant see the flag. Instead, we just pass it down everywhere. Signed-off-by: Tycho Andersen --- src/lxc/execute.c | 4 ++-- src/lxc/lxc.h | 22 +- src/lxc/lxc_execute.c | 2 +- src/lxc/lxccontainer.c | 10 ++ src/lxc/start.c| 17 ++--- src/lxc/start.h| 3 ++- 6 files changed, 34 insertions(+), 24 deletions(-) diff --git a/src/lxc/execute.c b/src/lxc/execute.c index a0f7ff1..d14092a 100644 --- a/src/lxc/execute.c +++ b/src/lxc/execute.c @@ -111,7 +111,7 @@ static struct lxc_operations execute_start_ops = { }; int lxc_execute(const char *name, char *const argv[], int quiet, - struct lxc_conf *conf, const char *lxcpath) + struct lxc_conf *conf, const char *lxcpath, bool backgrounded) { struct execute_args args = { .argv = argv, @@ -122,5 +122,5 @@ int lxc_execute(const char *name, char *const argv[], int quiet, return -1; conf->is_execute = 1; - return __lxc_start(name, conf, &execute_start_ops, &args, lxcpath); + return __lxc_start(name, conf, &execute_start_ops, &args, lxcpath, backgrounded); } diff --git a/src/lxc/lxc.h b/src/lxc/lxc.h index dd1048c..b1b858d 100644 --- a/src/lxc/lxc.h +++ b/src/lxc/lxc.h @@ -27,6 +27,7 @@ extern "C" { #endif +#include #include #include #include @@ -44,24 +45,27 @@ struct lxc_arguments; /* * Start the specified command inside a system container - * @name : the name of the container - * @argv : an array of char * corresponding to the commande line - * @conf : configuration + * @name : the name of the container + * @argv : an array of char * corresponding to the commande line + * @conf : configuration + * @backgrounded : whether or not the container is daemonized * Returns 0 on success, < 0 otherwise */ extern int lxc_start(const char *name, char *const argv[], struct lxc_conf *conf, -const char *lxcpath); +const char *lxcpath, bool backgrounded); /* * Start the specified command inside an application container - * @name : the name of the container - * @argv : an array of char * corresponding to the commande line - * @quiet: if != 0 then lxc-init won't produce any output - * @conf : configuration + * @name : the name of the container + * @argv : an array of char * corresponding to the commande line + * @quiet: if != 0 then lxc-init won't produce any output + * @conf : configuration + * @backgrounded : whether or not the container is daemonized * Returns 0 on success, < 0 otherwise */ extern int lxc_execute(const char *name, char *const argv[], int quiet, - struct lxc_conf *conf, const char *lxcpath); + struct lxc_conf *conf, const char *lxcpath, + bool backgrounded); /* * Open the monitoring mechanism for a specific container diff --git a/src/lxc/lxc_execute.c b/src/lxc/lxc_execute.c index b602793..4f1e1f6 100644 --- a/src/lxc/lxc_execute.c +++ b/src/lxc/lxc_execute.c @@ -139,7 +139,7 @@ int main(int argc, char *argv[]) if (lxc_config_define_load(&defines, conf)) return 1; - ret = lxc_execute(my_args.name, my_args.argv, my_args.quiet, conf, my_args.lxcpath[0]); + ret = lxc_execute(my_args.name, my_args.argv, my_args.quiet, conf, my_args.lxcpath[0], false); lxc_conf_free(conf); diff --git a/src/lxc/lxccontainer.c b/src/lxc/lxccontainer.c index 0ca5b9f..d49426f 100644 --- a/src/lxc/lxccontainer.c +++ b/src/lxc/lxccontainer.c @@ -584,7 +584,7 @@ static bool lxcapi_start(struct lxc_container *c, int useinit, char * const argv container_mem_unlock(c); if (useinit) { - ret = lxc_execute(c->name, argv, 1, conf, c->config_path); + ret = lxc_execute(c->name, argv, 1, conf, c->config_path, daemonize); return ret == 0 ? true : false; } @@ -642,12 +642,6 @@ static bool lxcapi_start(struct lxc_container *c, int useinit, char * const argv return false; } lxc_check_inherited(conf, true, -1); - close(0); - close(1); -
Re: [lxc-devel] [PATCH 2/2] c/r: re-open fds after clone()
On Mon, Apr 20, 2015 at 05:06:02PM +, Serge Hallyn wrote: > Quoting Tycho Andersen (tycho.ander...@canonical.com): > > If we don't re-open these after clone, the init process has a pointer to the > > parent's /dev/{zero,null}. CRIU seese these and wants to dump the parent's > > mount namespace, which is unnecessary. Instead, we should just re-open > > stdin/out/err after we do the clone and pivot root, to ensure that we have > > pointers to the devcies in init's rootfs instead of the host's. > > > > Signed-off-by: Tycho Andersen > > AFAICT you're switching this from doing it sometimes to doing it > always. That will break foreground containers. Oh, whoops, good point. > I guess you may have to add a 'backgrounded' boolean to the > src/lxc/start.h:lxc_handler struct, set in lxcapi_start, and > checked in do_start to decide whether to do this. Sounds good. Tycho > > --- > > src/lxc/lxccontainer.c | 6 -- > > src/lxc/start.c| 7 +++ > > 2 files changed, 7 insertions(+), 6 deletions(-) > > > > diff --git a/src/lxc/lxccontainer.c b/src/lxc/lxccontainer.c > > index 0ca5b9f..2a536ed 100644 > > --- a/src/lxc/lxccontainer.c > > +++ b/src/lxc/lxccontainer.c > > @@ -642,12 +642,6 @@ static bool lxcapi_start(struct lxc_container *c, int > > useinit, char * const argv > > return false; > > } > > lxc_check_inherited(conf, true, -1); > > - close(0); > > - close(1); > > - close(2); > > - open("/dev/zero", O_RDONLY); > > - open("/dev/null", O_RDWR); > > - open("/dev/null", O_RDWR); > > setsid(); > > } else { > > if (!am_single_threaded()) { > > diff --git a/src/lxc/start.c b/src/lxc/start.c > > index d615375..6939826 100644 > > --- a/src/lxc/start.c > > +++ b/src/lxc/start.c > > @@ -759,6 +759,13 @@ static int do_start(void *data) > > > > close(handler->sigfd); > > > > + close(0); > > + close(1); > > + close(2); > > + open("/dev/zero", O_RDONLY); > > + open("/dev/null", O_RDWR); > > + open("/dev/null", O_RDWR); > > + > > /* after this call, we are in error because this > > * ops should not return as it execs */ > > handler->ops->start(handler, handler->data); > > -- > > 2.1.4 > > > > ___ > > lxc-devel mailing list > > lxc-devel@lists.linuxcontainers.org > > http://lists.linuxcontainers.org/listinfo/lxc-devel > ___ > lxc-devel mailing list > lxc-devel@lists.linuxcontainers.org > http://lists.linuxcontainers.org/listinfo/lxc-devel ___ lxc-devel mailing list lxc-devel@lists.linuxcontainers.org http://lists.linuxcontainers.org/listinfo/lxc-devel
[lxc-devel] [PATCH 2/2] c/r: re-open fds after clone()
If we don't re-open these after clone, the init process has a pointer to the parent's /dev/{zero,null}. CRIU seese these and wants to dump the parent's mount namespace, which is unnecessary. Instead, we should just re-open stdin/out/err after we do the clone and pivot root, to ensure that we have pointers to the devcies in init's rootfs instead of the host's. Signed-off-by: Tycho Andersen --- src/lxc/lxccontainer.c | 6 -- src/lxc/start.c| 7 +++ 2 files changed, 7 insertions(+), 6 deletions(-) diff --git a/src/lxc/lxccontainer.c b/src/lxc/lxccontainer.c index 0ca5b9f..2a536ed 100644 --- a/src/lxc/lxccontainer.c +++ b/src/lxc/lxccontainer.c @@ -642,12 +642,6 @@ static bool lxcapi_start(struct lxc_container *c, int useinit, char * const argv return false; } lxc_check_inherited(conf, true, -1); - close(0); - close(1); - close(2); - open("/dev/zero", O_RDONLY); - open("/dev/null", O_RDWR); - open("/dev/null", O_RDWR); setsid(); } else { if (!am_single_threaded()) { diff --git a/src/lxc/start.c b/src/lxc/start.c index d615375..6939826 100644 --- a/src/lxc/start.c +++ b/src/lxc/start.c @@ -759,6 +759,13 @@ static int do_start(void *data) close(handler->sigfd); + close(0); + close(1); + close(2); + open("/dev/zero", O_RDONLY); + open("/dev/null", O_RDWR); + open("/dev/null", O_RDWR); + /* after this call, we are in error because this * ops should not return as it execs */ handler->ops->start(handler, handler->data); -- 2.1.4 ___ lxc-devel mailing list lxc-devel@lists.linuxcontainers.org http://lists.linuxcontainers.org/listinfo/lxc-devel
[lxc-devel] c/r: support for systemd-based guests
Hi all, Here are some patches we need to support dumping systemd based guests. There will be more patches coming and things still don't work yet, but we definitely need at least these. Tycho ___ lxc-devel mailing list lxc-devel@lists.linuxcontainers.org http://lists.linuxcontainers.org/listinfo/lxc-devel
[lxc-devel] [PATCH 1/2] c/r: enable hugetlbfs in criu
In vivid containers hugetlbfs is mounted, but it is not one of the hardcoded fses in criu, so we need to tell criu that it is okay to automount it. Signed-off-by: Tycho Andersen --- src/lxc/criu.c | 5 - 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/src/lxc/criu.c b/src/lxc/criu.c index 12b3be9..1a356b2 100644 --- a/src/lxc/criu.c +++ b/src/lxc/criu.c @@ -49,7 +49,7 @@ lxc_log_define(lxc_criu, lxc); void exec_criu(struct criu_opts *opts) { char **argv, log[PATH_MAX]; - int static_args = 18, argc = 0, i, ret; + int static_args = 20, argc = 0, i, ret; int netnr = 0; struct lxc_list *it; @@ -61,6 +61,7 @@ void exec_criu(struct criu_opts *opts) * --manage-cgroups action-script foo.sh -D $(directory) \ * -o $(directory)/$(action).log --ext-mount-map auto * --enable-external-sharing --enable-external-masters +* --enable-fs hugetlbfs * +1 for final NULL */ if (strcmp(opts->action, "dump") == 0) { @@ -120,6 +121,8 @@ void exec_criu(struct criu_opts *opts) DECLARE_ARG("auto"); DECLARE_ARG("--enable-external-sharing"); DECLARE_ARG("--enable-external-masters"); + DECLARE_ARG("--enable-fs"); + DECLARE_ARG("hugetlbfs"); DECLARE_ARG("-D"); DECLARE_ARG(opts->directory); DECLARE_ARG("-o"); -- 2.1.4 ___ lxc-devel mailing list lxc-devel@lists.linuxcontainers.org http://lists.linuxcontainers.org/listinfo/lxc-devel
[lxc-devel] [PATCH] c/r: check version of criu
Note that we allow both a tagged version or a git build that has sufficient patches for the features we require. v2: close criu's stderr too Signed-off-by: Tycho Andersen Acked-by: Serge E. Hallyn --- src/lxc/criu.c | 90 ++ 1 file changed, 90 insertions(+) diff --git a/src/lxc/criu.c b/src/lxc/criu.c index 043db36..ca8344f 100644 --- a/src/lxc/criu.c +++ b/src/lxc/criu.c @@ -193,6 +193,93 @@ err: free(argv); } +/* + * Check to see if the criu version is recent enough for all the features we + * use. This version allows either CRIU_VERSION or (CRIU_GITID_VERSION and + * CRIU_GITID_PATCHLEVEL) to work, enabling users building from git to c/r + * things potentially before a version is released with a particular feature. + * + * The intent is that when criu development slows down, we can drop this, but + * for now we shouldn't attempt to c/r with versions that we know won't work. + */ +static bool criu_version_ok() +{ + int pipes[2]; + pid_t pid; + + if (pipe(pipes) < 0) { + SYSERROR("pipe() failed"); + return false; + } + + pid = fork(); + if (pid < 0) { + SYSERROR("fork() failed"); + return false; + } + + if (pid == 0) { + char *args[] = { "criu", "--version", NULL }; + close(pipes[0]); + + close(STDERR_FILENO); + if (dup2(pipes[1], STDOUT_FILENO) < 0) + exit(1); + + execv("/usr/local/sbin/criu", args); + exit(1); + } else { + FILE *f; + char version[1024]; + int patch; + + close(pipes[1]); + if (wait_for_pid(pid) < 0) { + close(pipes[0]); + return false; + } + + f = fdopen(pipes[0], "r"); + if (!f) { + close(pipes[0]); + return false; + } + + if (fscanf(f, "Version: %1024[^\n]s", version) != 1) + goto version_error; + + if (fgetc(f) != '\n') + goto version_error; + + if (strcmp(version, CRIU_VERSION) >= 0) + goto version_match; + + if (fscanf(f, "GitID: v%1024[^-]s", version) != 1) + goto version_error; + + if (fgetc(f) != '-') + goto version_error; + + if (fscanf(f, "%d", &patch) != 1) + goto version_error; + + if (strcmp(version, CRIU_GITID_VERSION) < 0) + goto version_error; + + if (patch < CRIU_GITID_PATCHLEVEL) + goto version_error; + +version_match: + close(pipes[0]); + return true; + +version_error: + close(pipes[0]); + ERROR("must have criu " CRIU_VERSION " or greater to checkpoint/restore\n"); + return false; + } +} + /* Check and make sure the container has a configuration that we know CRIU can * dump. */ bool criu_ok(struct lxc_container *c) @@ -200,6 +287,9 @@ bool criu_ok(struct lxc_container *c) struct lxc_list *it; bool found_deny_rule = false; + if (!criu_version_ok()) + return false; + if (geteuid()) { ERROR("Must be root to checkpoint\n"); return false; -- 2.1.4 ___ lxc-devel mailing list lxc-devel@lists.linuxcontainers.org http://lists.linuxcontainers.org/listinfo/lxc-devel
Re: [lxc-devel] [PATCH 2/2] c/r: check version of criu
On Fri, Apr 17, 2015 at 04:28:33PM +, Serge Hallyn wrote: > Quoting Tycho Andersen (tycho.ander...@canonical.com): > > Note that we allow both a tagged version or a git build that has sufficient > > patches for the features we require. > > > > Signed-off-by: Tycho Andersen > > Acked-by: Serge E. Hallyn > > just one q below, > > > --- > > src/lxc/criu.c | 89 > > ++ > > 1 file changed, 89 insertions(+) > > > > diff --git a/src/lxc/criu.c b/src/lxc/criu.c > > index 043db36..12b3be9 100644 > > --- a/src/lxc/criu.c > > +++ b/src/lxc/criu.c > > @@ -193,6 +193,92 @@ err: > > free(argv); > > } > > > > +/* > > + * Check to see if the criu version is recent enough for all the features > > we > > + * use. This version allows either CRIU_VERSION or (CRIU_GITID_VERSION and > > + * CRIU_GITID_PATCHLEVEL) to work, enabling users building from git to c/r > > + * things potentially before a version is released with a particular > > feature. > > + * > > + * The intent is that when criu development slows down, we can drop this, > > but > > + * for now we shouldn't attempt to c/r with versions that we know won't > > work. > > + */ > > +static bool criu_version_ok() > > +{ > > + int pipes[2]; > > + pid_t pid; > > + > > + if (pipe(pipes) < 0) { > > + SYSERROR("pipe() failed"); > > + return false; > > + } > > + > > + pid = fork(); > > + if (pid < 0) { > > + SYSERROR("fork() failed"); > > + return false; > > + } > > + > > + if (pid == 0) { > > + char *args[] = { "criu", "--version", NULL }; > > + close(pipes[0]); > > + > > + if (dup2(pipes[1], STDOUT_FILENO) < 0) > > + exit(1); > > Should you also close stderr? Yes, that's probably a good idea. I'll resend with that. Tycho > > + execv("/usr/local/sbin/criu", args); > > + exit(1); > > + } else { > > + FILE *f; > > + char version[1024]; > > + int patch; > > + > > + close(pipes[1]); > > + if (wait_for_pid(pid) < 0) { > > + close(pipes[0]); > > + return false; > > + } > > + > > + f = fdopen(pipes[0], "r"); > > + if (!f) { > > + close(pipes[0]); > > + return false; > > + } > > + > > + if (fscanf(f, "Version: %1024[^\n]s", version) != 1) > > + goto version_error; > > + > > + if (fgetc(f) != '\n') > > + goto version_error; > > + > > + if (strcmp(version, CRIU_VERSION) >= 0) > > + goto version_match; > > + > > + if (fscanf(f, "GitID: v%1024[^-]s", version) != 1) > > + goto version_error; > > + > > + if (fgetc(f) != '-') > > + goto version_error; > > + > > + if (fscanf(f, "%d", &patch) != 1) > > + goto version_error; > > + > > + if (strcmp(version, CRIU_GITID_VERSION) < 0) > > + goto version_error; > > + > > + if (patch < CRIU_GITID_PATCHLEVEL) > > + goto version_error; > > + > > +version_match: > > + close(pipes[0]); > > + return true; > > + > > +version_error: > > + close(pipes[0]); > > + ERROR("must have criu " CRIU_VERSION " or greater to > > checkpoint/restore\n"); > > + return false; > > + } > > +} > > + > > /* Check and make sure the container has a configuration that we know CRIU > > can > > * dump. */ > > bool criu_ok(struct lxc_container *c) > > @@ -200,6 +286,9 @@ bool criu_ok(struct lxc_container *c) > > struct lxc_list *it; > > bool found_deny_rule = false; > > > > + if (!criu_version_ok()) > > + return false; > > + > > if (geteuid()) { > > ERROR("Must be root to checkpoint\n"); > > return false; > > -- > > 2.1.4 > > > > ___ > > lxc-devel mailing list > > lxc-devel@lists.linuxcontainers.org > > http://lists.linuxcontainers.org/listinfo/lxc-devel > ___ > lxc-devel mailing list > lxc-devel@lists.linuxcontainers.org > http://lists.linuxcontainers.org/listinfo/lxc-devel ___ lxc-devel mailing list lxc-devel@lists.linuxcontainers.org http://lists.linuxcontainers.org/listinfo/lxc-devel
[lxc-devel] [PATCH 1/2] c/r: move criu code to its own file
Trying to cage the beast that is lxccontainer.c. Signed-off-by: Tycho Andersen --- src/lxc/Makefile.am| 4 +- src/lxc/criu.c | 477 + src/lxc/criu.h | 70 src/lxc/lxccontainer.c | 454 +- 4 files changed, 551 insertions(+), 454 deletions(-) create mode 100644 src/lxc/criu.c create mode 100644 src/lxc/criu.h diff --git a/src/lxc/Makefile.am b/src/lxc/Makefile.am index d8e460b..2731843 100644 --- a/src/lxc/Makefile.am +++ b/src/lxc/Makefile.am @@ -21,7 +21,8 @@ noinst_HEADERS = \ namespace.h \ start.h \ state.h \ - utils.h + utils.h \ + criu.h if IS_BIONIC noinst_HEADERS += \ @@ -75,6 +76,7 @@ liblxc_so_SOURCES = \ state.c state.h \ log.c log.h \ attach.c attach.h \ + criu.c criu.h \ \ network.c network.h \ nl.c nl.h \ diff --git a/src/lxc/criu.c b/src/lxc/criu.c new file mode 100644 index 000..043db36 --- /dev/null +++ b/src/lxc/criu.c @@ -0,0 +1,477 @@ +/* + * lxc: linux Container library + * + * Copyright © 2014-2015 Canonical Ltd. + * + * Authors: + * Tycho Andersen + * + * This library is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * This library is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with this library; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + */ +#define _GNU_SOURCE +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#include "config.h" + +#include "bdev.h" +#include "cgroup.h" +#include "conf.h" +#include "criu.h" +#include "log.h" +#include "lxc.h" +#include "lxclock.h" +#include "network.h" +#include "utils.h" + +lxc_log_define(lxc_criu, lxc); + +void exec_criu(struct criu_opts *opts) +{ + char **argv, log[PATH_MAX]; + int static_args = 18, argc = 0, i, ret; + int netnr = 0; + struct lxc_list *it; + + char buf[4096]; + FILE *mnts = NULL; + + /* The command line always looks like: +* criu $(action) --tcp-established --file-locks --link-remap --force-irmap \ +* --manage-cgroups action-script foo.sh -D $(directory) \ +* -o $(directory)/$(action).log --ext-mount-map auto +* --enable-external-sharing --enable-external-masters +* +1 for final NULL */ + + if (strcmp(opts->action, "dump") == 0) { + /* -t pid */ + static_args += 2; + + /* --leave-running */ + if (!opts->stop) + static_args++; + } else if (strcmp(opts->action, "restore") == 0) { + /* --root $(lxc_mount_point) --restore-detached +* --restore-sibling --pidfile $foo --cgroup-root $foo */ + static_args += 8; + } else { + return; + } + + if (opts->verbose) + static_args++; + + ret = snprintf(log, PATH_MAX, "%s/%s.log", opts->directory, opts->action); + if (ret < 0 || ret >= PATH_MAX) { + ERROR("logfile name too long\n"); + return; + } + + argv = malloc(static_args * sizeof(*argv)); + if (!argv) + return; + + memset(argv, 0, static_args * sizeof(*argv)); + +#define DECLARE_ARG(arg) \ + do {\ + if (arg == NULL) { \ + ERROR("Got NULL argument for criu");\ + goto err; \ + } \ + argv[argc++] = strdup(arg); \ + if (!argv[argc-1]) \ + goto err; \ + } while (0) + + argv[argc++] = on_path("criu", NULL); + if (!argv[argc-1]) { + ERROR("Couldn't find criu binary\n"); + goto err; + } + + DECLARE_ARG(opts->action); + DECLARE_ARG("--tcp-established"); + DECLARE_ARG("--file-locks"); +
[lxc-devel] [PATCH 2/2] c/r: check version of criu
Note that we allow both a tagged version or a git build that has sufficient patches for the features we require. Signed-off-by: Tycho Andersen --- src/lxc/criu.c | 89 ++ 1 file changed, 89 insertions(+) diff --git a/src/lxc/criu.c b/src/lxc/criu.c index 043db36..12b3be9 100644 --- a/src/lxc/criu.c +++ b/src/lxc/criu.c @@ -193,6 +193,92 @@ err: free(argv); } +/* + * Check to see if the criu version is recent enough for all the features we + * use. This version allows either CRIU_VERSION or (CRIU_GITID_VERSION and + * CRIU_GITID_PATCHLEVEL) to work, enabling users building from git to c/r + * things potentially before a version is released with a particular feature. + * + * The intent is that when criu development slows down, we can drop this, but + * for now we shouldn't attempt to c/r with versions that we know won't work. + */ +static bool criu_version_ok() +{ + int pipes[2]; + pid_t pid; + + if (pipe(pipes) < 0) { + SYSERROR("pipe() failed"); + return false; + } + + pid = fork(); + if (pid < 0) { + SYSERROR("fork() failed"); + return false; + } + + if (pid == 0) { + char *args[] = { "criu", "--version", NULL }; + close(pipes[0]); + + if (dup2(pipes[1], STDOUT_FILENO) < 0) + exit(1); + + execv("/usr/local/sbin/criu", args); + exit(1); + } else { + FILE *f; + char version[1024]; + int patch; + + close(pipes[1]); + if (wait_for_pid(pid) < 0) { + close(pipes[0]); + return false; + } + + f = fdopen(pipes[0], "r"); + if (!f) { + close(pipes[0]); + return false; + } + + if (fscanf(f, "Version: %1024[^\n]s", version) != 1) + goto version_error; + + if (fgetc(f) != '\n') + goto version_error; + + if (strcmp(version, CRIU_VERSION) >= 0) + goto version_match; + + if (fscanf(f, "GitID: v%1024[^-]s", version) != 1) + goto version_error; + + if (fgetc(f) != '-') + goto version_error; + + if (fscanf(f, "%d", &patch) != 1) + goto version_error; + + if (strcmp(version, CRIU_GITID_VERSION) < 0) + goto version_error; + + if (patch < CRIU_GITID_PATCHLEVEL) + goto version_error; + +version_match: + close(pipes[0]); + return true; + +version_error: + close(pipes[0]); + ERROR("must have criu " CRIU_VERSION " or greater to checkpoint/restore\n"); + return false; + } +} + /* Check and make sure the container has a configuration that we know CRIU can * dump. */ bool criu_ok(struct lxc_container *c) @@ -200,6 +286,9 @@ bool criu_ok(struct lxc_container *c) struct lxc_list *it; bool found_deny_rule = false; + if (!criu_version_ok()) + return false; + if (geteuid()) { ERROR("Must be root to checkpoint\n"); return false; -- 2.1.4 ___ lxc-devel mailing list lxc-devel@lists.linuxcontainers.org http://lists.linuxcontainers.org/listinfo/lxc-devel
[lxc-devel] [PATCH 1/2] c/r: rework external mountpoint handling v4
CRIU now supports autodetection of external mounts via the --ext-mount-map auto --enable-external-sharing --enable-external-masters options, so we don't need to explicitly pass the cgmanager mount or any of the mounts from the config. This also means that lxcfs mounts (since they are bind mounts from outside the container) are autodetected, meaning that c/r of containers using lxcfs works. A further advantage of this patch is that it addresses some of the ugliness that was in the exec_criu() function. There are other criu options that will allow us to trim this even further, though. Finally, with --enable-external-masters, criu understands slave mounts in the container with shared mounts in the peer group that are outside the namespace. This allows containers on a systemd host to be dumped and restored correctly. However, these options have just landed in criu trunk today, and the next tagged release will be 1.6 on June 1, so we should avoid merging this into any stable releases until then. v2: remount / as private before bind mounting the container's directory for criu. The problem here is that if / is mounted as shared, even if we unshare() the /var/lib/lxc/rootfs mountpoint propagates outside of our mount namespace, which is bad, since we don't want to leak mounts. In particular, this leak confuses criu the second time it goes to checkpoint the container. v3: whoops, we really want / as MS_SLAVE | MS_REC here, to match what start does v4: rebase onto master for revert of logging patch Signed-off-by: Tycho Andersen Acked-by: Serge E. Hallyn --- src/lxc/lxccontainer.c | 91 +++--- 1 file changed, 20 insertions(+), 71 deletions(-) diff --git a/src/lxc/lxccontainer.c b/src/lxc/lxccontainer.c index e2586de..4d73b4c 100644 --- a/src/lxc/lxccontainer.c +++ b/src/lxc/lxccontainer.c @@ -3520,18 +3520,18 @@ struct criu_opts { static void exec_criu(struct criu_opts *opts) { char **argv, log[PATH_MAX]; - int static_args = 14, argc = 0, i, ret; + int static_args = 18, argc = 0, i, ret; int netnr = 0; struct lxc_list *it; - struct mntent mntent; char buf[4096]; FILE *mnts = NULL; /* The command line always looks like: * criu $(action) --tcp-established --file-locks --link-remap --force-irmap \ * --manage-cgroups action-script foo.sh -D $(directory) \ -* -o $(directory)/$(action).log +* -o $(directory)/$(action).log --ext-mount-map auto +* --enable-external-sharing --enable-external-masters * +1 for final NULL */ if (strcmp(opts->action, "dump") == 0) { @@ -3558,11 +3558,6 @@ static void exec_criu(struct criu_opts *opts) return; } - // We need to tell criu where cgmanager's socket is bind mounted from - // if it exists since it's external. - if (cgroup_driver() == CGMANAGER) - static_args+=2; - argv = malloc(static_args * sizeof(*argv)); if (!argv) return; @@ -3592,6 +3587,10 @@ static void exec_criu(struct criu_opts *opts) DECLARE_ARG("--link-remap"); DECLARE_ARG("--force-irmap"); DECLARE_ARG("--manage-cgroups"); + DECLARE_ARG("--ext-mount-map"); + DECLARE_ARG("auto"); + DECLARE_ARG("--enable-external-sharing"); + DECLARE_ARG("--enable-external-masters"); DECLARE_ARG("--action-script"); DECLARE_ARG(DATADIR "/lxc/lxc-restore-net"); DECLARE_ARG("-D"); @@ -3602,35 +3601,9 @@ static void exec_criu(struct criu_opts *opts) if (opts->verbose) DECLARE_ARG("-vv"); - /* -* Note: this macro is not intended to be called unless argc is equal -* to the length of the array; there is nothing that keeps track of the -* length of the array besides the location in the code that this is -* called. (Yes this is bad, and we should fix it.) -*/ -#define RESIZE_ARGS(additional) \ - do { \ - void *m; \ - if (additional < 0) { \ - ERROR("resizing by negative amount"); \ - goto err; \ - } else if (additional == 0) \ - continue; \ - \ - m = realloc(argv,
Re: [lxc-devel] [PATCH] c/r: rework external mountpoint handling v2
On Wed, Apr 15, 2015 at 04:39:10PM +, Serge Hallyn wrote: > Quoting Tycho Andersen (tycho.ander...@canonical.com): > > On Wed, Apr 15, 2015 at 04:19:54PM +, Serge Hallyn wrote: > > > Quoting Tycho Andersen (tycho.ander...@canonical.com): > > > > On Wed, Apr 15, 2015 at 03:48:10PM +, Serge Hallyn wrote: > > > > > Quoting Tycho Andersen (tycho.ander...@canonical.com): > > > > > > CRIU now supports autodetection of external mounts via the > > > > > > --ext-mount-map auto > > > > > > --enable-external-sharing --enable-external-masters options, so we > > > > > > don't need > > > > > > to explicitly pass the cgmanager mount or any of the mounts from > > > > > > the config. > > > > > > This also means that lxcfs mounts (since they are bind mounts from > > > > > > outside the > > > > > > container) are autodetected, meaning that c/r of containers using > > > > > > lxcfs works. > > > > > > > > > > > > A further advantage of this patch is that it addresses some of the > > > > > > ugliness > > > > > > that was in the exec_criu() function. There are other criu options > > > > > > that will > > > > > > allow us to trim this even further, though. > > > > > > > > > > > > Finally, with --enable-external-masters, criu understands slave > > > > > > mounts in the > > > > > > container with shared mounts in the peer group that are outside the > > > > > > namespace. > > > > > > This allows containers on a systemd host to be dumped and restored > > > > > > correctly. > > > > > > > > > > > > However, these options have just landed in criu trunk today, and > > > > > > the next > > > > > > tagged release will be 1.6 on June 1, so we should avoid merging > > > > > > this into any > > > > > > > > > > Ouch, June 1, that's a ways away. > > > > > > > > > > > stable releases until then. > > > > > > > > > > Should there be a lxc_check_criu_version() fn, updated in cases like > > > > > these, > > > > > which ensures that criu is new enough version for the lxc version? > > > > > > > > Yes, I was thinking about this, the only question is how to detect it; > > > > I can fork() and just pass --version, but then on every checkpoint or > > > > restore we have an extra fork() in the critical path. Unfortunately, I > > > > > > The critical path of checkpoint/restore? Isn't that as speedy as > > > mollasses > > > now? > > > > A fair point :) > > > > > You could also cache the result in a global variable (-1 by default) > > > so only the first lxcapi_checkpoint or lxcapi_restart will invoke the > > > penalty. > > > > Right, but in the case of e.g. checkpoint it will only ever be called > > That's "/usr/bin/lxc-checkpoint". But something like lxd using the lxc > API may benefit. Ah, good point. > > once, because as soon as you invoke checkpoint the monitor process > > Oh I was thinking the check would be done in the lxcapi_checkpoint, not > in the container monitor. Maybe I'm misunderstanding the current > c/r architecture. Actually it might be me misunderstanding. I thought lxcapi_checkpoint was run from the monitor process, but I just realized that it's not. Sorry for the noise. Tycho > Note also that when criu development slows down we can probably drop that > check. > ___ > lxc-devel mailing list > lxc-devel@lists.linuxcontainers.org > http://lists.linuxcontainers.org/listinfo/lxc-devel ___ lxc-devel mailing list lxc-devel@lists.linuxcontainers.org http://lists.linuxcontainers.org/listinfo/lxc-devel
Re: [lxc-devel] [PATCH] c/r: rework external mountpoint handling v2
On Wed, Apr 15, 2015 at 04:19:54PM +, Serge Hallyn wrote: > Quoting Tycho Andersen (tycho.ander...@canonical.com): > > On Wed, Apr 15, 2015 at 03:48:10PM +, Serge Hallyn wrote: > > > Quoting Tycho Andersen (tycho.ander...@canonical.com): > > > > CRIU now supports autodetection of external mounts via the > > > > --ext-mount-map auto > > > > --enable-external-sharing --enable-external-masters options, so we > > > > don't need > > > > to explicitly pass the cgmanager mount or any of the mounts from the > > > > config. > > > > This also means that lxcfs mounts (since they are bind mounts from > > > > outside the > > > > container) are autodetected, meaning that c/r of containers using lxcfs > > > > works. > > > > > > > > A further advantage of this patch is that it addresses some of the > > > > ugliness > > > > that was in the exec_criu() function. There are other criu options that > > > > will > > > > allow us to trim this even further, though. > > > > > > > > Finally, with --enable-external-masters, criu understands slave mounts > > > > in the > > > > container with shared mounts in the peer group that are outside the > > > > namespace. > > > > This allows containers on a systemd host to be dumped and restored > > > > correctly. > > > > > > > > However, these options have just landed in criu trunk today, and the > > > > next > > > > tagged release will be 1.6 on June 1, so we should avoid merging this > > > > into any > > > > > > Ouch, June 1, that's a ways away. > > > > > > > stable releases until then. > > > > > > Should there be a lxc_check_criu_version() fn, updated in cases like > > > these, > > > which ensures that criu is new enough version for the lxc version? > > > > Yes, I was thinking about this, the only question is how to detect it; > > I can fork() and just pass --version, but then on every checkpoint or > > restore we have an extra fork() in the critical path. Unfortunately, I > > The critical path of checkpoint/restore? Isn't that as speedy as mollasses > now? A fair point :) > You could also cache the result in a global variable (-1 by default) > so only the first lxcapi_checkpoint or lxcapi_restart will invoke the > penalty. Right, but in the case of e.g. checkpoint it will only ever be called once, because as soon as you invoke checkpoint the monitor process dies. Anyway, I think I'll just check it in the critical path. When one extra fork is a performance bottleneck for migration I'll probably be rotting away peacefully in a pine box somewhere :) Tycho > > think that may be our best option right now. If you're ok with the > > forks (I am), I can send a patch for a version check. > > > > > Given the heavy criu development I do NOT think we should fallback to > > > different behavior on older versions - just require the minversion we > > > need and fail with a nice informational message otherwise. > > > > Yep, agreed. > > > > > > v2: remount / as private before bind mounting the container's directory > > > > for > > > > > > It's probably a whacky enough case that lxc+criu would fail anyway, but > > > it *is* possible for / to not be a mountpoint. Init could chroot you > > > instead of pivot-rooting. In that case you'd have to bind-mount / onto > > > itself before doing the MS_PRIVATE remount. > > > > > > I'm also kind of surprised it worked without a MS_REMOUNT flag, but > > > > > > > Yes, in fact criu doesn't pass MS_REMOUNT most of the time, nor does > > LXC when re-mounting / for startup. > > > > > > criu. The problem here is that if / is mounted as shared, even if we > > > > unshare() the /var/lib/lxc/rootfs mountpoint propagates outside of > > > > our > > > > mount namespace, which is bad, since we don't want to leak mounts. > > > > In > > > > particular, this leak confuses criu the second time it goes to > > > > checkpoint > > > > the container. > > > > > > > > Signed-off-by: Tycho Andersen > > > > > > Acked-by: Serge E. Hallyn > > > > Thanks, although there is a v3 of this patch that I see you replied > > to, I'll try and address those concerns there. > > > > > But where do we keep this? We need a github cronjob to post merge request > > > on a given date. > > > > I was thinking we could just merge it into master since we have the > > stable-1.1 branch that we're tagging 1.1 stable release from. > > > > Tycho > > ___ > > lxc-devel mailing list > > lxc-devel@lists.linuxcontainers.org > > http://lists.linuxcontainers.org/listinfo/lxc-devel > ___ > lxc-devel mailing list > lxc-devel@lists.linuxcontainers.org > http://lists.linuxcontainers.org/listinfo/lxc-devel ___ lxc-devel mailing list lxc-devel@lists.linuxcontainers.org http://lists.linuxcontainers.org/listinfo/lxc-devel
Re: [lxc-devel] [PATCH 1/2] c/r: rework external mountpoint handling v3
On Wed, Apr 15, 2015 at 03:57:32PM +, Serge Hallyn wrote: > Quoting Tycho Andersen (tycho.ander...@canonical.com): > > CRIU now supports autodetection of external mounts via the --ext-mount-map > > auto > > --enable-external-sharing --enable-external-masters options, so we don't > > need > > to explicitly pass the cgmanager mount or any of the mounts from the config. > > This also means that lxcfs mounts (since they are bind mounts from outside > > the > > container) are autodetected, meaning that c/r of containers using lxcfs > > works. > > > > A further advantage of this patch is that it addresses some of the ugliness > > that was in the exec_criu() function. There are other criu options that will > > allow us to trim this even further, though. > > > > Finally, with --enable-external-masters, criu understands slave mounts in > > the > > container with shared mounts in the peer group that are outside the > > namespace. > > This allows containers on a systemd host to be dumped and restored > > correctly. > > > > However, these options have just landed in criu trunk today, and the next > > tagged release will be 1.6 on June 1, so we should avoid merging this into > > any > > stable releases until then. > > > > v2: remount / as private before bind mounting the container's directory for > > criu. The problem here is that if / is mounted as shared, even if we > > unshare() the /var/lib/lxc/rootfs mountpoint propagates outside of our > > mount namespace, which is bad, since we don't want to leak mounts. In > > particular, this leak confuses criu the second time it goes to > > checkpoint > > the container. > > > > v3: whoops, we really want / as MS_SLAVE | MS_REC here, to match what start > > does > > Can you justify using MS_SLAVE? (I'm not opposed, just wondering whether > there are specific reasons) Because that's how setup_rootfs() does it for a regular start, so I wanted to match that on restore: https://github.com/lxc/lxc/blob/master/src/lxc/conf.c#L1256 (which also incedentally doesn't use MS_REMOUNT :) > We probably do want MS_SLAVE so that any umount -l from the host is less > likely to leave a busy mount in the container pinning a device on the host. Yep, agreed, either private or slave makes sense, and I figured it's probably best to have start/restore match. > re MS_REC I was thinking you were doing this bc / was going to be bind-mounted > into the container, but that's not the case, is it? If you're doing this > so we can mount to our heart's content later on without worrying about the > host, then MS_REC is needed. Right, it's the latter. We don't want to pollute the host's mount ns just because of sharing. Also, I see you acked v2, if the above is ok can you ack this one instead? :) Tycho > > Signed-off-by: Tycho Andersen > > --- > > src/lxc/lxccontainer.c | 90 > > +++--- > > 1 file changed, 20 insertions(+), 70 deletions(-) > > > > diff --git a/src/lxc/lxccontainer.c b/src/lxc/lxccontainer.c > > index 3c3ff33..db11947 100644 > > --- a/src/lxc/lxccontainer.c > > +++ b/src/lxc/lxccontainer.c > > @@ -3708,18 +3708,18 @@ struct criu_opts { > > static void exec_criu(struct criu_opts *opts) > > { > > char **argv, log[PATH_MAX]; > > - int static_args = 14, argc = 0, i, ret; > > + int static_args = 18, argc = 0, i, ret; > > int netnr = 0; > > struct lxc_list *it; > > > > - struct mntent mntent; > > char buf[4096]; > > FILE *mnts = NULL; > > > > /* The command line always looks like: > > * criu $(action) --tcp-established --file-locks --link-remap > > --force-irmap \ > > * --manage-cgroups action-script foo.sh -D $(directory) \ > > -* -o $(directory)/$(action).log > > +* -o $(directory)/$(action).log --ext-mount-map auto > > +* --enable-external-sharing --enable-external-masters > > * +1 for final NULL */ > > > > if (strcmp(opts->action, "dump") == 0) { > > @@ -3746,11 +3746,6 @@ static void exec_criu(struct criu_opts *opts) > > return; > > } > > > > - // We need to tell criu where cgmanager's socket is bind mounted from > > - // if it exists since it's external. > > - if (cgroup_driver() == CGMANAGER) > > - static_args+=2; > > - > > argv = malloc(static_args * sizeof(*argv)); > > if (!argv) &g
Re: [lxc-devel] [PATCH] c/r: rework external mountpoint handling v2
On Wed, Apr 15, 2015 at 03:48:10PM +, Serge Hallyn wrote: > Quoting Tycho Andersen (tycho.ander...@canonical.com): > > CRIU now supports autodetection of external mounts via the --ext-mount-map > > auto > > --enable-external-sharing --enable-external-masters options, so we don't > > need > > to explicitly pass the cgmanager mount or any of the mounts from the config. > > This also means that lxcfs mounts (since they are bind mounts from outside > > the > > container) are autodetected, meaning that c/r of containers using lxcfs > > works. > > > > A further advantage of this patch is that it addresses some of the ugliness > > that was in the exec_criu() function. There are other criu options that will > > allow us to trim this even further, though. > > > > Finally, with --enable-external-masters, criu understands slave mounts in > > the > > container with shared mounts in the peer group that are outside the > > namespace. > > This allows containers on a systemd host to be dumped and restored > > correctly. > > > > However, these options have just landed in criu trunk today, and the next > > tagged release will be 1.6 on June 1, so we should avoid merging this into > > any > > Ouch, June 1, that's a ways away. > > > stable releases until then. > > Should there be a lxc_check_criu_version() fn, updated in cases like these, > which ensures that criu is new enough version for the lxc version? Yes, I was thinking about this, the only question is how to detect it; I can fork() and just pass --version, but then on every checkpoint or restore we have an extra fork() in the critical path. Unfortunately, I think that may be our best option right now. If you're ok with the forks (I am), I can send a patch for a version check. > Given the heavy criu development I do NOT think we should fallback to > different behavior on older versions - just require the minversion we > need and fail with a nice informational message otherwise. Yep, agreed. > > v2: remount / as private before bind mounting the container's directory for > > It's probably a whacky enough case that lxc+criu would fail anyway, but > it *is* possible for / to not be a mountpoint. Init could chroot you > instead of pivot-rooting. In that case you'd have to bind-mount / onto > itself before doing the MS_PRIVATE remount. > > I'm also kind of surprised it worked without a MS_REMOUNT flag, but Yes, in fact criu doesn't pass MS_REMOUNT most of the time, nor does LXC when re-mounting / for startup. > > criu. The problem here is that if / is mounted as shared, even if we > > unshare() the /var/lib/lxc/rootfs mountpoint propagates outside of our > > mount namespace, which is bad, since we don't want to leak mounts. In > > particular, this leak confuses criu the second time it goes to > > checkpoint > > the container. > > > > Signed-off-by: Tycho Andersen > > Acked-by: Serge E. Hallyn Thanks, although there is a v3 of this patch that I see you replied to, I'll try and address those concerns there. > But where do we keep this? We need a github cronjob to post merge request > on a given date. I was thinking we could just merge it into master since we have the stable-1.1 branch that we're tagging 1.1 stable release from. Tycho ___ lxc-devel mailing list lxc-devel@lists.linuxcontainers.org http://lists.linuxcontainers.org/listinfo/lxc-devel
[lxc-devel] [PATCH 1/2] c/r: rework external mountpoint handling v3
CRIU now supports autodetection of external mounts via the --ext-mount-map auto --enable-external-sharing --enable-external-masters options, so we don't need to explicitly pass the cgmanager mount or any of the mounts from the config. This also means that lxcfs mounts (since they are bind mounts from outside the container) are autodetected, meaning that c/r of containers using lxcfs works. A further advantage of this patch is that it addresses some of the ugliness that was in the exec_criu() function. There are other criu options that will allow us to trim this even further, though. Finally, with --enable-external-masters, criu understands slave mounts in the container with shared mounts in the peer group that are outside the namespace. This allows containers on a systemd host to be dumped and restored correctly. However, these options have just landed in criu trunk today, and the next tagged release will be 1.6 on June 1, so we should avoid merging this into any stable releases until then. v2: remount / as private before bind mounting the container's directory for criu. The problem here is that if / is mounted as shared, even if we unshare() the /var/lib/lxc/rootfs mountpoint propagates outside of our mount namespace, which is bad, since we don't want to leak mounts. In particular, this leak confuses criu the second time it goes to checkpoint the container. v3: whoops, we really want / as MS_SLAVE | MS_REC here, to match what start does Signed-off-by: Tycho Andersen --- src/lxc/lxccontainer.c | 90 +++--- 1 file changed, 20 insertions(+), 70 deletions(-) diff --git a/src/lxc/lxccontainer.c b/src/lxc/lxccontainer.c index 3c3ff33..db11947 100644 --- a/src/lxc/lxccontainer.c +++ b/src/lxc/lxccontainer.c @@ -3708,18 +3708,18 @@ struct criu_opts { static void exec_criu(struct criu_opts *opts) { char **argv, log[PATH_MAX]; - int static_args = 14, argc = 0, i, ret; + int static_args = 18, argc = 0, i, ret; int netnr = 0; struct lxc_list *it; - struct mntent mntent; char buf[4096]; FILE *mnts = NULL; /* The command line always looks like: * criu $(action) --tcp-established --file-locks --link-remap --force-irmap \ * --manage-cgroups action-script foo.sh -D $(directory) \ -* -o $(directory)/$(action).log +* -o $(directory)/$(action).log --ext-mount-map auto +* --enable-external-sharing --enable-external-masters * +1 for final NULL */ if (strcmp(opts->action, "dump") == 0) { @@ -3746,11 +3746,6 @@ static void exec_criu(struct criu_opts *opts) return; } - // We need to tell criu where cgmanager's socket is bind mounted from - // if it exists since it's external. - if (cgroup_driver() == CGMANAGER) - static_args+=2; - argv = malloc(static_args * sizeof(*argv)); if (!argv) return; @@ -3780,6 +3775,10 @@ static void exec_criu(struct criu_opts *opts) DECLARE_ARG("--link-remap"); DECLARE_ARG("--force-irmap"); DECLARE_ARG("--manage-cgroups"); + DECLARE_ARG("--ext-mount-map"); + DECLARE_ARG("auto"); + DECLARE_ARG("--enable-external-sharing"); + DECLARE_ARG("--enable-external-masters"); DECLARE_ARG("--action-script"); DECLARE_ARG(DATADIR "/lxc/lxc-restore-net"); DECLARE_ARG("-D"); @@ -3790,35 +3789,9 @@ static void exec_criu(struct criu_opts *opts) if (opts->verbose) DECLARE_ARG("-vv"); - /* -* Note: this macro is not intended to be called unless argc is equal -* to the length of the array; there is nothing that keeps track of the -* length of the array besides the location in the code that this is -* called. (Yes this is bad, and we should fix it.) -*/ -#define RESIZE_ARGS(additional) \ - do { \ - void *m; \ - if (additional < 0) { \ - ERROR("resizing by negative amount"); \ - goto err; \ - } else if (additional == 0) \ - continue; \ - \ - m = realloc(argv, (argc + additional + 1) * sizeof(*argv)); \ - if (!m)
[lxc-devel] [PATCH] don't compare unsigned values as negative ones
Instead, check that the result is larger than its parts. Signed-off-by: Tycho Andersen --- src/lxc/utils.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/src/lxc/utils.c b/src/lxc/utils.c index 084b556..fe71e9a 100644 --- a/src/lxc/utils.c +++ b/src/lxc/utils.c @@ -1645,12 +1645,13 @@ int setproctitle(char *title) env_start = env_end; } + arg_end = arg_start + len; + /* check overflow */ - if (arg_start + len < 0) { + if (arg_end < len || arg_end < arg_start) { return -1; } - arg_end = arg_start + len; } strcpy((char*)arg_start, title); -- 2.1.0 ___ lxc-devel mailing list lxc-devel@lists.linuxcontainers.org http://lists.linuxcontainers.org/listinfo/lxc-devel
Re: [lxc-devel] [PATCH 6/6] c/r: use mkstemp to get a pidfile name
On Mon, Apr 13, 2015 at 04:52:37PM -0500, Stéphane Graber wrote: > On Mon, Apr 13, 2015 at 06:07:05PM +0000, Tycho Andersen wrote: > > This is more secure than tempnam(). > > No such thing as mkstemp on Android unfortunately. Yep, sorry. I think we can drop both the temp file patches and just take the other four. Neither of them are really security related, other than getting rid of coverity complaints. Tycho > > > > Signed-off-by: Tycho Andersen > > --- > > src/lxc/lxccontainer.c | 30 +++--- > > 1 file changed, 27 insertions(+), 3 deletions(-) > > > > diff --git a/src/lxc/lxccontainer.c b/src/lxc/lxccontainer.c > > index 5b96b8c..8424cf6 100644 > > --- a/src/lxc/lxccontainer.c > > +++ b/src/lxc/lxccontainer.c > > @@ -4128,12 +4128,30 @@ out_unlock: > > static void do_restore(struct lxc_container *c, int pipe, char *directory, > > bool verbose) > > { > > pid_t pid; > > - char pidfile[L_tmpnam]; > > + char pidfile[sizeof(P_tmpdir) + 25]; > > struct lxc_handler *handler; > > - int status; > > + int status, ret; > > + > > + ret = snprintf(pidfile, sizeof(pidfile), "%s/lxc_criu_pidfile.XX", > > P_tmpdir); > > + if (ret < 0 || ret >= sizeof(pidfile)) > > + goto out; > > + > > + /* > > +* Here, we simply use mkstemp to acquire a secure tmpfile name. CRIU > > +* tries to create the pidfile with O_CREAT | O_EXCL, so we need to > > +* remove it before calling criu. > > +*/ > > + ret = mkstemp(pidfile); > > + if (ret < 0) { > > + SYSERROR("failed to create pidfile"); > > + goto out; > > + } > > > > - if (!tmpnam(pidfile)) > > + close(ret); > > + if (remove(pidfile) < 0) { > > + SYSERROR("failed to remove pidfile"); > > goto out; > > + } > > > > handler = lxc_init(c->name, c->lxc_conf, c->config_path); > > if (!handler) > > @@ -4231,6 +4249,12 @@ static void do_restore(struct lxc_container *c, int > > pipe, char *directory, bool > > > > ret = fscanf(f, "%d", (int*) &handler->pid); > > fclose(f); > > + > > + if (remove(pidfile) < 0) { > > + SYSERROR("failed to remove pidfile"); > > + goto out_fini_handler; > > + } > > + > > if (ret != 1) { > > ERROR("reading restore pid failed"); > > goto out_fini_handler; > > -- > > 2.1.4 > > > > ___ > > lxc-devel mailing list > > lxc-devel@lists.linuxcontainers.org > > http://lists.linuxcontainers.org/listinfo/lxc-devel > > -- > Stéphane Graber > Ubuntu developer > http://www.ubuntu.com > ___ > lxc-devel mailing list > lxc-devel@lists.linuxcontainers.org > http://lists.linuxcontainers.org/listinfo/lxc-devel ___ lxc-devel mailing list lxc-devel@lists.linuxcontainers.org http://lists.linuxcontainers.org/listinfo/lxc-devel