Re: [lxc-devel] [PATCH] c/r: remember to chown the cgroup path (correctly)
On Wed, Jan 13, 2016 at 09:47:50PM +, Serge Hallyn wrote: > Quoting Tycho Andersen (tycho.ander...@canonical.com): > > 1. remember to chown the cgroup path when migrating a container > > 2. when restoring the cgroup path, try to compute the euid for root vs. > >using geteuid(); geteuid works for start, but it doesn't work for > >migration since we're still real root at that point. > > > > Signed-off-by: Tycho Andersen> > --- > > src/lxc/cgmanager.c | 6 +- > > src/lxc/criu.c | 5 + > > 2 files changed, 10 insertions(+), 1 deletion(-) > > > > diff --git a/src/lxc/cgmanager.c b/src/lxc/cgmanager.c > > index 357182a..54e6912 100644 > > --- a/src/lxc/cgmanager.c > > +++ b/src/lxc/cgmanager.c > > @@ -488,7 +488,11 @@ static bool chown_cgroup(const char *cgroup_path, > > struct lxc_conf *conf) > > return true; > > > > data.cgroup_path = cgroup_path; > > - data.origuid = geteuid(); > > + data.origuid = mapped_hostid(0, conf, ID_TYPE_UID); > > + if (data.origuid < 0) { > > Can you confirm that this does not break > > sudo lxc-create -t download -n x1 -- -d ubuntu -r trusty -a amd64 > sudo lxc-start -n x1 > > Because in that case I think we have no mappings, and mapped_hostid() will > return -1. You can't see it in the patch, but just above this is a lxc_list_empty() test, and this whole path isn't executed if lxc_list_empty() is true, so I think it should be ok. Tycho > > + ERROR("failed to get mapped root id"); > > + return false; > > + } > > > > /* Unpriv users can't chown it themselves, so chown from > > * a child namespace mapping both our own and the target uid > > diff --git a/src/lxc/criu.c b/src/lxc/criu.c > > index 6ef4905..f442612 100644 > > --- a/src/lxc/criu.c > > +++ b/src/lxc/criu.c > > @@ -466,6 +466,11 @@ void do_restore(struct lxc_container *c, int pipe, > > char *directory, bool verbose > > goto out_fini_handler; > > } > > > > + if (!cgroup_chown(handler)) { > > + ERROR("failed creating groups"); > > + goto out_fini_handler; > > + } > > + > > if (!restore_net_info(c)) { > > ERROR("failed restoring network info"); > > goto out_fini_handler; > > -- > > 2.6.4 > > > > ___ > > lxc-devel mailing list > > lxc-devel@lists.linuxcontainers.org > > http://lists.linuxcontainers.org/listinfo/lxc-devel > ___ > lxc-devel mailing list > lxc-devel@lists.linuxcontainers.org > http://lists.linuxcontainers.org/listinfo/lxc-devel ___ lxc-devel mailing list lxc-devel@lists.linuxcontainers.org http://lists.linuxcontainers.org/listinfo/lxc-devel
Re: [lxc-devel] [PATCH] c/r: remember to chown the cgroup path (correctly)
Quoting Tycho Andersen (tycho.ander...@canonical.com): > On Wed, Jan 13, 2016 at 09:47:50PM +, Serge Hallyn wrote: > > Quoting Tycho Andersen (tycho.ander...@canonical.com): > > > 1. remember to chown the cgroup path when migrating a container > > > 2. when restoring the cgroup path, try to compute the euid for root vs. > > >using geteuid(); geteuid works for start, but it doesn't work for > > >migration since we're still real root at that point. > > > > > > Signed-off-by: Tycho Andersen> > > --- > > > src/lxc/cgmanager.c | 6 +- > > > src/lxc/criu.c | 5 + > > > 2 files changed, 10 insertions(+), 1 deletion(-) > > > > > > diff --git a/src/lxc/cgmanager.c b/src/lxc/cgmanager.c > > > index 357182a..54e6912 100644 > > > --- a/src/lxc/cgmanager.c > > > +++ b/src/lxc/cgmanager.c > > > @@ -488,7 +488,11 @@ static bool chown_cgroup(const char *cgroup_path, > > > struct lxc_conf *conf) > > > return true; > > > > > > data.cgroup_path = cgroup_path; > > > - data.origuid = geteuid(); > > > + data.origuid = mapped_hostid(0, conf, ID_TYPE_UID); now, when starting a container, this happens in the parent task in original uid. So the geteuid() returns 1000, mapped_hostid(0, conf, ID_TYPE_UID) something like 10. This is probably ok - but did you run all the lxc tests against this to make sure? You can still run 'lxc-cgroup' etc as an unpriv user? > > > + if (data.origuid < 0) { > > > > Can you confirm that this does not break > > > > sudo lxc-create -t download -n x1 -- -d ubuntu -r trusty -a amd64 > > sudo lxc-start -n x1 > > > > Because in that case I think we have no mappings, and mapped_hostid() will > > return -1. > > You can't see it in the patch, but just above this is a D'oh! I even looked for such a thing this morning but missed it. > lxc_list_empty() test, and this whole path isn't executed if > lxc_list_empty() is true, so I think it should be ok. > > Tycho > > > > + ERROR("failed to get mapped root id"); > > > + return false; > > > + } > > > > > > /* Unpriv users can't chown it themselves, so chown from > > >* a child namespace mapping both our own and the target uid > > > diff --git a/src/lxc/criu.c b/src/lxc/criu.c > > > index 6ef4905..f442612 100644 > > > --- a/src/lxc/criu.c > > > +++ b/src/lxc/criu.c > > > @@ -466,6 +466,11 @@ void do_restore(struct lxc_container *c, int pipe, > > > char *directory, bool verbose > > > goto out_fini_handler; > > > } > > > > > > + if (!cgroup_chown(handler)) { > > > + ERROR("failed creating groups"); > > > + goto out_fini_handler; > > > + } > > > + > > > if (!restore_net_info(c)) { > > > ERROR("failed restoring network info"); > > > goto out_fini_handler; > > > -- > > > 2.6.4 > > > > > > ___ > > > lxc-devel mailing list > > > lxc-devel@lists.linuxcontainers.org > > > http://lists.linuxcontainers.org/listinfo/lxc-devel > > ___ > > lxc-devel mailing list > > lxc-devel@lists.linuxcontainers.org > > http://lists.linuxcontainers.org/listinfo/lxc-devel > ___ > lxc-devel mailing list > lxc-devel@lists.linuxcontainers.org > http://lists.linuxcontainers.org/listinfo/lxc-devel ___ lxc-devel mailing list lxc-devel@lists.linuxcontainers.org http://lists.linuxcontainers.org/listinfo/lxc-devel
Re: [lxc-devel] [PATCH] .gitignore: add sparclinux make output
On Wed, Jan 13, 2016 at 10:20:12PM +, Serge Hallyn wrote: > Quoting Wim Coekaerts (wim.coekae...@oracle.com): > > On 1/13/16 1:50 PM, Serge Hallyn wrote: > > >Quoting Tycho Andersen (tycho.ander...@canonical.com): > > >>Signed-off-by: Tycho Andersen> > >Acked-by: Serge E. Hallyn > > > > > >>--- > > >> .gitignore | 1 + > > >> 1 file changed, 1 insertion(+) > > >> > > >>diff --git a/.gitignore b/.gitignore > > >>index 5e4912c..58e5dea 100644 > > >>--- a/.gitignore > > >>+++ b/.gitignore > > >>@@ -41,6 +41,7 @@ templates/lxc-opensuse > > >> templates/lxc-oracle > > >> templates/lxc-plamo > > >> templates/lxc-slackware > > >>+templates/lxc-sparclinux > > >> templates/lxc-sshd > > >> templates/lxc-ubuntu > > >> templates/lxc-ubuntu-cloud > > ugh do I feel stupid ! > > nonsense - we all do it. Indeed, please don't feel stupid :) Tycho ___ lxc-devel mailing list lxc-devel@lists.linuxcontainers.org http://lists.linuxcontainers.org/listinfo/lxc-devel
Re: [lxc-devel] [PATCH v2] safe_mount: Handle mounting proc and refactor
On 14.01.2016 01:09, Serge Hallyn wrote: > Quoting Bogdan Purcareata (bogdan.purcare...@nxp.com): >> On 11.01.2016 20:59, Serge Hallyn wrote: >>> Quoting Bogdan Purcareata (bogdan.purcare...@nxp.com): The safe_mount primitive will mount the fs in the new container environment by using file descriptors referred in /proc/self/fd. However, when the mounted filesystem is proc itself, it will have been previously unmounted, therefore resulting in an error when searching for these file descriptors. This only happens when there's no container rootfs prefix (commonly with lxc-execute). Implement the support for this use case as well, by doing the mount based on the full path. Refactor the whole function in order to remove duplicated code checks and improve readability. Changes since v1: - In order to address CVE-2015-1335, still check if the destination is not a symlink. Do the mount only if the destination file descriptor exists. Signed-off-by: Bogdan Purcareata--- src/lxc/utils.c | 49 - 1 file changed, 28 insertions(+), 21 deletions(-) diff --git a/src/lxc/utils.c b/src/lxc/utils.c index d9e769d..c53711a 100644 --- a/src/lxc/utils.c +++ b/src/lxc/utils.c @@ -1644,9 +1644,9 @@ out: int safe_mount(const char *src, const char *dest, const char *fstype, unsigned long flags, const void *data, const char *rootfs) { - int srcfd = -1, destfd, ret, saved_errno; + int srcfd = -1, destfd = -1, ret = 0; char srcbuf[50], destbuf[50]; // only needs enough for /proc/self/fd/ - const char *mntsrc = src; + const char *mntsrc = src, *mntdest = dest; if (!rootfs) rootfs = ""; @@ -1655,45 +1655,52 @@ int safe_mount(const char *src, const char *dest, const char *fstype, if (flags & MS_BIND && src && src[0] != '/') { INFO("this is a relative bind mount"); srcfd = open_without_symlink(src, NULL); - if (srcfd < 0) - return srcfd; + if (srcfd < 0) { + ret = srcfd; + goto out; + } ret = snprintf(srcbuf, 50, "/proc/self/fd/%d", srcfd); if (ret < 0 || ret > 50) { - close(srcfd); ERROR("Out of memory"); - return -EINVAL; + ret = -EINVAL; + goto out_src; } mntsrc = srcbuf; } destfd = open_without_symlink(dest, rootfs); if (destfd < 0) { - if (srcfd != -1) - close(srcfd); - return destfd; + ret = destfd; + goto out_src; } ret = snprintf(destbuf, 50, "/proc/self/fd/%d", destfd); if (ret < 0 || ret > 50) { - if (srcfd != -1) - close(srcfd); - close(destfd); ERROR("Out of memory"); - return -EINVAL; + ret = -EINVAL; + goto out_dest; } - ret = mount(mntsrc, destbuf, fstype, flags, data); - saved_errno = errno; - if (srcfd != -1) - close(srcfd); - close(destfd); + /* make sure the destination descriptor exists */ + if (access(destbuf, F_OK) == 0) + mntdest = destbuf; >>> First, if we're going to shortcut I'd prefer to say "if /proc/self >>> does not exist then skip this check" fo rnow. >>> But can we think of any way to still do this check? >>> >>> What exactly are the cases? >>> >>> 1. lxc-execute, lxc-init tries to mount /proc. We should be able >>> to simply have lxc always mount /proc before the pivot_root, so >>> we can properly do this check. >>> >>> what use-cases will break if we demand /proc to exist in the >>> container? (We can add an option to umount /proc in lxc-init, >>> but the directory would have to exist) >> That's my use case - the failing function is tmp_proc_mount, thus happening > > Wait, is your rootfs NULL or not? Yep, it's NULL - plain "lxc-execute -n foo -f empty.conf -- bash". > We can handle that case specially. > > mount_proc_if_needed() is only called from tmp_proc_mount(), which is only > called from lxc_setup(). If rootfs is NULL, then all bets are off anyway > and we can just call mount(). If rootfs is not NULL, then we're mounting > in a separate container rootfs and we should use safe_mount(), but we > expect the real /proc to then exist. In that case, would the following fix be more
[lxc-devel] [RFC 0/2] Feature: --start-frozen
Patch 1: This adds the possibility to start a container in a frozen state via a --start-frozen parameter added to lxc-start. I'm not sure the implementation is ideal. There's probably some reusable cgroup code somewhere, but as far as I could tell it is mostly meant for the monitor, so I don't know which parts I can use within the child process. This can be useful for when you want to get a fully prepared and mounted namespace without running the container itself while having the changes happening in hooks applied. In our case mount hooks, in order to provide ways to push/pull files to/from the container or mount it without things like multiple-mount-protection preventing you from firing up the container before it's unmounted again. We're still weighing some other ways we could go but I thought I'd get some input about this patch since it doesn't seem to be too specific to go upstream. I've considered a few alternatives to this but I'm not sure if there are any good ones. I've considered an lxc.cgroup.freezer.state config entry, but that happens too early and lxc-start won't exit until the container is unfrozen. Same problem with a start-hook, beside the fact that it must be present in the container. (In theory one could open a handle to an executable which waits for a signal and pass /proc/self/fd/$handle as hook, but --daemonize always closes them.) The semantics of the the start hook is another part to consider. (Ignore? Add a parameter? Run later? Split?) Patch 2: comment fixups For the new function I used 'set_' instead of 'want_' as prefix to make their purpose more obvious. Should I change it to 'want_' or maybe change the cleanup patch to also rename the other functions to 'set_*'? Wolfgang Bumiller (2): lxc-start: added --start-frozen cleanup: lxc_container::want_* comment descriptions doc/lxc-start.sgml.in | 12 src/lxc/arguments.h| 3 +++ src/lxc/conf.h | 1 + src/lxc/lxc_start.c| 7 +++ src/lxc/lxccontainer.c | 16 src/lxc/lxccontainer.h | 19 ++- src/lxc/start.c| 17 + 7 files changed, 70 insertions(+), 5 deletions(-) -- 2.1.4 ___ lxc-devel mailing list lxc-devel@lists.linuxcontainers.org http://lists.linuxcontainers.org/listinfo/lxc-devel
[lxc-devel] [RFC 1/2] lxc-start: added --start-frozen
Add the possibility to start a container in a frozen state. Signed-off-by: Wolfgang Bumiller--- doc/lxc-start.sgml.in | 12 src/lxc/arguments.h| 3 +++ src/lxc/conf.h | 1 + src/lxc/lxc_start.c| 7 +++ src/lxc/lxccontainer.c | 16 src/lxc/lxccontainer.h | 10 ++ src/lxc/start.c| 17 + 7 files changed, 66 insertions(+) diff --git a/doc/lxc-start.sgml.in b/doc/lxc-start.sgml.in index 1770ac2..46bfcc7 100644 --- a/doc/lxc-start.sgml.in +++ b/doc/lxc-start.sgml.in @@ -59,6 +59,7 @@ Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA -s KEY=VAL -C --share-[net|ipc|uts] name|pid + --start-frozen command @@ -245,6 +246,17 @@ Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + + + --start-frozen + + + + Freeze the container before starting its init process. + + + + diff --git a/src/lxc/arguments.h b/src/lxc/arguments.h index 38cb0da..79a4758 100644 --- a/src/lxc/arguments.h +++ b/src/lxc/arguments.h @@ -81,6 +81,9 @@ struct lxc_arguments { /* close fds from parent? */ int close_all_fds; + /* freeze before starting init */ + int start_frozen; + /* lxc-create */ char *bdevtype, *configfile, *template; char *fstype; diff --git a/src/lxc/conf.h b/src/lxc/conf.h index b0274ec..5c84f5f 100644 --- a/src/lxc/conf.h +++ b/src/lxc/conf.h @@ -312,6 +312,7 @@ struct lxc_conf { struct lxc_rootfs rootfs; char *ttydir; int close_all_fds; + int start_frozen; struct lxc_list hooks[NUM_LXC_HOOKS]; char *lsm_aa_profile; diff --git a/src/lxc/lxc_start.c b/src/lxc/lxc_start.c index 6b942ac..73e27e3 100644 --- a/src/lxc/lxc_start.c +++ b/src/lxc/lxc_start.c @@ -53,6 +53,7 @@ #define OPT_SHARE_NET OPT_USAGE+1 #define OPT_SHARE_IPC OPT_USAGE+2 #define OPT_SHARE_UTS OPT_USAGE+3 +#define OPT_START_FROZEN OPT_USAGE+4 lxc_log_define(lxc_start_ui, lxc); @@ -154,6 +155,7 @@ static int my_parser(struct lxc_arguments* args, int c, char* arg) case OPT_SHARE_NET: args->share_ns[LXC_NS_NET] = arg; break; case OPT_SHARE_IPC: args->share_ns[LXC_NS_IPC] = arg; break; case OPT_SHARE_UTS: args->share_ns[LXC_NS_UTS] = arg; break; + case OPT_START_FROZEN: args->start_frozen = 1; break; } return 0; } @@ -170,6 +172,7 @@ static const struct option my_longopts[] = { {"share-net", required_argument, 0, OPT_SHARE_NET}, {"share-ipc", required_argument, 0, OPT_SHARE_IPC}, {"share-uts", required_argument, 0, OPT_SHARE_UTS}, + {"start-frozen", no_argument, 0, OPT_START_FROZEN}, LXC_COMMON_OPTIONS }; @@ -193,6 +196,7 @@ Options :\n\ Note: --daemon implies --close-all-fds\n\ -s, --define KEY=VAL Assign VAL to configuration variable KEY\n\ --share-[net|ipc|uts]=NAME Share a namespace with another container or pid\n\ + --start-frozen Freeze the container before starting its init\n\ ", .options = my_longopts, .parser= my_parser, @@ -335,6 +339,9 @@ int main(int argc, char *argv[]) if (my_args.close_all_fds) c->want_close_all_fds(c, true); + if (my_args.start_frozen) + c->set_start_frozen(c, true); + if (args == default_args) err = c->start(c, 0, NULL) ? 0 : 1; else diff --git a/src/lxc/lxccontainer.c b/src/lxc/lxccontainer.c index 802a930..166f507 100644 --- a/src/lxc/lxccontainer.c +++ b/src/lxc/lxccontainer.c @@ -590,6 +590,21 @@ static bool do_lxcapi_want_close_all_fds(struct lxc_container *c, bool state) WRAP_API_1(bool, lxcapi_want_close_all_fds, bool) +static bool do_lxcapi_set_start_frozen(struct lxc_container *c, bool state) +{ + if (!c || !c->lxc_conf) + return false; + if (container_mem_lock(c)) { + ERROR("Error getting mem lock"); + return false; + } + c->lxc_conf->start_frozen = state; + container_mem_unlock(c); + return true; +} + +WRAP_API_1(bool, lxcapi_set_start_frozen, bool) + static bool do_lxcapi_wait(struct lxc_container *c, const char *state, int timeout) { int ret; @@ -4074,6 +4089,7 @@ struct lxc_container *lxc_container_new(const char *name, const char *configpath c->load_config = lxcapi_load_config; c->want_daemonize = lxcapi_want_daemonize; c->want_close_all_fds = lxcapi_want_close_all_fds; + c->set_start_frozen = lxcapi_set_start_frozen; c->start = lxcapi_start; c->startl = lxcapi_startl; c->stop = lxcapi_stop; diff --git a/src/lxc/lxccontainer.h b/src/lxc/lxccontainer.h index 6d155a1..454f029 100644 --- a/src/lxc/lxccontainer.h +++
[lxc-devel] [RFC 2/2] cleanup: lxc_container::want_* comment descriptions
They change a value and return true on success rather than fetching the value as the comments previously suggested. Signed-off-by: Wolfgang Bumiller--- src/lxc/lxccontainer.h | 9 - 1 file changed, 4 insertions(+), 5 deletions(-) diff --git a/src/lxc/lxccontainer.h b/src/lxc/lxccontainer.h index 454f029..3ab258e 100644 --- a/src/lxc/lxccontainer.h +++ b/src/lxc/lxccontainer.h @@ -222,25 +222,24 @@ struct lxc_container { bool (*stop)(struct lxc_container *c); /*! -* \brief Determine if the container wants to run disconnected +* \brief Change whether the container wants to run disconnected * from the terminal. * * \param c Container. * \param state Value for the daemonize bit (0 or 1). * -* \return \c true if container wants to be daemonised, else \c false. +* \return \c true on success, else \c false. */ bool (*want_daemonize)(struct lxc_container *c, bool state); /*! -* \brief Determine whether container wishes all file descriptors +* \brief Change whether the container wishes all file descriptors * to be closed on startup. * * \param c Container. * \param state Value for the close_all_fds bit (0 or 1). * -* \return \c true if container wants all file descriptors closed, -* else \c false. +* \return \c true on success, else \c false. */ bool (*want_close_all_fds)(struct lxc_container *c, bool state); -- 2.1.4 ___ lxc-devel mailing list lxc-devel@lists.linuxcontainers.org http://lists.linuxcontainers.org/listinfo/lxc-devel
Re: [lxc-devel] [PATCH] c/r: remember to chown the cgroup path (correctly)
On Thu, Jan 14, 2016 at 09:28:07AM +, Serge Hallyn wrote: > Quoting Tycho Andersen (tycho.ander...@canonical.com): > > On Wed, Jan 13, 2016 at 09:47:50PM +, Serge Hallyn wrote: > > > Quoting Tycho Andersen (tycho.ander...@canonical.com): > > > > 1. remember to chown the cgroup path when migrating a container > > > > 2. when restoring the cgroup path, try to compute the euid for root vs. > > > >using geteuid(); geteuid works for start, but it doesn't work for > > > >migration since we're still real root at that point. > > > > > > > > Signed-off-by: Tycho Andersen> > > > --- > > > > src/lxc/cgmanager.c | 6 +- > > > > src/lxc/criu.c | 5 + > > > > 2 files changed, 10 insertions(+), 1 deletion(-) > > > > > > > > diff --git a/src/lxc/cgmanager.c b/src/lxc/cgmanager.c > > > > index 357182a..54e6912 100644 > > > > --- a/src/lxc/cgmanager.c > > > > +++ b/src/lxc/cgmanager.c > > > > @@ -488,7 +488,11 @@ static bool chown_cgroup(const char *cgroup_path, > > > > struct lxc_conf *conf) > > > > return true; > > > > > > > > data.cgroup_path = cgroup_path; > > > > - data.origuid = geteuid(); > > > > + data.origuid = mapped_hostid(0, conf, ID_TYPE_UID); > > now, when starting a container, this happens in the > parent task in original uid. So the geteuid() returns 1000, > mapped_hostid(0, conf, ID_TYPE_UID) something like 10. Are you sure? Wouldn't it chmod everything to 1000 instead of 10 in that case and be all screwed up? > This is probably ok - but did you run all the lxc tests against > this to make sure? You can still run 'lxc-cgroup' etc as an > unpriv user? Well, it does change the behavior in ways I don't quite understand yet, so that's probably worth talking about. Before this series (the 644 one and this patch), I get: /sys/fs/cgroup/cpuset/lxc/unpriv $ ls -alh total 0 drwxrwxr-x 2 root 10 0 Jan 14 06:25 ./ drwxr-xr-x 3 root root 0 Jan 14 06:25 ../ -rw-r--r-- 1 root root 0 Jan 14 06:25 cgroup.clone_children -rwxrwxr-x 1 root 10 0 Jan 14 06:26 cgroup.procs* -rw-r--r-- 1 root root 0 Jan 14 06:25 cpuset.cpu_exclusive -rw-r--r-- 1 root root 0 Jan 14 06:25 cpuset.cpus -r--r--r-- 1 root root 0 Jan 14 06:25 cpuset.effective_cpus -r--r--r-- 1 root root 0 Jan 14 06:25 cpuset.effective_mems -rw-r--r-- 1 root root 0 Jan 14 06:25 cpuset.mem_exclusive -rw-r--r-- 1 root root 0 Jan 14 06:25 cpuset.mem_hardwall -rw-r--r-- 1 root root 0 Jan 14 06:25 cpuset.memory_migrate -r--r--r-- 1 root root 0 Jan 14 06:25 cpuset.memory_pressure -rw-r--r-- 1 root root 0 Jan 14 06:25 cpuset.memory_spread_page -rw-r--r-- 1 root root 0 Jan 14 06:25 cpuset.memory_spread_slab -rw-r--r-- 1 root root 0 Jan 14 06:25 cpuset.mems -rw-r--r-- 1 root root 0 Jan 14 06:25 cpuset.sched_load_balance -rw-r--r-- 1 root root 0 Jan 14 06:25 cpuset.sched_relax_domain_level -rw-r--r-- 1 root root 0 Jan 14 06:25 notify_on_release -rwxrwxr-x 1 root 10 0 Jan 14 06:25 tasks* and after: /sys/fs/cgroup/cpuset/lxc/unpriv $ ls -alh total 0 drwxrwxr-x 2 10 10 0 Jan 14 06:27 ./ drwxr-xr-x 3 root root 0 Jan 14 06:27 ../ -rw-r--r-- 1 root root 0 Jan 14 06:27 cgroup.clone_children -rw-rw-r-- 1 10 10 0 Jan 14 06:27 cgroup.procs -rw-r--r-- 1 root root 0 Jan 14 06:27 cpuset.cpu_exclusive -rw-r--r-- 1 root root 0 Jan 14 06:27 cpuset.cpus -r--r--r-- 1 root root 0 Jan 14 06:27 cpuset.effective_cpus -r--r--r-- 1 root root 0 Jan 14 06:27 cpuset.effective_mems -rw-r--r-- 1 root root 0 Jan 14 06:27 cpuset.mem_exclusive -rw-r--r-- 1 root root 0 Jan 14 06:27 cpuset.mem_hardwall -rw-r--r-- 1 root root 0 Jan 14 06:27 cpuset.memory_migrate -r--r--r-- 1 root root 0 Jan 14 06:27 cpuset.memory_pressure -rw-r--r-- 1 root root 0 Jan 14 06:27 cpuset.memory_spread_page -rw-r--r-- 1 root root 0 Jan 14 06:27 cpuset.memory_spread_slab -rw-r--r-- 1 root root 0 Jan 14 06:27 cpuset.mems -rw-r--r-- 1 root root 0 Jan 14 06:27 cpuset.sched_load_balance -rw-r--r-- 1 root root 0 Jan 14 06:27 cpuset.sched_relax_domain_level -rw-r--r-- 1 root root 0 Jan 14 06:27 notify_on_release -rw-rw-r-- 1 10 10 0 Jan 14 06:27 tasks However, it's not clear really /why/ from this patch the uid on tasks and cgroup.progs gets changed as well (probably some mask bug? it was ID_TYPE_UID before but changing the gid? not sure). So, what's the correct behavior here? Should we leave the uid on the files as real root, or chmod those to container root as well? (if it's the latter, i.e. the behavior introduced by this patch, it's probably still worth understanding if the former works, just to make sure that there are no other latent bugs; I don't mind figuring that out) Tycho > > > > + if (data.origuid < 0) { > > > > > > Can you confirm that this does not break > > > > > > sudo lxc-create -t download -n x1 -- -d ubuntu -r trusty -a amd64 > > > sudo lxc-start -n x1 > > > > > >
[lxc-devel] Swap Accounting :Query
I am using the most recent version of lxcfs, commit: 17f9a5a9d647467e3858fa751e40cc7c022dd475 When I spawn a container with the settings... lxc.cgroup.memory.limit_in_bytes = 256M lxc.cgroup.memory.memsw.limit_in_bytes = 512M ... I find that inside the container, we have: [root@test ~]# free -m total used free sharedbuffers cached Mem: 256 24231 6 0 20 -/+ buffers/cache: 3252 Swap: 1023 0 1023 As you can see, the Total swap should have been 256MB, but it is being shown as ~1GB (same as host). Basically, my memsw setting is being ignored. Now, when I revert the patch https://github.com/lxc/lxcfs/commit/a2de34bad72fb1b30af05d2f39a2b7bc6f20e42d Then I see the correct usage: [root@test ~]# free -m total used free sharedbuffers cached Mem: 256 24231 6 0 20 -/+ buffers/cache: 3252 Swap: 256 0256 But off course, this brings back the issue reported at https://github.com/lxc/lxcfs/issues/60 To fix both the problems, I proposed a patch at: https://lists.linuxcontainers.org/pipermail/lxc-devel/2016-January/013214.html which works well for me when host machine is CentOS 7. But fails to work when the host machine is Ubuntu 15.10 I followed: https://github.com/torvalds/linux/blob/v4.3/mm/memcontrol.c#L4037 https://github.com/torvalds/linux/blob/v4.3/mm/memcontrol.c#L2893 https://github.com/torvalds/linux/blob/v4.3/include/linux/page_counter.h#L21 To find out that the default value for /sys/fs/cgroup/memory/lxc/test/memory.memsw.limit_in_bytes is RES_LIMIT, which is PAGE_COUNTER_MAX * PAGE_SIZE, which is equal to 9223372036854775807 (LONG_MAX) in case of CentOS7 (kernel 3.10.0) and 9223372036854771712 in case of Ubuntu14.04 (kernel 4.2.0). So assumption of LONG_MAX always is obviously wrong. Only if somehow lxcfs knew that memsw.limit is not set in the container configuration...? -- Nehal J Wani ___ lxc-devel mailing list lxc-devel@lists.linuxcontainers.org http://lists.linuxcontainers.org/listinfo/lxc-devel
[lxc-devel] [PATCH v2] Fix btrfs bus error on sparc on snapshot delete
The following patch fixes memory alignment and endianness issue while doing a snapshot deletion with btrfs as a backing store on platform such as sparc. The implementation is taken from btrfs-progs. Changes since v1: - include for bswap definition - include defined function name as a comment above BTRFS_SETGET_STACK_FUNCS Signed-off-by: Thomas Tanaka--- src/lxc/bdev/lxcbtrfs.c | 28 +++- src/lxc/bdev/lxcbtrfs.h | 45 + 2 files changed, 60 insertions(+), 13 deletions(-) diff --git a/src/lxc/bdev/lxcbtrfs.c b/src/lxc/bdev/lxcbtrfs.c index 2db7c87..0f8c1fc 100644 --- a/src/lxc/bdev/lxcbtrfs.c +++ b/src/lxc/bdev/lxcbtrfs.c @@ -565,7 +565,7 @@ static int btrfs_recursive_destroy(const char *path) int fd; struct btrfs_ioctl_search_args args; struct btrfs_ioctl_search_key *sk = - struct btrfs_ioctl_search_header *sh; + struct btrfs_ioctl_search_header sh; struct btrfs_root_ref *ref; struct my_btrfs_tree *tree; int ret, i; @@ -573,6 +573,7 @@ static int btrfs_recursive_destroy(const char *path) int name_len; char *name; char *tmppath; + u64 dir_id; fd = open(path, O_RDONLY); if (fd < 0) { @@ -624,21 +625,22 @@ static int btrfs_recursive_destroy(const char *path) off = 0; for (i = 0; i < sk->nr_items; i++) { - sh = (struct btrfs_ioctl_search_header *)(args.buf + off); - off += sizeof(*sh); + memcpy(, args.buf + off, sizeof(sh)); + off += sizeof(sh); /* * A backref key with the name and dirid of the parent * comes followed by the reoot ref key which has the * name of the child subvol in question. */ - if (sh->objectid != root_id && sh->type == BTRFS_ROOT_BACKREF_KEY) { + if (sh.objectid != root_id && sh.type == BTRFS_ROOT_BACKREF_KEY) { ref = (struct btrfs_root_ref *)(args.buf + off); - name_len = ref->name_len; + name_len = btrfs_stack_root_ref_name_len(ref); name = (char *)(ref + 1); - tmppath = get_btrfs_subvol_path(fd, sh->offset, - ref->dirid, name, name_len); - if (!add_btrfs_tree_node(tree, sh->objectid, - sh->offset, name, + dir_id = btrfs_stack_root_ref_dirid(ref); + tmppath = get_btrfs_subvol_path(fd, sh.offset, + dir_id, name, name_len); + if (!add_btrfs_tree_node(tree, sh.objectid, + sh.offset, name, name_len, tmppath)) { ERROR("Out of memory"); free_btrfs_tree(tree); @@ -648,15 +650,15 @@ static int btrfs_recursive_destroy(const char *path) } free(tmppath); } - off += sh->len; + off += sh.len; /* * record the mins in sk so we can make sure the * next search doesn't repeat this root */ - sk->min_objectid = sh->objectid; - sk->min_type = sh->type; - sk->min_offset = sh->offset; + sk->min_objectid = sh.objectid; + sk->min_type = sh.type; + sk->min_offset = sh.offset; } sk->nr_items = 4096; sk->min_offset++; diff --git a/src/lxc/bdev/lxcbtrfs.h b/src/lxc/bdev/lxcbtrfs.h index e0adb7a..3b2742f 100644 --- a/src/lxc/bdev/lxcbtrfs.h +++ b/src/lxc/bdev/lxcbtrfs.h @@ -28,6 +28,7 @@ #include /* __le64, __l32 ... */ #include #include +#include typedef uint8_t u8; typedef uint16_t u16; @@ -317,6 +318,50 @@ struct btrfs_ioctl_ino_lookup_args { #define BTRFS_LAST_FREE_OBJECTID -256ULL #define BTRFS_FIRST_CHUNK_TREE_OBJECTID 256ULL +/* + * The followings are macro for correctly getting member of + * structures in both low and big endian platforms as per + * btrfs-progs + */ +#ifdef __CHECKER__ +#define __force__attribute__((force)) +#else +#define __force +#endif + +#if __BYTE_ORDER == __BIG_ENDIAN +#define cpu_to_le64(x) ((__force __le64)(u64)(bswap_64(x))) +#define le64_to_cpu(x) ((__force
Re: [lxc-devel] [PATCH v2] Fix btrfs bus error on sparc on snapshot delete
Quoting Thomas Tanaka (thomas.tan...@oracle.com): > The following patch fixes memory alignment and endianness > issue while doing a snapshot deletion with btrfs as a > backing store on platform such as sparc. > > The implementation is taken from btrfs-progs. > > Changes since v1: > - include for bswap definition > - include defined function name as a comment above BTRFS_SETGET_STACK_FUNCS > > Signed-off-by: Thomas TanakaThanks! Acked-by: Serge E. Hallyn > --- > src/lxc/bdev/lxcbtrfs.c | 28 +++- > src/lxc/bdev/lxcbtrfs.h | 45 + > 2 files changed, 60 insertions(+), 13 deletions(-) > > diff --git a/src/lxc/bdev/lxcbtrfs.c b/src/lxc/bdev/lxcbtrfs.c > index 2db7c87..0f8c1fc 100644 > --- a/src/lxc/bdev/lxcbtrfs.c > +++ b/src/lxc/bdev/lxcbtrfs.c > @@ -565,7 +565,7 @@ static int btrfs_recursive_destroy(const char *path) > int fd; > struct btrfs_ioctl_search_args args; > struct btrfs_ioctl_search_key *sk = > - struct btrfs_ioctl_search_header *sh; > + struct btrfs_ioctl_search_header sh; > struct btrfs_root_ref *ref; > struct my_btrfs_tree *tree; > int ret, i; > @@ -573,6 +573,7 @@ static int btrfs_recursive_destroy(const char *path) > int name_len; > char *name; > char *tmppath; > + u64 dir_id; > > fd = open(path, O_RDONLY); > if (fd < 0) { > @@ -624,21 +625,22 @@ static int btrfs_recursive_destroy(const char *path) > > off = 0; > for (i = 0; i < sk->nr_items; i++) { > - sh = (struct btrfs_ioctl_search_header *)(args.buf + > off); > - off += sizeof(*sh); > + memcpy(, args.buf + off, sizeof(sh)); > + off += sizeof(sh); > /* >* A backref key with the name and dirid of the parent >* comes followed by the reoot ref key which has the >* name of the child subvol in question. >*/ > - if (sh->objectid != root_id && sh->type == > BTRFS_ROOT_BACKREF_KEY) { > + if (sh.objectid != root_id && sh.type == > BTRFS_ROOT_BACKREF_KEY) { > ref = (struct btrfs_root_ref *)(args.buf + off); > - name_len = ref->name_len; > + name_len = btrfs_stack_root_ref_name_len(ref); > name = (char *)(ref + 1); > - tmppath = get_btrfs_subvol_path(fd, sh->offset, > - ref->dirid, name, name_len); > - if (!add_btrfs_tree_node(tree, sh->objectid, > - sh->offset, name, > + dir_id = btrfs_stack_root_ref_dirid(ref); > + tmppath = get_btrfs_subvol_path(fd, sh.offset, > + dir_id, name, name_len); > + if (!add_btrfs_tree_node(tree, sh.objectid, > + sh.offset, name, > name_len, tmppath)) { > ERROR("Out of memory"); > free_btrfs_tree(tree); > @@ -648,15 +650,15 @@ static int btrfs_recursive_destroy(const char *path) > } > free(tmppath); > } > - off += sh->len; > + off += sh.len; > > /* >* record the mins in sk so we can make sure the >* next search doesn't repeat this root >*/ > - sk->min_objectid = sh->objectid; > - sk->min_type = sh->type; > - sk->min_offset = sh->offset; > + sk->min_objectid = sh.objectid; > + sk->min_type = sh.type; > + sk->min_offset = sh.offset; > } > sk->nr_items = 4096; > sk->min_offset++; > diff --git a/src/lxc/bdev/lxcbtrfs.h b/src/lxc/bdev/lxcbtrfs.h > index e0adb7a..3b2742f 100644 > --- a/src/lxc/bdev/lxcbtrfs.h > +++ b/src/lxc/bdev/lxcbtrfs.h > @@ -28,6 +28,7 @@ > #include /* __le64, __l32 ... */ > #include > #include > +#include > > typedef uint8_t u8; > typedef uint16_t u16; > @@ -317,6 +318,50 @@ struct btrfs_ioctl_ino_lookup_args { > #define BTRFS_LAST_FREE_OBJECTID -256ULL > #define BTRFS_FIRST_CHUNK_TREE_OBJECTID 256ULL > > +/* > + * The followings are macro for correctly getting member of > + * structures in both low and big endian platforms as per > + * btrfs-progs > + */ > +#ifdef
Re: [lxc-devel] [PATCH] add lxc-copy to see_also.sgml.in
Quoting Christian Brauner (christian.brau...@mailbox.org): > Signed-off-by: Christian BraunerAcked-by: Serge E. Hallyn > --- > doc/see_also.sgml.in | 5 + > 1 file changed, 5 insertions(+) > > diff --git a/doc/see_also.sgml.in b/doc/see_also.sgml.in > index 4954e8e..3b3ecd7 100644 > --- a/doc/see_also.sgml.in > +++ b/doc/see_also.sgml.in > @@ -38,6 +38,11 @@ Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, > MA 02110-1301 USA >, > > > + lxc-copy > + 1 > + , > + > + > lxc-destroy > 1 >, > -- > 2.7.0 > > ___ > lxc-devel mailing list > lxc-devel@lists.linuxcontainers.org > http://lists.linuxcontainers.org/listinfo/lxc-devel ___ lxc-devel mailing list lxc-devel@lists.linuxcontainers.org http://lists.linuxcontainers.org/listinfo/lxc-devel
[lxc-devel] [PATCH] add lxc-copy to see_also.sgml.in
Signed-off-by: Christian Brauner--- doc/see_also.sgml.in | 5 + 1 file changed, 5 insertions(+) diff --git a/doc/see_also.sgml.in b/doc/see_also.sgml.in index 4954e8e..3b3ecd7 100644 --- a/doc/see_also.sgml.in +++ b/doc/see_also.sgml.in @@ -38,6 +38,11 @@ Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA , + lxc-copy + 1 + , + + lxc-destroy 1 , -- 2.7.0 ___ lxc-devel mailing list lxc-devel@lists.linuxcontainers.org http://lists.linuxcontainers.org/listinfo/lxc-devel
[lxc-devel] [lxc/lxc] 685062: avoid printing null string in error message
Branch: refs/heads/master Home: https://github.com/lxc/lxc Commit: 685062d6ff3eccf1b43abecbab95c213c22428e0 https://github.com/lxc/lxc/commit/685062d6ff3eccf1b43abecbab95c213c22428e0 Author: Serge HallynDate: 2016-01-14 (Thu, 14 Jan 2016) Changed paths: M src/lxc/conf.c Log Message: --- avoid printing null string in error message Show the ifindex in case it's useful Signed-off-by: Serge Hallyn ___ lxc-devel mailing list lxc-devel@lists.linuxcontainers.org http://lists.linuxcontainers.org/listinfo/lxc-devel
Re: [lxc-devel] [PATCH] check for btrfs fs in should_default_to_snapshot
Quoting Christian Brauner (christian.brau...@mailbox.org): > Check if we're really on a btrfs filesystem before we call btrfs_same_fs(). > Otherwise we will report misleading errors although everything went fine. > > Signed-off-by: Christian BraunerAcked-by: Serge E. Hallyn > --- > src/lxc/bdev/lxcbtrfs.c | 2 +- > src/lxc/bdev/lxcoverlay.c | 2 +- > src/lxc/lxccontainer.c| 4 > 3 files changed, 6 insertions(+), 2 deletions(-) > > diff --git a/src/lxc/bdev/lxcbtrfs.c b/src/lxc/bdev/lxcbtrfs.c > index 2db7c87..80b22fd 100644 > --- a/src/lxc/bdev/lxcbtrfs.c > +++ b/src/lxc/bdev/lxcbtrfs.c > @@ -40,7 +40,7 @@ > #include "lxcrsync.h" > #include "utils.h" > > -lxc_log_define(btrfs, lxc); > +lxc_log_define(lxcbtrfs, lxc); > > /* defined in lxccontainer.c: needs to become common helper */ > extern char *dir_new_path(char *src, const char *oldname, const char *name, > diff --git a/src/lxc/bdev/lxcoverlay.c b/src/lxc/bdev/lxcoverlay.c > index d06d24f..4d30f3c 100644 > --- a/src/lxc/bdev/lxcoverlay.c > +++ b/src/lxc/bdev/lxcoverlay.c > @@ -36,7 +36,7 @@ > #include "lxcrsync.h" > #include "utils.h" > > -lxc_log_define(overlay, lxc); > +lxc_log_define(lxcoverlay, lxc); > > static char *ovl_name; > > diff --git a/src/lxc/lxccontainer.c b/src/lxc/lxccontainer.c > index 490a59d..7f2a390 100644 > --- a/src/lxc/lxccontainer.c > +++ b/src/lxc/lxccontainer.c > @@ -2833,6 +2833,10 @@ bool should_default_to_snapshot(struct lxc_container > *c0, > > snprintf(p0, l0, "%s/%s", c0->config_path, c0->name); > snprintf(p1, l1, "%s/%s", c1->config_path, c1->name); > + > + if (!is_btrfs_fs(p0) || !is_btrfs_fs(p1)) > + return false; > + > return btrfs_same_fs(p0, p1) == 0; > } > > -- > 2.7.0 > > ___ > lxc-devel mailing list > lxc-devel@lists.linuxcontainers.org > http://lists.linuxcontainers.org/listinfo/lxc-devel ___ lxc-devel mailing list lxc-devel@lists.linuxcontainers.org http://lists.linuxcontainers.org/listinfo/lxc-devel
[lxc-devel] [PATCH] Fix btrfs bus error on sparc on snapshot delete
The following patch fixes memory alignment and endianness issue while doing a snapshot deletion with btrfs as a backing store on platform such as sparc. The implementation is taken from btrfs-progs. Signed-off-by: Thomas Tanaka--- src/lxc/bdev/lxcbtrfs.c | 28 +++- src/lxc/bdev/lxcbtrfs.h | 39 +++ 2 files changed, 54 insertions(+), 13 deletions(-) diff --git a/src/lxc/bdev/lxcbtrfs.c b/src/lxc/bdev/lxcbtrfs.c index 2db7c87..0f8c1fc 100644 --- a/src/lxc/bdev/lxcbtrfs.c +++ b/src/lxc/bdev/lxcbtrfs.c @@ -565,7 +565,7 @@ static int btrfs_recursive_destroy(const char *path) int fd; struct btrfs_ioctl_search_args args; struct btrfs_ioctl_search_key *sk = - struct btrfs_ioctl_search_header *sh; + struct btrfs_ioctl_search_header sh; struct btrfs_root_ref *ref; struct my_btrfs_tree *tree; int ret, i; @@ -573,6 +573,7 @@ static int btrfs_recursive_destroy(const char *path) int name_len; char *name; char *tmppath; + u64 dir_id; fd = open(path, O_RDONLY); if (fd < 0) { @@ -624,21 +625,22 @@ static int btrfs_recursive_destroy(const char *path) off = 0; for (i = 0; i < sk->nr_items; i++) { - sh = (struct btrfs_ioctl_search_header *)(args.buf + off); - off += sizeof(*sh); + memcpy(, args.buf + off, sizeof(sh)); + off += sizeof(sh); /* * A backref key with the name and dirid of the parent * comes followed by the reoot ref key which has the * name of the child subvol in question. */ - if (sh->objectid != root_id && sh->type == BTRFS_ROOT_BACKREF_KEY) { + if (sh.objectid != root_id && sh.type == BTRFS_ROOT_BACKREF_KEY) { ref = (struct btrfs_root_ref *)(args.buf + off); - name_len = ref->name_len; + name_len = btrfs_stack_root_ref_name_len(ref); name = (char *)(ref + 1); - tmppath = get_btrfs_subvol_path(fd, sh->offset, - ref->dirid, name, name_len); - if (!add_btrfs_tree_node(tree, sh->objectid, - sh->offset, name, + dir_id = btrfs_stack_root_ref_dirid(ref); + tmppath = get_btrfs_subvol_path(fd, sh.offset, + dir_id, name, name_len); + if (!add_btrfs_tree_node(tree, sh.objectid, + sh.offset, name, name_len, tmppath)) { ERROR("Out of memory"); free_btrfs_tree(tree); @@ -648,15 +650,15 @@ static int btrfs_recursive_destroy(const char *path) } free(tmppath); } - off += sh->len; + off += sh.len; /* * record the mins in sk so we can make sure the * next search doesn't repeat this root */ - sk->min_objectid = sh->objectid; - sk->min_type = sh->type; - sk->min_offset = sh->offset; + sk->min_objectid = sh.objectid; + sk->min_type = sh.type; + sk->min_offset = sh.offset; } sk->nr_items = 4096; sk->min_offset++; diff --git a/src/lxc/bdev/lxcbtrfs.h b/src/lxc/bdev/lxcbtrfs.h index e0adb7a..4ea2b4a 100644 --- a/src/lxc/bdev/lxcbtrfs.h +++ b/src/lxc/bdev/lxcbtrfs.h @@ -317,6 +317,45 @@ struct btrfs_ioctl_ino_lookup_args { #define BTRFS_LAST_FREE_OBJECTID -256ULL #define BTRFS_FIRST_CHUNK_TREE_OBJECTID 256ULL +/* The followings are macro for correctly getting member of structures + * in both low and big endian platforms as per btrfs-progs + */ +#ifdef __CHECKER__ +#define __force__attribute__((force)) +#else +#define __force +#endif + +#if __BYTE_ORDER == __BIG_ENDIAN +#define cpu_to_le64(x) ((__force __le64)(u64)(bswap_64(x))) +#define le64_to_cpu(x) ((__force u64)(__le64)(bswap_64(x))) +#define cpu_to_le32(x) ((__force __le32)(u32)(bswap_32(x))) +#define le32_to_cpu(x) ((__force u32)(__le32)(bswap_32(x))) +#define cpu_to_le16(x) ((__force __le16)(u16)(bswap_16(x))) +#define le16_to_cpu(x) ((__force u16)(__le16)(bswap_16(x))) +#else
Re: [lxc-devel] [PATCH v2] safe_mount: Handle mounting proc and refactor
Quoting Bogdan Purcareata (bogdan.purcare...@nxp.com): > On 14.01.2016 01:09, Serge Hallyn wrote: > > Quoting Bogdan Purcareata (bogdan.purcare...@nxp.com): > >> On 11.01.2016 20:59, Serge Hallyn wrote: > >>> Quoting Bogdan Purcareata (bogdan.purcare...@nxp.com): > The safe_mount primitive will mount the fs in the new container > environment by using file descriptors referred in /proc/self/fd. > However, when the mounted filesystem is proc itself, it will have > been previously unmounted, therefore resulting in an error when > searching for these file descriptors. This only happens when there's > no container rootfs prefix (commonly with lxc-execute). > > Implement the support for this use case as well, by doing the mount > based on the full path. > > Refactor the whole function in order to remove duplicated code checks > and improve readability. > > Changes since v1: > - In order to address CVE-2015-1335, still check if the destination is > not a symlink. Do the mount only if the destination file descriptor > exists. > > Signed-off-by: Bogdan Purcareata> --- > src/lxc/utils.c | 49 - > 1 file changed, 28 insertions(+), 21 deletions(-) > > diff --git a/src/lxc/utils.c b/src/lxc/utils.c > index d9e769d..c53711a 100644 > --- a/src/lxc/utils.c > +++ b/src/lxc/utils.c > @@ -1644,9 +1644,9 @@ out: > int safe_mount(const char *src, const char *dest, const char *fstype, > unsigned long flags, const void *data, const char > *rootfs) > { > -int srcfd = -1, destfd, ret, saved_errno; > +int srcfd = -1, destfd = -1, ret = 0; > char srcbuf[50], destbuf[50]; // only needs enough for > /proc/self/fd/ > -const char *mntsrc = src; > +const char *mntsrc = src, *mntdest = dest; > > if (!rootfs) > rootfs = ""; > @@ -1655,45 +1655,52 @@ int safe_mount(const char *src, const char > *dest, const char *fstype, > if (flags & MS_BIND && src && src[0] != '/') { > INFO("this is a relative bind mount"); > srcfd = open_without_symlink(src, NULL); > -if (srcfd < 0) > -return srcfd; > +if (srcfd < 0) { > +ret = srcfd; > +goto out; > +} > ret = snprintf(srcbuf, 50, "/proc/self/fd/%d", srcfd); > if (ret < 0 || ret > 50) { > -close(srcfd); > ERROR("Out of memory"); > -return -EINVAL; > +ret = -EINVAL; > +goto out_src; > } > mntsrc = srcbuf; > } > > destfd = open_without_symlink(dest, rootfs); > if (destfd < 0) { > -if (srcfd != -1) > -close(srcfd); > -return destfd; > +ret = destfd; > +goto out_src; > } > > ret = snprintf(destbuf, 50, "/proc/self/fd/%d", destfd); > if (ret < 0 || ret > 50) { > -if (srcfd != -1) > -close(srcfd); > -close(destfd); > ERROR("Out of memory"); > -return -EINVAL; > +ret = -EINVAL; > +goto out_dest; > } > > -ret = mount(mntsrc, destbuf, fstype, flags, data); > -saved_errno = errno; > -if (srcfd != -1) > -close(srcfd); > -close(destfd); > +/* make sure the destination descriptor exists */ > +if (access(destbuf, F_OK) == 0) > +mntdest = destbuf; > >>> First, if we're going to shortcut I'd prefer to say "if /proc/self > >>> does not exist then skip this check" fo rnow. > >>> But can we think of any way to still do this check? > >>> > >>> What exactly are the cases? > >>> > >>> 1. lxc-execute, lxc-init tries to mount /proc. We should be able > >>> to simply have lxc always mount /proc before the pivot_root, so > >>> we can properly do this check. > >>> > >>> what use-cases will break if we demand /proc to exist in the > >>> container? (We can add an option to umount /proc in lxc-init, > >>> but the directory would have to exist) > >> That's my use case - the failing function is tmp_proc_mount, thus happening > > > > Wait, is your rootfs NULL or not? > > Yep, it's NULL - plain "lxc-execute -n foo -f
Re: [lxc-devel] [PATCH] c/r: remember to chown the cgroup path (correctly)
Quoting Tycho Andersen (tycho.ander...@canonical.com): > On Thu, Jan 14, 2016 at 09:28:07AM +, Serge Hallyn wrote: > > Quoting Tycho Andersen (tycho.ander...@canonical.com): > > > On Wed, Jan 13, 2016 at 09:47:50PM +, Serge Hallyn wrote: > > > > Quoting Tycho Andersen (tycho.ander...@canonical.com): > > > > > 1. remember to chown the cgroup path when migrating a container > > > > > 2. when restoring the cgroup path, try to compute the euid for root > > > > > vs. > > > > >using geteuid(); geteuid works for start, but it doesn't work for > > > > >migration since we're still real root at that point. > > > > > > > > > > Signed-off-by: Tycho Andersen> > > > > --- > > > > > src/lxc/cgmanager.c | 6 +- > > > > > src/lxc/criu.c | 5 + > > > > > 2 files changed, 10 insertions(+), 1 deletion(-) > > > > > > > > > > diff --git a/src/lxc/cgmanager.c b/src/lxc/cgmanager.c > > > > > index 357182a..54e6912 100644 > > > > > --- a/src/lxc/cgmanager.c > > > > > +++ b/src/lxc/cgmanager.c > > > > > @@ -488,7 +488,11 @@ static bool chown_cgroup(const char > > > > > *cgroup_path, struct lxc_conf *conf) > > > > > return true; > > > > > > > > > > data.cgroup_path = cgroup_path; > > > > > - data.origuid = geteuid(); > > > > > + data.origuid = mapped_hostid(0, conf, ID_TYPE_UID); > > > > now, when starting a container, this happens in the > > parent task in original uid. So the geteuid() returns 1000, > > mapped_hostid(0, conf, ID_TYPE_UID) something like 10. > > Are you sure? Wouldn't it chmod everything to 1000 instead of 10 > in that case and be all screwed up? No I'm not sure :) I was lazily asking you to add an fprintf and verify :) > > This is probably ok - but did you run all the lxc tests against > > this to make sure? You can still run 'lxc-cgroup' etc as an > > unpriv user? > > Well, it does change the behavior in ways I don't quite understand > yet, so that's probably worth talking about. Before this series (the > 644 one and this patch), I get: > > /sys/fs/cgroup/cpuset/lxc/unpriv $ ls -alh > total 0 > drwxrwxr-x 2 root 10 0 Jan 14 06:25 ./ > drwxr-xr-x 3 root root 0 Jan 14 06:25 ../ > -rw-r--r-- 1 root root 0 Jan 14 06:25 cgroup.clone_children > -rwxrwxr-x 1 root 10 0 Jan 14 06:26 cgroup.procs* > -rw-r--r-- 1 root root 0 Jan 14 06:25 cpuset.cpu_exclusive > -rw-r--r-- 1 root root 0 Jan 14 06:25 cpuset.cpus > -r--r--r-- 1 root root 0 Jan 14 06:25 cpuset.effective_cpus > -r--r--r-- 1 root root 0 Jan 14 06:25 cpuset.effective_mems > -rw-r--r-- 1 root root 0 Jan 14 06:25 cpuset.mem_exclusive > -rw-r--r-- 1 root root 0 Jan 14 06:25 cpuset.mem_hardwall > -rw-r--r-- 1 root root 0 Jan 14 06:25 cpuset.memory_migrate > -r--r--r-- 1 root root 0 Jan 14 06:25 cpuset.memory_pressure > -rw-r--r-- 1 root root 0 Jan 14 06:25 cpuset.memory_spread_page > -rw-r--r-- 1 root root 0 Jan 14 06:25 cpuset.memory_spread_slab > -rw-r--r-- 1 root root 0 Jan 14 06:25 cpuset.mems > -rw-r--r-- 1 root root 0 Jan 14 06:25 cpuset.sched_load_balance > -rw-r--r-- 1 root root 0 Jan 14 06:25 cpuset.sched_relax_domain_level > -rw-r--r-- 1 root root 0 Jan 14 06:25 notify_on_release > -rwxrwxr-x 1 root 10 0 Jan 14 06:25 tasks* > > and after: > > /sys/fs/cgroup/cpuset/lxc/unpriv $ ls -alh > total 0 > drwxrwxr-x 2 10 10 0 Jan 14 06:27 ./ > drwxr-xr-x 3 root root 0 Jan 14 06:27 ../ > -rw-r--r-- 1 root root 0 Jan 14 06:27 cgroup.clone_children > -rw-rw-r-- 1 10 10 0 Jan 14 06:27 cgroup.procs > -rw-r--r-- 1 root root 0 Jan 14 06:27 cpuset.cpu_exclusive > -rw-r--r-- 1 root root 0 Jan 14 06:27 cpuset.cpus > -r--r--r-- 1 root root 0 Jan 14 06:27 cpuset.effective_cpus > -r--r--r-- 1 root root 0 Jan 14 06:27 cpuset.effective_mems > -rw-r--r-- 1 root root 0 Jan 14 06:27 cpuset.mem_exclusive > -rw-r--r-- 1 root root 0 Jan 14 06:27 cpuset.mem_hardwall > -rw-r--r-- 1 root root 0 Jan 14 06:27 cpuset.memory_migrate > -r--r--r-- 1 root root 0 Jan 14 06:27 cpuset.memory_pressure > -rw-r--r-- 1 root root 0 Jan 14 06:27 cpuset.memory_spread_page > -rw-r--r-- 1 root root 0 Jan 14 06:27 cpuset.memory_spread_slab > -rw-r--r-- 1 root root 0 Jan 14 06:27 cpuset.mems > -rw-r--r-- 1 root root 0 Jan 14 06:27 cpuset.sched_load_balance > -rw-r--r-- 1 root root 0 Jan 14 06:27 cpuset.sched_relax_domain_level > -rw-r--r-- 1 root root 0 Jan 14 06:27 notify_on_release > -rw-rw-r-- 1 10 10 0 Jan 14 06:27 tasks > > However, it's not clear really /why/ from this patch the uid on tasks and > cgroup.progs gets changed as well (probably some mask bug? it was ID_TYPE_UID > before but changing the gid? not sure). Cgmanager does that automatically. When you chown cgroup freezer:/lxc/c1, cgmanager chowns the directory itself (so you can create new cgroups), and the procs and tasks files, so you can move tasks in. It does not change the other files so you
[lxc-devel] [PATCH] check for btrfs fs in should_default_to_snapshot
Check if we're really on a btrfs filesystem before we call btrfs_same_fs(). Otherwise we will report misleading errors although everything went fine. Signed-off-by: Christian Brauner--- src/lxc/bdev/lxcbtrfs.c | 2 +- src/lxc/bdev/lxcoverlay.c | 2 +- src/lxc/lxccontainer.c| 4 3 files changed, 6 insertions(+), 2 deletions(-) diff --git a/src/lxc/bdev/lxcbtrfs.c b/src/lxc/bdev/lxcbtrfs.c index 2db7c87..80b22fd 100644 --- a/src/lxc/bdev/lxcbtrfs.c +++ b/src/lxc/bdev/lxcbtrfs.c @@ -40,7 +40,7 @@ #include "lxcrsync.h" #include "utils.h" -lxc_log_define(btrfs, lxc); +lxc_log_define(lxcbtrfs, lxc); /* defined in lxccontainer.c: needs to become common helper */ extern char *dir_new_path(char *src, const char *oldname, const char *name, diff --git a/src/lxc/bdev/lxcoverlay.c b/src/lxc/bdev/lxcoverlay.c index d06d24f..4d30f3c 100644 --- a/src/lxc/bdev/lxcoverlay.c +++ b/src/lxc/bdev/lxcoverlay.c @@ -36,7 +36,7 @@ #include "lxcrsync.h" #include "utils.h" -lxc_log_define(overlay, lxc); +lxc_log_define(lxcoverlay, lxc); static char *ovl_name; diff --git a/src/lxc/lxccontainer.c b/src/lxc/lxccontainer.c index 490a59d..7f2a390 100644 --- a/src/lxc/lxccontainer.c +++ b/src/lxc/lxccontainer.c @@ -2833,6 +2833,10 @@ bool should_default_to_snapshot(struct lxc_container *c0, snprintf(p0, l0, "%s/%s", c0->config_path, c0->name); snprintf(p1, l1, "%s/%s", c1->config_path, c1->name); + + if (!is_btrfs_fs(p0) || !is_btrfs_fs(p1)) + return false; + return btrfs_same_fs(p0, p1) == 0; } -- 2.7.0 ___ lxc-devel mailing list lxc-devel@lists.linuxcontainers.org http://lists.linuxcontainers.org/listinfo/lxc-devel
Re: [lxc-devel] [PATCH] Fix btrfs bus error on sparc on snapshot delete
Quoting Thomas Tanaka (thomas.tan...@oracle.com): > The following patch fixes memory alignment and endianness > issue while doing a snapshot deletion with btrfs as a > backing store on platform such as sparc. > > The implementation is taken from btrfs-progs. Hi, thanks, this looks nice. I'm worried though that the #defines hide some of the function definitions. Can you add a brief comment listing the defined fns names above their definitions in the .h file? Where is bswap_64 defined? > Signed-off-by: Thomas Tanaka> --- > src/lxc/bdev/lxcbtrfs.c | 28 +++- > src/lxc/bdev/lxcbtrfs.h | 39 +++ > 2 files changed, 54 insertions(+), 13 deletions(-) > > diff --git a/src/lxc/bdev/lxcbtrfs.c b/src/lxc/bdev/lxcbtrfs.c > index 2db7c87..0f8c1fc 100644 > --- a/src/lxc/bdev/lxcbtrfs.c > +++ b/src/lxc/bdev/lxcbtrfs.c > @@ -565,7 +565,7 @@ static int btrfs_recursive_destroy(const char *path) > int fd; > struct btrfs_ioctl_search_args args; > struct btrfs_ioctl_search_key *sk = > - struct btrfs_ioctl_search_header *sh; > + struct btrfs_ioctl_search_header sh; > struct btrfs_root_ref *ref; > struct my_btrfs_tree *tree; > int ret, i; > @@ -573,6 +573,7 @@ static int btrfs_recursive_destroy(const char *path) > int name_len; > char *name; > char *tmppath; > + u64 dir_id; > > fd = open(path, O_RDONLY); > if (fd < 0) { > @@ -624,21 +625,22 @@ static int btrfs_recursive_destroy(const char *path) > > off = 0; > for (i = 0; i < sk->nr_items; i++) { > - sh = (struct btrfs_ioctl_search_header *)(args.buf + > off); > - off += sizeof(*sh); > + memcpy(, args.buf + off, sizeof(sh)); > + off += sizeof(sh); > /* >* A backref key with the name and dirid of the parent >* comes followed by the reoot ref key which has the >* name of the child subvol in question. >*/ > - if (sh->objectid != root_id && sh->type == > BTRFS_ROOT_BACKREF_KEY) { > + if (sh.objectid != root_id && sh.type == > BTRFS_ROOT_BACKREF_KEY) { > ref = (struct btrfs_root_ref *)(args.buf + off); > - name_len = ref->name_len; > + name_len = btrfs_stack_root_ref_name_len(ref); > name = (char *)(ref + 1); > - tmppath = get_btrfs_subvol_path(fd, sh->offset, > - ref->dirid, name, name_len); > - if (!add_btrfs_tree_node(tree, sh->objectid, > - sh->offset, name, > + dir_id = btrfs_stack_root_ref_dirid(ref); > + tmppath = get_btrfs_subvol_path(fd, sh.offset, > + dir_id, name, name_len); > + if (!add_btrfs_tree_node(tree, sh.objectid, > + sh.offset, name, > name_len, tmppath)) { > ERROR("Out of memory"); > free_btrfs_tree(tree); > @@ -648,15 +650,15 @@ static int btrfs_recursive_destroy(const char *path) > } > free(tmppath); > } > - off += sh->len; > + off += sh.len; > > /* >* record the mins in sk so we can make sure the >* next search doesn't repeat this root >*/ > - sk->min_objectid = sh->objectid; > - sk->min_type = sh->type; > - sk->min_offset = sh->offset; > + sk->min_objectid = sh.objectid; > + sk->min_type = sh.type; > + sk->min_offset = sh.offset; > } > sk->nr_items = 4096; > sk->min_offset++; > diff --git a/src/lxc/bdev/lxcbtrfs.h b/src/lxc/bdev/lxcbtrfs.h > index e0adb7a..4ea2b4a 100644 > --- a/src/lxc/bdev/lxcbtrfs.h > +++ b/src/lxc/bdev/lxcbtrfs.h > @@ -317,6 +317,45 @@ struct btrfs_ioctl_ino_lookup_args { > #define BTRFS_LAST_FREE_OBJECTID -256ULL > #define BTRFS_FIRST_CHUNK_TREE_OBJECTID 256ULL > > +/* The followings are macro for correctly getting member of structures > + * in both low and big endian platforms as per btrfs-progs > + */ > +#ifdef __CHECKER__ > +#define __force__attribute__((force)) > +#else > +#define __force > +#endif > + > +#if __BYTE_ORDER == __BIG_ENDIAN >