Re: [lxc-devel] [PATCH] c/r: remember to chown the cgroup path (correctly)

2016-01-14 Thread Tycho Andersen
On Wed, Jan 13, 2016 at 09:47:50PM +, Serge Hallyn wrote:
> Quoting Tycho Andersen (tycho.ander...@canonical.com):
> > 1. remember to chown the cgroup path when migrating a container
> > 2. when restoring the cgroup path, try to compute the euid for root vs.
> >using geteuid(); geteuid works for start, but it doesn't work for
> >migration since we're still real root at that point.
> > 
> > Signed-off-by: Tycho Andersen 
> > ---
> >  src/lxc/cgmanager.c | 6 +-
> >  src/lxc/criu.c  | 5 +
> >  2 files changed, 10 insertions(+), 1 deletion(-)
> > 
> > diff --git a/src/lxc/cgmanager.c b/src/lxc/cgmanager.c
> > index 357182a..54e6912 100644
> > --- a/src/lxc/cgmanager.c
> > +++ b/src/lxc/cgmanager.c
> > @@ -488,7 +488,11 @@ static bool chown_cgroup(const char *cgroup_path, 
> > struct lxc_conf *conf)
> > return true;
> >  
> > data.cgroup_path = cgroup_path;
> > -   data.origuid = geteuid();
> > +   data.origuid = mapped_hostid(0, conf, ID_TYPE_UID);
> > +   if (data.origuid < 0) {
> 
> Can you confirm that this does not break
> 
> sudo lxc-create -t download -n x1 -- -d ubuntu -r trusty -a amd64
> sudo lxc-start -n x1
> 
> Because in that case I think we have no mappings, and mapped_hostid() will
> return -1.

You can't see it in the patch, but just above this is a
lxc_list_empty() test, and this whole path isn't executed if
lxc_list_empty() is true, so I think it should be ok.

Tycho

> > +   ERROR("failed to get mapped root id");
> > +   return false;
> > +   }
> >  
> > /* Unpriv users can't chown it themselves, so chown from
> >  * a child namespace mapping both our own and the target uid
> > diff --git a/src/lxc/criu.c b/src/lxc/criu.c
> > index 6ef4905..f442612 100644
> > --- a/src/lxc/criu.c
> > +++ b/src/lxc/criu.c
> > @@ -466,6 +466,11 @@ void do_restore(struct lxc_container *c, int pipe, 
> > char *directory, bool verbose
> > goto out_fini_handler;
> > }
> >  
> > +   if (!cgroup_chown(handler)) {
> > +   ERROR("failed creating groups");
> > +   goto out_fini_handler;
> > +   }
> > +
> > if (!restore_net_info(c)) {
> > ERROR("failed restoring network info");
> > goto out_fini_handler;
> > -- 
> > 2.6.4
> > 
> > ___
> > lxc-devel mailing list
> > lxc-devel@lists.linuxcontainers.org
> > http://lists.linuxcontainers.org/listinfo/lxc-devel
> ___
> lxc-devel mailing list
> lxc-devel@lists.linuxcontainers.org
> http://lists.linuxcontainers.org/listinfo/lxc-devel
___
lxc-devel mailing list
lxc-devel@lists.linuxcontainers.org
http://lists.linuxcontainers.org/listinfo/lxc-devel


Re: [lxc-devel] [PATCH] c/r: remember to chown the cgroup path (correctly)

2016-01-14 Thread Serge Hallyn
Quoting Tycho Andersen (tycho.ander...@canonical.com):
> On Wed, Jan 13, 2016 at 09:47:50PM +, Serge Hallyn wrote:
> > Quoting Tycho Andersen (tycho.ander...@canonical.com):
> > > 1. remember to chown the cgroup path when migrating a container
> > > 2. when restoring the cgroup path, try to compute the euid for root vs.
> > >using geteuid(); geteuid works for start, but it doesn't work for
> > >migration since we're still real root at that point.
> > > 
> > > Signed-off-by: Tycho Andersen 
> > > ---
> > >  src/lxc/cgmanager.c | 6 +-
> > >  src/lxc/criu.c  | 5 +
> > >  2 files changed, 10 insertions(+), 1 deletion(-)
> > > 
> > > diff --git a/src/lxc/cgmanager.c b/src/lxc/cgmanager.c
> > > index 357182a..54e6912 100644
> > > --- a/src/lxc/cgmanager.c
> > > +++ b/src/lxc/cgmanager.c
> > > @@ -488,7 +488,11 @@ static bool chown_cgroup(const char *cgroup_path, 
> > > struct lxc_conf *conf)
> > >   return true;
> > >  
> > >   data.cgroup_path = cgroup_path;
> > > - data.origuid = geteuid();
> > > + data.origuid = mapped_hostid(0, conf, ID_TYPE_UID);

now, when starting a container, this happens in the
parent task in original uid.  So the geteuid() returns 1000,
mapped_hostid(0, conf, ID_TYPE_UID) something like 10.

This is probably ok - but did you run all the lxc tests against
this to make sure?  You can still run 'lxc-cgroup' etc as an
unpriv user?

> > > + if (data.origuid < 0) {
> > 
> > Can you confirm that this does not break
> > 
> > sudo lxc-create -t download -n x1 -- -d ubuntu -r trusty -a amd64
> > sudo lxc-start -n x1
> > 
> > Because in that case I think we have no mappings, and mapped_hostid() will
> > return -1.
> 
> You can't see it in the patch, but just above this is a

D'oh!  I even looked for such a thing this morning but missed it.

> lxc_list_empty() test, and this whole path isn't executed if
> lxc_list_empty() is true, so I think it should be ok.
> 
> Tycho
> 
> > > + ERROR("failed to get mapped root id");
> > > + return false;
> > > + }
> > >  
> > >   /* Unpriv users can't chown it themselves, so chown from
> > >* a child namespace mapping both our own and the target uid
> > > diff --git a/src/lxc/criu.c b/src/lxc/criu.c
> > > index 6ef4905..f442612 100644
> > > --- a/src/lxc/criu.c
> > > +++ b/src/lxc/criu.c
> > > @@ -466,6 +466,11 @@ void do_restore(struct lxc_container *c, int pipe, 
> > > char *directory, bool verbose
> > >   goto out_fini_handler;
> > >   }
> > >  
> > > + if (!cgroup_chown(handler)) {
> > > + ERROR("failed creating groups");
> > > + goto out_fini_handler;
> > > + }
> > > +
> > >   if (!restore_net_info(c)) {
> > >   ERROR("failed restoring network info");
> > >   goto out_fini_handler;
> > > -- 
> > > 2.6.4
> > > 
> > > ___
> > > lxc-devel mailing list
> > > lxc-devel@lists.linuxcontainers.org
> > > http://lists.linuxcontainers.org/listinfo/lxc-devel
> > ___
> > lxc-devel mailing list
> > lxc-devel@lists.linuxcontainers.org
> > http://lists.linuxcontainers.org/listinfo/lxc-devel
> ___
> lxc-devel mailing list
> lxc-devel@lists.linuxcontainers.org
> http://lists.linuxcontainers.org/listinfo/lxc-devel
___
lxc-devel mailing list
lxc-devel@lists.linuxcontainers.org
http://lists.linuxcontainers.org/listinfo/lxc-devel


Re: [lxc-devel] [PATCH] .gitignore: add sparclinux make output

2016-01-14 Thread Tycho Andersen
On Wed, Jan 13, 2016 at 10:20:12PM +, Serge Hallyn wrote:
> Quoting Wim Coekaerts (wim.coekae...@oracle.com):
> > On 1/13/16 1:50 PM, Serge Hallyn wrote:
> > >Quoting Tycho Andersen (tycho.ander...@canonical.com):
> > >>Signed-off-by: Tycho Andersen 
> > >Acked-by: Serge E. Hallyn 
> > >
> > >>---
> > >>  .gitignore | 1 +
> > >>  1 file changed, 1 insertion(+)
> > >>
> > >>diff --git a/.gitignore b/.gitignore
> > >>index 5e4912c..58e5dea 100644
> > >>--- a/.gitignore
> > >>+++ b/.gitignore
> > >>@@ -41,6 +41,7 @@ templates/lxc-opensuse
> > >>  templates/lxc-oracle
> > >>  templates/lxc-plamo
> > >>  templates/lxc-slackware
> > >>+templates/lxc-sparclinux
> > >>  templates/lxc-sshd
> > >>  templates/lxc-ubuntu
> > >>  templates/lxc-ubuntu-cloud
> > ugh do I feel stupid !
> 
> nonsense - we all do it.

Indeed, please don't feel stupid :)

Tycho
___
lxc-devel mailing list
lxc-devel@lists.linuxcontainers.org
http://lists.linuxcontainers.org/listinfo/lxc-devel


Re: [lxc-devel] [PATCH v2] safe_mount: Handle mounting proc and refactor

2016-01-14 Thread Bogdan Purcareata
On 14.01.2016 01:09, Serge Hallyn wrote:
> Quoting Bogdan Purcareata (bogdan.purcare...@nxp.com):
>> On 11.01.2016 20:59, Serge Hallyn wrote:
>>> Quoting Bogdan Purcareata (bogdan.purcare...@nxp.com):
 The safe_mount primitive will mount the fs in the new container
 environment by using file descriptors referred in /proc/self/fd.
 However, when the mounted filesystem is proc itself, it will have
 been previously unmounted, therefore resulting in an error when
 searching for these file descriptors. This only happens when there's
 no container rootfs prefix (commonly with lxc-execute).

 Implement the support for this use case as well, by doing the mount
 based on the full path.

 Refactor the whole function in order to remove duplicated code checks
 and improve readability.

 Changes since v1:
 - In order to address CVE-2015-1335, still check if the destination is
 not a symlink. Do the mount only if the destination file descriptor
 exists.

 Signed-off-by: Bogdan Purcareata 
 ---
src/lxc/utils.c | 49 -
1 file changed, 28 insertions(+), 21 deletions(-)

 diff --git a/src/lxc/utils.c b/src/lxc/utils.c
 index d9e769d..c53711a 100644
 --- a/src/lxc/utils.c
 +++ b/src/lxc/utils.c
 @@ -1644,9 +1644,9 @@ out:
int safe_mount(const char *src, const char *dest, const char *fstype,
unsigned long flags, const void *data, const char 
 *rootfs)
{
 -  int srcfd = -1, destfd, ret, saved_errno;
 +  int srcfd = -1, destfd = -1, ret = 0;
char srcbuf[50], destbuf[50]; // only needs enough for 
 /proc/self/fd/
 -  const char *mntsrc = src;
 +  const char *mntsrc = src, *mntdest = dest;

if (!rootfs)
rootfs = "";
 @@ -1655,45 +1655,52 @@ int safe_mount(const char *src, const char *dest, 
 const char *fstype,
if (flags & MS_BIND && src && src[0] != '/') {
INFO("this is a relative bind mount");
srcfd = open_without_symlink(src, NULL);
 -  if (srcfd < 0)
 -  return srcfd;
 +  if (srcfd < 0) {
 +  ret = srcfd;
 +  goto out;
 +  }
ret = snprintf(srcbuf, 50, "/proc/self/fd/%d", srcfd);
if (ret < 0 || ret > 50) {
 -  close(srcfd);
ERROR("Out of memory");
 -  return -EINVAL;
 +  ret = -EINVAL;
 +  goto out_src;
}
mntsrc = srcbuf;
}

destfd = open_without_symlink(dest, rootfs);
if (destfd < 0) {
 -  if (srcfd != -1)
 -  close(srcfd);
 -  return destfd;
 +  ret = destfd;
 +  goto out_src;
}

ret = snprintf(destbuf, 50, "/proc/self/fd/%d", destfd);
if (ret < 0 || ret > 50) {
 -  if (srcfd != -1)
 -  close(srcfd);
 -  close(destfd);
ERROR("Out of memory");
 -  return -EINVAL;
 +  ret = -EINVAL;
 +  goto out_dest;
}

 -  ret = mount(mntsrc, destbuf, fstype, flags, data);
 -  saved_errno = errno;
 -  if (srcfd != -1)
 -  close(srcfd);
 -  close(destfd);
 +  /* make sure the destination descriptor exists */
 +  if (access(destbuf, F_OK) == 0)
 +  mntdest = destbuf;
>>> First, if we're going to shortcut I'd prefer to say "if /proc/self
>>> does not exist then skip this check" fo rnow.
>>> But can we think of any way to still do this check?
>>>
>>> What exactly are the cases?
>>>
>>> 1. lxc-execute, lxc-init tries to mount /proc.  We should be able
>>> to simply have lxc always mount /proc before the pivot_root, so
>>> we can properly do this check.
>>>
>>> what use-cases will break if we demand /proc to exist in the
>>> container?  (We can add an option to umount /proc in lxc-init,
>>> but the directory would have to exist)
>> That's my use case - the failing function is tmp_proc_mount, thus happening
>
> Wait, is your rootfs NULL or not?

Yep, it's NULL - plain "lxc-execute -n foo -f empty.conf -- bash".

> We can handle that case specially.
>
> mount_proc_if_needed() is only called from tmp_proc_mount(), which is only
> called from lxc_setup().  If rootfs is NULL, then all bets are off anyway
> and we can just call mount().  If rootfs is not NULL, then we're mounting
> in a separate container rootfs and we should use safe_mount(), but we
> expect the real /proc to then exist.

In that case, would the following fix be more 

[lxc-devel] [RFC 0/2] Feature: --start-frozen

2016-01-14 Thread Wolfgang Bumiller
Patch 1:

This adds the possibility to start a container in a frozen state via
a --start-frozen parameter added to lxc-start.

I'm not sure the implementation is ideal. There's probably some
reusable cgroup code somewhere, but as far as I could tell it is
mostly meant for the monitor, so I don't know which parts I can use
within the child process.

This can be useful for when you want to get a fully prepared and
mounted namespace without running the container itself while having
the changes happening in hooks applied. In our case mount hooks, in
order to provide ways to push/pull files to/from the container or
mount it without things like multiple-mount-protection preventing you
from firing up the container before it's unmounted again.
We're still weighing some other ways we could go but I thought I'd get
some input about this patch since it doesn't seem to be too specific
to go upstream.

I've considered a few alternatives to this but I'm not sure if there
are any good ones.
I've considered an lxc.cgroup.freezer.state config entry, but that
happens too early and lxc-start won't exit until the container is
unfrozen.  Same problem with a start-hook, beside the fact that it
must be present in the container. (In theory one could open a handle
to an executable which waits for a signal and pass
/proc/self/fd/$handle as hook, but --daemonize always closes them.)

The semantics of the the start hook is another part to consider.
(Ignore? Add a parameter? Run later? Split?)

Patch 2: comment fixups

For the new function I used 'set_' instead of 'want_' as prefix to
make their purpose more obvious. Should I change it to 'want_' or
maybe change the cleanup patch to also rename the other functions to
'set_*'?

Wolfgang Bumiller (2):
  lxc-start: added --start-frozen
  cleanup: lxc_container::want_* comment descriptions

 doc/lxc-start.sgml.in  | 12 
 src/lxc/arguments.h|  3 +++
 src/lxc/conf.h |  1 +
 src/lxc/lxc_start.c|  7 +++
 src/lxc/lxccontainer.c | 16 
 src/lxc/lxccontainer.h | 19 ++-
 src/lxc/start.c| 17 +
 7 files changed, 70 insertions(+), 5 deletions(-)

-- 
2.1.4


___
lxc-devel mailing list
lxc-devel@lists.linuxcontainers.org
http://lists.linuxcontainers.org/listinfo/lxc-devel


[lxc-devel] [RFC 1/2] lxc-start: added --start-frozen

2016-01-14 Thread Wolfgang Bumiller
Add the possibility to start a container in a frozen state.

Signed-off-by: Wolfgang Bumiller 
---
 doc/lxc-start.sgml.in  | 12 
 src/lxc/arguments.h|  3 +++
 src/lxc/conf.h |  1 +
 src/lxc/lxc_start.c|  7 +++
 src/lxc/lxccontainer.c | 16 
 src/lxc/lxccontainer.h | 10 ++
 src/lxc/start.c| 17 +
 7 files changed, 66 insertions(+)

diff --git a/doc/lxc-start.sgml.in b/doc/lxc-start.sgml.in
index 1770ac2..46bfcc7 100644
--- a/doc/lxc-start.sgml.in
+++ b/doc/lxc-start.sgml.in
@@ -59,6 +59,7 @@ Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 
02110-1301 USA
   -s KEY=VAL
   -C
   --share-[net|ipc|uts] 
name|pid
+  --start-frozen
   command
 
   
@@ -245,6 +246,17 @@ Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, 
MA 02110-1301 USA

   
 
+  
+   
+ --start-frozen
+   
+   
+ 
+   Freeze the container before starting its init process.
+ 
+   
+  
+
 
 
   
diff --git a/src/lxc/arguments.h b/src/lxc/arguments.h
index 38cb0da..79a4758 100644
--- a/src/lxc/arguments.h
+++ b/src/lxc/arguments.h
@@ -81,6 +81,9 @@ struct lxc_arguments {
/* close fds from parent? */
int close_all_fds;
 
+   /* freeze before starting init */
+   int start_frozen;
+
/* lxc-create */
char *bdevtype, *configfile, *template;
char *fstype;
diff --git a/src/lxc/conf.h b/src/lxc/conf.h
index b0274ec..5c84f5f 100644
--- a/src/lxc/conf.h
+++ b/src/lxc/conf.h
@@ -312,6 +312,7 @@ struct lxc_conf {
struct lxc_rootfs rootfs;
char *ttydir;
int close_all_fds;
+   int start_frozen;
struct lxc_list hooks[NUM_LXC_HOOKS];
 
char *lsm_aa_profile;
diff --git a/src/lxc/lxc_start.c b/src/lxc/lxc_start.c
index 6b942ac..73e27e3 100644
--- a/src/lxc/lxc_start.c
+++ b/src/lxc/lxc_start.c
@@ -53,6 +53,7 @@
 #define OPT_SHARE_NET OPT_USAGE+1
 #define OPT_SHARE_IPC OPT_USAGE+2
 #define OPT_SHARE_UTS OPT_USAGE+3
+#define OPT_START_FROZEN OPT_USAGE+4
 
 lxc_log_define(lxc_start_ui, lxc);
 
@@ -154,6 +155,7 @@ static int my_parser(struct lxc_arguments* args, int c, 
char* arg)
case OPT_SHARE_NET: args->share_ns[LXC_NS_NET] = arg; break;
case OPT_SHARE_IPC: args->share_ns[LXC_NS_IPC] = arg; break;
case OPT_SHARE_UTS: args->share_ns[LXC_NS_UTS] = arg; break;
+   case OPT_START_FROZEN: args->start_frozen = 1; break;
}
return 0;
 }
@@ -170,6 +172,7 @@ static const struct option my_longopts[] = {
{"share-net", required_argument, 0, OPT_SHARE_NET},
{"share-ipc", required_argument, 0, OPT_SHARE_IPC},
{"share-uts", required_argument, 0, OPT_SHARE_UTS},
+   {"start-frozen", no_argument, 0, OPT_START_FROZEN},
LXC_COMMON_OPTIONS
 };
 
@@ -193,6 +196,7 @@ Options :\n\
  Note: --daemon implies --close-all-fds\n\
   -s, --define KEY=VAL   Assign VAL to configuration variable KEY\n\
   --share-[net|ipc|uts]=NAME Share a namespace with another container or 
pid\n\
+  --start-frozen Freeze the container before starting its init\n\
 ",
.options   = my_longopts,
.parser= my_parser,
@@ -335,6 +339,9 @@ int main(int argc, char *argv[])
if (my_args.close_all_fds)
c->want_close_all_fds(c, true);
 
+   if (my_args.start_frozen)
+   c->set_start_frozen(c, true);
+
if (args == default_args)
err = c->start(c, 0, NULL) ? 0 : 1;
else
diff --git a/src/lxc/lxccontainer.c b/src/lxc/lxccontainer.c
index 802a930..166f507 100644
--- a/src/lxc/lxccontainer.c
+++ b/src/lxc/lxccontainer.c
@@ -590,6 +590,21 @@ static bool do_lxcapi_want_close_all_fds(struct 
lxc_container *c, bool state)
 
 WRAP_API_1(bool, lxcapi_want_close_all_fds, bool)
 
+static bool do_lxcapi_set_start_frozen(struct lxc_container *c, bool state)
+{
+   if (!c || !c->lxc_conf)
+   return false;
+   if (container_mem_lock(c)) {
+   ERROR("Error getting mem lock");
+   return false;
+   }
+   c->lxc_conf->start_frozen = state;
+   container_mem_unlock(c);
+   return true;
+}
+
+WRAP_API_1(bool, lxcapi_set_start_frozen, bool)
+
 static bool do_lxcapi_wait(struct lxc_container *c, const char *state, int 
timeout)
 {
int ret;
@@ -4074,6 +4089,7 @@ struct lxc_container *lxc_container_new(const char *name, 
const char *configpath
c->load_config = lxcapi_load_config;
c->want_daemonize = lxcapi_want_daemonize;
c->want_close_all_fds = lxcapi_want_close_all_fds;
+   c->set_start_frozen = lxcapi_set_start_frozen;
c->start = lxcapi_start;
c->startl = lxcapi_startl;
c->stop = lxcapi_stop;
diff --git a/src/lxc/lxccontainer.h b/src/lxc/lxccontainer.h
index 6d155a1..454f029 100644
--- a/src/lxc/lxccontainer.h
+++ 

[lxc-devel] [RFC 2/2] cleanup: lxc_container::want_* comment descriptions

2016-01-14 Thread Wolfgang Bumiller
They change a value and return true on success rather than
fetching the value as the comments previously suggested.

Signed-off-by: Wolfgang Bumiller 
---
 src/lxc/lxccontainer.h | 9 -
 1 file changed, 4 insertions(+), 5 deletions(-)

diff --git a/src/lxc/lxccontainer.h b/src/lxc/lxccontainer.h
index 454f029..3ab258e 100644
--- a/src/lxc/lxccontainer.h
+++ b/src/lxc/lxccontainer.h
@@ -222,25 +222,24 @@ struct lxc_container {
bool (*stop)(struct lxc_container *c);
 
/*!
-* \brief Determine if the container wants to run disconnected
+* \brief Change whether the container wants to run disconnected
 * from the terminal.
 *
 * \param c Container.
 * \param state Value for the daemonize bit (0 or 1).
 *
-* \return \c true if container wants to be daemonised, else \c false.
+* \return \c true on success, else \c false.
 */
bool (*want_daemonize)(struct lxc_container *c, bool state);
 
/*!
-* \brief Determine whether container wishes all file descriptors
+* \brief Change whether the container wishes all file descriptors
 *  to be closed on startup.
 *
 * \param c Container.
 * \param state Value for the close_all_fds bit (0 or 1).
 *
-* \return \c true if container wants all file descriptors closed,
-*  else \c false.
+* \return \c true on success, else \c false.
 */
bool (*want_close_all_fds)(struct lxc_container *c, bool state);
 
-- 
2.1.4


___
lxc-devel mailing list
lxc-devel@lists.linuxcontainers.org
http://lists.linuxcontainers.org/listinfo/lxc-devel


Re: [lxc-devel] [PATCH] c/r: remember to chown the cgroup path (correctly)

2016-01-14 Thread Tycho Andersen
On Thu, Jan 14, 2016 at 09:28:07AM +, Serge Hallyn wrote:
> Quoting Tycho Andersen (tycho.ander...@canonical.com):
> > On Wed, Jan 13, 2016 at 09:47:50PM +, Serge Hallyn wrote:
> > > Quoting Tycho Andersen (tycho.ander...@canonical.com):
> > > > 1. remember to chown the cgroup path when migrating a container
> > > > 2. when restoring the cgroup path, try to compute the euid for root vs.
> > > >using geteuid(); geteuid works for start, but it doesn't work for
> > > >migration since we're still real root at that point.
> > > > 
> > > > Signed-off-by: Tycho Andersen 
> > > > ---
> > > >  src/lxc/cgmanager.c | 6 +-
> > > >  src/lxc/criu.c  | 5 +
> > > >  2 files changed, 10 insertions(+), 1 deletion(-)
> > > > 
> > > > diff --git a/src/lxc/cgmanager.c b/src/lxc/cgmanager.c
> > > > index 357182a..54e6912 100644
> > > > --- a/src/lxc/cgmanager.c
> > > > +++ b/src/lxc/cgmanager.c
> > > > @@ -488,7 +488,11 @@ static bool chown_cgroup(const char *cgroup_path, 
> > > > struct lxc_conf *conf)
> > > > return true;
> > > >  
> > > > data.cgroup_path = cgroup_path;
> > > > -   data.origuid = geteuid();
> > > > +   data.origuid = mapped_hostid(0, conf, ID_TYPE_UID);
> 
> now, when starting a container, this happens in the
> parent task in original uid.  So the geteuid() returns 1000,
> mapped_hostid(0, conf, ID_TYPE_UID) something like 10.

Are you sure? Wouldn't it chmod everything to 1000 instead of 10
in that case and be all screwed up?

> This is probably ok - but did you run all the lxc tests against
> this to make sure?  You can still run 'lxc-cgroup' etc as an
> unpriv user?

Well, it does change the behavior in ways I don't quite understand
yet, so that's probably worth talking about. Before this series (the
644 one and this patch), I get:

/sys/fs/cgroup/cpuset/lxc/unpriv $ ls -alh
total 0
drwxrwxr-x 2 root 10 0 Jan 14 06:25 ./
drwxr-xr-x 3 root root   0 Jan 14 06:25 ../
-rw-r--r-- 1 root root   0 Jan 14 06:25 cgroup.clone_children
-rwxrwxr-x 1 root 10 0 Jan 14 06:26 cgroup.procs*
-rw-r--r-- 1 root root   0 Jan 14 06:25 cpuset.cpu_exclusive
-rw-r--r-- 1 root root   0 Jan 14 06:25 cpuset.cpus
-r--r--r-- 1 root root   0 Jan 14 06:25 cpuset.effective_cpus
-r--r--r-- 1 root root   0 Jan 14 06:25 cpuset.effective_mems
-rw-r--r-- 1 root root   0 Jan 14 06:25 cpuset.mem_exclusive
-rw-r--r-- 1 root root   0 Jan 14 06:25 cpuset.mem_hardwall
-rw-r--r-- 1 root root   0 Jan 14 06:25 cpuset.memory_migrate
-r--r--r-- 1 root root   0 Jan 14 06:25 cpuset.memory_pressure
-rw-r--r-- 1 root root   0 Jan 14 06:25 cpuset.memory_spread_page
-rw-r--r-- 1 root root   0 Jan 14 06:25 cpuset.memory_spread_slab
-rw-r--r-- 1 root root   0 Jan 14 06:25 cpuset.mems
-rw-r--r-- 1 root root   0 Jan 14 06:25 cpuset.sched_load_balance
-rw-r--r-- 1 root root   0 Jan 14 06:25 cpuset.sched_relax_domain_level
-rw-r--r-- 1 root root   0 Jan 14 06:25 notify_on_release
-rwxrwxr-x 1 root 10 0 Jan 14 06:25 tasks*

and after:

/sys/fs/cgroup/cpuset/lxc/unpriv $ ls -alh
total 0
drwxrwxr-x 2 10 10 0 Jan 14 06:27 ./
drwxr-xr-x 3 root   root   0 Jan 14 06:27 ../
-rw-r--r-- 1 root   root   0 Jan 14 06:27 cgroup.clone_children
-rw-rw-r-- 1 10 10 0 Jan 14 06:27 cgroup.procs
-rw-r--r-- 1 root   root   0 Jan 14 06:27 cpuset.cpu_exclusive
-rw-r--r-- 1 root   root   0 Jan 14 06:27 cpuset.cpus
-r--r--r-- 1 root   root   0 Jan 14 06:27 cpuset.effective_cpus
-r--r--r-- 1 root   root   0 Jan 14 06:27 cpuset.effective_mems
-rw-r--r-- 1 root   root   0 Jan 14 06:27 cpuset.mem_exclusive
-rw-r--r-- 1 root   root   0 Jan 14 06:27 cpuset.mem_hardwall
-rw-r--r-- 1 root   root   0 Jan 14 06:27 cpuset.memory_migrate
-r--r--r-- 1 root   root   0 Jan 14 06:27 cpuset.memory_pressure
-rw-r--r-- 1 root   root   0 Jan 14 06:27 cpuset.memory_spread_page
-rw-r--r-- 1 root   root   0 Jan 14 06:27 cpuset.memory_spread_slab
-rw-r--r-- 1 root   root   0 Jan 14 06:27 cpuset.mems
-rw-r--r-- 1 root   root   0 Jan 14 06:27 cpuset.sched_load_balance
-rw-r--r-- 1 root   root   0 Jan 14 06:27 cpuset.sched_relax_domain_level
-rw-r--r-- 1 root   root   0 Jan 14 06:27 notify_on_release
-rw-rw-r-- 1 10 10 0 Jan 14 06:27 tasks

However, it's not clear really /why/ from this patch the uid on tasks and
cgroup.progs gets changed as well (probably some mask bug? it was ID_TYPE_UID 
before but changing the gid? not sure).

So, what's the correct behavior here? Should we leave the uid on the files as
real root, or chmod those to container root as well? (if it's the latter, i.e.
the behavior introduced by this patch, it's probably still worth understanding
if the former works, just to make sure that there are no other latent bugs; I
don't mind figuring that out)

Tycho

> > > > +   if (data.origuid < 0) {
> > > 
> > > Can you confirm that this does not break
> > > 
> > > sudo lxc-create -t download -n x1 -- -d ubuntu -r trusty -a amd64
> > > sudo lxc-start -n x1
> > > 
> > > 

[lxc-devel] Swap Accounting :Query

2016-01-14 Thread Nehal J Wani
I am using the most recent version of lxcfs, commit:
17f9a5a9d647467e3858fa751e40cc7c022dd475

When I spawn a container with the settings...

lxc.cgroup.memory.limit_in_bytes = 256M
lxc.cgroup.memory.memsw.limit_in_bytes = 512M

... I find that inside the container, we have:

[root@test ~]# free -m
 total   used   free sharedbuffers cached
Mem:   256 24231  6  0 20
-/+ buffers/cache:  3252
Swap: 1023  0   1023

As you can see, the Total swap should have been 256MB, but it is being
shown as ~1GB (same as host). Basically, my memsw setting is being
ignored.

Now, when I revert the patch
https://github.com/lxc/lxcfs/commit/a2de34bad72fb1b30af05d2f39a2b7bc6f20e42d

Then I see the correct usage:
[root@test ~]# free -m
 total   used   free sharedbuffers cached
Mem:   256 24231  6  0 20
-/+ buffers/cache:  3252
Swap:  256  0256

But off course, this brings back the issue reported at
https://github.com/lxc/lxcfs/issues/60

To fix both the problems, I proposed a patch at:
https://lists.linuxcontainers.org/pipermail/lxc-devel/2016-January/013214.html
which works well for me when host machine is CentOS 7. But fails to
work when the host machine is Ubuntu 15.10

I followed:
https://github.com/torvalds/linux/blob/v4.3/mm/memcontrol.c#L4037
https://github.com/torvalds/linux/blob/v4.3/mm/memcontrol.c#L2893
https://github.com/torvalds/linux/blob/v4.3/include/linux/page_counter.h#L21
To find out that the default value for
/sys/fs/cgroup/memory/lxc/test/memory.memsw.limit_in_bytes is
RES_LIMIT, which is PAGE_COUNTER_MAX * PAGE_SIZE, which is equal to
9223372036854775807 (LONG_MAX) in case of CentOS7 (kernel 3.10.0) and
9223372036854771712 in case of Ubuntu14.04 (kernel 4.2.0). So
assumption of LONG_MAX always is obviously wrong.

Only if somehow lxcfs knew that memsw.limit is not set in the
container configuration...?

-- 
Nehal J Wani
___
lxc-devel mailing list
lxc-devel@lists.linuxcontainers.org
http://lists.linuxcontainers.org/listinfo/lxc-devel


[lxc-devel] [PATCH v2] Fix btrfs bus error on sparc on snapshot delete

2016-01-14 Thread Thomas Tanaka
The following patch fixes memory alignment and endianness
issue while doing a snapshot deletion with btrfs as a
backing store on platform such as sparc.

The implementation is taken from btrfs-progs.

Changes since v1:
- include  for bswap definition
- include defined function name as a comment above BTRFS_SETGET_STACK_FUNCS

Signed-off-by: Thomas Tanaka 
---
 src/lxc/bdev/lxcbtrfs.c |   28 +++-
 src/lxc/bdev/lxcbtrfs.h |   45 +
 2 files changed, 60 insertions(+), 13 deletions(-)

diff --git a/src/lxc/bdev/lxcbtrfs.c b/src/lxc/bdev/lxcbtrfs.c
index 2db7c87..0f8c1fc 100644
--- a/src/lxc/bdev/lxcbtrfs.c
+++ b/src/lxc/bdev/lxcbtrfs.c
@@ -565,7 +565,7 @@ static int btrfs_recursive_destroy(const char *path)
int fd;
struct btrfs_ioctl_search_args args;
struct btrfs_ioctl_search_key *sk = 
-   struct btrfs_ioctl_search_header *sh;
+   struct btrfs_ioctl_search_header sh;
struct btrfs_root_ref *ref;
struct my_btrfs_tree *tree;
int ret, i;
@@ -573,6 +573,7 @@ static int btrfs_recursive_destroy(const char *path)
int name_len;
char *name;
char *tmppath;
+   u64 dir_id;
 
fd = open(path, O_RDONLY);
if (fd < 0) {
@@ -624,21 +625,22 @@ static int btrfs_recursive_destroy(const char *path)
 
off = 0;
for (i = 0; i < sk->nr_items; i++) {
-   sh = (struct btrfs_ioctl_search_header *)(args.buf + 
off);
-   off += sizeof(*sh);
+   memcpy(, args.buf + off, sizeof(sh));
+   off += sizeof(sh);
/*
 * A backref key with the name and dirid of the parent
 * comes followed by the reoot ref key which has the
 * name of the child subvol in question.
 */
-   if (sh->objectid != root_id && sh->type == 
BTRFS_ROOT_BACKREF_KEY) {
+   if (sh.objectid != root_id && sh.type == 
BTRFS_ROOT_BACKREF_KEY) {
ref = (struct btrfs_root_ref *)(args.buf + off);
-   name_len = ref->name_len;
+   name_len = btrfs_stack_root_ref_name_len(ref);
name = (char *)(ref + 1);
-   tmppath = get_btrfs_subvol_path(fd, sh->offset,
-   ref->dirid, name, name_len);
-   if (!add_btrfs_tree_node(tree, sh->objectid,
-   sh->offset, name,
+   dir_id = btrfs_stack_root_ref_dirid(ref);
+   tmppath = get_btrfs_subvol_path(fd, sh.offset,
+   dir_id, name, name_len);
+   if (!add_btrfs_tree_node(tree, sh.objectid,
+   sh.offset, name,
name_len, tmppath)) {
ERROR("Out of memory");
free_btrfs_tree(tree);
@@ -648,15 +650,15 @@ static int btrfs_recursive_destroy(const char *path)
}
free(tmppath);
}
-   off += sh->len;
+   off += sh.len;
 
/*
 * record the mins in sk so we can make sure the
 * next search doesn't repeat this root
 */
-   sk->min_objectid = sh->objectid;
-   sk->min_type = sh->type;
-   sk->min_offset = sh->offset;
+   sk->min_objectid = sh.objectid;
+   sk->min_type = sh.type;
+   sk->min_offset = sh.offset;
}
sk->nr_items = 4096;
sk->min_offset++;
diff --git a/src/lxc/bdev/lxcbtrfs.h b/src/lxc/bdev/lxcbtrfs.h
index e0adb7a..3b2742f 100644
--- a/src/lxc/bdev/lxcbtrfs.h
+++ b/src/lxc/bdev/lxcbtrfs.h
@@ -28,6 +28,7 @@
 #include  /* __le64, __l32 ... */
 #include 
 #include 
+#include 
 
 typedef uint8_t u8;
 typedef uint16_t u16;
@@ -317,6 +318,50 @@ struct btrfs_ioctl_ino_lookup_args {
 #define BTRFS_LAST_FREE_OBJECTID -256ULL
 #define BTRFS_FIRST_CHUNK_TREE_OBJECTID 256ULL
 
+/*
+ * The followings are macro for correctly getting member of
+ * structures in both low and big endian platforms as per
+ * btrfs-progs
+ */
+#ifdef __CHECKER__
+#define __force__attribute__((force))
+#else
+#define __force
+#endif
+
+#if __BYTE_ORDER == __BIG_ENDIAN
+#define cpu_to_le64(x) ((__force __le64)(u64)(bswap_64(x)))
+#define le64_to_cpu(x) ((__force 

Re: [lxc-devel] [PATCH v2] Fix btrfs bus error on sparc on snapshot delete

2016-01-14 Thread Serge Hallyn
Quoting Thomas Tanaka (thomas.tan...@oracle.com):
> The following patch fixes memory alignment and endianness
> issue while doing a snapshot deletion with btrfs as a
> backing store on platform such as sparc.
> 
> The implementation is taken from btrfs-progs.
> 
> Changes since v1:
> - include  for bswap definition
> - include defined function name as a comment above BTRFS_SETGET_STACK_FUNCS
> 
> Signed-off-by: Thomas Tanaka 

Thanks!

Acked-by: Serge E. Hallyn 

> ---
>  src/lxc/bdev/lxcbtrfs.c |   28 +++-
>  src/lxc/bdev/lxcbtrfs.h |   45 +
>  2 files changed, 60 insertions(+), 13 deletions(-)
> 
> diff --git a/src/lxc/bdev/lxcbtrfs.c b/src/lxc/bdev/lxcbtrfs.c
> index 2db7c87..0f8c1fc 100644
> --- a/src/lxc/bdev/lxcbtrfs.c
> +++ b/src/lxc/bdev/lxcbtrfs.c
> @@ -565,7 +565,7 @@ static int btrfs_recursive_destroy(const char *path)
>   int fd;
>   struct btrfs_ioctl_search_args args;
>   struct btrfs_ioctl_search_key *sk = 
> - struct btrfs_ioctl_search_header *sh;
> + struct btrfs_ioctl_search_header sh;
>   struct btrfs_root_ref *ref;
>   struct my_btrfs_tree *tree;
>   int ret, i;
> @@ -573,6 +573,7 @@ static int btrfs_recursive_destroy(const char *path)
>   int name_len;
>   char *name;
>   char *tmppath;
> + u64 dir_id;
>  
>   fd = open(path, O_RDONLY);
>   if (fd < 0) {
> @@ -624,21 +625,22 @@ static int btrfs_recursive_destroy(const char *path)
>  
>   off = 0;
>   for (i = 0; i < sk->nr_items; i++) {
> - sh = (struct btrfs_ioctl_search_header *)(args.buf + 
> off);
> - off += sizeof(*sh);
> + memcpy(, args.buf + off, sizeof(sh));
> + off += sizeof(sh);
>   /*
>* A backref key with the name and dirid of the parent
>* comes followed by the reoot ref key which has the
>* name of the child subvol in question.
>*/
> - if (sh->objectid != root_id && sh->type == 
> BTRFS_ROOT_BACKREF_KEY) {
> + if (sh.objectid != root_id && sh.type == 
> BTRFS_ROOT_BACKREF_KEY) {
>   ref = (struct btrfs_root_ref *)(args.buf + off);
> - name_len = ref->name_len;
> + name_len = btrfs_stack_root_ref_name_len(ref);
>   name = (char *)(ref + 1);
> - tmppath = get_btrfs_subvol_path(fd, sh->offset,
> - ref->dirid, name, name_len);
> - if (!add_btrfs_tree_node(tree, sh->objectid,
> - sh->offset, name,
> + dir_id = btrfs_stack_root_ref_dirid(ref);
> + tmppath = get_btrfs_subvol_path(fd, sh.offset,
> + dir_id, name, name_len);
> + if (!add_btrfs_tree_node(tree, sh.objectid,
> + sh.offset, name,
>   name_len, tmppath)) {
>   ERROR("Out of memory");
>   free_btrfs_tree(tree);
> @@ -648,15 +650,15 @@ static int btrfs_recursive_destroy(const char *path)
>   }
>   free(tmppath);
>   }
> - off += sh->len;
> + off += sh.len;
>  
>   /*
>* record the mins in sk so we can make sure the
>* next search doesn't repeat this root
>*/
> - sk->min_objectid = sh->objectid;
> - sk->min_type = sh->type;
> - sk->min_offset = sh->offset;
> + sk->min_objectid = sh.objectid;
> + sk->min_type = sh.type;
> + sk->min_offset = sh.offset;
>   }
>   sk->nr_items = 4096;
>   sk->min_offset++;
> diff --git a/src/lxc/bdev/lxcbtrfs.h b/src/lxc/bdev/lxcbtrfs.h
> index e0adb7a..3b2742f 100644
> --- a/src/lxc/bdev/lxcbtrfs.h
> +++ b/src/lxc/bdev/lxcbtrfs.h
> @@ -28,6 +28,7 @@
>  #include  /* __le64, __l32 ... */
>  #include 
>  #include 
> +#include 
>  
>  typedef uint8_t u8;
>  typedef uint16_t u16;
> @@ -317,6 +318,50 @@ struct btrfs_ioctl_ino_lookup_args {
>  #define BTRFS_LAST_FREE_OBJECTID -256ULL
>  #define BTRFS_FIRST_CHUNK_TREE_OBJECTID 256ULL
>  
> +/*
> + * The followings are macro for correctly getting member of
> + * structures in both low and big endian platforms as per
> + * btrfs-progs
> + */
> +#ifdef 

Re: [lxc-devel] [PATCH] add lxc-copy to see_also.sgml.in

2016-01-14 Thread Serge Hallyn
Quoting Christian Brauner (christian.brau...@mailbox.org):
> Signed-off-by: Christian Brauner 

Acked-by: Serge E. Hallyn 

> ---
>  doc/see_also.sgml.in | 5 +
>  1 file changed, 5 insertions(+)
> 
> diff --git a/doc/see_also.sgml.in b/doc/see_also.sgml.in
> index 4954e8e..3b3ecd7 100644
> --- a/doc/see_also.sgml.in
> +++ b/doc/see_also.sgml.in
> @@ -38,6 +38,11 @@ Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, 
> MA 02110-1301 USA
>,
>  
>
> + lxc-copy
> + 1
> +  ,
> +
> +  
>   lxc-destroy
>   1
>,
> -- 
> 2.7.0
> 
> ___
> lxc-devel mailing list
> lxc-devel@lists.linuxcontainers.org
> http://lists.linuxcontainers.org/listinfo/lxc-devel
___
lxc-devel mailing list
lxc-devel@lists.linuxcontainers.org
http://lists.linuxcontainers.org/listinfo/lxc-devel


[lxc-devel] [PATCH] add lxc-copy to see_also.sgml.in

2016-01-14 Thread Christian Brauner
Signed-off-by: Christian Brauner 
---
 doc/see_also.sgml.in | 5 +
 1 file changed, 5 insertions(+)

diff --git a/doc/see_also.sgml.in b/doc/see_also.sgml.in
index 4954e8e..3b3ecd7 100644
--- a/doc/see_also.sgml.in
+++ b/doc/see_also.sgml.in
@@ -38,6 +38,11 @@ Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, 
MA 02110-1301 USA
   ,
 
   
+   lxc-copy
+   1
+  ,
+
+  
lxc-destroy
1
   ,
-- 
2.7.0

___
lxc-devel mailing list
lxc-devel@lists.linuxcontainers.org
http://lists.linuxcontainers.org/listinfo/lxc-devel


[lxc-devel] [lxc/lxc] 685062: avoid printing null string in error message

2016-01-14 Thread GitHub
  Branch: refs/heads/master
  Home:   https://github.com/lxc/lxc
  Commit: 685062d6ff3eccf1b43abecbab95c213c22428e0
  https://github.com/lxc/lxc/commit/685062d6ff3eccf1b43abecbab95c213c22428e0
  Author: Serge Hallyn 
  Date:   2016-01-14 (Thu, 14 Jan 2016)

  Changed paths:
M src/lxc/conf.c

  Log Message:
  ---
  avoid printing null string in error message

Show the ifindex in case it's useful

Signed-off-by: Serge Hallyn 


___
lxc-devel mailing list
lxc-devel@lists.linuxcontainers.org
http://lists.linuxcontainers.org/listinfo/lxc-devel


Re: [lxc-devel] [PATCH] check for btrfs fs in should_default_to_snapshot

2016-01-14 Thread Serge Hallyn
Quoting Christian Brauner (christian.brau...@mailbox.org):
> Check if we're really on a btrfs filesystem before we call btrfs_same_fs().
> Otherwise we will report misleading errors although everything went fine.
> 
> Signed-off-by: Christian Brauner 

Acked-by: Serge E. Hallyn 

> ---
>  src/lxc/bdev/lxcbtrfs.c   | 2 +-
>  src/lxc/bdev/lxcoverlay.c | 2 +-
>  src/lxc/lxccontainer.c| 4 
>  3 files changed, 6 insertions(+), 2 deletions(-)
> 
> diff --git a/src/lxc/bdev/lxcbtrfs.c b/src/lxc/bdev/lxcbtrfs.c
> index 2db7c87..80b22fd 100644
> --- a/src/lxc/bdev/lxcbtrfs.c
> +++ b/src/lxc/bdev/lxcbtrfs.c
> @@ -40,7 +40,7 @@
>  #include "lxcrsync.h"
>  #include "utils.h"
>  
> -lxc_log_define(btrfs, lxc);
> +lxc_log_define(lxcbtrfs, lxc);
>  
>  /* defined in lxccontainer.c: needs to become common helper */
>  extern char *dir_new_path(char *src, const char *oldname, const char *name,
> diff --git a/src/lxc/bdev/lxcoverlay.c b/src/lxc/bdev/lxcoverlay.c
> index d06d24f..4d30f3c 100644
> --- a/src/lxc/bdev/lxcoverlay.c
> +++ b/src/lxc/bdev/lxcoverlay.c
> @@ -36,7 +36,7 @@
>  #include "lxcrsync.h"
>  #include "utils.h"
>  
> -lxc_log_define(overlay, lxc);
> +lxc_log_define(lxcoverlay, lxc);
>  
>  static char *ovl_name;
>  
> diff --git a/src/lxc/lxccontainer.c b/src/lxc/lxccontainer.c
> index 490a59d..7f2a390 100644
> --- a/src/lxc/lxccontainer.c
> +++ b/src/lxc/lxccontainer.c
> @@ -2833,6 +2833,10 @@ bool should_default_to_snapshot(struct lxc_container 
> *c0,
>  
>   snprintf(p0, l0, "%s/%s", c0->config_path, c0->name);
>   snprintf(p1, l1, "%s/%s", c1->config_path, c1->name);
> +
> + if (!is_btrfs_fs(p0) || !is_btrfs_fs(p1))
> + return false;
> +
>   return btrfs_same_fs(p0, p1) == 0;
>  }
>  
> -- 
> 2.7.0
> 
> ___
> lxc-devel mailing list
> lxc-devel@lists.linuxcontainers.org
> http://lists.linuxcontainers.org/listinfo/lxc-devel
___
lxc-devel mailing list
lxc-devel@lists.linuxcontainers.org
http://lists.linuxcontainers.org/listinfo/lxc-devel


[lxc-devel] [PATCH] Fix btrfs bus error on sparc on snapshot delete

2016-01-14 Thread Thomas Tanaka
The following patch fixes memory alignment and endianness
issue while doing a snapshot deletion with btrfs as a
backing store on platform such as sparc.

The implementation is taken from btrfs-progs.

Signed-off-by: Thomas Tanaka 
---
 src/lxc/bdev/lxcbtrfs.c |   28 +++-
 src/lxc/bdev/lxcbtrfs.h |   39 +++
 2 files changed, 54 insertions(+), 13 deletions(-)

diff --git a/src/lxc/bdev/lxcbtrfs.c b/src/lxc/bdev/lxcbtrfs.c
index 2db7c87..0f8c1fc 100644
--- a/src/lxc/bdev/lxcbtrfs.c
+++ b/src/lxc/bdev/lxcbtrfs.c
@@ -565,7 +565,7 @@ static int btrfs_recursive_destroy(const char *path)
int fd;
struct btrfs_ioctl_search_args args;
struct btrfs_ioctl_search_key *sk = 
-   struct btrfs_ioctl_search_header *sh;
+   struct btrfs_ioctl_search_header sh;
struct btrfs_root_ref *ref;
struct my_btrfs_tree *tree;
int ret, i;
@@ -573,6 +573,7 @@ static int btrfs_recursive_destroy(const char *path)
int name_len;
char *name;
char *tmppath;
+   u64 dir_id;
 
fd = open(path, O_RDONLY);
if (fd < 0) {
@@ -624,21 +625,22 @@ static int btrfs_recursive_destroy(const char *path)
 
off = 0;
for (i = 0; i < sk->nr_items; i++) {
-   sh = (struct btrfs_ioctl_search_header *)(args.buf + 
off);
-   off += sizeof(*sh);
+   memcpy(, args.buf + off, sizeof(sh));
+   off += sizeof(sh);
/*
 * A backref key with the name and dirid of the parent
 * comes followed by the reoot ref key which has the
 * name of the child subvol in question.
 */
-   if (sh->objectid != root_id && sh->type == 
BTRFS_ROOT_BACKREF_KEY) {
+   if (sh.objectid != root_id && sh.type == 
BTRFS_ROOT_BACKREF_KEY) {
ref = (struct btrfs_root_ref *)(args.buf + off);
-   name_len = ref->name_len;
+   name_len = btrfs_stack_root_ref_name_len(ref);
name = (char *)(ref + 1);
-   tmppath = get_btrfs_subvol_path(fd, sh->offset,
-   ref->dirid, name, name_len);
-   if (!add_btrfs_tree_node(tree, sh->objectid,
-   sh->offset, name,
+   dir_id = btrfs_stack_root_ref_dirid(ref);
+   tmppath = get_btrfs_subvol_path(fd, sh.offset,
+   dir_id, name, name_len);
+   if (!add_btrfs_tree_node(tree, sh.objectid,
+   sh.offset, name,
name_len, tmppath)) {
ERROR("Out of memory");
free_btrfs_tree(tree);
@@ -648,15 +650,15 @@ static int btrfs_recursive_destroy(const char *path)
}
free(tmppath);
}
-   off += sh->len;
+   off += sh.len;
 
/*
 * record the mins in sk so we can make sure the
 * next search doesn't repeat this root
 */
-   sk->min_objectid = sh->objectid;
-   sk->min_type = sh->type;
-   sk->min_offset = sh->offset;
+   sk->min_objectid = sh.objectid;
+   sk->min_type = sh.type;
+   sk->min_offset = sh.offset;
}
sk->nr_items = 4096;
sk->min_offset++;
diff --git a/src/lxc/bdev/lxcbtrfs.h b/src/lxc/bdev/lxcbtrfs.h
index e0adb7a..4ea2b4a 100644
--- a/src/lxc/bdev/lxcbtrfs.h
+++ b/src/lxc/bdev/lxcbtrfs.h
@@ -317,6 +317,45 @@ struct btrfs_ioctl_ino_lookup_args {
 #define BTRFS_LAST_FREE_OBJECTID -256ULL
 #define BTRFS_FIRST_CHUNK_TREE_OBJECTID 256ULL
 
+/* The followings are macro for correctly getting member of structures
+ * in both low and big endian platforms as per btrfs-progs 
+ */
+#ifdef __CHECKER__
+#define __force__attribute__((force))
+#else
+#define __force
+#endif
+
+#if __BYTE_ORDER == __BIG_ENDIAN
+#define cpu_to_le64(x) ((__force __le64)(u64)(bswap_64(x)))
+#define le64_to_cpu(x) ((__force u64)(__le64)(bswap_64(x)))
+#define cpu_to_le32(x) ((__force __le32)(u32)(bswap_32(x)))
+#define le32_to_cpu(x) ((__force u32)(__le32)(bswap_32(x)))
+#define cpu_to_le16(x) ((__force __le16)(u16)(bswap_16(x)))
+#define le16_to_cpu(x) ((__force u16)(__le16)(bswap_16(x)))
+#else

Re: [lxc-devel] [PATCH v2] safe_mount: Handle mounting proc and refactor

2016-01-14 Thread Serge Hallyn
Quoting Bogdan Purcareata (bogdan.purcare...@nxp.com):
> On 14.01.2016 01:09, Serge Hallyn wrote:
> > Quoting Bogdan Purcareata (bogdan.purcare...@nxp.com):
> >> On 11.01.2016 20:59, Serge Hallyn wrote:
> >>> Quoting Bogdan Purcareata (bogdan.purcare...@nxp.com):
>  The safe_mount primitive will mount the fs in the new container
>  environment by using file descriptors referred in /proc/self/fd.
>  However, when the mounted filesystem is proc itself, it will have
>  been previously unmounted, therefore resulting in an error when
>  searching for these file descriptors. This only happens when there's
>  no container rootfs prefix (commonly with lxc-execute).
> 
>  Implement the support for this use case as well, by doing the mount
>  based on the full path.
> 
>  Refactor the whole function in order to remove duplicated code checks
>  and improve readability.
> 
>  Changes since v1:
>  - In order to address CVE-2015-1335, still check if the destination is
>  not a symlink. Do the mount only if the destination file descriptor
>  exists.
> 
>  Signed-off-by: Bogdan Purcareata 
>  ---
> src/lxc/utils.c | 49 -
> 1 file changed, 28 insertions(+), 21 deletions(-)
> 
>  diff --git a/src/lxc/utils.c b/src/lxc/utils.c
>  index d9e769d..c53711a 100644
>  --- a/src/lxc/utils.c
>  +++ b/src/lxc/utils.c
>  @@ -1644,9 +1644,9 @@ out:
> int safe_mount(const char *src, const char *dest, const char *fstype,
>   unsigned long flags, const void *data, const char 
>  *rootfs)
> {
>  -int srcfd = -1, destfd, ret, saved_errno;
>  +int srcfd = -1, destfd = -1, ret = 0;
>   char srcbuf[50], destbuf[50]; // only needs enough for 
>  /proc/self/fd/
>  -const char *mntsrc = src;
>  +const char *mntsrc = src, *mntdest = dest;
> 
>   if (!rootfs)
>   rootfs = "";
>  @@ -1655,45 +1655,52 @@ int safe_mount(const char *src, const char 
>  *dest, const char *fstype,
>   if (flags & MS_BIND && src && src[0] != '/') {
>   INFO("this is a relative bind mount");
>   srcfd = open_without_symlink(src, NULL);
>  -if (srcfd < 0)
>  -return srcfd;
>  +if (srcfd < 0) {
>  +ret = srcfd;
>  +goto out;
>  +}
>   ret = snprintf(srcbuf, 50, "/proc/self/fd/%d", srcfd);
>   if (ret < 0 || ret > 50) {
>  -close(srcfd);
>   ERROR("Out of memory");
>  -return -EINVAL;
>  +ret = -EINVAL;
>  +goto out_src;
>   }
>   mntsrc = srcbuf;
>   }
> 
>   destfd = open_without_symlink(dest, rootfs);
>   if (destfd < 0) {
>  -if (srcfd != -1)
>  -close(srcfd);
>  -return destfd;
>  +ret = destfd;
>  +goto out_src;
>   }
> 
>   ret = snprintf(destbuf, 50, "/proc/self/fd/%d", destfd);
>   if (ret < 0 || ret > 50) {
>  -if (srcfd != -1)
>  -close(srcfd);
>  -close(destfd);
>   ERROR("Out of memory");
>  -return -EINVAL;
>  +ret = -EINVAL;
>  +goto out_dest;
>   }
> 
>  -ret = mount(mntsrc, destbuf, fstype, flags, data);
>  -saved_errno = errno;
>  -if (srcfd != -1)
>  -close(srcfd);
>  -close(destfd);
>  +/* make sure the destination descriptor exists */
>  +if (access(destbuf, F_OK) == 0)
>  +mntdest = destbuf;
> >>> First, if we're going to shortcut I'd prefer to say "if /proc/self
> >>> does not exist then skip this check" fo rnow.
> >>> But can we think of any way to still do this check?
> >>>
> >>> What exactly are the cases?
> >>>
> >>> 1. lxc-execute, lxc-init tries to mount /proc.  We should be able
> >>> to simply have lxc always mount /proc before the pivot_root, so
> >>> we can properly do this check.
> >>>
> >>> what use-cases will break if we demand /proc to exist in the
> >>> container?  (We can add an option to umount /proc in lxc-init,
> >>> but the directory would have to exist)
> >> That's my use case - the failing function is tmp_proc_mount, thus happening
> >
> > Wait, is your rootfs NULL or not?
> 
> Yep, it's NULL - plain "lxc-execute -n foo -f 

Re: [lxc-devel] [PATCH] c/r: remember to chown the cgroup path (correctly)

2016-01-14 Thread Serge Hallyn
Quoting Tycho Andersen (tycho.ander...@canonical.com):
> On Thu, Jan 14, 2016 at 09:28:07AM +, Serge Hallyn wrote:
> > Quoting Tycho Andersen (tycho.ander...@canonical.com):
> > > On Wed, Jan 13, 2016 at 09:47:50PM +, Serge Hallyn wrote:
> > > > Quoting Tycho Andersen (tycho.ander...@canonical.com):
> > > > > 1. remember to chown the cgroup path when migrating a container
> > > > > 2. when restoring the cgroup path, try to compute the euid for root 
> > > > > vs.
> > > > >using geteuid(); geteuid works for start, but it doesn't work for
> > > > >migration since we're still real root at that point.
> > > > > 
> > > > > Signed-off-by: Tycho Andersen 
> > > > > ---
> > > > >  src/lxc/cgmanager.c | 6 +-
> > > > >  src/lxc/criu.c  | 5 +
> > > > >  2 files changed, 10 insertions(+), 1 deletion(-)
> > > > > 
> > > > > diff --git a/src/lxc/cgmanager.c b/src/lxc/cgmanager.c
> > > > > index 357182a..54e6912 100644
> > > > > --- a/src/lxc/cgmanager.c
> > > > > +++ b/src/lxc/cgmanager.c
> > > > > @@ -488,7 +488,11 @@ static bool chown_cgroup(const char 
> > > > > *cgroup_path, struct lxc_conf *conf)
> > > > >   return true;
> > > > >  
> > > > >   data.cgroup_path = cgroup_path;
> > > > > - data.origuid = geteuid();
> > > > > + data.origuid = mapped_hostid(0, conf, ID_TYPE_UID);
> > 
> > now, when starting a container, this happens in the
> > parent task in original uid.  So the geteuid() returns 1000,
> > mapped_hostid(0, conf, ID_TYPE_UID) something like 10.
> 
> Are you sure? Wouldn't it chmod everything to 1000 instead of 10
> in that case and be all screwed up?

No I'm not sure :)  I was lazily asking you to add an fprintf and verify :)

> > This is probably ok - but did you run all the lxc tests against
> > this to make sure?  You can still run 'lxc-cgroup' etc as an
> > unpriv user?
> 
> Well, it does change the behavior in ways I don't quite understand
> yet, so that's probably worth talking about. Before this series (the
> 644 one and this patch), I get:
> 
> /sys/fs/cgroup/cpuset/lxc/unpriv $ ls -alh
> total 0
> drwxrwxr-x 2 root 10 0 Jan 14 06:25 ./
> drwxr-xr-x 3 root root   0 Jan 14 06:25 ../
> -rw-r--r-- 1 root root   0 Jan 14 06:25 cgroup.clone_children
> -rwxrwxr-x 1 root 10 0 Jan 14 06:26 cgroup.procs*
> -rw-r--r-- 1 root root   0 Jan 14 06:25 cpuset.cpu_exclusive
> -rw-r--r-- 1 root root   0 Jan 14 06:25 cpuset.cpus
> -r--r--r-- 1 root root   0 Jan 14 06:25 cpuset.effective_cpus
> -r--r--r-- 1 root root   0 Jan 14 06:25 cpuset.effective_mems
> -rw-r--r-- 1 root root   0 Jan 14 06:25 cpuset.mem_exclusive
> -rw-r--r-- 1 root root   0 Jan 14 06:25 cpuset.mem_hardwall
> -rw-r--r-- 1 root root   0 Jan 14 06:25 cpuset.memory_migrate
> -r--r--r-- 1 root root   0 Jan 14 06:25 cpuset.memory_pressure
> -rw-r--r-- 1 root root   0 Jan 14 06:25 cpuset.memory_spread_page
> -rw-r--r-- 1 root root   0 Jan 14 06:25 cpuset.memory_spread_slab
> -rw-r--r-- 1 root root   0 Jan 14 06:25 cpuset.mems
> -rw-r--r-- 1 root root   0 Jan 14 06:25 cpuset.sched_load_balance
> -rw-r--r-- 1 root root   0 Jan 14 06:25 cpuset.sched_relax_domain_level
> -rw-r--r-- 1 root root   0 Jan 14 06:25 notify_on_release
> -rwxrwxr-x 1 root 10 0 Jan 14 06:25 tasks*
> 
> and after:
> 
> /sys/fs/cgroup/cpuset/lxc/unpriv $ ls -alh
> total 0
> drwxrwxr-x 2 10 10 0 Jan 14 06:27 ./
> drwxr-xr-x 3 root   root   0 Jan 14 06:27 ../
> -rw-r--r-- 1 root   root   0 Jan 14 06:27 cgroup.clone_children
> -rw-rw-r-- 1 10 10 0 Jan 14 06:27 cgroup.procs
> -rw-r--r-- 1 root   root   0 Jan 14 06:27 cpuset.cpu_exclusive
> -rw-r--r-- 1 root   root   0 Jan 14 06:27 cpuset.cpus
> -r--r--r-- 1 root   root   0 Jan 14 06:27 cpuset.effective_cpus
> -r--r--r-- 1 root   root   0 Jan 14 06:27 cpuset.effective_mems
> -rw-r--r-- 1 root   root   0 Jan 14 06:27 cpuset.mem_exclusive
> -rw-r--r-- 1 root   root   0 Jan 14 06:27 cpuset.mem_hardwall
> -rw-r--r-- 1 root   root   0 Jan 14 06:27 cpuset.memory_migrate
> -r--r--r-- 1 root   root   0 Jan 14 06:27 cpuset.memory_pressure
> -rw-r--r-- 1 root   root   0 Jan 14 06:27 cpuset.memory_spread_page
> -rw-r--r-- 1 root   root   0 Jan 14 06:27 cpuset.memory_spread_slab
> -rw-r--r-- 1 root   root   0 Jan 14 06:27 cpuset.mems
> -rw-r--r-- 1 root   root   0 Jan 14 06:27 cpuset.sched_load_balance
> -rw-r--r-- 1 root   root   0 Jan 14 06:27 cpuset.sched_relax_domain_level
> -rw-r--r-- 1 root   root   0 Jan 14 06:27 notify_on_release
> -rw-rw-r-- 1 10 10 0 Jan 14 06:27 tasks
> 
> However, it's not clear really /why/ from this patch the uid on tasks and
> cgroup.progs gets changed as well (probably some mask bug? it was ID_TYPE_UID 
> before but changing the gid? not sure).

Cgmanager does that automatically.  When you chown cgroup freezer:/lxc/c1,
cgmanager chowns the directory itself (so you can create new cgroups), and
the procs and tasks files, so you can move tasks in.  It does not change
the other files so you 

[lxc-devel] [PATCH] check for btrfs fs in should_default_to_snapshot

2016-01-14 Thread Christian Brauner
Check if we're really on a btrfs filesystem before we call btrfs_same_fs().
Otherwise we will report misleading errors although everything went fine.

Signed-off-by: Christian Brauner 
---
 src/lxc/bdev/lxcbtrfs.c   | 2 +-
 src/lxc/bdev/lxcoverlay.c | 2 +-
 src/lxc/lxccontainer.c| 4 
 3 files changed, 6 insertions(+), 2 deletions(-)

diff --git a/src/lxc/bdev/lxcbtrfs.c b/src/lxc/bdev/lxcbtrfs.c
index 2db7c87..80b22fd 100644
--- a/src/lxc/bdev/lxcbtrfs.c
+++ b/src/lxc/bdev/lxcbtrfs.c
@@ -40,7 +40,7 @@
 #include "lxcrsync.h"
 #include "utils.h"
 
-lxc_log_define(btrfs, lxc);
+lxc_log_define(lxcbtrfs, lxc);
 
 /* defined in lxccontainer.c: needs to become common helper */
 extern char *dir_new_path(char *src, const char *oldname, const char *name,
diff --git a/src/lxc/bdev/lxcoverlay.c b/src/lxc/bdev/lxcoverlay.c
index d06d24f..4d30f3c 100644
--- a/src/lxc/bdev/lxcoverlay.c
+++ b/src/lxc/bdev/lxcoverlay.c
@@ -36,7 +36,7 @@
 #include "lxcrsync.h"
 #include "utils.h"
 
-lxc_log_define(overlay, lxc);
+lxc_log_define(lxcoverlay, lxc);
 
 static char *ovl_name;
 
diff --git a/src/lxc/lxccontainer.c b/src/lxc/lxccontainer.c
index 490a59d..7f2a390 100644
--- a/src/lxc/lxccontainer.c
+++ b/src/lxc/lxccontainer.c
@@ -2833,6 +2833,10 @@ bool should_default_to_snapshot(struct lxc_container *c0,
 
snprintf(p0, l0, "%s/%s", c0->config_path, c0->name);
snprintf(p1, l1, "%s/%s", c1->config_path, c1->name);
+
+   if (!is_btrfs_fs(p0) || !is_btrfs_fs(p1))
+   return false;
+
return btrfs_same_fs(p0, p1) == 0;
 }
 
-- 
2.7.0

___
lxc-devel mailing list
lxc-devel@lists.linuxcontainers.org
http://lists.linuxcontainers.org/listinfo/lxc-devel


Re: [lxc-devel] [PATCH] Fix btrfs bus error on sparc on snapshot delete

2016-01-14 Thread Serge Hallyn
Quoting Thomas Tanaka (thomas.tan...@oracle.com):
> The following patch fixes memory alignment and endianness
> issue while doing a snapshot deletion with btrfs as a
> backing store on platform such as sparc.
> 
> The implementation is taken from btrfs-progs.

Hi,

thanks, this looks nice.  I'm worried though that the #defines hide some of the
function definitions.  Can you add a brief comment listing the defined fns
names above their definitions in the .h file?

Where is bswap_64 defined?

> Signed-off-by: Thomas Tanaka 
> ---
>  src/lxc/bdev/lxcbtrfs.c |   28 +++-
>  src/lxc/bdev/lxcbtrfs.h |   39 +++
>  2 files changed, 54 insertions(+), 13 deletions(-)
> 
> diff --git a/src/lxc/bdev/lxcbtrfs.c b/src/lxc/bdev/lxcbtrfs.c
> index 2db7c87..0f8c1fc 100644
> --- a/src/lxc/bdev/lxcbtrfs.c
> +++ b/src/lxc/bdev/lxcbtrfs.c
> @@ -565,7 +565,7 @@ static int btrfs_recursive_destroy(const char *path)
>   int fd;
>   struct btrfs_ioctl_search_args args;
>   struct btrfs_ioctl_search_key *sk = 
> - struct btrfs_ioctl_search_header *sh;
> + struct btrfs_ioctl_search_header sh;
>   struct btrfs_root_ref *ref;
>   struct my_btrfs_tree *tree;
>   int ret, i;
> @@ -573,6 +573,7 @@ static int btrfs_recursive_destroy(const char *path)
>   int name_len;
>   char *name;
>   char *tmppath;
> + u64 dir_id;
>  
>   fd = open(path, O_RDONLY);
>   if (fd < 0) {
> @@ -624,21 +625,22 @@ static int btrfs_recursive_destroy(const char *path)
>  
>   off = 0;
>   for (i = 0; i < sk->nr_items; i++) {
> - sh = (struct btrfs_ioctl_search_header *)(args.buf + 
> off);
> - off += sizeof(*sh);
> + memcpy(, args.buf + off, sizeof(sh));
> + off += sizeof(sh);
>   /*
>* A backref key with the name and dirid of the parent
>* comes followed by the reoot ref key which has the
>* name of the child subvol in question.
>*/
> - if (sh->objectid != root_id && sh->type == 
> BTRFS_ROOT_BACKREF_KEY) {
> + if (sh.objectid != root_id && sh.type == 
> BTRFS_ROOT_BACKREF_KEY) {
>   ref = (struct btrfs_root_ref *)(args.buf + off);
> - name_len = ref->name_len;
> + name_len = btrfs_stack_root_ref_name_len(ref);
>   name = (char *)(ref + 1);
> - tmppath = get_btrfs_subvol_path(fd, sh->offset,
> - ref->dirid, name, name_len);
> - if (!add_btrfs_tree_node(tree, sh->objectid,
> - sh->offset, name,
> + dir_id = btrfs_stack_root_ref_dirid(ref);
> + tmppath = get_btrfs_subvol_path(fd, sh.offset,
> + dir_id, name, name_len);
> + if (!add_btrfs_tree_node(tree, sh.objectid,
> + sh.offset, name,
>   name_len, tmppath)) {
>   ERROR("Out of memory");
>   free_btrfs_tree(tree);
> @@ -648,15 +650,15 @@ static int btrfs_recursive_destroy(const char *path)
>   }
>   free(tmppath);
>   }
> - off += sh->len;
> + off += sh.len;
>  
>   /*
>* record the mins in sk so we can make sure the
>* next search doesn't repeat this root
>*/
> - sk->min_objectid = sh->objectid;
> - sk->min_type = sh->type;
> - sk->min_offset = sh->offset;
> + sk->min_objectid = sh.objectid;
> + sk->min_type = sh.type;
> + sk->min_offset = sh.offset;
>   }
>   sk->nr_items = 4096;
>   sk->min_offset++;
> diff --git a/src/lxc/bdev/lxcbtrfs.h b/src/lxc/bdev/lxcbtrfs.h
> index e0adb7a..4ea2b4a 100644
> --- a/src/lxc/bdev/lxcbtrfs.h
> +++ b/src/lxc/bdev/lxcbtrfs.h
> @@ -317,6 +317,45 @@ struct btrfs_ioctl_ino_lookup_args {
>  #define BTRFS_LAST_FREE_OBJECTID -256ULL
>  #define BTRFS_FIRST_CHUNK_TREE_OBJECTID 256ULL
>  
> +/* The followings are macro for correctly getting member of structures
> + * in both low and big endian platforms as per btrfs-progs 
> + */
> +#ifdef __CHECKER__
> +#define __force__attribute__((force))
> +#else
> +#define __force
> +#endif
> +
> +#if __BYTE_ORDER == __BIG_ENDIAN
>