Quoting Papp Tamas (tom...@martos.bme.hu):
> On 02/28/2012 04:13 PM, Serge Hallyn wrote:
> >Quoting Papp Tamas (tom...@martos.bme.hu):
> >>On 02/28/2012 01:20 AM, Serge Hallyn wrote:
> >>>Quoting Daniel Lezcano (daniel.lezc...@free.fr):
> >>>>Hi all,
> >>>>
> >>>>I will release a 0.8.0-rc1. I am looking for volunteer to test it :)
> >>>Worked fine for me.  Tested create and clone of ubuntu, ubuntu and
> >>>ubuntu-cloud images, with dir and lvm backing stores.  (And a run
> >>>of lp:~serge-hallyn/+junk/lxc-test)
> >>>
> >>>Note, because upstream kernel didn't much care about the
> >>>'mount -o remount,ro /' problem, I'm going to patch lxc to
> >>>pin open a '${rootfs}.hold' file, as long as the container
> >>>is running.  That will prevent the underlying fs from being
> >>>remounted ro.  (see
> >>>https://bugs.launchpad.net/ubuntu/+source/lxc/+bug/942325 for
> >>>details).  That'll buy us some time to find a better solution
> >>>in the kernel.
> >>>
> >>>
> >>Why can a container change mount options outside of its rootfs?
> >>Sorry for the stupid question:)
> >It's not a stupid question at all.
> >
> >The container isn't changing mount options outside of its rootfs.  THere
> >are two places an fs can be marked readonly - in the mount itself, and in
> >the superblock.  When you make a bind mount, you are creating more mounts
> >(vfsmounts) using the same superblcok.
> >
> >If you do
> >
> >     mount --bind / / # not needed in container bc it's already been done
> >     mount --bind -o remount,ro /
> >
> >then you are setting the reasonly flag on the mount itself.  If you just do
> >
> >     mount -o remount,ro /
> >
> >then you are setting the reasonly flag on the superblock, which will
> >force all other mounts of that superblcok to also be readonly.
> >
> >Right now there is no way to prevent a container from doing that.  I sent
> >a patch to make the devices cgroup be consulted on that, so that it could
> >reteurn -EPERM.  That was refused.  The two other options I'm considering
> >(and it wouldn't hurt ot have both) are 1. to pass the  remoutn flags to the
> >LSM (selinux or apparmor or smack) so that it can deny permission.  Right
> >now it can't do that (except for all-or-nothing check on remount).  And 2.
> >to make it so that after doing
> >
> >     mount --bind / /
> >     mount --bind -o remount,ro /
> >     mount --bind -o remount,rw /
> >
> >any subsequent
> >
> >     mount -o remount,rw /
> >
> >would be refused (or automatically done only at the mount level).  I don't
> >think that should be hard to do at fs/namespace.c:do_remount().
> 
> 
> This may be to much for my brain:)
> 
> Anyway, could you make deb package from it?

I've got it working for an ubuntu package, though we're in freeze right
now.  I intend to push the patch to my github tree tomorrow, and I've
pushed the package to ppa:serge-hallyn/virt (version 0.7.5-3ubuntu31,
should build in a few hours).  Meanwhile here is the actual patch for
now.

Tests fine for me.

Subject: lxc-start: if rootfs is a dir, pin the fs

Otherwise the container can remount the shared underlying fs readonly.

Index: lxc-dnsmasq/src/lxc/conf.c
===================================================================
--- lxc-dnsmasq.orig/src/lxc/conf.c     2012-02-28 19:13:01.400960000 +0000
+++ lxc-dnsmasq/src/lxc/conf.c  2012-02-28 20:05:45.538144907 +0000
@@ -445,6 +445,51 @@
        return mount_unknow_fs(rootfs, target, 0);
 }
 
+/*
+ * pin_rootfs
+ * if rootfs is a directory, then open ${rootfs}.hold for writing for the
+ * duration of the container run, to prevent the container from marking the
+ * underlying fs readonly on shutdown.
+ * return -1 on error.
+ * return -2 if nothing needed to be pinned.
+ * return an open fd (>=0) if we pinned it.
+ */
+int pin_rootfs(const char *rootfs)
+{
+       char absrootfs[MAXPATHLEN];
+       char absrootfspin[MAXPATHLEN];
+       struct stat s;
+       int ret, fd;
+
+       if (!realpath(rootfs, absrootfs)) {
+               SYSERROR("failed to get real path for '%s'", rootfs);
+               return -1;
+       }
+
+       if (access(absrootfs, F_OK)) {
+               SYSERROR("'%s' is not accessible", absrootfs);
+               return -1;
+       }
+
+       if (stat(absrootfs, &s)) {
+               SYSERROR("failed to stat '%s'", absrootfs);
+               return -1;
+       }
+
+       if (!__S_ISTYPE(s.st_mode, S_IFDIR))
+               return -2;
+
+       ret = snprintf(absrootfspin, MAXPATHLEN, "%s%s", absrootfs, ".hold");
+       if (ret >= MAXPATHLEN) {
+               SYSERROR("pathname too long for rootfs hold file");
+               return -1;
+       }
+
+       fd = open(absrootfspin, O_CREAT | O_RDWR, S_IWUSR|S_IRUSR);
+       INFO("opened %s as fd %d\n", absrootfspin, fd);
+       return fd;
+}
+
 static int mount_rootfs(const char *rootfs, const char *target)
 {
        char absrootfs[MAXPATHLEN];
Index: lxc-dnsmasq/src/lxc/conf.h
===================================================================
--- lxc-dnsmasq.orig/src/lxc/conf.h     2012-02-28 19:13:01.400960000 +0000
+++ lxc-dnsmasq/src/lxc/conf.h  2012-02-28 19:13:01.400960000 +0000
@@ -218,6 +218,8 @@
  */
 extern struct lxc_conf *lxc_conf_init(void);
 
+extern int pin_rootfs(const char *rootfs);
+
 extern int lxc_create_network(struct lxc_handler *handler);
 extern void lxc_delete_network(struct lxc_list *networks);
 extern int lxc_assign_network(struct lxc_list *networks, pid_t pid);
Index: lxc-dnsmasq/src/lxc/start.c
===================================================================
--- lxc-dnsmasq.orig/src/lxc/start.c    2012-02-28 19:13:01.400960000 +0000
+++ lxc-dnsmasq/src/lxc/start.c 2012-02-28 20:07:41.174882442 +0000
@@ -565,6 +565,7 @@
        int clone_flags;
        int failed_before_rename = 0;
        const char *name = handler->name;
+       int pinfd;
 
        if (lxc_sync_init(handler))
                return -1;
@@ -585,6 +586,17 @@
        }
 
 
+       /*
+        * if the rootfs is not a blockdev, prevent the container from
+        * marking it readonly.
+        */
+
+       pinfd = pin_rootfs(handler->conf->rootfs.path);
+       if (pinfd == -1) {
+               ERROR("failed to pin the container's rootfs");
+               goto out_abort;
+       }
+
        /* Create a process in a new set of namespaces */
        handler->pid = lxc_clone(do_start, handler, clone_flags);
        if (handler->pid < 0) {
@@ -627,6 +639,10 @@
        }
 
        lxc_sync_fini(handler);
+
+       if (pinfd >= 0)
+               close(pinfd);
+
        return 0;
 
 out_delete_net:

------------------------------------------------------------------------------
Keep Your Developer Skills Current with LearnDevNow!
The most comprehensive online learning library for Microsoft developers
is just $99.99! Visual Studio, SharePoint, SQL - plus HTML5, CSS3, MVC3,
Metro Style Apps, more. Free future releases when you subscribe now!
http://p.sf.net/sfu/learndevnow-d2d
_______________________________________________
Lxc-users mailing list
Lxc-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/lxc-users

Reply via email to