Re: [PATCH ] Fix capability check to allow privileged CLONE_NEWUSER from nested user namespaces

2018-01-31 Thread Serge E. Hallyn
Quoting Srivatsa S. Bhat (sriva...@csail.mit.edu):
> From: Srivatsa S. Bhat 
> 
> The existing patch which disallows unprivileged CLONE_NEWUSER applies
> the check for CAP_SYS_ADMIN capability on the 'init_user_ns'
> namespace, which is not entirely correct. Consider the following sequence:
> 
> 1. A process with root privileges calls
>clone(child_fn, ..., CLONE_NEWUSER, ...) to create a new user namespace.
> 
> 2. child_fn, now running in the newly created user namespace enjoys the
>full set of capabilities in the new user namespace, but has lost
>its capabilities in the old user namespace (init_user_ns in this
>case).
> 
> 3. child_fn now calls
>clone(child_fn2, ..., CLONE_NEWUSER, ...) to create a new (nested)
>user namespace.
> 
> Step 3 should have succeeded because child_fn has full privileges

No, it should not.  If the host has unprivileged_userns_clone=false,
then it should require privilege against the init_user_ns to change
or bypass that.  Yes the userns was *created* by someone with that
privilege, but likely they did so precisely so that they could run
more sandboxed code.

> (including CAP_SYS_ADMIN) in its user namespace, but this step fails
> because the CAP_SYS_ADMIN capability is checked against init_user_ns,
> as opposed to child_fn's user namespace.
> 
> So fix this by checking for CAP_SYS_ADMIN using ns_capable() on the
> current task's user namespace.
> 
> This also helps the userns07 testcase from LTP
> (testcases/kernel/containers/userns/userns07.c) to pass when running
> with root privileges.

If you want to run that test (which checks for >=32 level nesting depth)
with privilege, surely you can just set unprivileged_userns_clone=true?

> Signed-off-by: Srivatsa S. Bhat 
> ---
> Applies on debian kernel's master branch.
> 
>  ...low-unprivileged-CLONE_NEWUSER-by-default.patch |   11 +++
>  1 file changed, 7 insertions(+), 4 deletions(-)
> 
> diff --git 
> a/debian/patches/debian/add-sysctl-to-disallow-unprivileged-CLONE_NEWUSER-by-default.patch
>  
> b/debian/patches/debian/add-sysctl-to-disallow-unprivileged-CLONE_NEWUSER-by-default.patch
> index 55edbc7..fc978ea 100644
> --- 
> a/debian/patches/debian/add-sysctl-to-disallow-unprivileged-CLONE_NEWUSER-by-default.patch
> +++ 
> b/debian/patches/debian/add-sysctl-to-disallow-unprivileged-CLONE_NEWUSER-by-default.patch
> @@ -12,6 +12,9 @@ issues are found, we have a fail-safe.
>  
>  Signed-off-by: Serge Hallyn 
>  [bwh: Remove unneeded binary sysctl bits]
> +[Srivatsa: Fix capability checks when running nested user namespaces
> +by using ns_capable() on the current task's user namespace.]
> +Signed-off-by: Srivatsa S. Bhat 
>  ---
>  --- a/kernel/fork.c
>  +++ b/kernel/fork.c
> @@ -27,24 +30,24 @@ Signed-off-by: Serge Hallyn 
>   
>   /*
>* Minimum number of threads to boot the kernel
> -@@ -1550,6 +1555,10 @@ static __latent_entropy struct task_stru
> +@@ -1550,6 +1555,10 @@ static __latent_entropy struct task_struct 
> *copy_process(
>   if ((clone_flags & (CLONE_NEWUSER|CLONE_FS)) == 
> (CLONE_NEWUSER|CLONE_FS))
>   return ERR_PTR(-EINVAL);
>   
>  +if ((clone_flags & CLONE_NEWUSER) && !unprivileged_userns_clone)
> -+if (!capable(CAP_SYS_ADMIN))
> ++if (!ns_capable(current_user_ns(), CAP_SYS_ADMIN))
>  +return ERR_PTR(-EPERM);
>  +
>   /*
>* Thread groups must share signals as well, and detached threads
>* can only be started up within the thread group.
> -@@ -2343,6 +2352,12 @@ SYSCALL_DEFINE1(unshare, unsigned long,
> +@@ -2343,6 +2352,12 @@ SYSCALL_DEFINE1(unshare, unsigned long, unshare_flags)
>   if (unshare_flags & CLONE_NEWNS)
>   unshare_flags |= CLONE_FS;
>   
>  +if ((unshare_flags & CLONE_NEWUSER) && !unprivileged_userns_clone) {
>  +err = -EPERM;
> -+if (!capable(CAP_SYS_ADMIN))
> ++if (!ns_capable(current_user_ns(), CAP_SYS_ADMIN))
>  +goto bad_unshare_out;
>  +}
>  +



Bug#603944: Updated patch

2010-12-09 Thread Serge E. Hallyn
Here is a patch (against the ubuntu package, just as example)
which instead of doing a dumb retry loop, waits for udev.

=== modified file 'debian/changelog'
--- debian/changelog2010-04-26 15:17:47 +
+++ debian/changelog2010-12-08 21:44:32 +
@@ -1,3 +1,15 @@
+initramfs-tools (0.92bubuntu79) natty; urgency=low
+
+  * When using multipath, it is possible that mountroot() will race
+with udev's renaming of /dev/disk/by-uuid/{rootfs-uuid} from
+/dev/sd?? to /dev/mapper/something.  After multipath has grabbed
+the /dev/sd?? and until udev completes the rename, mounting
+/dev/disk/by-uuid/{rootfs-uuid} will fail with -EBUSY.  In that
+case, call 'udevsettle' to wait until udev has finished all its
+related actions. (Closes LP: #686832)
+
+ -- Serge Hallyn serge.hal...@ubuntu.com  Fri, 19 Nov 2010 12:19:43 -0600
+
 initramfs-tools (0.92bubuntu78) lucid; urgency=low
 
   * hooks/compcache: Escape $-expansions inside EOF (thanks, Eugene San;

=== modified file 'scripts/local'
--- scripts/local   2009-12-21 23:06:53 +
+++ scripts/local   2010-11-20 01:03:26 +
@@ -69,10 +69,19 @@
# FIXME This has no error checking
[ -n ${FSTYPE} ]  modprobe ${FSTYPE}
 
-   # FIXME This has no error checking
# Mount root
-   mount ${roflag} ${FSTYPE:+-t ${FSTYPE} }${ROOTFLAGS} ${ROOT} ${rootmnt}
-   mountroot_status=$?
+   tries=0
+   ret=1
+   while [ $tries -lt 2 -a $ret -ne 0 ]; do
+   mount ${roflag} ${FSTYPE:+-t ${FSTYPE} }${ROOTFLAGS} ${ROOT} 
${rootmnt}
+   ret=$?
+   if [ $ret -ne 0 ]; then
+   echo failed attempt $tries to mount $ROOT as root
+   udevadm settle
+   tries=$((tries+1))
+   fi
+   done
+   mountroot_status=$ret
if [ $LOOP ]; then
if [ $mountroot_status != 0 ]; then
if [ ${FSTYPE} = ntfs ] || [ ${FSTYPE} = vfat ]; then




-- 
To UNSUBSCRIBE, email to debian-kernel-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Archive: http://lists.debian.org/20101209212028.ga24...@hallyn.com



Bug#603944: retry mounting of root

2010-11-18 Thread Serge E. Hallyn
Package: initramfs-tools
Version: 0.98

When using multipath, it is possible that mountroot() will race with
udev's renaming of /dev/disk/by-uuid/{rootfs-uuid} from /dev/sd?? to
/dev/mapper/something.  After multipath has grabbed the /dev/sd?? and
until udev completes the rename, mounting
/dev/disk/by-uuid/{rootfs-uuid} will fail with -EBUSY.

Here is a patch I've been using successfully for awhile.  Colin Watson
has suggested that:

 A poll/retry loop is generally a suboptimal way to do this kind of
 thing; what we really want is to wait for udev to tell us that it has
 finished with the event that triggered renaming of the device.

which does seem cleaner.

thanks,
-serge

diff -Nru initramfs-tools-0.98ubuntu2/debian/changelog 
initramfs-tools-0.98ubuntu3/debian/changelog
--- initramfs-tools-0.98ubuntu2/debian/changelog2010-08-20 
03:48:58.0 -0500
+++ initramfs-tools-0.98ubuntu3/debian/changelog2010-08-24 
22:31:31.0 -0500
@@ -1,3 +1,14 @@
+initramfs-tools (0.98ubuntu3) maverick; urgency=low
+
+  * Add retries to mountroot().  This is particularly needed when we
+use multipath, because it is possible that mountroot() will race
+with udev's renaming of /dev/disk/by-uuid/{rootfs-uuid} from
+/dev/sd?? to /dev/mapper/something.  After multipath has grabbed
+the /dev/sd?? and until udev completes the rename, mounting
+/dev/disk/by-uuid/{rootfs-uuid} will fail with -EBUSY.
+
+ -- Serge Hallyn serge.hal...@canonical.com  Tue, 24 Aug 2010 22:17:57 -0500
+
 initramfs-tools (0.98ubuntu2) maverick; urgency=low
 
   * The ramzswap device changed its interface so that the disk size needs to
diff -Nru initramfs-tools-0.98ubuntu2/scripts/local 
initramfs-tools-0.98ubuntu3/scripts/local
--- initramfs-tools-0.98ubuntu2/scripts/local   2010-08-20 03:48:58.0 
-0500
+++ initramfs-tools-0.98ubuntu3/scripts/local   2010-08-24 22:16:17.0 
-0500
@@ -86,10 +86,19 @@
# FIXME This has no error checking
[ -n ${FSTYPE} ]  modprobe ${FSTYPE}
 
-   # FIXME This has no error checking
# Mount root
-   mount ${roflag} ${FSTYPE:+-t ${FSTYPE} }${ROOTFLAGS} ${ROOT} ${rootmnt}
-   mountroot_status=$?
+   tries=0
+   ret=1
+   while [ $tries -lt 10 -a $ret -ne 0 ]; do
+   mount ${roflag} ${FSTYPE:+-t ${FSTYPE} }${ROOTFLAGS} ${ROOT} 
${rootmnt}
+   ret=$?
+   if [ $ret -ne 0 ]; then
+   echo failed attempt $tries to mount $ROOT as root
+   sleep 1
+   tries=$((tries+1))
+   fi
+   done
+   mountroot_status=$ret
if [ $LOOP ]; then
if [ $mountroot_status != 0 ]; then
if [ ${FSTYPE} = ntfs ] || [ ${FSTYPE} = vfat ]; then



-- 
To UNSUBSCRIBE, email to debian-kernel-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Archive: http://lists.debian.org/20101118174249.ga24...@hallyn.com