Seth Forshee <[email protected]> writes:

> On Thu, Sep 24, 2015 at 04:14:33PM -0500, Eric W. Biederman wrote:
>> Seth Forshee <[email protected]> writes:
>> 
>> > Initially this will be used to eliminate the implicit MNT_NODEV
>> > flag for mounts from user namespaces. In the future it will also
>> > be used for translating ids and checking capabilities for
>> > filesystems mounted from user namespaces.
>> >
>> > s_user_ns is initialized in alloc_super() and is generally set to
>> > current_user_ns(). To avoid security and corruption issues, two
>> > additional mount checks are also added:
>> >
>> >  - do_new_mount() gains a check that the user has CAP_SYS_ADMIN
>> >    in current_user_ns().
>> >
>> >  - sget() will fail with EBUSY when the filesystem it's looking
>> >    for is already mounted from another user namespace.
>> >
>> > proc requires some special handling. The user namespace of
>> > current isn't appropriate when forking as a result of clone (2)
>> > with CLONE_NEWPID|CLONE_NEWUSER, as it will set s_user_ns to the
>> > namespace of the parent and make proc unmountable in the new user
>> > namespace. Instead, the user namespace which owns the new pid
>> > namespace is used. sget_userns() is allowed to allow passing in
>> > a namespace other than that of current, and sget becomes a
>> > wrapper around sget_userns() which passes current_user_ns().
>> 
>> Minor nits below.  I have fixed them up.
>> 
>> > Signed-off-by: Seth Forshee <[email protected]>
>> > ---
>> >  fs/namespace.c     |  3 +++
>> >  fs/proc/root.c     |  3 ++-
>> >  fs/super.c         | 38 +++++++++++++++++++++++++++++++++-----
>> >  include/linux/fs.h |  9 ++++++++-
>> >  4 files changed, 46 insertions(+), 7 deletions(-)
>> >
>> > diff --git a/fs/namespace.c b/fs/namespace.c
>> > index 0570729c87fd..d023a353dc63 100644
>> > --- a/fs/namespace.c
>> > +++ b/fs/namespace.c
>> > @@ -2381,6 +2381,9 @@ static int do_new_mount(struct path *path, const 
>> > char *fstype, int flags,
>> >    struct vfsmount *mnt;
>> >    int err;
>> >  
>> > +  if (!ns_capable(current_user_ns(), CAP_SYS_ADMIN))
>> > +          return -EPERM;
>> > +
>> >    if (!fstype)
>> >            return -EINVAL;
>> >  
>> > diff --git a/fs/proc/root.c b/fs/proc/root.c
>> > index 361ab4ee42fc..4b302cbf13f9 100644
>> > --- a/fs/proc/root.c
>> > +++ b/fs/proc/root.c
>> > @@ -117,7 +117,8 @@ static struct dentry *proc_mount(struct 
>> > file_system_type *fs_type,
>> >                    return ERR_PTR(-EPERM);
>> >    }
>> >  
>> > -  sb = sget(fs_type, proc_test_super, proc_set_super, flags, ns);
>> > +  sb = sget_userns(fs_type, proc_test_super, proc_set_super, flags,
>> > +                   ns->user_ns, ns);
>> >    if (IS_ERR(sb))
>> >            return ERR_CAST(sb);
>> >  
>> > diff --git a/fs/super.c b/fs/super.c
>> > index 954aeb80e202..42837da7d641 100644
>> > --- a/fs/super.c
>> > +++ b/fs/super.c
>> > @@ -33,6 +33,7 @@
>> >  #include <linux/cleancache.h>
>> >  #include <linux/fsnotify.h>
>> >  #include <linux/lockdep.h>
>> > +#include <linux/user_namespace.h>
>> >  #include "internal.h"
>> >  
>> >  
>> > @@ -163,6 +164,7 @@ static void destroy_super(struct super_block *s)
>> >  {
>> >    list_lru_destroy(&s->s_dentry_lru);
>> >    list_lru_destroy(&s->s_inode_lru);
>> > +  put_user_ns(s->s_user_ns);
>> >    security_sb_free(s);
>> >    WARN_ON(!list_empty(&s->s_mounts));
>> >    kfree(s->s_subtype);
>> > @@ -178,7 +180,8 @@ static void destroy_super(struct super_block *s)
>> >   *        Allocates and initializes a new &struct super_block.  
>> > alloc_super()
>> >   *        returns a pointer new superblock or %NULL if allocation had 
>> > failed.
>> >   */
>> > -static struct super_block *alloc_super(struct file_system_type *type, int 
>> > flags)
>> > +static struct super_block *alloc_super(struct file_system_type *type, int 
>> > flags,
>> > +                                 struct user_namespace *user_ns)
>> >  {
>> >    struct super_block *s = kzalloc(sizeof(struct super_block),  GFP_USER);
>> >    static const struct super_operations default_op;
>> > @@ -246,6 +249,8 @@ static struct super_block *alloc_super(struct 
>> > file_system_type *type, int flags)
>> >    s->s_shrink.count_objects = super_cache_count;
>> >    s->s_shrink.batch = 1024;
>> >    s->s_shrink.flags = SHRINKER_NUMA_AWARE | SHRINKER_MEMCG_AWARE;
>> > +
>> > +  s->s_user_ns = get_user_ns(user_ns);
>> >    return s;
>> >  
>> >  fail:
>> > @@ -442,17 +447,17 @@ void generic_shutdown_super(struct super_block *sb)
>> >  EXPORT_SYMBOL(generic_shutdown_super);
>> >  
>> >  /**
>> > - *        sget    -       find or create a superblock
>> > + *        sget_userns -   find or create a superblock
>> >   *        @type:  filesystem type superblock should belong to
>> >   *        @test:  comparison callback
>> >   *        @set:   setup callback
>> >   *        @flags: mount flags
>> 
>> You don't mention the user namespace parameter here.  I have fixed that
>> as.
>> 
>>   + *     @user_ns: User namespace you need CAP_SYS_ADMIN over to mount this 
>> fs.
>
> Looks good, thanks. Seems I also missed it in alloc_super though.

FYI I have placed everything that I has made it through my review in my
for-testing branch up on kernel.org.  So you can see what I have merged,
and the build test bots can look and see if they find anything to
complain about.

git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace.git 
for-testing

Eric

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Reply via email to