Re: [patch 2/2] VFS: allow filesystem to override mknod capability checks

2007-10-01 Thread Miklos Szeredi
> On Monday September 24, [EMAIL PROTECTED] wrote:
> > From: Miklos Szeredi <[EMAIL PROTECTED]>
> > 
> > Add a new super block flag, that results in the VFS not checking if
> > the current process has enough privileges to do an mknod().
> > 
> > If this flag is set, all mounts for this super block will have the
> > "nodev" flag implied.
> > 
> > This is needed on filesystems, where an unprivileged user may be able
> > to create a device node, without causing security problems.
> > 
> > One such example is "mountlo" a loopback mount utility implemented
> > with fuse and UML, which runs as an unprivileged userspace process.
> > In this case the user does in fact have the right to create device
> > nodes within the filesystem image, as long as the user has write
> > access to the image.  Since the filesystem is mounted with "nodev",
> > adding device nodes is not a security concern.
> 
> I must admit that I don't feel very comfortable about this.  I wonder
> how many more flags we might be tempted to add to allow
> user-controlled filesystems to do interesting things.  Somehow I doubt
> this will be the last, so we should be very careful allowing it to be
> the first (or is it the second already?)

Or third ;)

Yes, I've always argued, that permission checking needs to be pushed
to the filesystem, since the VFS can't always do a good enough job.

My usual example is sshfs, where it is just impossible to know the
uid/gid mapping between the server and the client.  So any permission
checking based on uid or gid on the client simply can't work, the only
sane thing to do is to delegate the permission checking to the server.
Which works beautifully, since the server is an unprivileged process,
and everything automatically works out.

All the fuse kernel module has to do is to basically define
->permission() to always return success, and let the userspace
filesystem do the permission checking.

This works fine, except a couple of things, like checking the sticky
bit on a directory, and mknod().

> A more concrete comment on the patch:  Is it really necessary to
> introduce IS_MNT_NODEV??  Why not simply test both the flags
> (MS_MKNOD_NOCAP and MNT_NODEV) before allowing the mknod?

Because we need MNT_NODEV on _all_ mounts belonging to a superblock,
not just the one on which mknod was performed on, which would get
really ugly.  This way it's simple: if the MS_MKNOD_NOCAP is specified
for the super block, that implies, that devices can't be opened.

> Do we actually need a new flag?  Would not a combination of MS_NODEV
> and MS_SETUSER achieve the same thing (near enough)?

See above.

> Do you imagine this flag being set as a mount option (-o unprivmknod)
> or does the filesystem set it itself?

I imagine this flag to usually be set by the filesystem itself.  But
it could be a separate mount option.  I guess it could make sense in
some non-fuse cases as well.

> If the latter, maybe this test should be moved down into the
> filesystems ->mknod operation.  Most filesystems get
>  
>   if ((S_ISCHR(mode) || S_ISBLK(mode)) && !capable(CAP_MKNOD))
>   return -EPERM;
> 
> at the top of ->mknod.  fuse can do whatever it likes without
> bothering common code.

Yes, that's one of the options, but it would be a huge change, with
nasty security implications for past and future filesystems.

> According to fs.h, we only support 32 fs-independent mount-flags, and
> over half are in use.  I'm not convinced we should spend one on such a
> narrow requirement.

I think there's consensus on the need for a new mount API.  Not just
because the 32 bits for the flags will be exhausted sooner or later
anyway, but the current API is limited in lots of other ways.

Miklos
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch 2/2] VFS: allow filesystem to override mknod capability checks

2007-10-01 Thread Miklos Szeredi
 On Monday September 24, [EMAIL PROTECTED] wrote:
  From: Miklos Szeredi [EMAIL PROTECTED]
  
  Add a new super block flag, that results in the VFS not checking if
  the current process has enough privileges to do an mknod().
  
  If this flag is set, all mounts for this super block will have the
  nodev flag implied.
  
  This is needed on filesystems, where an unprivileged user may be able
  to create a device node, without causing security problems.
  
  One such example is mountlo a loopback mount utility implemented
  with fuse and UML, which runs as an unprivileged userspace process.
  In this case the user does in fact have the right to create device
  nodes within the filesystem image, as long as the user has write
  access to the image.  Since the filesystem is mounted with nodev,
  adding device nodes is not a security concern.
 
 I must admit that I don't feel very comfortable about this.  I wonder
 how many more flags we might be tempted to add to allow
 user-controlled filesystems to do interesting things.  Somehow I doubt
 this will be the last, so we should be very careful allowing it to be
 the first (or is it the second already?)

Or third ;)

Yes, I've always argued, that permission checking needs to be pushed
to the filesystem, since the VFS can't always do a good enough job.

My usual example is sshfs, where it is just impossible to know the
uid/gid mapping between the server and the client.  So any permission
checking based on uid or gid on the client simply can't work, the only
sane thing to do is to delegate the permission checking to the server.
Which works beautifully, since the server is an unprivileged process,
and everything automatically works out.

All the fuse kernel module has to do is to basically define
-permission() to always return success, and let the userspace
filesystem do the permission checking.

This works fine, except a couple of things, like checking the sticky
bit on a directory, and mknod().

 A more concrete comment on the patch:  Is it really necessary to
 introduce IS_MNT_NODEV??  Why not simply test both the flags
 (MS_MKNOD_NOCAP and MNT_NODEV) before allowing the mknod?

Because we need MNT_NODEV on _all_ mounts belonging to a superblock,
not just the one on which mknod was performed on, which would get
really ugly.  This way it's simple: if the MS_MKNOD_NOCAP is specified
for the super block, that implies, that devices can't be opened.

 Do we actually need a new flag?  Would not a combination of MS_NODEV
 and MS_SETUSER achieve the same thing (near enough)?

See above.

 Do you imagine this flag being set as a mount option (-o unprivmknod)
 or does the filesystem set it itself?

I imagine this flag to usually be set by the filesystem itself.  But
it could be a separate mount option.  I guess it could make sense in
some non-fuse cases as well.

 If the latter, maybe this test should be moved down into the
 filesystems -mknod operation.  Most filesystems get
  
   if ((S_ISCHR(mode) || S_ISBLK(mode))  !capable(CAP_MKNOD))
   return -EPERM;
 
 at the top of -mknod.  fuse can do whatever it likes without
 bothering common code.

Yes, that's one of the options, but it would be a huge change, with
nasty security implications for past and future filesystems.

 According to fs.h, we only support 32 fs-independent mount-flags, and
 over half are in use.  I'm not convinced we should spend one on such a
 narrow requirement.

I think there's consensus on the need for a new mount API.  Not just
because the 32 bits for the flags will be exhausted sooner or later
anyway, but the current API is limited in lots of other ways.

Miklos
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch 2/2] VFS: allow filesystem to override mknod capability checks

2007-09-27 Thread Neil Brown
On Monday September 24, [EMAIL PROTECTED] wrote:
> From: Miklos Szeredi <[EMAIL PROTECTED]>
> 
> Add a new super block flag, that results in the VFS not checking if
> the current process has enough privileges to do an mknod().
> 
> If this flag is set, all mounts for this super block will have the
> "nodev" flag implied.
> 
> This is needed on filesystems, where an unprivileged user may be able
> to create a device node, without causing security problems.
> 
> One such example is "mountlo" a loopback mount utility implemented
> with fuse and UML, which runs as an unprivileged userspace process.
> In this case the user does in fact have the right to create device
> nodes within the filesystem image, as long as the user has write
> access to the image.  Since the filesystem is mounted with "nodev",
> adding device nodes is not a security concern.

I must admit that I don't feel very comfortable about this.  I wonder
how many more flags we might be tempted to add to allow
user-controlled filesystems to do interesting things.  Somehow I doubt
this will be the last, so we should be very careful allowing it to be
the first (or is it the second already?)

A more concrete comment on the patch:  Is it really necessary to
introduce IS_MNT_NODEV??  Why not simply test both the flags
(MS_MKNOD_NOCAP and MNT_NODEV) before allowing the mknod?  That would
localise the change to where is it really relevant.

Do we actually need a new flag?  Would not a combination of MS_NODEV
and MS_SETUSER achieve the same thing (near enough)?

Do you imagine this flag being set as a mount option (-o unprivmknod)
or does the filesystem set it itself?
If the latter, maybe this test should be moved down into the
filesystems ->mknod operation.  Most filesystems get
 
if ((S_ISCHR(mode) || S_ISBLK(mode)) && !capable(CAP_MKNOD))
return -EPERM;

at the top of ->mknod.  fuse can do whatever it likes without
bothering common code.

According to fs.h, we only support 32 fs-independent mount-flags, and
over half are in use.  I'm not convinced we should spend one on such a
narrow requirement.

NeilBrown


> 
> Signed-off-by: Miklos Szeredi <[EMAIL PROTECTED]>
> ---
> 
> Index: linux/fs/namei.c
> ===
> --- linux.orig/fs/namei.c 2007-09-24 13:52:17.0 +0200
> +++ linux/fs/namei.c  2007-09-24 13:54:57.0 +0200
> @@ -1617,7 +1617,7 @@ int may_open(struct nameidata *nd, int a
>   if (S_ISFIFO(inode->i_mode) || S_ISSOCK(inode->i_mode)) {
>   flag &= ~O_TRUNC;
>   } else if (S_ISBLK(inode->i_mode) || S_ISCHR(inode->i_mode)) {
> - if (nd->mnt->mnt_flags & MNT_NODEV)
> + if (IS_MNT_NODEV(nd->mnt))
>   return -EACCES;
>  
>   flag &= ~O_TRUNC;
> @@ -1920,7 +1920,8 @@ int vfs_mknod(struct inode *dir, struct 
>   if (error)
>   return error;
>  
> - if ((S_ISCHR(mode) || S_ISBLK(mode)) && !capable(CAP_MKNOD))
> + if (!(dir->i_sb->s_flags & MS_MKNOD_NOCAP) &&
> + (S_ISCHR(mode) || S_ISBLK(mode)) && !capable(CAP_MKNOD))
>   return -EPERM;
>  
>   if (!dir->i_op || !dir->i_op->mknod)
> Index: linux/include/linux/fs.h
> ===
> --- linux.orig/include/linux/fs.h 2007-09-24 13:52:17.0 +0200
> +++ linux/include/linux/fs.h  2007-09-24 13:54:57.0 +0200
> @@ -130,6 +130,8 @@ extern int dir_notify_enable;
>  #define MS_SETUSER   (1<<23) /* set mnt_uid to current user */
>  #define MS_NOMNT (1<<24) /* don't allow unprivileged submounts */
>  #define MS_KERNMOUNT (1<<25) /* this is a kern_mount call */
> +#define MS_MKNOD_NOCAP   (1<<26) /* no capability check in mknod,
> +implies "nodev" */
>  #define MS_ACTIVE(1<<30)
>  #define MS_NOUSER(1<<31)
>  
> @@ -190,6 +192,10 @@ extern int dir_notify_enable;
>  #define IS_SWAPFILE(inode)   ((inode)->i_flags & S_SWAPFILE)
>  #define IS_PRIVATE(inode)((inode)->i_flags & S_PRIVATE)
>  
> +#define IS_MNT_NODEV(mnt)(((mnt)->mnt_flags & MNT_NODEV) || \
> + ((mnt)->mnt_sb->s_flags & MS_MKNOD_NOCAP))
> +
> +
>  /* the read-only stuff doesn't really belong here, but any other place is
> probably as bad and I don't want to create yet another include file. */
>  
> Index: linux/drivers/mtd/mtdsuper.c
> ===
> --- linux.orig/drivers/mtd/mtdsuper.c 2007-09-24 13:52:17.0 +0200
> +++ linux/drivers/mtd/mtdsuper.c  2007-09-24 13:54:57.0 +0200
> @@ -194,7 +194,7 @@ int get_sb_mtd(struct file_system_type *
>   if (!S_ISBLK(nd.dentry->d_inode->i_mode))
>   goto out;
>  
> - if (nd.mnt->mnt_flags & MNT_NODEV) {
> + if (IS_MNT_NODEV(nd.mnt)) {
>   ret = -EACCES;
>   goto out;
>   }
> Index: linux/fs/block_dev.c
> 

Re: [patch 2/2] VFS: allow filesystem to override mknod capability checks

2007-09-27 Thread Neil Brown
On Monday September 24, [EMAIL PROTECTED] wrote:
 From: Miklos Szeredi [EMAIL PROTECTED]
 
 Add a new super block flag, that results in the VFS not checking if
 the current process has enough privileges to do an mknod().
 
 If this flag is set, all mounts for this super block will have the
 nodev flag implied.
 
 This is needed on filesystems, where an unprivileged user may be able
 to create a device node, without causing security problems.
 
 One such example is mountlo a loopback mount utility implemented
 with fuse and UML, which runs as an unprivileged userspace process.
 In this case the user does in fact have the right to create device
 nodes within the filesystem image, as long as the user has write
 access to the image.  Since the filesystem is mounted with nodev,
 adding device nodes is not a security concern.

I must admit that I don't feel very comfortable about this.  I wonder
how many more flags we might be tempted to add to allow
user-controlled filesystems to do interesting things.  Somehow I doubt
this will be the last, so we should be very careful allowing it to be
the first (or is it the second already?)

A more concrete comment on the patch:  Is it really necessary to
introduce IS_MNT_NODEV??  Why not simply test both the flags
(MS_MKNOD_NOCAP and MNT_NODEV) before allowing the mknod?  That would
localise the change to where is it really relevant.

Do we actually need a new flag?  Would not a combination of MS_NODEV
and MS_SETUSER achieve the same thing (near enough)?

Do you imagine this flag being set as a mount option (-o unprivmknod)
or does the filesystem set it itself?
If the latter, maybe this test should be moved down into the
filesystems -mknod operation.  Most filesystems get
 
if ((S_ISCHR(mode) || S_ISBLK(mode))  !capable(CAP_MKNOD))
return -EPERM;

at the top of -mknod.  fuse can do whatever it likes without
bothering common code.

According to fs.h, we only support 32 fs-independent mount-flags, and
over half are in use.  I'm not convinced we should spend one on such a
narrow requirement.

NeilBrown


 
 Signed-off-by: Miklos Szeredi [EMAIL PROTECTED]
 ---
 
 Index: linux/fs/namei.c
 ===
 --- linux.orig/fs/namei.c 2007-09-24 13:52:17.0 +0200
 +++ linux/fs/namei.c  2007-09-24 13:54:57.0 +0200
 @@ -1617,7 +1617,7 @@ int may_open(struct nameidata *nd, int a
   if (S_ISFIFO(inode-i_mode) || S_ISSOCK(inode-i_mode)) {
   flag = ~O_TRUNC;
   } else if (S_ISBLK(inode-i_mode) || S_ISCHR(inode-i_mode)) {
 - if (nd-mnt-mnt_flags  MNT_NODEV)
 + if (IS_MNT_NODEV(nd-mnt))
   return -EACCES;
  
   flag = ~O_TRUNC;
 @@ -1920,7 +1920,8 @@ int vfs_mknod(struct inode *dir, struct 
   if (error)
   return error;
  
 - if ((S_ISCHR(mode) || S_ISBLK(mode))  !capable(CAP_MKNOD))
 + if (!(dir-i_sb-s_flags  MS_MKNOD_NOCAP) 
 + (S_ISCHR(mode) || S_ISBLK(mode))  !capable(CAP_MKNOD))
   return -EPERM;
  
   if (!dir-i_op || !dir-i_op-mknod)
 Index: linux/include/linux/fs.h
 ===
 --- linux.orig/include/linux/fs.h 2007-09-24 13:52:17.0 +0200
 +++ linux/include/linux/fs.h  2007-09-24 13:54:57.0 +0200
 @@ -130,6 +130,8 @@ extern int dir_notify_enable;
  #define MS_SETUSER   (123) /* set mnt_uid to current user */
  #define MS_NOMNT (124) /* don't allow unprivileged submounts */
  #define MS_KERNMOUNT (125) /* this is a kern_mount call */
 +#define MS_MKNOD_NOCAP   (126) /* no capability check in mknod,
 +implies nodev */
  #define MS_ACTIVE(130)
  #define MS_NOUSER(131)
  
 @@ -190,6 +192,10 @@ extern int dir_notify_enable;
  #define IS_SWAPFILE(inode)   ((inode)-i_flags  S_SWAPFILE)
  #define IS_PRIVATE(inode)((inode)-i_flags  S_PRIVATE)
  
 +#define IS_MNT_NODEV(mnt)(((mnt)-mnt_flags  MNT_NODEV) || \
 + ((mnt)-mnt_sb-s_flags  MS_MKNOD_NOCAP))
 +
 +
  /* the read-only stuff doesn't really belong here, but any other place is
 probably as bad and I don't want to create yet another include file. */
  
 Index: linux/drivers/mtd/mtdsuper.c
 ===
 --- linux.orig/drivers/mtd/mtdsuper.c 2007-09-24 13:52:17.0 +0200
 +++ linux/drivers/mtd/mtdsuper.c  2007-09-24 13:54:57.0 +0200
 @@ -194,7 +194,7 @@ int get_sb_mtd(struct file_system_type *
   if (!S_ISBLK(nd.dentry-d_inode-i_mode))
   goto out;
  
 - if (nd.mnt-mnt_flags  MNT_NODEV) {
 + if (IS_MNT_NODEV(nd.mnt)) {
   ret = -EACCES;
   goto out;
   }
 Index: linux/fs/block_dev.c
 ===
 --- linux.orig/fs/block_dev.c 2007-09-24 13:52:17.0 +0200
 +++ 

Re: [patch 2/2] VFS: allow filesystem to override mknod capability checks

2007-09-24 Thread Christoph Hellwig
On Mon, Sep 24, 2007 at 02:25:54PM +0200, Miklos Szeredi wrote:
> From: Miklos Szeredi <[EMAIL PROTECTED]>
> 
> Add a new super block flag, that results in the VFS not checking if
> the current process has enough privileges to do an mknod().
> 
> If this flag is set, all mounts for this super block will have the
> "nodev" flag implied.
> 
> This is needed on filesystems, where an unprivileged user may be able
> to create a device node, without causing security problems.
> 
> One such example is "mountlo" a loopback mount utility implemented
> with fuse and UML, which runs as an unprivileged userspace process.
> In this case the user does in fact have the right to create device
> nodes within the filesystem image, as long as the user has write
> access to the image.  Since the filesystem is mounted with "nodev",
> adding device nodes is not a security concern.

This one looks okay, but I'd prefer to not put it in until we actually
have proper non-privilegued mounts.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[patch 2/2] VFS: allow filesystem to override mknod capability checks

2007-09-24 Thread Miklos Szeredi
From: Miklos Szeredi <[EMAIL PROTECTED]>

Add a new super block flag, that results in the VFS not checking if
the current process has enough privileges to do an mknod().

If this flag is set, all mounts for this super block will have the
"nodev" flag implied.

This is needed on filesystems, where an unprivileged user may be able
to create a device node, without causing security problems.

One such example is "mountlo" a loopback mount utility implemented
with fuse and UML, which runs as an unprivileged userspace process.
In this case the user does in fact have the right to create device
nodes within the filesystem image, as long as the user has write
access to the image.  Since the filesystem is mounted with "nodev",
adding device nodes is not a security concern.

Signed-off-by: Miklos Szeredi <[EMAIL PROTECTED]>
---

Index: linux/fs/namei.c
===
--- linux.orig/fs/namei.c   2007-09-24 13:52:17.0 +0200
+++ linux/fs/namei.c2007-09-24 13:54:57.0 +0200
@@ -1617,7 +1617,7 @@ int may_open(struct nameidata *nd, int a
if (S_ISFIFO(inode->i_mode) || S_ISSOCK(inode->i_mode)) {
flag &= ~O_TRUNC;
} else if (S_ISBLK(inode->i_mode) || S_ISCHR(inode->i_mode)) {
-   if (nd->mnt->mnt_flags & MNT_NODEV)
+   if (IS_MNT_NODEV(nd->mnt))
return -EACCES;
 
flag &= ~O_TRUNC;
@@ -1920,7 +1920,8 @@ int vfs_mknod(struct inode *dir, struct 
if (error)
return error;
 
-   if ((S_ISCHR(mode) || S_ISBLK(mode)) && !capable(CAP_MKNOD))
+   if (!(dir->i_sb->s_flags & MS_MKNOD_NOCAP) &&
+   (S_ISCHR(mode) || S_ISBLK(mode)) && !capable(CAP_MKNOD))
return -EPERM;
 
if (!dir->i_op || !dir->i_op->mknod)
Index: linux/include/linux/fs.h
===
--- linux.orig/include/linux/fs.h   2007-09-24 13:52:17.0 +0200
+++ linux/include/linux/fs.h2007-09-24 13:54:57.0 +0200
@@ -130,6 +130,8 @@ extern int dir_notify_enable;
 #define MS_SETUSER (1<<23) /* set mnt_uid to current user */
 #define MS_NOMNT   (1<<24) /* don't allow unprivileged submounts */
 #define MS_KERNMOUNT   (1<<25) /* this is a kern_mount call */
+#define MS_MKNOD_NOCAP (1<<26) /* no capability check in mknod,
+  implies "nodev" */
 #define MS_ACTIVE  (1<<30)
 #define MS_NOUSER  (1<<31)
 
@@ -190,6 +192,10 @@ extern int dir_notify_enable;
 #define IS_SWAPFILE(inode) ((inode)->i_flags & S_SWAPFILE)
 #define IS_PRIVATE(inode)  ((inode)->i_flags & S_PRIVATE)
 
+#define IS_MNT_NODEV(mnt)  (((mnt)->mnt_flags & MNT_NODEV) || \
+   ((mnt)->mnt_sb->s_flags & MS_MKNOD_NOCAP))
+
+
 /* the read-only stuff doesn't really belong here, but any other place is
probably as bad and I don't want to create yet another include file. */
 
Index: linux/drivers/mtd/mtdsuper.c
===
--- linux.orig/drivers/mtd/mtdsuper.c   2007-09-24 13:52:17.0 +0200
+++ linux/drivers/mtd/mtdsuper.c2007-09-24 13:54:57.0 +0200
@@ -194,7 +194,7 @@ int get_sb_mtd(struct file_system_type *
if (!S_ISBLK(nd.dentry->d_inode->i_mode))
goto out;
 
-   if (nd.mnt->mnt_flags & MNT_NODEV) {
+   if (IS_MNT_NODEV(nd.mnt)) {
ret = -EACCES;
goto out;
}
Index: linux/fs/block_dev.c
===
--- linux.orig/fs/block_dev.c   2007-09-24 13:52:17.0 +0200
+++ linux/fs/block_dev.c2007-09-24 13:54:57.0 +0200
@@ -1408,7 +1408,7 @@ struct block_device *lookup_bdev(const c
if (!S_ISBLK(inode->i_mode))
goto fail;
error = -EACCES;
-   if (nd.mnt->mnt_flags & MNT_NODEV)
+   if (IS_MNT_NODEV(nd.mnt))
goto fail;
error = -ENOMEM;
bdev = bd_acquire(inode);
Index: linux/fs/namespace.c
===
--- linux.orig/fs/namespace.c   2007-09-24 13:52:17.0 +0200
+++ linux/fs/namespace.c2007-09-24 13:54:57.0 +0200
@@ -431,7 +431,6 @@ static int show_vfsmnt(struct seq_file *
};
static struct proc_fs_info mnt_info[] = {
{ MNT_NOSUID, ",nosuid" },
-   { MNT_NODEV, ",nodev" },
{ MNT_NOEXEC, ",noexec" },
{ MNT_NOATIME, ",noatime" },
{ MNT_NODIRATIME, ",nodiratime" },
@@ -459,6 +458,8 @@ static int show_vfsmnt(struct seq_file *
if (mnt->mnt_flags & fs_infop->flag)
seq_puts(m, fs_infop->str);
}
+   if (IS_MNT_NODEV(mnt))
+   seq_puts(m, ",nodev");
if (mnt->mnt_flags & MNT_USER)
seq_printf(m, 

[patch 2/2] VFS: allow filesystem to override mknod capability checks

2007-09-24 Thread Miklos Szeredi
From: Miklos Szeredi [EMAIL PROTECTED]

Add a new super block flag, that results in the VFS not checking if
the current process has enough privileges to do an mknod().

If this flag is set, all mounts for this super block will have the
nodev flag implied.

This is needed on filesystems, where an unprivileged user may be able
to create a device node, without causing security problems.

One such example is mountlo a loopback mount utility implemented
with fuse and UML, which runs as an unprivileged userspace process.
In this case the user does in fact have the right to create device
nodes within the filesystem image, as long as the user has write
access to the image.  Since the filesystem is mounted with nodev,
adding device nodes is not a security concern.

Signed-off-by: Miklos Szeredi [EMAIL PROTECTED]
---

Index: linux/fs/namei.c
===
--- linux.orig/fs/namei.c   2007-09-24 13:52:17.0 +0200
+++ linux/fs/namei.c2007-09-24 13:54:57.0 +0200
@@ -1617,7 +1617,7 @@ int may_open(struct nameidata *nd, int a
if (S_ISFIFO(inode-i_mode) || S_ISSOCK(inode-i_mode)) {
flag = ~O_TRUNC;
} else if (S_ISBLK(inode-i_mode) || S_ISCHR(inode-i_mode)) {
-   if (nd-mnt-mnt_flags  MNT_NODEV)
+   if (IS_MNT_NODEV(nd-mnt))
return -EACCES;
 
flag = ~O_TRUNC;
@@ -1920,7 +1920,8 @@ int vfs_mknod(struct inode *dir, struct 
if (error)
return error;
 
-   if ((S_ISCHR(mode) || S_ISBLK(mode))  !capable(CAP_MKNOD))
+   if (!(dir-i_sb-s_flags  MS_MKNOD_NOCAP) 
+   (S_ISCHR(mode) || S_ISBLK(mode))  !capable(CAP_MKNOD))
return -EPERM;
 
if (!dir-i_op || !dir-i_op-mknod)
Index: linux/include/linux/fs.h
===
--- linux.orig/include/linux/fs.h   2007-09-24 13:52:17.0 +0200
+++ linux/include/linux/fs.h2007-09-24 13:54:57.0 +0200
@@ -130,6 +130,8 @@ extern int dir_notify_enable;
 #define MS_SETUSER (123) /* set mnt_uid to current user */
 #define MS_NOMNT   (124) /* don't allow unprivileged submounts */
 #define MS_KERNMOUNT   (125) /* this is a kern_mount call */
+#define MS_MKNOD_NOCAP (126) /* no capability check in mknod,
+  implies nodev */
 #define MS_ACTIVE  (130)
 #define MS_NOUSER  (131)
 
@@ -190,6 +192,10 @@ extern int dir_notify_enable;
 #define IS_SWAPFILE(inode) ((inode)-i_flags  S_SWAPFILE)
 #define IS_PRIVATE(inode)  ((inode)-i_flags  S_PRIVATE)
 
+#define IS_MNT_NODEV(mnt)  (((mnt)-mnt_flags  MNT_NODEV) || \
+   ((mnt)-mnt_sb-s_flags  MS_MKNOD_NOCAP))
+
+
 /* the read-only stuff doesn't really belong here, but any other place is
probably as bad and I don't want to create yet another include file. */
 
Index: linux/drivers/mtd/mtdsuper.c
===
--- linux.orig/drivers/mtd/mtdsuper.c   2007-09-24 13:52:17.0 +0200
+++ linux/drivers/mtd/mtdsuper.c2007-09-24 13:54:57.0 +0200
@@ -194,7 +194,7 @@ int get_sb_mtd(struct file_system_type *
if (!S_ISBLK(nd.dentry-d_inode-i_mode))
goto out;
 
-   if (nd.mnt-mnt_flags  MNT_NODEV) {
+   if (IS_MNT_NODEV(nd.mnt)) {
ret = -EACCES;
goto out;
}
Index: linux/fs/block_dev.c
===
--- linux.orig/fs/block_dev.c   2007-09-24 13:52:17.0 +0200
+++ linux/fs/block_dev.c2007-09-24 13:54:57.0 +0200
@@ -1408,7 +1408,7 @@ struct block_device *lookup_bdev(const c
if (!S_ISBLK(inode-i_mode))
goto fail;
error = -EACCES;
-   if (nd.mnt-mnt_flags  MNT_NODEV)
+   if (IS_MNT_NODEV(nd.mnt))
goto fail;
error = -ENOMEM;
bdev = bd_acquire(inode);
Index: linux/fs/namespace.c
===
--- linux.orig/fs/namespace.c   2007-09-24 13:52:17.0 +0200
+++ linux/fs/namespace.c2007-09-24 13:54:57.0 +0200
@@ -431,7 +431,6 @@ static int show_vfsmnt(struct seq_file *
};
static struct proc_fs_info mnt_info[] = {
{ MNT_NOSUID, ,nosuid },
-   { MNT_NODEV, ,nodev },
{ MNT_NOEXEC, ,noexec },
{ MNT_NOATIME, ,noatime },
{ MNT_NODIRATIME, ,nodiratime },
@@ -459,6 +458,8 @@ static int show_vfsmnt(struct seq_file *
if (mnt-mnt_flags  fs_infop-flag)
seq_puts(m, fs_infop-str);
}
+   if (IS_MNT_NODEV(mnt))
+   seq_puts(m, ,nodev);
if (mnt-mnt_flags  MNT_USER)
seq_printf(m, ,user=%i, mnt-mnt_uid);
if (mnt-mnt_sb-s_op-show_options)
-
To unsubscribe from 

Re: [patch 2/2] VFS: allow filesystem to override mknod capability checks

2007-09-24 Thread Christoph Hellwig
On Mon, Sep 24, 2007 at 02:25:54PM +0200, Miklos Szeredi wrote:
 From: Miklos Szeredi [EMAIL PROTECTED]
 
 Add a new super block flag, that results in the VFS not checking if
 the current process has enough privileges to do an mknod().
 
 If this flag is set, all mounts for this super block will have the
 nodev flag implied.
 
 This is needed on filesystems, where an unprivileged user may be able
 to create a device node, without causing security problems.
 
 One such example is mountlo a loopback mount utility implemented
 with fuse and UML, which runs as an unprivileged userspace process.
 In this case the user does in fact have the right to create device
 nodes within the filesystem image, as long as the user has write
 access to the image.  Since the filesystem is mounted with nodev,
 adding device nodes is not a security concern.

This one looks okay, but I'd prefer to not put it in until we actually
have proper non-privilegued mounts.

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/