Mount bind filehandle (Was: Re: [RFC][2.6 patch] Allow creation of new namespaces during mount system call)

2005-04-21 Thread Jan Hudec
On Wed, Apr 20, 2005 at 18:09:21 +0100, Al Viro wrote:
 On Wed, Apr 20, 2005 at 09:51:26AM -0700, Ram wrote:
  Reading through the thread I assume the requirement is:
  
  1) A User being able to create his own VFS-mount environment 
  2) being able to use the same VFS-mount environment from 
  multiple login sessions.
  3) Being able to switch some processes to some other
VFS-mount environment.
 
 Excuse me, but could somebody give coherent rationale for such requirements?
 _Especially_ for joining existing group by completely unrelated process -
 something we don't do for any other component of process.

I think I can. And I think I can modify the proposal to something a bit
more sane.

The problem is: The mount should be accessible only by processes started
  by the authorized user, but not by other user, including root, who is
  capable of changing their uid to the authorized user's id.

The solution can be: The mount is only accessible to the process group
  of that user's session. That's easy -- the login process is created
  with new namespace.

  Now how to allow the user to have that mountpoint accessible from all
  sessions? Well, we can already pass open file descriptor to files
  inside that mount to unrelated processes along unix domain sockets. So
  what is left is being able to mount --bind a directory by it's
  filehandle (since it does not have a path in our namespace).

  This allows creating a mount-agent, that would work similar to
  ssh-agnet or gpg-agent, but instead of passing secret keys, it would
  pass descriptors to that user's mount roots and it's client would bind
  them. This agent can do whatever authentication is deemed appropriate
  for that mountpoint.

  Note however, that it's really hard to protect something against root,
  because root can ptrace any process.

Well, being able to mount bind file descriptor might be useful for other
purposes as well and should not be hard to do. Unless you or somebody
finds a security problem with it, but I don't see any. Process having
a directory handle can chdir there anyway, so this does not add it extra
access.

---
 Jan 'Bulb' Hudec [EMAIL 
PROTECTED]


signature.asc
Description: Digital signature


Re: Mount bind filehandle (Was: Re: [RFC][2.6 patch] Allow creation of new namespaces during mount system call)

2005-04-21 Thread Christoph Hellwig
On Thu, Apr 21, 2005 at 09:33:20AM +0200, Jan Hudec wrote:
 I think I can. And I think I can modify the proposal to something a bit
 more sane.
 
 The problem is: The mount should be accessible only by processes started
   by the authorized user, but not by other user, including root, who is
   capable of changing their uid to the authorized user's id.
 
 The solution can be: The mount is only accessible to the process group
   of that user's session. That's easy -- the login process is created
   with new namespace.

That doesn't make sense.  A process with sufficient capabilities (aka root)
can do things including reading or modifying kernel memory and can
access your namespace always, no matter how difficult you're trying to make
it.

-
To unsubscribe from this list: send the line unsubscribe linux-fsdevel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC][2.6 patch] Allow creation of new namespaces during mount system call

2005-04-21 Thread Christoph Hellwig
On Wed, Apr 20, 2005 at 05:23:10PM -0700, Bryan Henderson wrote:
 That assumes that everyone has the same stuff in the same places.  I.e.
 that there is a universal tree with different subset hidden from 
 different
 processes.  But that is obviously a wrong approach - e.g. it loses 
 ability
 to bind different stuff on the same place in different namespaces.
 
 Aren't you trying to boil another egg in my pot?  In Linux today, everyone 
 (every process on the same Linux system, that is) has the same stuff in 
 the same place.

No.  I'm running various different namespace here, as part of a command to
setup chroots sanely.

 Not sure which reality you're talking about.  I don't think a directory 
 has a real absolute pathname, because I think the person who mounts the 
 filesystem that contains it chooses part of its absolute pathname for the 
 lifetime of the mount.  But as between multiple processes on the same 
 system at the same time, yeah, the directory has one name.

No.

-
To unsubscribe from this list: send the line unsubscribe linux-fsdevel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC][2.6 patch] Allow creation of new namespaces during mount system call

2005-04-21 Thread Miklos Szeredi
 OK, I overlooked the problem of having to add commands to the shell and 
 everything else.  While there's plenty of precedent for this style 
 (current directory, ulimits, umask), I wouldn't like to extend it, even to 
 adding a command to Bash.  But it could follow the 'nice' and 'renice' 
 model.

OK.  The user could then script the mount to change visibility for all
current processes with his uid, and new programs would inherit the
visibility.

It would still not work for ftp-server style programs, which are
started from an independent daemon, yet after login run with the
user's credentials, and hence I would expect to run with the user's
namespace.

 Would it not be better if you could specify the visibility policy when
 mounting?  Something simple like the user-group-other permission
 model would do nicely.  That would also have the advantage of being
 bound to the mountpoint, not the process.
 
 I just don't think that gives you enough policy flexibility.  If processes 
 can control visibility on a per-process basis independent from the mount 
 action, they can use a much greater variety of policy, and do it in user 
 space.

If used in conjuction with CLONE_NEWNS it would have all the needed
flexibility.

 As for user-group-other, let me first point out that this whole namespace 
 discussion started when a design based on actual file permission bits, but 
 not a true implementation of Unix security (root didn't get carte blanche) 
 was found unpalatable by some.  So as you say, it would be something 
 _like_ the permission model, not a part of it.

Yes, that is what I meant.

 We've been straining against the limitations of user/group/other for 
 decades.  Sophisticated systems don't even use them for file permissions. 

But the non-sophisticated case is by far the most abundant.  And for
that the traditional UNIX permission modell is not only good enough,
it is in fact _better_ than any sophisticated access control mechanism
because of it's _simplicity_.

Miklos
-
To unsubscribe from this list: send the line unsubscribe linux-fsdevel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Mount bind filehandle (Was: Re: [RFC][2.6 patch] Allow creation of new namespaces during mount system call)

2005-04-21 Thread Jan Hudec
On Thu, Apr 21, 2005 at 09:09:01 +0100, Christoph Hellwig wrote:
 On Thu, Apr 21, 2005 at 09:33:20AM +0200, Jan Hudec wrote:
  I think I can. And I think I can modify the proposal to something a bit
  more sane.
  
  The problem is: The mount should be accessible only by processes started
by the authorized user, but not by other user, including root, who is
capable of changing their uid to the authorized user's id.
  
  The solution can be: The mount is only accessible to the process group
of that user's session. That's easy -- the login process is created
with new namespace.
 
 That doesn't make sense.  A process with sufficient capabilities (aka root)
 can do things including reading or modifying kernel memory and can
 access your namespace always, no matter how difficult you're trying to make
 it.

Yes, I know. Actually, in the mail you cite, there was also writte:
Note however, that it's really hard to protect something against root,
because root can ptrace any process.

So determined attacker with root access will break in (actually,
determined attacker with root access can read your ssh keys from your
running ssh session too -- te fact you fuse-mount it does not increase
his chances).

However, there are other reasons mentioned in this thread, why private
namespaces are useful. They can't be corrupted by misconfigured stuff,
don't confuse other (broken) stuff and such. And after all while the
proposal is inspired by this issue, it just means a generic extension to
bind mounts that could be useful for other applications. Sometimes
a program, for reliability or security reasons, need to work with
directory handles -- and this is a way to reliably assign them a path
instead of finding out the current one, which can change under their
hands.

---
 Jan 'Bulb' Hudec [EMAIL 
PROTECTED]


signature.asc
Description: Digital signature


Re: [RFC][2.6 patch] Allow creation of new namespaces during mount system call

2005-04-21 Thread Bryan Henderson
It would still not work for ftp-server style programs,

True.  Users might want the mounts to show up to an ftp or not, and this 
handles only not.

If used in conjuction with CLONE_NEWNS it would have all the needed
flexibility.

I don't see how.  What if my policy is that processes with a certain 
process name (command) see the mount?  What if my policy is that users in 
a certain filesystem ACL can see it?  That's the kind of flexibility you 
can't get if the policy is set up via the mount() system call.

But the non-sophisticated case is by far the most abundant.  And for
that the traditional UNIX permission modell is not only good enough,
it is in fact _better_ than any sophisticated access control mechanism
because of it's _simplicity_.

Absolutely.  And that's why I speak of flexibility.  Let the simple users 
have their simple U-G-0 and the more creative ones do something more 
complex.

I'm not opposed, by the way, to an implementation that just does U-G-O (or 
even just U) if it's done in a way amenable to future extension.

--
Bryan Henderson  IBM Almaden Research Center
San Jose CA  Filesystems

-
To unsubscribe from this list: send the line unsubscribe linux-fsdevel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC][2.6 patch] Allow creation of new namespaces during mount system call

2005-04-21 Thread Jamie Lokier
Jan Hudec wrote:
 By the way, IIRC so far the root can access all kernel memory too via
 /dev/kmem. So the limiting of root's rights would have to be limited
 a bit more yet.

On some hardened systems, root is not allowed access to /dev/kmem.

-- Jamie
-
To unsubscribe from this list: send the line unsubscribe linux-fsdevel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Hiding secrets from root (Was: Re: [RFC][2.6 patch] Allow creation of new namespaces during mount system call)

2005-04-21 Thread Jan Hudec
On Thu, Apr 21, 2005 at 19:44:56 +0100, Jamie Lokier wrote:
 Jan Hudec wrote:
  By the way, IIRC so far the root can access all kernel memory too via
  /dev/kmem. So the limiting of root's rights would have to be limited
  a bit more yet.
 
 On some hardened systems, root is not allowed access to /dev/kmem.

That sure makes sense. Still the secret keys must either never leave
kernel (which would need all the encryption, decryption and checking
code in kernel), or they must be protected in userland too. Which means
the process has to be protected against being ptraced or inspected
through /dev/mem.

---
 Jan 'Bulb' Hudec [EMAIL 
PROTECTED]


signature.asc
Description: Digital signature


Re: [RFC][2.6 patch] Allow creation of new namespaces during mount system call

2005-04-21 Thread Ram
On Wed, 2005-04-20 at 17:42, Bryan Henderson wrote:
 Well I am not aware of issues that can arise if a user is allowed to
 change to some namespace for which it has permission to switch.
 
 I think I misunderstood your proposal.
 
 A user 'ram' creates a namespace 'n1' with a device node /dev/n1 having
 permission 700 owned by the user 'ram'. The user than tailors his
 namespace with a bunch of mount/umount/binds etc to meet his
 requirement.
 
 How does that address the setuid problem -- that a setuid program is 
 installed with the expectation that when it runs, certain names will 
 identify certain files (e.g. /etc/shadow)?  But also that certain other 
 names will identify a file of the invoker's choosing?

the new namespace 'n1' though created by the user 'ram', carries the
same restrictions to 'ram' . So 'ram' will not be able to mount
something else on /bin or /sbin or anyother directory that it does not
own, even though its done in its own namespace. Hence I dont see how a
attacker would be able fool a malicious setuid program into a genuine
setuid program. hope this is the concern you were talking about. right?

RP


 
 Trying to understand your proposal to see how it could be used to solve
 the problem faced by the FUSE project.  Are you trying to use a single
 namespace with invisible mounts capability? 
 
 Essentially.  It's a compromise.  A user can customize his namespace, but 
 only within limits that preserve the integrity of the system.
 
 Technically, we have to admit it's not one namespace today or with 
 invisible mounts.  Because of the way mounts cover up mountpoints, it's 
 technically possible for two processes to see different files as the same 
 name, if one opened the directory before a mount and the other after. 
 Mounting over is a curse.
 
 --
 Bryan Henderson  IBM Almaden Research Center
 San Jose CA  Filesystems

-
To unsubscribe from this list: send the line unsubscribe linux-fsdevel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Hiding secrets from root (Was: Re: [RFC][2.6 patch] Allow creation of new namespaces during mount system call)

2005-04-21 Thread Jamie Lokier
Jan Hudec wrote:
 On Thu, Apr 21, 2005 at 19:44:56 +0100, Jamie Lokier wrote:
  Jan Hudec wrote:
   By the way, IIRC so far the root can access all kernel memory too via
   /dev/kmem. So the limiting of root's rights would have to be limited
   a bit more yet.
  
  On some hardened systems, root is not allowed access to /dev/kmem.
 
 That sure makes sense. Still the secret keys must either never leave
 kernel (which would need all the encryption, decryption and checking
 code in kernel), or they must be protected in userland too. Which means
 the process has to be protected against being ptraced or inspected
 through /dev/mem.

That's right.  Protecting users' private data from access by the
administrators on a multi-user system is, not surprisingly, hard

-- Jamie
-
To unsubscribe from this list: send the line unsubscribe linux-fsdevel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC][2.6 patch] Allow creation of new namespaces during mount system call

2005-04-20 Thread Jamie Lokier
Al Viro wrote:
 Most of the code is already there - do_fork() has to do such stuff anyway.
 So how about adding sys_unshare(flags) that would do that job?  Flags would
 correspond to those of clone(2), except that all these guys would be
 what do we unshare instead of what do we leave shared.

That would let a program split off into its own namespace, but that's
not really what's needed for FUSE.

For FUSE, what's needed is that a user can mount something, and the
mounted fs is visible only to that user, but it's visible to _all_ of
the user's processes.

For example, as a non-root user I should be able to mount an ftpfs
or sshfs on /home/jamie/mnt/remote in one shell window, and be able
to cd inside that directory from a different shell window.  But other
users, including root, should not be able to see the mounted content
of that directory.  (The mounted fs is really just an interface to a
program owned by me - a program running ftp or ssh in this case).

We think namespaces are a nice way to do that: making a user-owned
filesystem only visible to a user.  But the mechanism of CLONE_NEWNS
does not work, because it presumes namespace divisions are only
propagated over parent-child divisions, like environment variables.

What we really want is a mount point that propagates across all the
processes owned by one user, but is not there for other users.

The shared subtree mechanism seems like it has the basic ideas for this.

 b) I _really_ don't like the idea of messing with the parent.  Make it
 a shell builtin if you want to affect shell behaviour; the same reason
 why cd is a builtin and not an external command.

I agree.  That's just a poor hack to let a usermount program alter
the namespace of the shell it's called from.  It won't work in general.

 c) I would be really, really careful with implications of let user
 do whatever he wants - that certainly should include bindings and
 that can create heaps of fun for suid stuff.  More comments when
 I get around to digging through FUSE thread...

Probably the best thing to do for suid programs is this:

   - Have a namespace per user.  The user's namespace will be entered
 by the login program somehow.

   - All logins to the same user acquire the same per-user namespace.
 This isn't possible at the moment; it would be a kernel extension
 + administrative change to login.

   - Mounts done by the user are private to the per-user namespace.

   - Mounts done by the user are still restricted to directories writable
 by the user, so he cannot modify /etc or /usr - your /etc/shadow
 attack is not possible.

   - suid programs run by the user inherit the same per-user namespace.
 In other words, the per-user namespace is entered at login time
 (by the login program doing a syscall), not when changing uid.
 In this way, a user can pass a private pathname to an suid
 program, and the suid program will be able to operate on that
 file properly.

-- Jamie
-
To unsubscribe from this list: send the line unsubscribe linux-fsdevel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC][2.6 patch] Allow creation of new namespaces during mount system call

2005-04-20 Thread Al Viro
On Wed, Apr 20, 2005 at 10:45:58AM +0100, Jamie Lokier wrote:
 For FUSE, what's needed is that a user can mount something, and the
 mounted fs is visible only to that user, but it's visible to _all_ of
 the user's processes.

So get that namespace as soon as you log in.
 
 We think namespaces are a nice way to do that: making a user-owned
 filesystem only visible to a user.  But the mechanism of CLONE_NEWNS
 does not work, because it presumes namespace divisions are only
 propagated over parent-child divisions, like environment variables.
 
 What we really want is a mount point that propagates across all the
 processes owned by one user, but is not there for other users.

This is almost certainly bogus.  Same user can easily want several
different environments set on the same box.

- Have a namespace per user.  The user's namespace will be entered
  by the login program somehow.

Trivial right now - just have libpam do that.
 
- All logins to the same user acquire the same per-user namespace.
  This isn't possible at the moment; it would be a kernel extension
  + administrative change to login.

No.  Identical setup at login time - sure.  Enforced _single_ namespace
is just plain wrong.  Moreover, all user's processes is the wrong answer
to practically any question (well, aside of what processes do you kill
when you get rid of luser's account).
-
To unsubscribe from this list: send the line unsubscribe linux-fsdevel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC][2.6 patch] Allow creation of new namespaces during mount system call

2005-04-20 Thread Jamie Lokier
Al Viro wrote:
 On Wed, Apr 20, 2005 at 10:45:58AM +0100, Jamie Lokier wrote:
  For FUSE, what's needed is that a user can mount something, and the
  mounted fs is visible only to that user, but it's visible to _all_ of
  the user's processes.
 
 So get that namespace as soon as you log in.

Yes.  How?

It is usual to log in on multiple terminals.  You'd _usually_ want
virtual filesystems mounted in one terminal to be visisble in others.

Hmm.  A fairly radical change to the login process would be able to do
it: have login pass the terminal file descriptor to a per-user process
which spawns user login shells from a common namespace.

That would work with the current CLONE_NEWNS mechanism :)

It's quite a radical change to the way logins are done, though.

  We think namespaces are a nice way to do that: making a user-owned
  filesystem only visible to a user.  But the mechanism of CLONE_NEWNS
  does not work, because it presumes namespace divisions are only
  propagated over parent-child divisions, like environment variables.
  
  What we really want is a mount point that propagates across all the
  processes owned by one user, but is not there for other users.
 
 This is almost certainly bogus.  Same user can easily want several
 different environments set on the same box.

I agree, but didn't say that before as the explanation is complicated
enough.

It shouldn't be literally per-user - it should be possible for a user
to have several environment _when_ they want that.  chroot-jail style
virtual server environments require that too.

But that shouldn't be the only option - because it would be horrible
to use.  If I login on multiple terminals, I normally want to mount
filesystems in /home/jamie/mnt on one terminal, and use them on another.

 - Have a namespace per user.  The user's namespace will be entered
   by the login program somehow.
 
 Trivial right now - just have libpam do that.

How can libpam join the user's existing namespace?

Having a separate usermount-namespace for each login of the same user
would not be nice to use.

 - All logins to the same user acquire the same per-user namespace.
   This isn't possible at the moment; it would be a kernel extension
   + administrative change to login.
 
 No.  Identical setup at login time - sure.  Enforced _single_ namespace
 is just plain wrong.  Moreover, all user's processes is the wrong answer
 to practically any question (well, aside of what processes do you kill
 when you get rid of luser's account).

I agree.  Certainly the kernel should not be looking at uid for this.
The actual determination of which processes share mounts should be
determined entirely by userspace.

But right now, the kernel doesn't provide a mechanism (that I know of)
whereby a new login shell can use the namespace of an existing process
- something that would be desirable if it were possible.

The idea I gave at the start of this mail - a daemon per user spawning
login shells using a shared namespace - would work using the available
CLONE_NEWNS mechanism.  But it doesn't fit very well with libpam, nor
with the usual tree relationship of parent-child processes when using
su and such.

-- Jamie
-
To unsubscribe from this list: send the line unsubscribe linux-fsdevel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC][2.6 patch] Allow creation of new namespaces during mount system call

2005-04-20 Thread Al Viro
On Wed, Apr 20, 2005 at 01:03:40PM +0100, Jamie Lokier wrote:
 It shouldn't be literally per-user - it should be possible for a user
 to have several environment _when_ they want that.  chroot-jail style
 virtual server environments require that too.
 
 But that shouldn't be the only option - because it would be horrible
 to use.  If I login on multiple terminals, I normally want to mount
 filesystems in /home/jamie/mnt on one terminal, and use them on another.

And when you log in on several terminals you usually want same $PATH.
You don't do that by sharing VM between shell processes, do you?  Sure,
that would work with sufficient kernel-side hacks for joining thread
group and making e.g. bash multithreaded.  Nobody does it though - it
doesn't buy you anything really useful.
 
 How can libpam join the user's existing namespace?
 
 Having a separate usermount-namespace for each login of the same user
 would not be nice to use.

I don't see why.  _IF_ you can change the set of mounts after you log in,
there's no more need to do any kernel tricks for that stuff than you would
need for environment, etc.  If you can't - well, the last point where you
can get something set up is login with no changes afterwards.  In that case
everything is just as trivial...
-
To unsubscribe from this list: send the line unsubscribe linux-fsdevel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC][2.6 patch] Allow creation of new namespaces during mount system call

2005-04-20 Thread Jan Hudec
On Wed, Apr 20, 2005 at 10:45:58 +0100, Jamie Lokier wrote:
 Al Viro wrote:
  Most of the code is already there - do_fork() has to do such stuff anyway.
  So how about adding sys_unshare(flags) that would do that job?  Flags would
  correspond to those of clone(2), except that all these guys would be
  what do we unshare instead of what do we leave shared.
 
 That would let a program split off into its own namespace, but that's
 not really what's needed for FUSE.
 
 For FUSE, what's needed is that a user can mount something, and the
 mounted fs is visible only to that user, but it's visible to _all_ of
 the user's processes.

Including root's su to that user...
Keeping information in a process group is the *only* way to actually
lock out root. That is, except some kind of keyring in kernel.

---
 Jan 'Bulb' Hudec [EMAIL 
PROTECTED]


signature.asc
Description: Digital signature


Re: [RFC][2.6 patch] Allow creation of new namespaces during mount system call

2005-04-20 Thread Eric Van Hensbergen
On 4/19/05, Al Viro [EMAIL PROTECTED] wrote:
 On Tue, Apr 19, 2005 at 06:53:29PM -0500, Eric Van Hensbergen wrote:
  On 4/19/05, Al Viro [EMAIL PROTECTED] wrote:
 
 a) ability to create a private namespace without forking anything - sure,
 that would be useful.  However, that's not something I would push into
 mount(2) (already overloaded to hell and back).


That's fair, just though it might be easier to slide in than a new
system call.  Should have known better -- sliding things in is never a
good idea.

 
 There used to be a kinda-sorta agreement on a new syscall:
 unshare(bitmap)
 with arguments like those of clone(2).  That's not just for namespaces -
 e.g. you might legitimately want to unshare VM in a thread and leave the
 rest alone.  Or unshare -fs (i.e. uncouple cwd from the rest of group).
 
 Most of the code is already there - do_fork() has to do such stuff anyway.
 So how about adding sys_unshare(flags) that would do that job?  Flags would
 correspond to those of clone(2), except that all these guys would be
 what do we unshare instead of what do we leave shared.
 

Okay, I'll try to work something up today and post a straw man here.


 b) I _really_ don't like the idea of messing with the parent.  Make it
 a shell builtin if you want to affect shell behaviour; the same reason
 why cd is a builtin and not an external command.


Yeah, that was really slimy, just wanted something that would be more
accessible to end users without having to affect changes to lots of
applications.  A somewhat less slimy method would be to expose
something in /proc, but I suppose that is a much more theologically
charged option.
 
 c) I would be really, really careful with implications of let user
 do whatever he wants - that certainly should include bindings and
 that can create heaps of fun for suid stuff.  More comments when
 I get around to digging through FUSE thread...

I agree, but I want to understand all the issues and see if we can
work out a solution which gives the user some ability to leverage
mount/bind in private namespaces.  I am trying to get more of a Plan 9
like environment under Linux, but understand the existence of a root
uid will require a lot more restrictions.  See my comments in the
later portions of the FUSE thread on a user-mount-permissions file
that would allow admins to define policies on what and where users
could mount/bind (and with what flags).

   -eric
-
To unsubscribe from this list: send the line unsubscribe linux-fsdevel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC][2.6 patch] Allow creation of new namespaces during mount system call

2005-04-20 Thread Eric Van Hensbergen
On 4/20/05, Eric Van Hensbergen [EMAIL PROTECTED] wrote:
 
 Yeah, that was really slimy, just wanted something that would be more
 accessible to end users without having to affect changes to lots of
 applications.  A somewhat less slimy method would be to expose
 something in /proc, but I suppose that is a much more theologically
 charged option.
 

I take it back, the more I think about it /proc is even more slimy. 
Forget I said it.

   -eric
-
To unsubscribe from this list: send the line unsubscribe linux-fsdevel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC][2.6 patch] Allow creation of new namespaces during mount system call

2005-04-20 Thread Ram
On Wed, 2005-04-20 at 05:39, Al Viro wrote:
 On Wed, Apr 20, 2005 at 01:03:40PM +0100, Jamie Lokier wrote:
  It shouldn't be literally per-user - it should be possible for a user
  to have several environment _when_ they want that.  chroot-jail style
  virtual server environments require that too.
  
  But that shouldn't be the only option - because it would be horrible
  to use.  If I login on multiple terminals, I normally want to mount
  filesystems in /home/jamie/mnt on one terminal, and use them on another.
 
 And when you log in on several terminals you usually want same $PATH.
 You don't do that by sharing VM between shell processes, do you?  Sure,
 that would work with sufficient kernel-side hacks for joining thread
 group and making e.g. bash multithreaded.  Nobody does it though - it
 doesn't buy you anything really useful.
  
  How can libpam join the user's existing namespace?
  
  Having a separate usermount-namespace for each login of the same user
  would not be nice to use.
 
 I don't see why.  _IF_ you can change the set of mounts after you log in,
 there's no more need to do any kernel tricks for that stuff than you would
 need for environment, etc.  If you can't - well, the last point where you
 can get something set up is login with no changes afterwards.  In that case
 everything is just as trivial...
 -

Reading through the thread I assume the requirement is:

1) A User being able to create his own VFS-mount environment 
2) being able to use the same VFS-mount environment from 
multiple login sessions.
3) Being able to switch some processes to some other
  VFS-mount environment.

How about making namespace's as first class objects with some associated
name or device in the device tree having owner/permissions etc.  any
process which forks off a namespace shall create the
device node for the namespace. If some other process wants to use
the same namespace, it can do so by attaching itself to the namespace
dynamically? Offcourse children processes inherit the same namespace.


If such a functionality existed, then a user can create his own
namespace if one does not exist, and if one exist he can attach to
that namespace?  Not thought through this idea entirely, but seems
to provide the desired functionality.

RP


 To unsubscribe from this list: send the line unsubscribe linux-fsdevel in
 the body of a message to [EMAIL PROTECTED]
 More majordomo info at  http://vger.kernel.org/majordomo-info.html

-
To unsubscribe from this list: send the line unsubscribe linux-fsdevel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC][2.6 patch] Allow creation of new namespaces during mount system call

2005-04-20 Thread Al Viro
On Wed, Apr 20, 2005 at 09:51:26AM -0700, Ram wrote:
 Reading through the thread I assume the requirement is:
 
 1) A User being able to create his own VFS-mount environment 
 2) being able to use the same VFS-mount environment from 
 multiple login sessions.
 3) Being able to switch some processes to some other
   VFS-mount environment.

Excuse me, but could somebody give coherent rationale for such requirements?
_Especially_ for joining existing group by completely unrelated process -
something we don't do for any other component of process.
-
To unsubscribe from this list: send the line unsubscribe linux-fsdevel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC][2.6 patch] Allow creation of new namespaces during mount system call

2005-04-20 Thread Miklos Szeredi
  Reading through the thread I assume the requirement is:
  
  1) A User being able to create his own VFS-mount environment 
  2) being able to use the same VFS-mount environment from 
  multiple login sessions.
  3) Being able to switch some processes to some other
VFS-mount environment.
 
 Excuse me, but could somebody give coherent rationale for such requirements?
 _Especially_ for joining existing group by completely unrelated process -
 something we don't do for any other component of process.

The user expects to have the see the same files in all sessions,
whether those be local logins, remote logins, ftp/scp/etc sessions.

If I'm remotely logged into server X from Y, and want to use scp to
copy some file from X to Y or vica versa, I will want my private
mounts to be visible from the scp.

A single global namespace makes perfect sense here.  Why do people
want private namespaces?  The usual reason given is that the global
namespace should not be polluted with private mounts.  Is that a
good reason?  I'm not sure.

Miklos
-
To unsubscribe from this list: send the line unsubscribe linux-fsdevel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC][2.6 patch] Allow creation of new namespaces during mount system call

2005-04-20 Thread Eric Van Hensbergen
On 4/20/05, Al Viro [EMAIL PROTECTED] wrote:
 
 Excuse me, but could somebody give coherent rationale for such requirements?
 _Especially_ for joining existing group by completely unrelated process -
 something we don't do for any other component of process.
 
 On Wed, Apr 20, 2005 at 09:51:26AM -0700, Ram wrote:
  Reading through the thread I assume the requirement is:
 
  1) A User being able to create his own VFS-mount environment
 

I think this enables a more secure environment for users when they
mount/bind their own file systems (be it network or synthetic).   It
helps prevent them from spoofing files and/or resources for users
other than themselves.  Security exposure for system files can be
limited by preventing users from mounting/binding over secure
locations and by enforcing nosuid/nosgid on user mounted file systems.
 Since there is no way to get at the private namespace, its contents
are safe from otherwise privileged users of the system (I'm not sure
if I agree with this, but I think that is part of what sent us down
this thread in the first place).

  2) being able to use the same VFS-mount environment from
  multiple login sessions.
  3) Being able to switch some processes to some other
VFS-mount environment.

These requirements are less clear to me - although it might be useful
to structure things in such a way that it is easy to recreate a
VFS-mount environment.  To a certain extent the information in
/proc/x/mounts gives this information, but I'm unclear on how this
might impact synthetic file systems.

Another alternative is to allow easy re-export of a constructed
VFS-mount environment so that it could be mounted/bound into other
environments.  The Plan 9 /srv (3) device kind of works this way.  I
assume that is some of the intent under the shared subtree stuff as
well.

Of course that doesn't answer the question of why you need to do these
things in the first place.  One case is when you've opened up a remote
file system in one window under a private namespace and realize you
need access to those files in another window -- opening up a new
connection to the server from the other window seems somewhat heavy
weight -- I imagine the same thing would be the case in a synthetic
file system case (like a user-space encrypted file system).

In Plan 9, the connection to the server (whether remote or local
synthetic) would be represented by a handle in the /srv device.  This
single instance can then be mounted onto a particular process'
namespace, and all that is required to share the instance is the /srv
handle.  The /srv handles can be per-user or just protected with
normal file system permissions (I guess this is sorta similar to Ram's
device suggestion).

All in all, I think these last two requirements represent a much
tougher problem and there very well may be a better solution to what
they are trying to do.  Its also worth noting (as I think someone did)
that this invalidates private namespace's prevention of unauthorized
super-user access to user files (which is somewhat bogus anyways).

  -eric
-
To unsubscribe from this list: send the line unsubscribe linux-fsdevel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC][2.6 patch] Allow creation of new namespaces during mount system call

2005-04-20 Thread Eric Van Hensbergen
On 4/20/05, Miklos Szeredi [EMAIL PROTECTED] wrote:

 The user expects to have the see the same files in all sessions,
 whether those be local logins, remote logins, ftp/scp/etc sessions.


 A single global namespace makes perfect sense here.  Why do people
 want private namespaces?


I disagree with this, I think there are plenty of situations where I
may want to have several different namespaces for several different
sessions.  Once you unlock namespace manipulation for users, all sorts
of new models of how to interact with the system fall out of that
capability.

Not to mention the fact that if you accidentally screw up your private
name space, fixing it is as easy as exiting that shell (or closing
that window, etc.)  Could you imagine if your PATH environment
variable was shared between all your shells and logins?  A simple typo
when extending your PATH in a single shell would screw all your
sessions.

  -eric
-
To unsubscribe from this list: send the line unsubscribe linux-fsdevel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC][2.6 patch] Allow creation of new namespaces during mount system call

2005-04-20 Thread Bryan Henderson
 But that shouldn't be the only option - because it would be horrible
 to use.  If I login on multiple terminals, I normally want to mount
 filesystems in /home/jamie/mnt on one terminal, and use them on 
another.

And when you log in on several terminals you usually want same $PATH.
You don't do that by sharing VM between shell processes, do you?

I share Al's view, and would expand:  You'd _like_ to be able to add 
something to your namespace once and have it show up in multiple process' 
namespaces, but you wouldn't expect it, because Unix has been horrible to 
use in that way forever.  I am frequently frustrated when I decide to 
change my environment either by setting an environment variable or shell 
variable or alias, and I have to do it separately in every existing shell. 
 And forget about the background jobs.  But at least it's consistent.  And 
there are other times when I exploit the fact that I can set something 
differently in different shells of the same user.

We do have a few areas where a group of processes can share the same 
kernel state, but it's always based on common ancestry.  It would take a 
major new concept to have a different kind of group of processes for 
namespace purposes, and then we probably wouldn't want to base it on uid, 
because uid means other things already.  Why tie them together?

--
Bryan Henderson  IBM Almaden Research Center
San Jose CA  Filesystems

-
To unsubscribe from this list: send the line unsubscribe linux-fsdevel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC][2.6 patch] Allow creation of new namespaces during mount system call

2005-04-20 Thread Ram
On Wed, 2005-04-20 at 10:09, Al Viro wrote:
 On Wed, Apr 20, 2005 at 09:51:26AM -0700, Ram wrote:
  Reading through the thread I assume the requirement is:
  
  1) A User being able to create his own VFS-mount environment 
  2) being able to use the same VFS-mount environment from 
  multiple login sessions.
  3) Being able to switch some processes to some other
VFS-mount environment.
 
 Excuse me, but could somebody give coherent rationale for such requirements?
 _Especially_ for joining existing group by completely unrelated process -
 something we don't do for any other component of process.

Would it be wrong to do (3) if access-controlled properly? Currently the
only way to share the same namespace is to inherit it, which is possible
only if the process belongs to the heridity chain of the creator of the
namespace.

I extracted the requirement (3) from this discussion 

 We think namespaces are a nice way to do that: making a user-owned
 filesystem only visible to a user.  But the mechanism of CLONE_NEWNS
 does not work, because it presumes namespace divisions are only
 propagated over parent-child divisions, like environment variables.
 
 What we really want is a mount point that propagates across all the
 processes owned by one user, but is not there for other users.

This is almost certainly bogus.  Same user can easily want several
different environments set on the same box.


RP
 -
 To unsubscribe from this list: send the line unsubscribe linux-fsdevel in
 the body of a message to [EMAIL PROTECTED]
 More majordomo info at  http://vger.kernel.org/majordomo-info.html

-
To unsubscribe from this list: send the line unsubscribe linux-fsdevel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC][2.6 patch] Allow creation of new namespaces during mount system call

2005-04-20 Thread Miklos Szeredi
(Please don't post separately to different recipients, that makes
replying quite awkward.  Always reply to all, it's the Right Thing)

 I disagree with this, I think there are plenty of situations where I
 may want to have several different namespaces for several different
 sessions.  Once you unlock namespace manipulation for users, all sorts
 of new models of how to interact with the system fall out of that
 capability.

I agree fully with you.

What I was getting at, is why people want to use private namespaces in
a way that contradits the privateness of the namespace: i.e. share
it between sessions etc.  Global namespace makes sense in that case.

Miklos
-
To unsubscribe from this list: send the line unsubscribe linux-fsdevel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC][2.6 patch] Allow creation of new namespaces during mount system call

2005-04-20 Thread Bryan Henderson
How about making namespace's as first class objects with some associated
name or device in the device tree having owner/permissions etc.  any
process which forks off a namespace shall create the
device node for the namespace.  If some other process wants to use
the same namespace, it can do so by attaching itself to the namespace
dynamically? Offcourse children processes inherit the same namespace.

For the issues being discussed here, I don't think that's materially 
different from what we started with; it has the same issue concerning 
whether a user should be allowed to change his namespace and whether a 
process' namespace should change automatically when another process does 
something.

Here's one more proposal, kind of a compromise among various previous 
ones.

  - When you mount(), you say whether the names should be visible by 
default or not.  It takes system privilege to make them visible by 
default, but an ordinary user can mount a willing filesystem over a 
directory he's permitted to modify unconditionally, invisible by default

  - A process can explicitly request to see an invisible-by-default 
mounted filesystem.  Anyone can do this, but permissions on the root 
directory of the mount determine if he can actually see anything.

  - A process inherits the parent's namespace (i.e. sees the mounts the 
parent does).

This accomplishes:

  - not much of a philosophical break from where we are now.

  - users can mount their own stuff without system privilege.

  - no one, not even a fully permitted administrative process, sees user 
junk by default.

  - setuid programs see standard files where the system administrator put 
them.

  - setuid programs see user files where the user put them.

  - multiple processes, with or without the same uid, can see user-mounted 
files if they want.

  - a process can opt not to see user-mounted files, even if it has the 
same uid as processes that do.

I'm not saying how I would implement this; there's enough discussion over 
the desired result that I thought we should start there.

--
Bryan Henderson  IBM Almaden Research Center
San Jose CA  Filesystems

-
To unsubscribe from this list: send the line unsubscribe linux-fsdevel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC][2.6 patch] Allow creation of new namespaces during mount system call

2005-04-20 Thread Miklos Szeredi
 For the issues being discussed here, I don't think that's materially 
 different from what we started with; it has the same issue concerning 
 whether a user should be allowed to change his namespace and whether a 
 process' namespace should change automatically when another process does 
 something.
 
 Here's one more proposal, kind of a compromise among various previous 
 ones.
 
   - When you mount(), you say whether the names should be visible by 
 default or not.  It takes system privilege to make them visible by 
 default, but an ordinary user can mount a willing filesystem over a 
 directory he's permitted to modify unconditionally, invisible by default

I like the idea of invisible mountpoints.  It doesn't even sound as if
it would be hard to implement, although Al will surely find a million
reasons why it's problematic ;)

   - A process can explicitly request to see an invisible-by-default 
 mounted filesystem.  Anyone can do this, but permissions on the root 
 directory of the mount determine if he can actually see anything.

How would you request to make the mountpoint visible from _any_
program.  It's not acceptable to expect every program to include a
menu, command, etc. to be able to modify the visibility of
mountpoints.

Would it not be better if you could specify the visibility policy when
mounting?  Something simple like the user-group-other permission
modell would do nicely.  That would also have the advantage of being
bound to the mountpoint, not the process.

Miklos
-
To unsubscribe from this list: send the line unsubscribe linux-fsdevel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC][2.6 patch] Allow creation of new namespaces during mount system call

2005-04-20 Thread Al Viro
On Wed, Apr 20, 2005 at 09:43:47PM +0100, Jamie Lokier wrote:
 Al Viro is right to point out that environment variables are not
 shared.  But he forgets that _files_ are shared.

So they are.  ~/.profile, for instance ;-)
-
To unsubscribe from this list: send the line unsubscribe linux-fsdevel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC][2.6 patch] Allow creation of new namespaces during mount system call

2005-04-20 Thread Al Viro
On Wed, Apr 20, 2005 at 07:53:04PM +0200, Miklos Szeredi wrote:
 The user expects to have the see the same files in all sessions,
 whether those be local logins, remote logins, ftp/scp/etc sessions.

Umm...  You know, I would be *very* unhappy if I found that some process
on parcelfarce was able to see the contents of ~/.ssh/* from the laptop
I'm using right now.  Even more so if that would apply to random ftp
sewer I happen to use at the moment...
 
 If I'm remotely logged into server X from Y, and want to use scp to
 copy some file from X to Y or vica versa, I will want my private
 mounts to be visible from the scp.

Do you?  Really?  OK, so I've got ~/bin/ and ~/bin/arch/ in my path on
my boxen.  The latter has ~/bin/{i386,alpha,sparc,amd64,hppa,ppc} bound
on it - depending on the host I'm using.  Tell me, why would I want that
private mount to be visible when I log in from one box to another?  To
make sure that wrong binaries would be picked?
-
To unsubscribe from this list: send the line unsubscribe linux-fsdevel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC][2.6 patch] Allow creation of new namespaces during mount system call

2005-04-20 Thread Ram
On Wed, 2005-04-20 at 11:57, Bryan Henderson wrote:
 How about making namespace's as first class objects with some associated
 name or device in the device tree having owner/permissions etc.  any
 process which forks off a namespace shall create the
 device node for the namespace.  If some other process wants to use
 the same namespace, it can do so by attaching itself to the namespace
 dynamically? Offcourse children processes inherit the same namespace.

 For the issues being discussed here, I don't think that's materially 
 different from what we started with; it has the same issue concerning 
 whether a user should be allowed to change his namespace and whether a 
 process' namespace should change automatically when another process does 
 something.
Well I am not aware of issues that can arise if a user is allowed to
change to some namespace for which it has permission to switch. I am 
wondering what can those be. Just for clarity under the assumption that
there wont be any security issues with this proposal, I will run through
a typical case on how this proposal can be used to solve the FUSE needs.

Ok let me explain how my proposal attempts to solves the problem:

A user 'ram' creates a namespace 'n1' with a device node /dev/n1 having
permission 700 owned by the user 'ram'. The user than tailors his
namespace with a bunch of mount/umount/binds etc to meet his
requirement.  All the processes that fork off this process see this same
setup.

Now 'ram' again logs-in to the same machine from a different terminal.
He gets the default system-namespace to begin with. But he runs a shell
built-in command 'chnamespace n1' and hence forth it will hook on to the
new namespace and see the same environment as seen by the other login.

And this login can be automated to see the 'n1' namespace by default by
making the login process to pick the 'n1' namespace right from the
beginning (if one exists) by looking at the some configuration file like
/etc/password or maybe /etc/shadow. 

Any other user who wants to attach to the same namespace can try to,
but if the permissions are denied it wont be able to. 

 
 Here's one more proposal, kind of a compromise among various previous 
 ones.
 
   - When you mount(), you say whether the names should be visible by 
 default or not.  It takes system privilege to make them visible by 
 default, but an ordinary user can mount a willing filesystem over a 
 directory he's permitted to modify unconditionally, invisible by default
   - A process can explicitly request to see an invisible-by-default 
 mounted filesystem.  Anyone can do this, but permissions on the root 
 directory of the mount determine if he can actually see anything.
 
   - A process inherits the parent's namespace (i.e. sees the mounts the 
 parent does).

Trying to understand your proposal to see how it could be used to solve
the problem faced by the FUSE project.  Are you trying to use a single
namespace with invisible mounts capability? 

RP

 
 This accomplishes:
 
   - not much of a philosophical break from where we are now.
 
   - users can mount their own stuff without system privilege.
 
   - no one, not even a fully permitted administrative process, sees user 
 junk by default.
 
   - setuid programs see standard files where the system administrator put 
 them.
 
   - setuid programs see user files where the user put them.
 
   - multiple processes, with or without the same uid, can see user-mounted 
 files if they want.
 
   - a process can opt not to see user-mounted files, even if it has the 
 same uid as processes that do.
 
 I'm not saying how I would implement this; there's enough discussion over 
 the desired result that I thought we should start there.
 
 --
 Bryan Henderson  IBM Almaden Research Center
 San Jose CA  Filesystems
 

-
To unsubscribe from this list: send the line unsubscribe linux-fsdevel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC][2.6 patch] Allow creation of new namespaces during mount system call

2005-04-20 Thread Jamie Lokier
Ram wrote:
  What we really want is a mount point that propagates across all the
  processes owned by one user, but is not there for other users.
 
 This is almost certainly bogus.  Same user can easily want several
 different environments set on the same box.

Yes of course.

The problem is the current mechanism _forces_ the user to have
different environments on the same box - there's no choice.

Which is, as Al says, just like environment variables.

But not like files - if I create a file called $HOME/foo, I expect
that I can access it from a different login.  I might want to have
different environments, but that's not the _default_ when dealing with
files.

The question is whether private user-mounts should, by default, behave
more like environment variables or more like files.

-- Jamie
-
To unsubscribe from this list: send the line unsubscribe linux-fsdevel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC][2.6 patch] Allow creation of new namespaces during mount system call

2005-04-20 Thread Jamie Lokier
Jan Hudec wrote:
  For FUSE, what's needed is that a user can mount something, and the
  mounted fs is visible only to that user, but it's visible to _all_ of
  the user's processes.
 
 Including root's su to that user...
 Keeping information in a process group is the *only* way to actually
 lock out root.

If root is _intending_ to view the user's data, it will succeed.

Keeping them in a process group won't prevent that: root can look at
the data accessible by any process (via ptrace or /dev/mem).

The problem most clearly illustrating the need for private user data
is sshfs, or anything which mounts over ssl/tls.

 That is, except some kind of keyring in kernel.

For secure user data, as in sshfs, that's the only real solution: a
keyring in kernel which cannot be accessed simply by calling su, and
which must be accessed to gain access to the mounted directory.

Which is no different from securing user data when scp+ssh-agent is used.

-- Jamie
-
To unsubscribe from this list: send the line unsubscribe linux-fsdevel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC][2.6 patch] Allow creation of new namespaces during mount system call

2005-04-20 Thread Jamie Lokier
Al Viro wrote:
 On Wed, Apr 20, 2005 at 09:43:47PM +0100, Jamie Lokier wrote:
  Al Viro is right to point out that environment variables are not
  shared.  But he forgets that _files_ are shared.
 
 So they are.  ~/.profile, for instance ;-)

And ~/.cvspass.  Mysteriously, when I add something to .cvspass using
cvs login, I then have the same access in all my login windows... ;)

Yes it's a bit muddled.

-- Jamie

-
To unsubscribe from this list: send the line unsubscribe linux-fsdevel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC][2.6 patch] Allow creation of new namespaces during mount system call

2005-04-20 Thread Jamie Lokier
Al Viro wrote:
  If I'm remotely logged into server X from Y, and want to use scp to
  copy some file from X to Y or vica versa, I will want my private
  mounts to be visible from the scp.
 
 Do you?  Really?  OK, so I've got ~/bin/ and ~/bin/arch/ in my path on
 my boxen.  The latter has ~/bin/{i386,alpha,sparc,amd64,hppa,ppc} bound
 on it - depending on the host I'm using.  Tell me, why would I want that
 private mount to be visible when I log in from one box to another?  To
 make sure that wrong binaries would be picked?

I believe the point is:

   1. Person is logged from client Y to server X, and mounts something on
  $HOME/mnt/private (that's on X).

   2. On client Y, person does scp X:mnt/private/secrets.txt .
  and wants it to work.

The second operation is a separate login to the first.

-- Jamie
-
To unsubscribe from this list: send the line unsubscribe linux-fsdevel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC][2.6 patch] Allow creation of new namespaces during mount system call

2005-04-20 Thread Bryan Henderson
Well I am not aware of issues that can arise if a user is allowed to
change to some namespace for which it has permission to switch.

I think I misunderstood your proposal.

A user 'ram' creates a namespace 'n1' with a device node /dev/n1 having
permission 700 owned by the user 'ram'. The user than tailors his
namespace with a bunch of mount/umount/binds etc to meet his
requirement.

How does that address the setuid problem -- that a setuid program is 
installed with the expectation that when it runs, certain names will 
identify certain files (e.g. /etc/shadow)?  But also that certain other 
names will identify a file of the invoker's choosing?

Trying to understand your proposal to see how it could be used to solve
the problem faced by the FUSE project.  Are you trying to use a single
namespace with invisible mounts capability? 

Essentially.  It's a compromise.  A user can customize his namespace, but 
only within limits that preserve the integrity of the system.

Technically, we have to admit it's not one namespace today or with 
invisible mounts.  Because of the way mounts cover up mountpoints, it's 
technically possible for two processes to see different files as the same 
name, if one opened the directory before a mount and the other after. 
Mounting over is a curse.

--
Bryan Henderson  IBM Almaden Research Center
San Jose CA  Filesystems
-
To unsubscribe from this list: send the line unsubscribe linux-fsdevel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[RFC][2.6 patch] Allow creation of new namespaces during mount system call

2005-04-19 Thread Eric Van Hensbergen
The motivation behind this patch is to make private namespaces more
accessible by allowing their creation at mount/bind time.

Based on some of the FUSE permissions discussions, I wanted to check
into modifying the mount system calls -- adding a flag which created a
new namespace for the resulting mount.  I quickly discovered that what
I typically wanted (for the case of running a mount command) was to
actually create a new namespace for the parent thread (typically the
shell), inherit that namespace, and then perform the mount.

Its not clear to me that both options are needed, cloning the parent's
namespace seems to be what you want most of the time.

In order to minimize code impact I split the copy_namespace function,
perhaps the right long term solution is to change it's interface to
accommodate the changes.  Things look a bit more invasive as I moved
the copy_namespace function above do_mount.  The patch follows:

  fs/namespace.c |  193
+
  include/linux/fs.h |2 
 2 files changed, 108 insertions(+), 87 deletions(-)

--- linux-2.5/include/linux/fs.h2005-04-19 17:02:28.530152496 -0500
+++ newns-2.5/include/linux/fs.h2005-04-19 17:03:52.619368992 -0500
@@ -103,6 +103,8 @@ extern int dir_notify_enable;
 #define MS_REC 16384
 #define MS_VERBOSE 32768
  #define MS_POSIXACL   (116) /* VFS does not apply the umask */
+#define MS_CLONE_NEWNS (117) /* clone my namespace before mount */
+#define MS_CLONE_NEWPNS (118) /* clone my  my parent namespace */
 #define MS_ACTIVE  (130)
 #define MS_NOUSER  (131)
 
--- linux-2.5/fs/namespace.c2005-04-19 17:02:14.551277608 -0500
+++ newns-2.5/fs/namespace.c2005-04-19 17:03:38.227556880 -0500
@@ -991,6 +991,104 @@ int copy_mount_options(const void __user
return 0;
 }
 
+int update_namespace(struct task_struct *tsk, struct namespace *new_ns )
+{
+   struct namespace *namespace = tsk-namespace;
+   struct vfsmount *rootmnt = NULL, *pwdmnt = NULL, *altrootmnt = NULL;
+   struct fs_struct *fs = tsk-fs;
+   struct vfsmount *p, *q;
+
+   if (!namespace)
+   return 0;
+
+   get_namespace(namespace);
+
+   if (!capable(CAP_SYS_ADMIN)) {
+   put_namespace(namespace);
+   return -EPERM;
+   }
+
+   down_write(tsk-namespace-sem);
+   if(!new_ns) {
+   new_ns = kmalloc(sizeof(struct namespace), GFP_KERNEL);
+   if (!new_ns)
+   goto out;
+
+   atomic_set(new_ns-count, 1);
+   init_rwsem(new_ns-sem);
+   INIT_LIST_HEAD(new_ns-list);
+
+   /* First pass: copy the tree topology */
+   new_ns-root = copy_tree(namespace-root, 
namespace-root-mnt_root);
+   if (!new_ns-root) {
+   up_write(tsk-namespace-sem);
+   kfree(new_ns);
+   goto out;
+   }
+   spin_lock(vfsmount_lock);
+   list_add_tail(new_ns-list, new_ns-root-mnt_list);
+   spin_unlock(vfsmount_lock);
+   } else 
+   get_namespace(new_ns);
+
+   /*
+* Second pass: switch the tsk-fs-* elements and mark new vfsmounts
+* as belonging to new namespace.  We have already acquired a private
+* fs_struct, so tsk-fs-lock is not needed.
+*/
+   p = namespace-root;
+   q = new_ns-root;
+   while (p) {
+   q-mnt_namespace = new_ns;
+   if (fs) {
+   if (p == fs-rootmnt) {
+   rootmnt = p;
+   fs-rootmnt = mntget(q);
+   }
+   if (p == fs-pwdmnt) {
+   pwdmnt = p;
+   fs-pwdmnt = mntget(q);
+   }
+   if (p == fs-altrootmnt) {
+   altrootmnt = p;
+   fs-altrootmnt = mntget(q);
+   }
+   }
+   p = next_mnt(p, namespace-root);
+   q = next_mnt(q, new_ns-root);
+   }
+   up_write(tsk-namespace-sem);
+
+   tsk-namespace = new_ns;
+
+   if (rootmnt)
+   mntput(rootmnt);
+   if (pwdmnt)
+   mntput(pwdmnt);
+   if (altrootmnt)
+   mntput(altrootmnt);
+
+   put_namespace(namespace);
+   return 0;
+
+out:
+   put_namespace(namespace);
+   return -ENOMEM;
+}
+
+int copy_namespace(int flags, struct task_struct *tsk)
+{
+   if (!tsk-namespace)
+   return 0;
+
+   if (!(flags  CLONE_NEWNS)) {
+   get_namespace(tsk-namespace);
+   return 0;
+   }
+
+   return update_namespace( tsk, NULL );
+}
+
 /*
  * Flags is a 32-bit value that allows up to 31 non-fs dependent flags to
  * be given to the mount() call (ie: read-only, 

Re: [RFC][2.6 patch] Allow creation of new namespaces during mount system call

2005-04-19 Thread Al Viro
On Tue, Apr 19, 2005 at 05:13:32PM -0500, Eric Van Hensbergen wrote:
 The motivation behind this patch is to make private namespaces more
 accessible by allowing their creation at mount/bind time.
 
 Based on some of the FUSE permissions discussions, I wanted to check
 into modifying the mount system calls -- adding a flag which created a
 new namespace for the resulting mount.  I quickly discovered that what
 I typically wanted (for the case of running a mount command) was to
 actually create a new namespace for the parent thread (typically the
 shell), inherit that namespace, and then perform the mount.
 
 Its not clear to me that both options are needed, cloning the parent's
 namespace seems to be what you want most of the time.
 
 In order to minimize code impact I split the copy_namespace function,
 perhaps the right long term solution is to change it's interface to
 accommodate the changes.  Things look a bit more invasive as I moved
 the copy_namespace function above do_mount.  The patch follows:

*UGH*

So what happens to those who happen to share task-fs with the parent?
-
To unsubscribe from this list: send the line unsubscribe linux-fsdevel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC][2.6 patch] Allow creation of new namespaces during mount system call

2005-04-19 Thread Eric Van Hensbergen
On 4/19/05, Al Viro [EMAIL PROTECTED] wrote:
 On Tue, Apr 19, 2005 at 05:13:32PM -0500, Eric Van Hensbergen wrote:
  The motivation behind this patch is to make private namespaces more
  accessible by allowing their creation at mount/bind time.
 
 
 *UGH*
 
 So what happens to those who happen to share task-fs with the parent?
 

Okay, I'll admit to being a bit too hasty with pushing out that patch
- I was being particularly myopic looking for a solution only for a
command-line mount.  Are you generally opposed to new namespace
creation at mount time or just my slimy hack?  A shared task-fs seems
like something which could be easily checked against and disallowed.

   -eric
-
To unsubscribe from this list: send the line unsubscribe linux-fsdevel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC][2.6 patch] Allow creation of new namespaces during mount system call

2005-04-19 Thread Al Viro
On Tue, Apr 19, 2005 at 06:53:29PM -0500, Eric Van Hensbergen wrote:
 On 4/19/05, Al Viro [EMAIL PROTECTED] wrote:
  On Tue, Apr 19, 2005 at 05:13:32PM -0500, Eric Van Hensbergen wrote:
   The motivation behind this patch is to make private namespaces more
   accessible by allowing their creation at mount/bind time.
  
  
  *UGH*
  
  So what happens to those who happen to share task-fs with the parent?
  
 
 Okay, I'll admit to being a bit too hasty with pushing out that patch
 - I was being particularly myopic looking for a solution only for a
 command-line mount.  Are you generally opposed to new namespace
 creation at mount time or just my slimy hack?  A shared task-fs seems
 like something which could be easily checked against and disallowed.

a) ability to create a private namespace without forking anything - sure,
that would be useful.  However, that's not something I would push into
mount(2) (already overloaded to hell and back).

There used to be a kinda-sorta agreement on a new syscall:
unshare(bitmap)
with arguments like those of clone(2).  That's not just for namespaces -
e.g. you might legitimately want to unshare VM in a thread and leave the
rest alone.  Or unshare -fs (i.e. uncouple cwd from the rest of group).

Most of the code is already there - do_fork() has to do such stuff anyway.
So how about adding sys_unshare(flags) that would do that job?  Flags would
correspond to those of clone(2), except that all these guys would be
what do we unshare instead of what do we leave shared.


b) I _really_ don't like the idea of messing with the parent.  Make it
a shell builtin if you want to affect shell behaviour; the same reason
why cd is a builtin and not an external command.


c) I would be really, really careful with implications of let user
do whatever he wants - that certainly should include bindings and
that can create heaps of fun for suid stuff.  More comments when
I get around to digging through FUSE thread...
-
To unsubscribe from this list: send the line unsubscribe linux-fsdevel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html