Re: [patch 0/8] unprivileged mount syscall
> >> Arn't there ways to escape chroot jails? Serge had pointed me to a URL > >> which showed chroots can be escaped. And if that is true than having all > >> user's private mount tree in the same namespace can be a security issue? > > > > No. In fact chrooting the user into /share/$USER will actually > > _grant_ a privilege to the user, instead of taking it away. It allows > > the user to modify it's root namespace, which it wouldn't be able to > > in the initial namespace. > > > > So even if the user could escape from the chroot (which I doubt), s/he > > would not be able to do any harm, since unprivileged mounting would be > > restricted to /share. Also /share/$USER should only have read/search > > permission for $USER or no permissions at all, which would mean, that > > other users' namespaces would be safe from tampering as well. > > A couple of points. > - chroot can be escaped, it is just a chdir for the root directory > it is not a security feature. The only security is that you have to > be root to call chdir. A carefully done namespace setup won't have > that issue. > > - While it may not violate security as far as what a user is allowed > to modify it may violate security as far as what a user is allowed > to see. I think that's just up to the permissions in the global namespace. In this example if you 'chmod 0 /share' there won't be anything for the user to see. > There are interesting per login cases as well such as allowing a > user to replicate their mount tree from another machine when they > log in. When /home is on a network filesystem this can be very > practical and can allow propagation of mounts across machines not > just across a single login session. Yeah, sounds interesting, but I think it's better to get the basics working first, and then we can start to think about the extras. Btw, there's nothing that prevents cloning the namespace _after_ chrooting into the per-user tree. That would still be simpler than doing it the other way round: first creating per-session namespaces and then setting up mount propagation between them. Miklos - To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [patch 0/8] unprivileged mount syscall
Miklos Szeredi <[EMAIL PROTECTED]> writes: >> Arn't there ways to escape chroot jails? Serge had pointed me to a URL >> which showed chroots can be escaped. And if that is true than having all >> user's private mount tree in the same namespace can be a security issue? > > No. In fact chrooting the user into /share/$USER will actually > _grant_ a privilege to the user, instead of taking it away. It allows > the user to modify it's root namespace, which it wouldn't be able to > in the initial namespace. > > So even if the user could escape from the chroot (which I doubt), s/he > would not be able to do any harm, since unprivileged mounting would be > restricted to /share. Also /share/$USER should only have read/search > permission for $USER or no permissions at all, which would mean, that > other users' namespaces would be safe from tampering as well. A couple of points. - chroot can be escaped, it is just a chdir for the root directory it is not a security feature. The only security is that you have to be root to call chdir. A carefully done namespace setup won't have that issue. - While it may not violate security as far as what a user is allowed to modify it may violate security as far as what a user is allowed to see. There are interesting per login cases as well such as allowing a user to replicate their mount tree from another machine when they log in. When /home is on a network filesystem this can be very practical and can allow propagation of mounts across machines not just across a single login session. Eric - To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [patch 0/8] unprivileged mount syscall
> Arn't there ways to escape chroot jails? Serge had pointed me to a URL > which showed chroots can be escaped. And if that is true than having all > user's private mount tree in the same namespace can be a security issue? No. In fact chrooting the user into /share/$USER will actually _grant_ a privilege to the user, instead of taking it away. It allows the user to modify it's root namespace, which it wouldn't be able to in the initial namespace. So even if the user could escape from the chroot (which I doubt), s/he would not be able to do any harm, since unprivileged mounting would be restricted to /share. Also /share/$USER should only have read/search permission for $USER or no permissions at all, which would mean, that other users' namespaces would be safe from tampering as well. Miklos - To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [patch 0/8] unprivileged mount syscall
On Fri, 2007-04-13 at 16:05 +0200, Miklos Szeredi wrote: > > > Thinking a bit more about this, I'm quite sure most users wouldn't > > > even want private namespaces. It would be enough to > > > > > > chroot /share/$USER > > > > > > and be done with it. > > > > > > Private namespaces are only good for keeping a bunch of mounts > > > referenced by a group of processes. But my guess is, that the natural > > > behavior for users is to see a persistent set of mounts. > > > > > > If for example they mount something on a remote machine, then log out > > > from the ssh session and later log back in, they would want to see > > > their previous mount still there. > > > > > > Miklos > > > > Agreed on desired behavior, but not on chroot sufficing. It actually > > sounds like you want exactly what was outlined in the OLS paper. > > > > Users still need to be in a different mounts namespace from the admin > > user so long as we consider the deluser and backup problems > > I don't think it matters, because /share/$USER duplicates a part or > the whole of the user's namespace. > > So backup would have to be taught about /share anyway, and deluser > operates on /home/$USER and not on /share/*, so there shouldn't be any > problem. > > There's actually very little difference between rbind+chroot, and > CLONE_NEWNS. In a private namespace: > > 1) when no more processes reference the namespace, the tree will be > disbanded > > 2) the mount tree won't be accessible from outside the namespace > > Wanting a persistent namespace contradicts 1). > > Wanting a per-user (as opposed to per-session) namespace contradicts > 2). The namespace _has_ to be accessible from outside, so that a new > session can access/copy it. As i mentioned in the previous mail, disbanding all the namespaces of a user will not disband his mount tree, because a mirror of the mount tree still continues to exist in /share/$USER in the admin namespace. And a new user session can always use this copy to create a namespace that looks identical to that which existed earlier. > > So both requirements point to the rbind/chroot solution. Arn't there ways to escape chroot jails? Serge had pointed me to a URL which showed chroots can be escaped. And if that is true than having all user's private mount tree in the same namespace can be a security issue? RP > > Miklos - To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [patch 0/8] unprivileged mount syscall
On Fri, 2007-04-13 at 13:58 +0200, Miklos Szeredi wrote: > > On Wed, 2007-04-11 at 12:44 +0200, Miklos Szeredi wrote: > > > > 1. clone the master namespace. > > > > > > > > 2. in the new namespace > > > > > > > > move the tree under /share/$me to / > > > > for each ($user, $what, $how) { > > > > move /share/$user/$what to /$what > > > > if ($how == slave) { > > > > make the mount tree under /$what as slave > > > > } > > > > } > > > > > > > > 3. in the new namespace make the tree under > > > >/share as private and unmount /share > > > > > > Thanks. I get the basic idea now: the namespace itself need not be > > > shared between the sessions, it is enough if "share" propagation is > > > set up between the different namespaces of a user. > > > > > > I don't yet see either in your or Viro's description how the trees > > > under /share/$USER are initialized. I guess they are recursively > > > bound from /, and are made slaves. > > > > yes. I suppose, when a userid is created one of the steps would be > > > > mount --rbind / /share/$USER > > mount --make-rslave /share/$USER > > mount --make-rshared /share/$USER > > Thinking a bit more about this, I'm quite sure most users wouldn't > even want private namespaces. It would be enough to > > chroot /share/$USER > > and be done with it. > > Private namespaces are only good for keeping a bunch of mounts > referenced by a group of processes. But my guess is, that the natural > behavior for users is to see a persistent set of mounts. > > If for example they mount something on a remote machine, then log out > from the ssh session and later log back in, they would want to see > their previous mount still there. They will continue see their previous mount tree. Even if all the namespaces belonging to the different sessions of the user get dismantled when all the sessions exit, the a mirror of those mount trees continue to exist under /share/$USER in the original namespace. So I don't think we have a issue. NOTE: when I say 'original namespace' I mean the admin namespace; the first namespace that gets created when the machine boots. RP > > Miklos - To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [patch 0/8] unprivileged mount syscall
Quoting Miklos Szeredi ([EMAIL PROTECTED]): > > > > Agreed on desired behavior, but not on chroot sufficing. It actually > > > > sounds like you want exactly what was outlined in the OLS paper. > > > > > > > > Users still need to be in a different mounts namespace from the admin > > > > user so long as we consider the deluser and backup problems > > > > > > I don't think it matters, because /share/$USER duplicates a part or > > > the whole of the user's namespace. > > > > > > So backup would have to be taught about /share anyway, and deluser > > > operates on /home/$USER and not on /share/*, so there shouldn't be any > > > problem. > > > > In what I was thinking of, /share/$USER is bind mounted to > > ~$USER/share, so it would have to be done in a private namespace in > > order for deluser to not be tricked. > > But /share/$USER is surely not bind mounted to ~$USER/share in the > _global_ namespace, is it? I can't see any sense in that. No it's not, only in the private namespace. > > > There's actually very little difference between rbind+chroot, and > > > CLONE_NEWNS. In a private namespace: > > > > > > 1) when no more processes reference the namespace, the tree will be > > > disbanded > > > > > > 2) the mount tree won't be accessible from outside the namespace > > > > But it *can* be, if properly set up. That's part of the point of the > > example in the OLS paper. When a user logs in, sshd clones a new > > namespace, then bind-mounts /share/$USER into ~$USER/share. So assuming > > that /share/$USER was --make-shared'd, it and ~$USER are now in the > > same peer group, and any changes made by the user under ~$USER will > > be reflected back into /share/$USER. > > I acknowledge, that it can be done. My point was that it can be done > more simply _without_ using CLONE_NS. Seems like a matter of preference, but I see what you're saying. > > > Wanting a persistent namespace contradicts 1). > > > > Not necessarily, see above. > > > > > Wanting a per-user (as opposed to per-session) namespace contradicts > > > 2). The namespace _has_ to be accessible from outside, so that a new > > > session can access/copy it. > > > > Again, I *think* you are wrong that private namespace contradicts this > > requirement. > > I'm not saying there's any contradiction, I'm saying rbind+chroot is a > better fit. Ok, I see. > I haven't yet heard a single reason why a per-session namespace with > parts shared per-user is better than just a per-user namespace. In fact I suspect we could show that they are functionally equivalent (for your purposes) by drawing the fs tree and peer groups from current->fs->root on up for both methods. And not using private namespaces leaves the admin (at least for now) better able to diagnose the state of the system. -serge - To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [patch 0/8] unprivileged mount syscall
> > > Agreed on desired behavior, but not on chroot sufficing. It actually > > > sounds like you want exactly what was outlined in the OLS paper. > > > > > > Users still need to be in a different mounts namespace from the admin > > > user so long as we consider the deluser and backup problems > > > > I don't think it matters, because /share/$USER duplicates a part or > > the whole of the user's namespace. > > > > So backup would have to be taught about /share anyway, and deluser > > operates on /home/$USER and not on /share/*, so there shouldn't be any > > problem. > > In what I was thinking of, /share/$USER is bind mounted to > ~$USER/share, so it would have to be done in a private namespace in > order for deluser to not be tricked. But /share/$USER is surely not bind mounted to ~$USER/share in the _global_ namespace, is it? I can't see any sense in that. > > There's actually very little difference between rbind+chroot, and > > CLONE_NEWNS. In a private namespace: > > > > 1) when no more processes reference the namespace, the tree will be > > disbanded > > > > 2) the mount tree won't be accessible from outside the namespace > > But it *can* be, if properly set up. That's part of the point of the > example in the OLS paper. When a user logs in, sshd clones a new > namespace, then bind-mounts /share/$USER into ~$USER/share. So assuming > that /share/$USER was --make-shared'd, it and ~$USER are now in the > same peer group, and any changes made by the user under ~$USER will > be reflected back into /share/$USER. I acknowledge, that it can be done. My point was that it can be done more simply _without_ using CLONE_NS. > > Wanting a persistent namespace contradicts 1). > > Not necessarily, see above. > > > Wanting a per-user (as opposed to per-session) namespace contradicts > > 2). The namespace _has_ to be accessible from outside, so that a new > > session can access/copy it. > > Again, I *think* you are wrong that private namespace contradicts this > requirement. I'm not saying there's any contradiction, I'm saying rbind+chroot is a better fit. I haven't yet heard a single reason why a per-session namespace with parts shared per-user is better than just a per-user namespace. Miklos - To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [patch 0/8] unprivileged mount syscall
> > Thinking a bit more about this, I'm quite sure most users wouldn't > > even want private namespaces. It would be enough to > > > > chroot /share/$USER > > > > and be done with it. > > I don't think so. How to you want to implement non-shared /tmp > directories? mount --bind /.tmp/$USER /share/$USER/tmp or whatever else this polyunsaturated thingy does within the cloned namespace. > The chroot is overkill in this case. What do you mean it's an overkill? clone(CLONE_NS) duplicates all the mounts, just as mount --rbind does. > > Private namespaces are only good for keeping a bunch of mounts > > referenced by a group of processes. But my guess is, that the natural > > behavior for users is to see a persistent set of mounts. > > > > If for example they mount something on a remote machine, then log out > > from the ssh session and later log back in, they would want to see > > their previous mount still there. > > They can mount to /mnt where the directory is shared ("mount > --make-shared /mnt") and visible and all namespaces. > > I think /share/$USER is an extreme example. You can found more > situations when private namespaces are nice solution. Private to a single login session? I'd like to hear examples. Thanks, Miklos - To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [patch 0/8] unprivileged mount syscall
Quoting Miklos Szeredi ([EMAIL PROTECTED]): > > > Thinking a bit more about this, I'm quite sure most users wouldn't > > > even want private namespaces. It would be enough to > > > > > > chroot /share/$USER > > > > > > and be done with it. > > > > > > Private namespaces are only good for keeping a bunch of mounts > > > referenced by a group of processes. But my guess is, that the natural > > > behavior for users is to see a persistent set of mounts. > > > > > > If for example they mount something on a remote machine, then log out > > > from the ssh session and later log back in, they would want to see > > > their previous mount still there. > > > > > > Miklos > > > > Agreed on desired behavior, but not on chroot sufficing. It actually > > sounds like you want exactly what was outlined in the OLS paper. > > > > Users still need to be in a different mounts namespace from the admin > > user so long as we consider the deluser and backup problems > > I don't think it matters, because /share/$USER duplicates a part or > the whole of the user's namespace. > > So backup would have to be taught about /share anyway, and deluser > operates on /home/$USER and not on /share/*, so there shouldn't be any > problem. In what I was thinking of, /share/$USER is bind mounted to ~$USER/share, so it would have to be done in a private namespace in order for deluser to not be tricked. > There's actually very little difference between rbind+chroot, and > CLONE_NEWNS. In a private namespace: > > 1) when no more processes reference the namespace, the tree will be > disbanded > > 2) the mount tree won't be accessible from outside the namespace But it *can* be, if properly set up. That's part of the point of the example in the OLS paper. When a user logs in, sshd clones a new namespace, then bind-mounts /share/$USER into ~$USER/share. So assuming that /share/$USER was --make-shared'd, it and ~$USER are now in the same peer group, and any changes made by the user under ~$USER will be reflected back into /share/$USER. > Wanting a persistent namespace contradicts 1). Not necessarily, see above. > Wanting a per-user (as opposed to per-session) namespace contradicts > 2). The namespace _has_ to be accessible from outside, so that a new > session can access/copy it. Again, I *think* you are wrong that private namespace contradicts this requirement. > So both requirements point to the rbind/chroot solution. It all points to a combination of the two :-) -serge - To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [patch 0/8] unprivileged mount syscall
On Fri, Apr 13, 2007 at 01:58:59PM +0200, Miklos Szeredi wrote: > > On Wed, 2007-04-11 at 12:44 +0200, Miklos Szeredi wrote: > > > > 1. clone the master namespace. > > > > > > > > 2. in the new namespace > > > > > > > > move the tree under /share/$me to / > > > > for each ($user, $what, $how) { > > > > move /share/$user/$what to /$what > > > > if ($how == slave) { > > > > make the mount tree under /$what as slave > > > > } > > > > } > > > > > > > > 3. in the new namespace make the tree under > > > >/share as private and unmount /share > > > > > > Thanks. I get the basic idea now: the namespace itself need not be > > > shared between the sessions, it is enough if "share" propagation is > > > set up between the different namespaces of a user. > > > > > > I don't yet see either in your or Viro's description how the trees > > > under /share/$USER are initialized. I guess they are recursively > > > bound from /, and are made slaves. > > > > yes. I suppose, when a userid is created one of the steps would be > > > > mount --rbind / /share/$USER > > mount --make-rslave /share/$USER > > mount --make-rshared /share/$USER > > Thinking a bit more about this, I'm quite sure most users wouldn't > even want private namespaces. It would be enough to > > chroot /share/$USER > > and be done with it. I don't think so. How to you want to implement non-shared /tmp directories? The chroot is overkill in this case. See: http://www.coker.com.au/selinux/talks/sage-2006/PolyInstantiatedDirectories.html http://danwalsh.livejournal.com/ > Private namespaces are only good for keeping a bunch of mounts > referenced by a group of processes. But my guess is, that the natural > behavior for users is to see a persistent set of mounts. > > If for example they mount something on a remote machine, then log out > from the ssh session and later log back in, they would want to see > their previous mount still there. They can mount to /mnt where the directory is shared ("mount --make-shared /mnt") and visible and all namespaces. I think /share/$USER is an extreme example. You can found more situations when private namespaces are nice solution. Karel -- Karel Zak <[EMAIL PROTECTED]> - To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [patch 0/8] unprivileged mount syscall
> > Thinking a bit more about this, I'm quite sure most users wouldn't > > even want private namespaces. It would be enough to > > > > chroot /share/$USER > > > > and be done with it. > > > > Private namespaces are only good for keeping a bunch of mounts > > referenced by a group of processes. But my guess is, that the natural > > behavior for users is to see a persistent set of mounts. > > > > If for example they mount something on a remote machine, then log out > > from the ssh session and later log back in, they would want to see > > their previous mount still there. > > > > Miklos > > Agreed on desired behavior, but not on chroot sufficing. It actually > sounds like you want exactly what was outlined in the OLS paper. > > Users still need to be in a different mounts namespace from the admin > user so long as we consider the deluser and backup problems I don't think it matters, because /share/$USER duplicates a part or the whole of the user's namespace. So backup would have to be taught about /share anyway, and deluser operates on /home/$USER and not on /share/*, so there shouldn't be any problem. There's actually very little difference between rbind+chroot, and CLONE_NEWNS. In a private namespace: 1) when no more processes reference the namespace, the tree will be disbanded 2) the mount tree won't be accessible from outside the namespace Wanting a persistent namespace contradicts 1). Wanting a per-user (as opposed to per-session) namespace contradicts 2). The namespace _has_ to be accessible from outside, so that a new session can access/copy it. So both requirements point to the rbind/chroot solution. Miklos - To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [patch 0/8] unprivileged mount syscall
Quoting Miklos Szeredi ([EMAIL PROTECTED]): > > On Wed, 2007-04-11 at 12:44 +0200, Miklos Szeredi wrote: > > > > 1. clone the master namespace. > > > > > > > > 2. in the new namespace > > > > > > > > move the tree under /share/$me to / > > > > for each ($user, $what, $how) { > > > > move /share/$user/$what to /$what > > > > if ($how == slave) { > > > > make the mount tree under /$what as slave > > > > } > > > > } > > > > > > > > 3. in the new namespace make the tree under > > > >/share as private and unmount /share > > > > > > Thanks. I get the basic idea now: the namespace itself need not be > > > shared between the sessions, it is enough if "share" propagation is > > > set up between the different namespaces of a user. > > > > > > I don't yet see either in your or Viro's description how the trees > > > under /share/$USER are initialized. I guess they are recursively > > > bound from /, and are made slaves. > > > > yes. I suppose, when a userid is created one of the steps would be > > > > mount --rbind / /share/$USER > > mount --make-rslave /share/$USER > > mount --make-rshared /share/$USER > > Thinking a bit more about this, I'm quite sure most users wouldn't > even want private namespaces. It would be enough to > > chroot /share/$USER > > and be done with it. > > Private namespaces are only good for keeping a bunch of mounts > referenced by a group of processes. But my guess is, that the natural > behavior for users is to see a persistent set of mounts. > > If for example they mount something on a remote machine, then log out > from the ssh session and later log back in, they would want to see > their previous mount still there. > > Miklos Agreed on desired behavior, but not on chroot sufficing. It actually sounds like you want exactly what was outlined in the OLS paper. Users still need to be in a different mounts namespace from the admin user so long as we consider the deluser and backup problems to be legitimate problems (well, so long as user mounts are allowed). So, when they log in, pam gives them a new namespace and chroots them into /share/$USER. Assuming I'm thinking clearly :) -serge - To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [patch 0/8] unprivileged mount syscall
> On Wed, 2007-04-11 at 12:44 +0200, Miklos Szeredi wrote: > > > 1. clone the master namespace. > > > > > > 2. in the new namespace > > > > > > move the tree under /share/$me to / > > > for each ($user, $what, $how) { > > > move /share/$user/$what to /$what > > > if ($how == slave) { > > > make the mount tree under /$what as slave > > > } > > > } > > > > > > 3. in the new namespace make the tree under > > >/share as private and unmount /share > > > > Thanks. I get the basic idea now: the namespace itself need not be > > shared between the sessions, it is enough if "share" propagation is > > set up between the different namespaces of a user. > > > > I don't yet see either in your or Viro's description how the trees > > under /share/$USER are initialized. I guess they are recursively > > bound from /, and are made slaves. > > yes. I suppose, when a userid is created one of the steps would be > > mount --rbind / /share/$USER > mount --make-rslave /share/$USER > mount --make-rshared /share/$USER Thinking a bit more about this, I'm quite sure most users wouldn't even want private namespaces. It would be enough to chroot /share/$USER and be done with it. Private namespaces are only good for keeping a bunch of mounts referenced by a group of processes. But my guess is, that the natural behavior for users is to see a persistent set of mounts. If for example they mount something on a remote machine, then log out from the ssh session and later log back in, they would want to see their previous mount still there. Miklos - To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [patch 0/8] unprivileged mount syscall
Quoting Miklos Szeredi ([EMAIL PROTECTED]): > > Not objecting to prctl(), but two other options would be > > > > 1. add a CLONE_NEW_NS_USERMNT flag - kind of ugly, but that is > >the time at which the ns is created, so in that sense it > >makes sense. > > Yes, I thought about this, but there's no easy way to set the flag for > the initial namespace, and a second flag CLONE_NEW_NS_NOUSERMNT would > be needed to turn off the flag. Not mentioning it would 'turn it off' for the cloned ns, but the default value for the initial namespace is still a problem. > > 2. use the nsproxy container subsystem (see Paul Menage's > >containers patchset) to set this using, e.g., > > > > echo 1 > /containers/vserver1/mounts/usermount > > That again would lose some flexibility: only namespaces which > are part of a container could be manipulated. In the nsproxy subsystem, every namespace gets a container so long as the nsproxy subsystem is mounted. > Does that exclude the > initial namespace? No, the initial namespace is tied to the root dentry - so if as my example was assuming youve done mount -t container -o ns none /containers then to change the setting for the initial namespace you would echo 0 > /containers/mounts/usermount > Also how would a process find out which vserver it is running in? cat /proc/$$/container -serge - To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [patch 0/8] unprivileged mount syscall
> Not objecting to prctl(), but two other options would be > > 1. add a CLONE_NEW_NS_USERMNT flag - kind of ugly, but that is > the time at which the ns is created, so in that sense it > makes sense. Yes, I thought about this, but there's no easy way to set the flag for the initial namespace, and a second flag CLONE_NEW_NS_NOUSERMNT would be needed to turn off the flag. > 2. use the nsproxy container subsystem (see Paul Menage's > containers patchset) to set this using, e.g., > > echo 1 > /containers/vserver1/mounts/usermount That again would lose some flexibility: only namespaces which are part of a container could be manipulated. Does that exclude the initial namespace? Also how would a process find out which vserver it is running in? Thanks, Miklos - To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [patch 0/8] unprivileged mount syscall
Quoting Miklos Szeredi ([EMAIL PROTECTED]): > > It would be nice in general if we could avoid any sort of checks for > > (mnt->mnt_ns == init_nsproxy.mnt_ns). Maybe that won't be possible, > > but, taking the two listed examples: > > [snip] > > It's probably worthwile going after these problematic cases, and > fixing them, OTOH it's not easy to audit a complete system for holes > arising from user mounts in the global namespace. > > So why not move this decision out from the kernel? How about adding a > boolean flag to namespaces, which specifies whether unprivileged > mounts are allowed or not. This would give complete flexibility to > distro builders and sysadmins. > > The biggest problem I see is how to set this flag. There's no easy > way to represent namespaces in /proc or /sys, and this is sufficiently > obscure not to warrant a new syscall. Adding a new flag to prctl() > could do the trick. Does that sound OK? Not objecting to prctl(), but two other options would be 1. add a CLONE_NEW_NS_USERMNT flag - kind of ugly, but that is the time at which the ns is created, so in that sense it makes sense. 2. use the nsproxy container subsystem (see Paul Menage's containers patchset) to set this using, e.g., echo 1 > /containers/vserver1/mounts/usermount The prctl() method has a huge advantage of being implementable right now. -serge - To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [patch 0/8] unprivileged mount syscall
> It would be nice in general if we could avoid any sort of checks for > (mnt->mnt_ns == init_nsproxy.mnt_ns). Maybe that won't be possible, > but, taking the two listed examples: [snip] It's probably worthwile going after these problematic cases, and fixing them, OTOH it's not easy to audit a complete system for holes arising from user mounts in the global namespace. So why not move this decision out from the kernel? How about adding a boolean flag to namespaces, which specifies whether unprivileged mounts are allowed or not. This would give complete flexibility to distro builders and sysadmins. The biggest problem I see is how to set this flag. There's no easy way to represent namespaces in /proc or /sys, and this is sufficiently obscure not to warrant a new syscall. Adding a new flag to prctl() could do the trick. Does that sound OK? Thanks, Miklos - To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [patch 0/8] unprivileged mount syscall
On Wed, 2007-04-11 at 12:44 +0200, Miklos Szeredi wrote: > > 1. clone the master namespace. > > > > 2. in the new namespace > > > > move the tree under /share/$me to / > > for each ($user, $what, $how) { > > move /share/$user/$what to /$what > > if ($how == slave) { > > make the mount tree under /$what as slave > > } > > } > > > > 3. in the new namespace make the tree under > >/share as private and unmount /share > > Thanks. I get the basic idea now: the namespace itself need not be > shared between the sessions, it is enough if "share" propagation is > set up between the different namespaces of a user. > > I don't yet see either in your or Viro's description how the trees > under /share/$USER are initialized. I guess they are recursively > bound from /, and are made slaves. yes. I suppose, when a userid is created one of the steps would be mount --rbind / /share/$USER mount --make-rslave /share/$USER mount --make-rshared /share/$USER RP > Miklos - To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [patch 0/8] unprivileged mount syscall
Quoting Ian Kent ([EMAIL PROTECTED]): > On Wed, 2007-04-11 at 09:26 -0500, Serge E. Hallyn wrote: > > Quoting Ian Kent ([EMAIL PROTECTED]): > > > On Wed, 2007-04-11 at 12:48 +0200, Miklos Szeredi wrote: > > > > > > >> > > > > > > >> - users can use bind mounts without having to pre-configure them > > > > > > >> in > > > > > > >> /etc/fstab > > > > > > >> > > > > > > > > > > > > This is by far the biggest concern I see. I think the security > > > > > > implication of allowing anyone to do bind mounts are poorly > > > > > > understood. > > > > > > > > > > And especially so since there is no way for a filesystem module to > > > > > veto > > > > > such requests. > > > > > > > > The filesystem can't veto initial mounts based on destination either. > > > > I don't think it's up to the filesystem to police bind/move mounts in > > > > any way. > > > > > > But if a filesystem can't or the developer thinks that it shouldn't for > > > some reason, support bind/move mounts then there should be a way for the > > > > Can you list some valid reasons why an fs could care where it is > > mounted? The only thing I could think of is a stackable fs, but it > > shouldn't care whether it is overlay-mounted or not. > > For my part, autofs and autofs4. Ah, thanks. I can see I'm going to have start using autofs to get to know the implementation, because it seems clear we'll run into it in the containers work again (beyond the struct pid conv) at some point. > Moving or binding isn't valid. > I tried to design that limitation out version 5 but wasn't able to. > In time I probably can but couldn't continue to support older versions. thanks, -serge > > > > thanks, > > -serge > > > > > filesystem to tell the kernel that. > > > > > > Surely a filesystem is in a good position to be able to decide if a > > > mount request "for it" should be allowed to continue based on it's "own > > > situation and capabilities". > > > > > > Ian > > > > > > > > > > > > - > > > To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" > > > in > > > the body of a message to [EMAIL PROTECTED] > > > More majordomo info at http://vger.kernel.org/majordomo-info.html - To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [patch 0/8] unprivileged mount syscall
On Wed, 2007-04-11 at 09:26 -0500, Serge E. Hallyn wrote: > Quoting Ian Kent ([EMAIL PROTECTED]): > > On Wed, 2007-04-11 at 12:48 +0200, Miklos Szeredi wrote: > > > > > >> > > > > > >> - users can use bind mounts without having to pre-configure them in > > > > > >> /etc/fstab > > > > > >> > > > > > > > > > > This is by far the biggest concern I see. I think the security > > > > > implication of allowing anyone to do bind mounts are poorly > > > > > understood. > > > > > > > > And especially so since there is no way for a filesystem module to veto > > > > such requests. > > > > > > The filesystem can't veto initial mounts based on destination either. > > > I don't think it's up to the filesystem to police bind/move mounts in > > > any way. > > > > But if a filesystem can't or the developer thinks that it shouldn't for > > some reason, support bind/move mounts then there should be a way for the > > Can you list some valid reasons why an fs could care where it is > mounted? The only thing I could think of is a stackable fs, but it > shouldn't care whether it is overlay-mounted or not. For my part, autofs and autofs4. Moving or binding isn't valid. I tried to design that limitation out version 5 but wasn't able to. In time I probably can but couldn't continue to support older versions. > > thanks, > -serge > > > filesystem to tell the kernel that. > > > > Surely a filesystem is in a good position to be able to decide if a > > mount request "for it" should be allowed to continue based on it's "own > > situation and capabilities". > > > > Ian > > > > > > > > - > > To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in > > the body of a message to [EMAIL PROTECTED] > > More majordomo info at http://vger.kernel.org/majordomo-info.html - To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [patch 0/8] unprivileged mount syscall
Quoting Ian Kent ([EMAIL PROTECTED]): > On Wed, 2007-04-11 at 12:48 +0200, Miklos Szeredi wrote: > > > > >> > > > > >> - users can use bind mounts without having to pre-configure them in > > > > >> /etc/fstab > > > > >> > > > > > > > > This is by far the biggest concern I see. I think the security > > > > implication of allowing anyone to do bind mounts are poorly understood. > > > > > > And especially so since there is no way for a filesystem module to veto > > > such requests. > > > > The filesystem can't veto initial mounts based on destination either. > > I don't think it's up to the filesystem to police bind/move mounts in > > any way. > > But if a filesystem can't or the developer thinks that it shouldn't for > some reason, support bind/move mounts then there should be a way for the Can you list some valid reasons why an fs could care where it is mounted? The only thing I could think of is a stackable fs, but it shouldn't care whether it is overlay-mounted or not. thanks, -serge > filesystem to tell the kernel that. > > Surely a filesystem is in a good position to be able to decide if a > mount request "for it" should be allowed to continue based on it's "own > situation and capabilities". > > Ian > > > > - > To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in > the body of a message to [EMAIL PROTECTED] > More majordomo info at http://vger.kernel.org/majordomo-info.html - To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [patch 0/8] unprivileged mount syscall
On Wed, 2007-04-11 at 12:48 +0200, Miklos Szeredi wrote: > > > >> > > > >> - users can use bind mounts without having to pre-configure them in > > > >> /etc/fstab > > > >> > > > > > > This is by far the biggest concern I see. I think the security > > > implication of allowing anyone to do bind mounts are poorly understood. > > > > And especially so since there is no way for a filesystem module to veto > > such requests. > > The filesystem can't veto initial mounts based on destination either. > I don't think it's up to the filesystem to police bind/move mounts in > any way. But if a filesystem can't or the developer thinks that it shouldn't for some reason, support bind/move mounts then there should be a way for the filesystem to tell the kernel that. Surely a filesystem is in a good position to be able to decide if a mount request "for it" should be allowed to continue based on it's "own situation and capabilities". Ian - To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [patch 0/8] unprivileged mount syscall
> > >> > > >> - users can use bind mounts without having to pre-configure them in > > >> /etc/fstab > > >> > > > > This is by far the biggest concern I see. I think the security > > implication of allowing anyone to do bind mounts are poorly understood. > > And especially so since there is no way for a filesystem module to veto > such requests. The filesystem can't veto initial mounts based on destination either. I don't think it's up to the filesystem to police bind/move mounts in any way. Miklos - To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [patch 0/8] unprivileged mount syscall
> 1. clone the master namespace. > > 2. in the new namespace > > move the tree under /share/$me to / > for each ($user, $what, $how) { > move /share/$user/$what to /$what > if ($how == slave) { > make the mount tree under /$what as slave > } > } > > 3. in the new namespace make the tree under >/share as private and unmount /share Thanks. I get the basic idea now: the namespace itself need not be shared between the sessions, it is enough if "share" propagation is set up between the different namespaces of a user. I don't yet see either in your or Viro's description how the trees under /share/$USER are initialized. I guess they are recursively bound from /, and are made slaves. Miklos - To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [patch 0/8] unprivileged mount syscall
> > This patchset adds support for keeping mount ownership information in > > the kernel, and allow unprivileged mount(2) and umount(2) in certain > > cases. > > Well, I'd like to feel all smart and point out some bugs, but the code > all reads very nicely, seems to work as advertised, and while I won't > have ltp results until tomorrow, boot test results in so far are all > successful. > > Looks good. Thanks for the review and testing! Miklos - To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [patch 0/8] unprivileged mount syscall
On Mon, Apr 09, 2007 at 10:46:25AM -0700, Ram Pai wrote: > On Mon, 2007-04-09 at 12:07 -0500, Serge E. Hallyn wrote: > > Quoting Miklos Szeredi ([EMAIL PROTECTED]): > > > > - need to set up mount propagation from global namespace to private > > >ones, mount(8) does not yet have options to configure propagation > > > > Hmm, I guess I get lost using my own little systems, and just assumed > > that shared subtree functionality was making its way up into mount(8). > > Ram, have you been working on that? > > It is in FC6. I dont know the status off upstream util-linux. I did > submit the patch many times to Adrian Bunk (the then util-linux > maintainer) and got no response. I have not pushed the patches to the > new maintainer(Karel Zak?) though. The "shared-subtree" patch has been applied: http://git.kernel.org/?p=utils/util-linux-ng/util-linux-ng.git;a=commitdiff;h=389fbea536e4308d9475fa2a89e53e188ce8a0e3;hp=939a997de0c761d29fb7530976ca20da4898703a Karel -- Karel Zak <[EMAIL PROTECTED]> - To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [patch 0/8] unprivileged mount syscall
On Fri, 2007-04-06 at 16:16 -0700, H. Peter Anvin wrote: > >> > >> - users can use bind mounts without having to pre-configure them in > >> /etc/fstab > >> > > This is by far the biggest concern I see. I think the security > implication of allowing anyone to do bind mounts are poorly understood. And especially so since there is no way for a filesystem module to veto such requests. Ian - To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [patch 0/8] unprivileged mount syscall
On Mon, 2007-04-09 at 22:10 +0200, Miklos Szeredi wrote: > > > The one in pam-0.99.6.3-29.1 in opensuse-10.2 is totally broken. Are > > > you interested in the details? I can reproduce it, but forgot to note > > > down the details of the brokenness. > > > > I don't know how far removed that is from the one being used by redhat, > > but assuming it's the same, then redhat-lspp@redhat.com will be > > very interested. > > OK. > > > > - user namespace setup: what if user has multiple sessions? > > > > > >1) namespaces are shared? That's tricky because the session needs to > > >be a child of a namespace server, not of login. I'm not sure PAM > > >can handle this > > > > > >2) or mounts are copied on login? That's not possible currently, > > >as there's no way to send a mount between namespaces. Also it's > > >tricky to make sure that new mounts are also shared > > > > See toward the end of the 'shared subtrees' OLS paper from last year for > > a suggestion on how to let users effectively 'log in to' an existing > > private mounts ns. > > This? > > 1. create a new namespace > 2. bind /share/$USER to /share > 3. for each pair ($who, $what) such that > /share/$USER/$who/$what exists, look > in /share/$who/allowed for "peer $what > $USER" or "slave $what $USER". If the > former is found, rbind /share/$who/$what > on /share/$USER/$who/$what; if the > latter is found, do the same and > follow with marking subtree under > /share/$USER/$who/$what as slave. > 4. rbind /share/$USER to /share > 5. mark subtree under /share as private. > 6. umount -l /share > > Well, someone please explain using short words, because I don't > understand at all. I am trying to re-construct Viro's thoughts. I think the steps outlined above; though not accurate, are still insightful. The idea is -- there is one master namespace, which has under /share, a replica of the mount tree of namespaces belonging to all users. for example if there are two users A and B, then in the master namespace under /share you will find /share/A and /share/B, each reflecting the mount tree for the namespaces belonging to user-A and user-B respectively. Note: /share is a shared mount-tree, which means it can propagate mount events. Everytime the user logs on the machine, a new namespace is created which is the clone of the master namespace. In this new namespace, the /share/$user is made the root of the namespace. Also if other users have allowed part of their namespace available to this user, than those mounts are also brought under this namespace. And finally the entire tree under /share is unmounted. Note, though multiple namespaces can exist simultaneously for the same user, the user is provided the illusion of per-process-namespace since all the namespaces look identical. I am trying to rewrite the steps outlined above, which may or may not reflect Viro's thoughts, but certainly reflect my reconstruction of viro's thoughts. 1. clone the master namespace. 2. in the new namespace move the tree under /share/$me to / for each ($user, $what, $how) { move /share/$user/$what to /$what if ($how == slave) { make the mount tree under /$what as slave } } 3. in the new namespace make the tree under /share as private and unmount /share RP > > Thanks, > Miklos - To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [patch 0/8] unprivileged mount syscall
Quoting Miklos Szeredi ([EMAIL PROTECTED]): > This patchset adds support for keeping mount ownership information in > the kernel, and allow unprivileged mount(2) and umount(2) in certain > cases. Well, I'd like to feel all smart and point out some bugs, but the code all reads very nicely, seems to work as advertised, and while I won't have ltp results until tomorrow, boot test results in so far are all successful. Looks good. -serge > This can be useful for the following reasons: > > - mount(8) can store ownership ("user=XY" option) in the kernel > instead, or in addition to storing it in /etc/mtab. For example if > private namespaces are used with mount propagations /etc/mtab > becomes unworkable, but using /proc/mounts works fine > > - fuse won't need a special suid-root mount/umount utility. Plain > umount(8) can easily be made to work with unprivileged fuse mounts > > - users can use bind mounts without having to pre-configure them in > /etc/fstab > > All this is done in a secure way, and unprivileged bind and fuse > mounts are disabled by default and can be enabled through sysctl or > /proc/sys. > > One thing that is missing from this series is the ability to restrict > user mounts to private namespaces. The reason is that private > namespaces have still not gained the momentum and support needed for > painless user experience. So such a feature would not yet get enough > attention and testing. However adding such an optional restriction > can be done with minimal changes in the future, once private > namespaces have matured. > > An earlier version of these patches have been discussed here: > > http://lkml.org/lkml/2005/5/3/64 > > -- > - > To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in > the body of a message to [EMAIL PROTECTED] > More majordomo info at http://vger.kernel.org/majordomo-info.html - To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [patch 0/8] unprivileged mount syscall
Quoting Miklos Szeredi ([EMAIL PROTECTED]): > > > One thing that is missing from this series is the ability to restrict > > > user mounts to private namespaces. The reason is that private > > > namespaces have still not gained the momentum and support needed for > > > painless user experience. So such a feature would not yet get enough > > > attention and testing. However adding such an optional restriction > > > can be done with minimal changes in the future, once private > > > namespaces have matured. > > > > What is the main reason for that feature? Would it be to prevent things > > like login from being tricked by user mounts? Isn't it sufficient, in > > fact, better, to require that the target of the mount be owned by the > > user doing the mount? > > It's been discussed later in that thread. Basically you can fool a I see now, sorry. > lot of system programs (like backup) with mounting/binding in the > global namespace. Restricting the destination doesn't always help. > > Miklos It would be nice in general if we could avoid any sort of checks for (mnt->mnt_ns == init_nsproxy.mnt_ns). Maybe that won't be possible, but, taking the two listed examples: 1. mount --bind / ~/bindns; (later) userdel hallyn I assume userdel does a simple stupid rm -rf without first umounting, then? So (1) it seems wise to have userdel umount anything under ~user first anyway, and (2) if $USER does a mount --bind from a source he doesn't own, should we make the resulting mount read-only? (realizing the read-only bind mount patches are still under development :) Or is that overly restrictive somehow for fuse? 2. backups Is this just a 'he's going to fill up the whole disk' issue? Frankly, it seems wise to have cron or whatever is spawning the backup start in it's own namespace right at boot. Generally when I think back on sites where I've dealt with backup, backups were done on a separate server which didn't allow userlogins anyway, so it wouldn't be a problem. But I'm sure that's a limited (==erroneous) POV. I do realize that the whole problem about corner cases isn't addressing two little ones, but the fact that there are more we haven't thought of. So are there any currently known use cases where requiring a CLONE_NEWNS before user mounts is unacceptable? thanks, -serge - To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [patch 0/8] unprivileged mount syscall
> > One thing that is missing from this series is the ability to restrict > > user mounts to private namespaces. The reason is that private > > namespaces have still not gained the momentum and support needed for > > painless user experience. So such a feature would not yet get enough > > attention and testing. However adding such an optional restriction > > can be done with minimal changes in the future, once private > > namespaces have matured. > > What is the main reason for that feature? Would it be to prevent things > like login from being tricked by user mounts? Isn't it sufficient, in > fact, better, to require that the target of the mount be owned by the > user doing the mount? It's been discussed later in that thread. Basically you can fool a lot of system programs (like backup) with mounting/binding in the global namespace. Restricting the destination doesn't always help. Miklos - To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [patch 0/8] unprivileged mount syscall
> > The one in pam-0.99.6.3-29.1 in opensuse-10.2 is totally broken. Are > > you interested in the details? I can reproduce it, but forgot to note > > down the details of the brokenness. > > I don't know how far removed that is from the one being used by redhat, > but assuming it's the same, then redhat-lspp@redhat.com will be > very interested. OK. > > - user namespace setup: what if user has multiple sessions? > > > >1) namespaces are shared? That's tricky because the session needs to > >be a child of a namespace server, not of login. I'm not sure PAM > >can handle this > > > >2) or mounts are copied on login? That's not possible currently, > >as there's no way to send a mount between namespaces. Also it's > >tricky to make sure that new mounts are also shared > > See toward the end of the 'shared subtrees' OLS paper from last year for > a suggestion on how to let users effectively 'log in to' an existing > private mounts ns. This? 1. create a new namespace 2. bind /share/$USER to /share 3. for each pair ($who, $what) such that /share/$USER/$who/$what exists, look in /share/$who/allowed for "peer $what $USER" or "slave $what $USER". If the former is found, rbind /share/$who/$what on /share/$USER/$who/$what; if the latter is found, do the same and follow with marking subtree under /share/$USER/$who/$what as slave. 4. rbind /share/$USER to /share 5. mark subtree under /share as private. 6. umount -l /share Well, someone please explain using short words, because I don't understand at all. Thanks, Miklos - To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [patch 0/8] unprivileged mount syscall
Quoting Miklos Szeredi ([EMAIL PROTECTED]): > This patchset adds support for keeping mount ownership information in > the kernel, and allow unprivileged mount(2) and umount(2) in certain > cases. > > This can be useful for the following reasons: > > - mount(8) can store ownership ("user=XY" option) in the kernel > instead, or in addition to storing it in /etc/mtab. For example if > private namespaces are used with mount propagations /etc/mtab > becomes unworkable, but using /proc/mounts works fine > > - fuse won't need a special suid-root mount/umount utility. Plain > umount(8) can easily be made to work with unprivileged fuse mounts > > - users can use bind mounts without having to pre-configure them in > /etc/fstab > > All this is done in a secure way, and unprivileged bind and fuse > mounts are disabled by default and can be enabled through sysctl or > /proc/sys. > > One thing that is missing from this series is the ability to restrict > user mounts to private namespaces. The reason is that private > namespaces have still not gained the momentum and support needed for > painless user experience. So such a feature would not yet get enough > attention and testing. However adding such an optional restriction > can be done with minimal changes in the future, once private > namespaces have matured. What is the main reason for that feature? Would it be to prevent things like login from being tricked by user mounts? Isn't it sufficient, in fact, better, to require that the target of the mount be owned by the user doing the mount? -serge (who's pretty sure he's missing something) > An earlier version of these patches have been discussed here: > > http://lkml.org/lkml/2005/5/3/64 > > -- > - > To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in > the body of a message to [EMAIL PROTECTED] > More majordomo info at http://vger.kernel.org/majordomo-info.html - To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [patch 0/8] unprivileged mount syscall
Ram Pai wrote: It is in FC6. I dont know the status off upstream util-linux. I did submit the patch many times to Adrian Bunk (the then util-linux maintainer) and got no response. I have not pushed the patches to the new maintainer(Karel Zak?) though. Well, do that, then :) Seriously. The whole point of util-linux-ng is to make forward progress. -hpa - To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [patch 0/8] unprivileged mount syscall
On Mon, 2007-04-09 at 12:07 -0500, Serge E. Hallyn wrote: > Quoting Miklos Szeredi ([EMAIL PROTECTED]): > > - need to set up mount propagation from global namespace to private > >ones, mount(8) does not yet have options to configure propagation > > Hmm, I guess I get lost using my own little systems, and just assumed > that shared subtree functionality was making its way up into mount(8). > Ram, have you been working on that? It is in FC6. I dont know the status off upstream util-linux. I did submit the patch many times to Adrian Bunk (the then util-linux maintainer) and got no response. I have not pushed the patches to the new maintainer(Karel Zak?) though. RP - To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [patch 0/8] unprivileged mount syscall
Quoting Miklos Szeredi ([EMAIL PROTECTED]): > > > > > One thing that is missing from this series is the ability to restrict > > > > > user mounts to private namespaces. The reason is that private > > > > > namespaces have still not gained the momentum and support needed for > > > > > painless user experience. So such a feature would not yet get enough > > > > > attention and testing. However adding such an optional restriction > > > > > can be done with minimal changes in the future, once private > > > > > namespaces have matured. > > > > > > > > I suspect the people who developed and maintain nsproxy would disagree > > > > ;) > > > > > > Well, they better show me some working and simple-to-use userspace > > > code, because I've not seen anything like that related to mount > > > namespaces. > > > > If you mean to test/exploit them, see > > http://lxc.sourceforge.net/patches/2.6.20/2.6.20-lxc8/broken-out/tests/ > > > > Compile the ns_exec.c program and do > > > > ns_exec -m /bin/sh > > > > to get a shell in a new mounts namespace. > > Cool, thanks. This is a very nice utility for testing, but for the > end user rather useless: Well that depends on which end-user. Those wanting to create a vserver or checkpoint-restart job will want this, but clearly we have a long way to go for that upstream anyway. > - user starts up a private namespace in a shell, mounts something > > - then opens app from menu, tries to access mount, but the mount is > not there > > - user unhappy > > BTW, looking at -mm unshare() on namespace is not privileged any more. > Why is that? Or rather, what's the reason, that clone() is privileged > and unshare() is not? The check is still there - see kernel/nsproxy.c:unshare_nsproxy_namespaces(). > > > pam_namespace.so is one example of a non-working, but probably-not-too- > > > hard-to-fix one. > > > > Non-working? I sure hope the one used for LSPP certification is > > working... As is the ugly version I wrote 18 mounts ago and use on my > > laptop. > > The one in pam-0.99.6.3-29.1 in opensuse-10.2 is totally broken. Are > you interested in the details? I can reproduce it, but forgot to note > down the details of the brokenness. I don't know how far removed that is from the one being used by redhat, but assuming it's the same, then redhat-lspp@redhat.com will be very interested. > > > I'm just saying this is not yet something that Joe Blow would just > > > enable by ticking a box in their desktop setup wizard, and it would > > > all work flawlessly thereafter. There's still a _long_ way towards > > > that, and mostly in userspace. > > > > I'm not sure there's a that long a way to go, but clearly we need to be > > showing users what they can do, or they'll never work their way towards > > there. > > There _is_ a long way to go. Random things that spring to my mind: > > - using /etc/mtab is broken with private namespaces, using >/proc/mounts is missing various functionality, that /etc/mtab has, >for example the "user" option, which this patchset adds Agreed those need fixing. > - need to set up mount propagation from global namespace to private >ones, mount(8) does not yet have options to configure propagation Hmm, I guess I get lost using my own little systems, and just assumed that shared subtree functionality was making its way up into mount(8). Ram, have you been working on that? > - user namespace setup: what if user has multiple sessions? > >1) namespaces are shared? That's tricky because the session needs to >be a child of a namespace server, not of login. I'm not sure PAM >can handle this > >2) or mounts are copied on login? That's not possible currently, >as there's no way to send a mount between namespaces. Also it's >tricky to make sure that new mounts are also shared See toward the end of the 'shared subtrees' OLS paper from last year for a suggestion on how to let users effectively 'log in to' an existing private mounts ns. > > For instance, as you say, a user admin gui with a checkmark and text > > boxes saying 'enter new namespace on login', 'create private /tmp', > > and 'create private dmcrypted /home' would be trivial right now. > > Trivial modulo the above slightly non-trivial exemptions ;) Ok, so it can use some very non-trivial fine-tuning... But I've been using the above - minus the trivial gui - for over a year without ever worrying about any of these short-comings. > Miklos -serge - To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [patch 0/8] unprivileged mount syscall
> > > > One thing that is missing from this series is the ability to restrict > > > > user mounts to private namespaces. The reason is that private > > > > namespaces have still not gained the momentum and support needed for > > > > painless user experience. So such a feature would not yet get enough > > > > attention and testing. However adding such an optional restriction > > > > can be done with minimal changes in the future, once private > > > > namespaces have matured. > > > > > > I suspect the people who developed and maintain nsproxy would disagree ;) > > > > Well, they better show me some working and simple-to-use userspace > > code, because I've not seen anything like that related to mount > > namespaces. > > If you mean to test/exploit them, see > http://lxc.sourceforge.net/patches/2.6.20/2.6.20-lxc8/broken-out/tests/ > > Compile the ns_exec.c program and do > > ns_exec -m /bin/sh > > to get a shell in a new mounts namespace. Cool, thanks. This is a very nice utility for testing, but for the end user rather useless: - user starts up a private namespace in a shell, mounts something - then opens app from menu, tries to access mount, but the mount is not there - user unhappy BTW, looking at -mm unshare() on namespace is not privileged any more. Why is that? Or rather, what's the reason, that clone() is privileged and unshare() is not? > > pam_namespace.so is one example of a non-working, but probably-not-too- > > hard-to-fix one. > > Non-working? I sure hope the one used for LSPP certification is > working... As is the ugly version I wrote 18 mounts ago and use on my > laptop. The one in pam-0.99.6.3-29.1 in opensuse-10.2 is totally broken. Are you interested in the details? I can reproduce it, but forgot to note down the details of the brokenness. > > I'm just saying this is not yet something that Joe Blow would just > > enable by ticking a box in their desktop setup wizard, and it would > > all work flawlessly thereafter. There's still a _long_ way towards > > that, and mostly in userspace. > > I'm not sure there's a that long a way to go, but clearly we need to be > showing users what they can do, or they'll never work their way towards > there. There _is_ a long way to go. Random things that spring to my mind: - using /etc/mtab is broken with private namespaces, using /proc/mounts is missing various functionality, that /etc/mtab has, for example the "user" option, which this patchset adds - need to set up mount propagation from global namespace to private ones, mount(8) does not yet have options to configure propagation - user namespace setup: what if user has multiple sessions? 1) namespaces are shared? That's tricky because the session needs to be a child of a namespace server, not of login. I'm not sure PAM can handle this 2) or mounts are copied on login? That's not possible currently, as there's no way to send a mount between namespaces. Also it's tricky to make sure that new mounts are also shared > For instance, as you say, a user admin gui with a checkmark and text > boxes saying 'enter new namespace on login', 'create private /tmp', > and 'create private dmcrypted /home' would be trivial right now. Trivial modulo the above slightly non-trivial exemptions ;) Miklos - To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [patch 0/8] unprivileged mount syscall
Quoting Miklos Szeredi ([EMAIL PROTECTED]): > > > This patchset adds support for keeping mount ownership information in > > > the kernel, and allow unprivileged mount(2) and umount(2) in certain > > > cases. > > > > No replies, huh? > > All we need is a comment from Andrew, and the replies come flooding in ;) > > > My knowledge of the code which you're touching is not strong, and my spare > > reviewing capacity is not high. And this work does need close review by > > people who are familar with the code which you're changing. > > > > So could I suggest that you go for a dig through the git history, identify > > some individuals who look like they know this code, then do a resend, > > cc'ing those people? Please also cc linux-kernel on that resend. > > OK. > > > > One thing that is missing from this series is the ability to restrict > > > user mounts to private namespaces. The reason is that private > > > namespaces have still not gained the momentum and support needed for > > > painless user experience. So such a feature would not yet get enough > > > attention and testing. However adding such an optional restriction > > > can be done with minimal changes in the future, once private > > > namespaces have matured. > > > > I suspect the people who developed and maintain nsproxy would disagree ;) > > Well, they better show me some working and simple-to-use userspace > code, because I've not seen anything like that related to mount > namespaces. If you mean to test/exploit them, see http://lxc.sourceforge.net/patches/2.6.20/2.6.20-lxc8/broken-out/tests/ Compile the ns_exec.c program and do ns_exec -m /bin/sh to get a shell in a new mounts namespace. > pam_namespace.so is one example of a non-working, but probably-not-too- > hard-to-fix one. Non-working? I sure hope the one used for LSPP certification is working... As is the ugly version I wrote 18 mounts ago and use on my laptop. > I'm just saying this is not yet something that Joe Blow would just > enable by ticking a box in their desktop setup wizard, and it would > all work flawlessly thereafter. There's still a _long_ way towards > that, and mostly in userspace. I'm not sure there's a that long a way to go, but clearly we need to be showing users what they can do, or they'll never work their way towards there. For instance, as you say, a user admin gui with a checkmark and text boxes saying 'enter new namespace on login', 'create private /tmp', and 'create private dmcrypted /home' would be trivial right now. -serge - To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [patch 0/8] unprivileged mount syscall
> On 4/6/07, H. Peter Anvin <[EMAIL PROTECTED]> wrote: > > Jan Engelhardt wrote: > > > On Apr 6 2007 16:16, H. Peter Anvin wrote: > > - users can use bind mounts without having to pre-configure them in > > /etc/fstab > > > > >> This is by far the biggest concern I see. I think the security > > >> implication of > > >> allowing anyone to do bind mounts are poorly understood. > > > > > > $ whoami > > > miklos > > > $ mount --bind / ~/down_under > > > > > > later that day: > > > # userdel -r miklos > > > > > > > Consider backups, for example. > > > > This is the reason why enforcing private namespaces for user mounts > makes sense. I think it catches many of these corner cases. Yes, disabling user bind mounts in the global namespace makes sense. Enabling user fuse mounts in the global namespace still works though, even if a little cludgy. All these nasty corner cases have been thought through and validated by a lot of users. Thanks, Miklos - To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [patch 0/8] unprivileged mount syscall
> > This patchset adds support for keeping mount ownership information in > > the kernel, and allow unprivileged mount(2) and umount(2) in certain > > cases. > > No replies, huh? All we need is a comment from Andrew, and the replies come flooding in ;) > My knowledge of the code which you're touching is not strong, and my spare > reviewing capacity is not high. And this work does need close review by > people who are familar with the code which you're changing. > > So could I suggest that you go for a dig through the git history, identify > some individuals who look like they know this code, then do a resend, > cc'ing those people? Please also cc linux-kernel on that resend. OK. > > One thing that is missing from this series is the ability to restrict > > user mounts to private namespaces. The reason is that private > > namespaces have still not gained the momentum and support needed for > > painless user experience. So such a feature would not yet get enough > > attention and testing. However adding such an optional restriction > > can be done with minimal changes in the future, once private > > namespaces have matured. > > I suspect the people who developed and maintain nsproxy would disagree ;) Well, they better show me some working and simple-to-use userspace code, because I've not seen anything like that related to mount namespaces. pam_namespace.so is one example of a non-working, but probably-not-too- hard-to-fix one. I'm just saying this is not yet something that Joe Blow would just enable by ticking a box in their desktop setup wizard, and it would all work flawlessly thereafter. There's still a _long_ way towards that, and mostly in userspace. Thanks, Miklos - To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [patch 0/8] unprivileged mount syscall
On 4/6/07, H. Peter Anvin <[EMAIL PROTECTED]> wrote: Jan Engelhardt wrote: > On Apr 6 2007 16:16, H. Peter Anvin wrote: - users can use bind mounts without having to pre-configure them in /etc/fstab >> This is by far the biggest concern I see. I think the security implication of >> allowing anyone to do bind mounts are poorly understood. > > $ whoami > miklos > $ mount --bind / ~/down_under > > later that day: > # userdel -r miklos > Consider backups, for example. This is the reason why enforcing private namespaces for user mounts makes sense. I think it catches many of these corner cases. -eric - To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [patch 0/8] unprivileged mount syscall
Jan Engelhardt wrote: On Apr 6 2007 16:16, H. Peter Anvin wrote: - users can use bind mounts without having to pre-configure them in /etc/fstab This is by far the biggest concern I see. I think the security implication of allowing anyone to do bind mounts are poorly understood. $ whoami miklos $ mount --bind / ~/down_under later that day: # userdel -r miklos So both the source (/) and target (~/down_under) directory must be owned by the user before --bind may succeed. There may be other implications hpa might want to fill us in. Consider backups, for example. -hpa - To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [patch 0/8] unprivileged mount syscall
On Apr 6 2007 16:16, H. Peter Anvin wrote: >> > >> > - users can use bind mounts without having to pre-configure them in >> > /etc/fstab >> > > > This is by far the biggest concern I see. I think the security implication of > allowing anyone to do bind mounts are poorly understood. $ whoami miklos $ mount --bind / ~/down_under later that day: # userdel -r miklos So both the source (/) and target (~/down_under) directory must be owned by the user before --bind may succeed. There may be other implications hpa might want to fill us in. Regards, Jan -- - To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [patch 0/8] unprivileged mount syscall
- users can use bind mounts without having to pre-configure them in /etc/fstab This is by far the biggest concern I see. I think the security implication of allowing anyone to do bind mounts are poorly understood. -hpa - To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [patch 0/8] unprivileged mount syscall
On Wed, 04 Apr 2007 20:30:12 +0200 Miklos Szeredi <[EMAIL PROTECTED]> wrote: > This patchset adds support for keeping mount ownership information in > the kernel, and allow unprivileged mount(2) and umount(2) in certain > cases. No replies, huh? My knowledge of the code which you're touching is not strong, and my spare reviewing capacity is not high. And this work does need close review by people who are familar with the code which you're changing. So could I suggest that you go for a dig through the git history, identify some individuals who look like they know this code, then do a resend, cc'ing those people? Please also cc linux-kernel on that resend. > This can be useful for the following reasons: > > - mount(8) can store ownership ("user=XY" option) in the kernel > instead, or in addition to storing it in /etc/mtab. For example if > private namespaces are used with mount propagations /etc/mtab > becomes unworkable, but using /proc/mounts works fine > > - fuse won't need a special suid-root mount/umount utility. Plain > umount(8) can easily be made to work with unprivileged fuse mounts > > - users can use bind mounts without having to pre-configure them in > /etc/fstab > > All this is done in a secure way, and unprivileged bind and fuse > mounts are disabled by default and can be enabled through sysctl or > /proc/sys. > > One thing that is missing from this series is the ability to restrict > user mounts to private namespaces. The reason is that private > namespaces have still not gained the momentum and support needed for > painless user experience. So such a feature would not yet get enough > attention and testing. However adding such an optional restriction > can be done with minimal changes in the future, once private > namespaces have matured. I suspect the people who developed and maintain nsproxy would disagree ;) Please also cc [EMAIL PROTECTED] > An earlier version of these patches have been discussed here: > > http://lkml.org/lkml/2005/5/3/64 > > -- - To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html