Re: [PATCH v2] ceph: fix posix ACL hooks
On Mon, Feb 03, 2014 at 10:31:27PM +, Al Viro wrote: > > > And the fact is, filesystems with hardlinks and path-name-based > > operations do exist. cifs with the unix extensions is one of them. > > Pox on Tridge... Actually you have to blame me for that. Tridge always *HATED* the UNIX extensions :-). -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2] ceph: fix posix ACL hooks
On Tue, Feb 4, 2014 at 3:33 AM, Steven Whitehouse wrote: > > The other question that I have relating to that side of things, is why > security_inode_permission() is called from __inode_permission() rather > than from generic_permission() ? Maybe there is a good reason, but I > can't immediately see what it is at the moment. "generic_permission()" is just a helper that implements the default UNIX permissions, and won't necessarily even be called. A filesystem could decide not to call it at all, and in fact there are cases that don't (eg coda or the bad_inode case). The inode_permission() class of helpers, in contrast, is what gets called by the VFS layer itself. So if you want to catch all permission checks (and that would be security_inode_permission()) then you need to catch it there. Linus -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2] ceph: fix posix ACL hooks
On Tue, Feb 04, 2014 at 11:33:35AM +, Steven Whitehouse wrote: > To diverge from that topic for a moment, this thread has also brought > together some discussion on another issue which I've been pondering > recently that of whether the inode operations for get/set_xattr > should take a dentry or not. I had thought that we'd come to the > conclusion that 9p made it impossible to swap the current dentry > argument for an inode, and I was about to send a patch for selinux > support on clustered fs on that basis. However the discussion in this > thread has made me wonder whether that really is the case or not Al, > can you confirm whether your xattr-experimental patches are still under > active consideration? My plan was to work on the 9p and cifs conversions using the d_find_alias hack we have in ceph right now. That means the base work could switch to passed in dentries or in case of 9p the per-inode fids easily. > The other question that I have relating to that side of things, is why > security_inode_permission() is called from __inode_permission() rather > than from generic_permission() ? Maybe there is a good reason, but I > can't immediately see what it is at the moment. Seems like almost everything of the security_* family is called from the VFS instead of the filesystem. There's also some very odd other behaviour in there, e.g. for the xattrs sets are handed to the filesystem first, and then the xattr layer calls into the security layer, which for reads the filesystems is never reached at all. > In response to the question elsewhere about GFS2 calling > gfs2_permission() after the vfs has already done its checks, that is > indeed down to needing to ensure that we have the cluster locks when > this check is called. More importantly to know that things haven't > changed since the VFS called the same function in case we've raced with > another node changing the permissions, for example. There are a number > of cases where we redo vfs level checks for this reason, Seems like we should be able to grab a cluster lock where we grab i_mutex in the namespace code to avoid having to redo all these checks. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2] ceph: fix posix ACL hooks
On Mon, 2014-02-03 at 22:40 +, Al Viro wrote: > > >> +static int gfs2_vfs_permission(struct dentry *dentry, struct inode > > >> *inode, int mask) > > >> +{ > > >> + return gfs2_permission(inode, mask); > > >> +} > > > > > > Er... You do realize that callers of gfs2_permission() tend to have > > > the dentry in question, either directly or as ->d_parent of something > > > they have? > > > > Not true. Look closer. > > > > Look at gfs2_lookupi() in particular, and check how it is called. > > Yeowch... gfs2_ok_to_move() is particulary nasty... WTF do we need > it for and why is it not racy as hell? Well, the original intent was to prevent us from moving a directory into one of its subdirectories, and thus creating a loop. It is only called when the rename is moving a directory, and when the parent directories (source and destination) are different. I know there is a problem there, recently reported by Bruce Fields which he came across when looking at the d_splice_alias issue. The bug that Bruce found only shows up in the clustered case, and not in the single node case though. Which particular race are you worried about? This check is covered by the rename glock, which is basically a cluster wide version of the vfs level ->s_vfs_rename_mutex. To diverge from that topic for a moment, this thread has also brought together some discussion on another issue which I've been pondering recently that of whether the inode operations for get/set_xattr should take a dentry or not. I had thought that we'd come to the conclusion that 9p made it impossible to swap the current dentry argument for an inode, and I was about to send a patch for selinux support on clustered fs on that basis. However the discussion in this thread has made me wonder whether that really is the case or not Al, can you confirm whether your xattr-experimental patches are still under active consideration? The other question that I have relating to that side of things, is why security_inode_permission() is called from __inode_permission() rather than from generic_permission() ? Maybe there is a good reason, but I can't immediately see what it is at the moment. In response to the question elsewhere about GFS2 calling gfs2_permission() after the vfs has already done its checks, that is indeed down to needing to ensure that we have the cluster locks when this check is called. More importantly to know that things haven't changed since the VFS called the same function in case we've raced with another node changing the permissions, for example. There are a number of cases where we redo vfs level checks for this reason, Steve. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2] ceph: fix posix ACL hooks
On Mon, Feb 3, 2014 at 2:40 PM, Al Viro wrote: > > Umm... Point, but that actually means that we get an extra pitfall for > filesystem writers here. foo_permission() passes dentry (now that it > has one) to foo_wank_a_lot(), with the latter using dentry->d_inode at > some point... I agree. The good news, though, is that in the RCU lookup case, we have that MAY_NOT_BLOCK thing, and most filesystems will have errored out for any complex operations. RCU lookup is special, and complicated, and sadly, the permissions checking is very much part of that. But for the really complex cases, at least we can punt. >> Look at gfs2_lookupi() in particular, and check how it is called. > > Yeowch... gfs2_ok_to_move() is particulary nasty... WTF do we need > it for and why is it not racy as hell? I don't know. And I suspect that for things like the journal index file lookup (which is actually worse - see gfs2_jindex_hold()) we don't really about the permissions, since this is just done at init_journal() time. So I think all of this is quite solvable for gfs2, it just wasn't the obvious kind of "we already have the dentry" case that every single other case I looked at was. Linus -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2] ceph: fix posix ACL hooks
On Mon, Feb 03, 2014 at 02:12:00PM -0800, Linus Torvalds wrote: > >> -int afs_permission(struct inode *inode, int mask) > >> +int afs_permission(struct dentry *dentry, struct inode *inode, int mask) > > > > Oh, _lovely_. So not only do we pass dentry, the arguments are redundant > > as well. > > Note that *not* passing in inode would make the patch much bigger, > because now every filesystem would have to add the > >struct inode *inode = dentry->d_inode; > > at the top. > > Also, I'm not actually convinced it is redundant at all. Remember the > RCU lookup case? dentry->d_inode is not safe. Umm... Point, but that actually means that we get an extra pitfall for filesystem writers here. foo_permission() passes dentry (now that it has one) to foo_wank_a_lot(), with the latter using dentry->d_inode at some point... > >> +static int gfs2_vfs_permission(struct dentry *dentry, struct inode > >> *inode, int mask) > >> +{ > >> + return gfs2_permission(inode, mask); > >> +} > > > > Er... You do realize that callers of gfs2_permission() tend to have > > the dentry in question, either directly or as ->d_parent of something > > they have? > > Not true. Look closer. > > Look at gfs2_lookupi() in particular, and check how it is called. Yeowch... gfs2_ok_to_move() is particulary nasty... WTF do we need it for and why is it not racy as hell? -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2] ceph: fix posix ACL hooks
On Mon, Feb 03, 2014 at 01:44:22PM -0800, Linus Torvalds wrote: > On Mon, Feb 3, 2014 at 1:39 PM, Al Viro wrote: > > > > If we really have hardlinks, the result of permission check would better > > be a function of inode itself - as in, "if it gives different results > > for two pathnames reachable for the same user, we have a bug". ^ > No. You're wrong. > > EVEN ON A UNIX FILESYSTEM THE PATH IS MEANINGFUL. > > Do this: create a hardlink in two different directories. Make the > *directory* permissions for one of the directories be something you > cannot traverse. Now try to check the permissions of the *same* inode > through those two paths. Notice how you get *different* results. > > Really. Yes. In one case we won't get to looking at the permissions of that thing at all. > Now, imagine that you are doing the same thing over a network. On the > server, there may be a single inode for the file, but when the client > gives the server a pathname, the two pathnames to that single inode > ARE NOT EQUIVALENT. Why do we pretend that those are links, in the first place? > And the fact is, filesystems with hardlinks and path-name-based > operations do exist. cifs with the unix extensions is one of them. Pox on Tridge... I really, really hate that change; I can buy "->getxattr() has inconvenient interface because of lousy protocol design", but spreading the same to ->permission(), with everything that will fall out of that... -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2] ceph: fix posix ACL hooks
On Mon, Feb 3, 2014 at 1:59 PM, Al Viro wrote: > On Mon, Feb 03, 2014 at 01:03:32PM -0800, Linus Torvalds wrote: > >> - err = vfs_mkdir(path.dentry->d_inode, dentry, mode); >> + err = vfs_mkdir(path.dentry, dentry, mode); > > Pointless - path.dentry == dentry->d_parent anyway. Heh. It's no less redundant than it used to be. But if you want to clean up the vfs_xyzzy() ones further, I'm perfectly fine with that. >> - err = ll_vfs_rename(dir->d_inode, dchild_old, mnt, >> - dir->d_inode, dchild_new, mnt, NULL); >> + err = ll_vfs_rename(dir, dchild_old, mnt, >> + dir, dchild_new, mnt, NULL); > > > ... and again, that's completely pointless. Minimal patch.. I really didn't want to check what the heck lustre does with the insane ll_vfs thing. >> -int afs_permission(struct inode *inode, int mask) >> +int afs_permission(struct dentry *dentry, struct inode *inode, int mask) > > Oh, _lovely_. So not only do we pass dentry, the arguments are redundant > as well. Note that *not* passing in inode would make the patch much bigger, because now every filesystem would have to add the struct inode *inode = dentry->d_inode; at the top. Also, I'm not actually convinced it is redundant at all. Remember the RCU lookup case? dentry->d_inode is not safe. The RCU case actually does inode_permission(nd->path.dentry, nd->inode, ..) and here the difference between nd->inode and dentry->d_inode are relevant, I think. >> +static int gfs2_vfs_permission(struct dentry *dentry, struct inode *inode, >> int mask) >> +{ >> + return gfs2_permission(inode, mask); >> +} > > Er... You do realize that callers of gfs2_permission() tend to have > the dentry in question, either directly or as ->d_parent of something > they have? Not true. Look closer. Look at gfs2_lookupi() in particular, and check how it is called. I did hate that part of the patch, and I did mention the kinds of problems this will cause if the next phase passes in dentry to "generic_permission()". Linus -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2] ceph: fix posix ACL hooks
On Mon, Feb 03, 2014 at 01:03:32PM -0800, Linus Torvalds wrote: > - err = vfs_mkdir(path.dentry->d_inode, dentry, mode); > + err = vfs_mkdir(path.dentry, dentry, mode); Pointless - path.dentry == dentry->d_parent anyway. > - err = vfs_mknod(path.dentry->d_inode, dentry, mode, dev->devt); > + err = vfs_mknod(path.dentry, dentry, mode, dev->devt); Ditto. > @@ -237,7 +237,7 @@ static int dev_rmdir(const char *name) > return PTR_ERR(dentry); > if (dentry->d_inode) { > if (dentry->d_inode->i_private == &thread) > - err = vfs_rmdir(parent.dentry->d_inode, dentry); > + err = vfs_rmdir(parent.dentry, dentry); Ditto, with s/path/parent/ > @@ -324,7 +324,7 @@ static int handle_remove(const char *nodename, struct > device *dev) > mutex_lock(&dentry->d_inode->i_mutex); > notify_change(dentry, &newattrs, NULL); > mutex_unlock(&dentry->d_inode->i_mutex); > - err = vfs_unlink(parent.dentry->d_inode, dentry, NULL); > + err = vfs_unlink(parent.dentry, dentry, NULL); > if (!err || err == -ENOENT) > deleted = 1; > } And here as well. > +++ b/drivers/staging/lustre/lustre/lvfs/lvfs_linux.c > @@ -222,8 +222,8 @@ int lustre_rename(struct dentry *dir, struct vfsmount > *mnt, > if (IS_ERR(dchild_new)) > GOTO(put_old, err = PTR_ERR(dchild_new)); > > - err = ll_vfs_rename(dir->d_inode, dchild_old, mnt, > - dir->d_inode, dchild_new, mnt, NULL); > + err = ll_vfs_rename(dir, dchild_old, mnt, > + dir, dchild_new, mnt, NULL); ... and again, that's completely pointless. > -int afs_permission(struct inode *inode, int mask) > +int afs_permission(struct dentry *dentry, struct inode *inode, int mask) Oh, _lovely_. So not only do we pass dentry, the arguments are redundant as well. > -static inline int btrfs_may_create(struct inode *dir, struct dentry *child) > +static inline int btrfs_may_create(struct dentry *parent, struct dentry > *child) I'm fairly sure that it's also pointless, because parent is going to be, well, the parent. Of child. > +static int gfs2_vfs_permission(struct dentry *dentry, struct inode *inode, > int mask) > +{ > + return gfs2_permission(inode, mask); > +} Er... You do realize that callers of gfs2_permission() tend to have the dentry in question, either directly or as ->d_parent of something they have? I really hate the whole thing... ;-/ -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2] ceph: fix posix ACL hooks
On Mon, Feb 3, 2014 at 1:39 PM, Al Viro wrote: > > If we really have hardlinks, the result of permission check would better > be a function of inode itself - as in, "if it gives different results > for two pathnames reachable for the same user, we have a bug". No. You're wrong. EVEN ON A UNIX FILESYSTEM THE PATH IS MEANINGFUL. Do this: create a hardlink in two different directories. Make the *directory* permissions for one of the directories be something you cannot traverse. Now try to check the permissions of the *same* inode through those two paths. Notice how you get *different* results. Really. Now, imagine that you are doing the same thing over a network. On the server, there may be a single inode for the file, but when the client gives the server a pathname, the two pathnames to that single inode ARE NOT EQUIVALENT. And the fact is, filesystems with hardlinks and path-name-based operations do exist. cifs with the unix extensions is one of them. Al, face it, you're wrong this time. Linus -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2] ceph: fix posix ACL hooks
On Mon, Feb 03, 2014 at 01:37:03PM -0800, Linus Torvalds wrote: > On Mon, Feb 3, 2014 at 1:31 PM, Al Viro wrote: > > > > Yes, and...? CIFS also doesn't have hardlinks, so _there_ d_find_alias() > > is just fine. > > Hmm? I'm pretty sure cifs can actually have hardlinks. *UGH* How the hell does it... Oh, right - samba on Unix server. I really wonder how well do Windows clients deal with those... Crap... I reall hate that prototype change ;-/ -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2] ceph: fix posix ACL hooks
On Mon, Feb 03, 2014 at 09:39:26PM +, Al Viro wrote: > Which fs are you talking about? For 9P it *is* a function of inode alone. > For CIFS there's no wrong dentry to pick - it doesn't have links to start > with. Except that apparently it does ;-/ Bugger... -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2] ceph: fix posix ACL hooks
On Mon, Feb 03, 2014 at 01:31:19PM -0800, Linus Torvalds wrote: > If the protocol is path-based (and it happens, and it's actually the > *correct* thing to do for a network filesystem, rather than the > idiotic "file handle" crap that tries to emulate the unix inode > semantics in the protocol), then the inode is simply not sufficient. > > And no, d_find_alias() is not correct or sufficient either. It can > work in practice (and probably does perfectly fine 99.9% of the time), > but it can equally well give the *wrong* dentry: yes, the dentry it > returns would have been a valid dentry for somebody at some time, but > it might be a stale dentry *now*, and it might be the wrong dentry for > the current user (because the current user may not have permissions to > that particular path, even if the user has permissions through his > *own* path). > > So I really think you're *fundamentally* incorrect when you say > "result *is* a function of inode alone". Which fs are you talking about? For 9P it *is* a function of inode alone. For CIFS there's no wrong dentry to pick - it doesn't have links to start with. If we really have hardlinks, the result of permission check would better be a function of inode itself - as in, "if it gives different results for two pathnames reachable for the same user, we have a bug". -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2] ceph: fix posix ACL hooks
On Mon, Feb 3, 2014 at 1:31 PM, Al Viro wrote: > > Yes, and...? CIFS also doesn't have hardlinks, so _there_ d_find_alias() > is just fine. Hmm? I'm pretty sure cifs can actually have hardlinks. Linus -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2] ceph: fix posix ACL hooks
On Mon, Feb 03, 2014 at 09:31:53PM +, Al Viro wrote: > Yes, and...? CIFS also doesn't have hardlinks, so _there_ d_find_alias() > is just fine. It does have hardlinks, look at cifs_hardlink and functions called from it. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2] ceph: fix posix ACL hooks
On Mon, Feb 03, 2014 at 01:24:47PM -0800, Christoph Hellwig wrote: > On Mon, Feb 03, 2014 at 09:19:55PM +, Al Viro wrote: > > Result *is* a function of inode alone; the problem with 9P is that we > > are caching FIDs in the wrong place. > > I don't think that's true for CIFS unfortunately, which is path based. Yes, and...? CIFS also doesn't have hardlinks, so _there_ d_find_alias() is just fine. 9P is actually trickier; there we need some massage, but for CIFS it's literally a matter of one function call. The problem with 9P is that the way we do it right now, you might pick the wrong dentry. _Some_ alias already bears the FID you are after, but if you end up picking the one that doesn't, you might very well end up being unable to obtain one either. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2] ceph: fix posix ACL hooks
On Mon, Feb 3, 2014 at 1:19 PM, Al Viro wrote: > > Please, don't do that. Half of that is pointless (e.g. what you are > doing with vfs_rmdir() - if anything, we could get rid of the first > argument completely, it's always victim->d_parent->d_inode and we are > holding enough locks for that to be stable) and ->permission() is > just plain wrong. > > Result *is* a function of inode alone; the problem with 9P is that we > are caching FIDs in the wrong place. No, Al. It's *not* just a function of the inode alone. That's the point. Some network filesystems pass the *path* to the server. Any operation that needs to check something on the server needs the *dentry*, not the inode. This whole "the inode describes the file" mentality comes from old broken UNIX semantics. It's fundamentally true for local Unix filesystems, but that's it. It's not true in general. Sure, many network filesystems then emulate the local Unix filesystem behavior, so in practice, you get the unix semantics quite often. But it really is wrong. If the protocol is path-based (and it happens, and it's actually the *correct* thing to do for a network filesystem, rather than the idiotic "file handle" crap that tries to emulate the unix inode semantics in the protocol), then the inode is simply not sufficient. And no, d_find_alias() is not correct or sufficient either. It can work in practice (and probably does perfectly fine 99.9% of the time), but it can equally well give the *wrong* dentry: yes, the dentry it returns would have been a valid dentry for somebody at some time, but it might be a stale dentry *now*, and it might be the wrong dentry for the current user (because the current user may not have permissions to that particular path, even if the user has permissions through his *own* path). So I really think you're *fundamentally* incorrect when you say "result *is* a function of inode alone". Linus -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2] ceph: fix posix ACL hooks
On Mon, Feb 03, 2014 at 09:19:55PM +, Al Viro wrote: > Result *is* a function of inode alone; the problem with 9P is that we > are caching FIDs in the wrong place. I don't think that's true for CIFS unfortunately, which is path based. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2] ceph: fix posix ACL hooks
On Mon, Feb 03, 2014 at 01:03:32PM -0800, Linus Torvalds wrote: > Now, to be honest, pushing it down one more level (to > generic_permission()) will actually start causing some trouble. In > particular, gfs2_permission() fundamentally does not have a dentry for > several of the callers. Looking over the gfs2 code the problem seems to be that it duplicates permissions checks from the may_{lookup,create,linkat,delete}, most likely because it needs cluster locking in place for them. The right fix seems to be to optionally call the filesystem from those. That being said I wonder how ocfs2 or network filesystems get away without that. > What do you think? I guess this patch could be split up into two: one > that does the "vfs_xyz()" helper functions, and another that does the > inode_permission() change. I tied them together mainly because I > started with the inode_permission() change, and that required the > vfs_xyz() change. The changes look good to me, and yes I think they should be split. I'll see if I can take this further, but doing something non-hacky in GFS2 would be the first step here. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2] ceph: fix posix ACL hooks
On Mon, Feb 03, 2014 at 01:03:32PM -0800, Linus Torvalds wrote: > What do you think? I guess this patch could be split up into two: one > that does the "vfs_xyz()" helper functions, and another that does the > inode_permission() change. I tied them together mainly because I > started with the inode_permission() change, and that required the > vfs_xyz() change. Please, don't do that. Half of that is pointless (e.g. what you are doing with vfs_rmdir() - if anything, we could get rid of the first argument completely, it's always victim->d_parent->d_inode and we are holding enough locks for that to be stable) and ->permission() is just plain wrong. Result *is* a function of inode alone; the problem with 9P is that we are caching FIDs in the wrong place. What really happens is that protocol refers to objects by 32bit tokens, given by server out to client. Many of those can correspond to the same file; think of those as file descriptors. We are associating them with dentries, but if you have several links to the same file a FID acquired for either of them will do. What we ought to do, AFAICS, is moving these guys from dentry to inode; sure, to *get* one in the first place we need some dentry. But by the time ->setxattr() and friends get to see the inode, the caller has already done things that make sure that some FID of his exists. IOW, NAK. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2] ceph: fix posix ACL hooks
On Mon, Feb 03, 2014 at 02:29:43AM -0800, Christoph Hellwig wrote: > On Thu, Jan 30, 2014 at 02:01:38PM -0800, Linus Torvalds wrote: > > In the end, all the original call-sites should have a dentry, and none > > of this is "fundamental". But you're right, it looks like an absolute > > nightmare to add the dentry pointer through the whole chain. Damn. > > > > So I'm not thrilled about it, but maybe that "d_find_alias(inode)" to > > find the dentry is good enough in practice. It feels very much > > incorrect (it could find a dentry with a path that you cannot actually > > access on the server, and result in user-visible errors), but I > > definitely see your argument. It may just not be worth the pain for > > this odd ceph case. > > It's not just ceph. 9p fundamentally needs it and I really want to > convert 9p to the new code so that we can get rid of the lower level > interfaces entirely and eventually move ACL dispatching entirely > into the VFS. The same d_find_alias hack should work for 9p as well, > although spreading this even more gets uglier and uglier. Similarly > for CIFS which pretends to understand the Posix ACL xattrs, but doesn't > use any of the infrastructure as it seems to rely on server side > enforcement. 9P is going to be fun to deal with; that's why I've ended up abandoning vfs.git#experimental-xattr last year. We probably want to move FIDs from dentries to inodes there, and rely in ->getxattr() et.al. upon having already done ->d_revalidate() on some dentry for that inode. Another pile of fun is fsnotify_xattr() call in __vfs_setxattr_noperm() and the whole misbegotten IMA/EVM mess ;-/ See #experimental-xattr - a lot of stuff in that direction is sitting there; might turn out to be useful. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2] ceph: fix posix ACL hooks
On Thu, Jan 30, 2014 at 02:01:38PM -0800, Linus Torvalds wrote: > In the end, all the original call-sites should have a dentry, and none > of this is "fundamental". But you're right, it looks like an absolute > nightmare to add the dentry pointer through the whole chain. Damn. > > So I'm not thrilled about it, but maybe that "d_find_alias(inode)" to > find the dentry is good enough in practice. It feels very much > incorrect (it could find a dentry with a path that you cannot actually > access on the server, and result in user-visible errors), but I > definitely see your argument. It may just not be worth the pain for > this odd ceph case. It's not just ceph. 9p fundamentally needs it and I really want to convert 9p to the new code so that we can get rid of the lower level interfaces entirely and eventually move ACL dispatching entirely into the VFS. The same d_find_alias hack should work for 9p as well, although spreading this even more gets uglier and uglier. Similarly for CIFS which pretends to understand the Posix ACL xattrs, but doesn't use any of the infrastructure as it seems to rely on server side enforcement. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2] ceph: fix posix ACL hooks
On Thu, 30 Jan 2014, Linus Torvalds wrote: > On Wed, Jan 29, 2014 at 11:54 PM, Christoph Hellwig > wrote: > > > > For ->set_acl that's fairly easily doable and I actually had a version > > doing that to be able to convert 9p. But for ->get_acl the path walking > > caller didn't seem easily feasible. ->get_acl actually is an invention > > of yours, so if you got a good idea to get the dentry to it I'd love > > to be able to pass it. > > Yeah, that's pretty annoying, largely because that path is also > RCU-walk aware, which does *not* need this all (because it will never > call down into the filesystem - if the acl isn't found in the cached > acl's, we just abort). > > And we're going through that very common "generic_permission()" thing > that in turn is also often called from the low-level filesystens, and > it's all fairly tightly integrated with __inode_permission() etc. > > In the end, all the original call-sites should have a dentry, and none > of this is "fundamental". But you're right, it looks like an absolute > nightmare to add the dentry pointer through the whole chain. Damn. > > So I'm not thrilled about it, but maybe that "d_find_alias(inode)" to > find the dentry is good enough in practice. It feels very much > incorrect (it could find a dentry with a path that you cannot actually > access on the server, and result in user-visible errors), but I > definitely see your argument. It may just not be worth the pain for > this odd ceph case. > > That said, if the ceph people decide to try to bite the bullet and do > the required conversions to pass the dentry to the permissions > functions, I think I'd take it unless it ends up being *too* horribly > messy. FWIW the dentry isn't useful in the get case; it's only on put that it is currently used. And now that I look closely, it is only being used by ceph_setattr to associate the update with the parent directory for the purposes of fsync(dirfd)... which is, I think, incorrect anyway (that should only flush out/wait for namespace modifications, not inode attr updates). So I think it's fine as is, and we'll clean this up later. I do have a couple patches on top of what's in your tree, though, that clean up a couple duplicated lines in your fix and apply Christoph's cleanup: git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client.git for-linus Thanks! sage -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2] ceph: fix posix ACL hooks
On Wed, Jan 29, 2014 at 11:54 PM, Christoph Hellwig wrote: > > For ->set_acl that's fairly easily doable and I actually had a version > doing that to be able to convert 9p. But for ->get_acl the path walking > caller didn't seem easily feasible. ->get_acl actually is an invention > of yours, so if you got a good idea to get the dentry to it I'd love > to be able to pass it. Yeah, that's pretty annoying, largely because that path is also RCU-walk aware, which does *not* need this all (because it will never call down into the filesystem - if the acl isn't found in the cached acl's, we just abort). And we're going through that very common "generic_permission()" thing that in turn is also often called from the low-level filesystens, and it's all fairly tightly integrated with __inode_permission() etc. In the end, all the original call-sites should have a dentry, and none of this is "fundamental". But you're right, it looks like an absolute nightmare to add the dentry pointer through the whole chain. Damn. So I'm not thrilled about it, but maybe that "d_find_alias(inode)" to find the dentry is good enough in practice. It feels very much incorrect (it could find a dentry with a path that you cannot actually access on the server, and result in user-visible errors), but I definitely see your argument. It may just not be worth the pain for this odd ceph case. That said, if the ceph people decide to try to bite the bullet and do the required conversions to pass the dentry to the permissions functions, I think I'd take it unless it ends up being *too* horribly messy. Linus -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2] ceph: fix posix ACL hooks
On Wed, Jan 29, 2014 at 11:09:18AM -0800, Linus Torvalds wrote: > So attached is the incremental diff of the patch by Sage and Ilya, and > I'll apply it (delayed a bit to see if I can get the sign-off from > Ilya), but I also think we should fix the (non-cached) ACL functions > that call down to the filesystem layer to also get the dentry. For ->set_acl that's fairly easily doable and I actually had a version doing that to be able to convert 9p. But for ->get_acl the path walking caller didn't seem easily feasible. ->get_acl actually is an invention of yours, so if you got a good idea to get the dentry to it I'd love to be able to pass it. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2] ceph: fix posix ACL hooks
On Wed, Jan 29, 2014 at 6:37 PM, Ilya Dryomov wrote: > From: Sage Weil > > The merge of 7221fe4c2 raced with upstream changes in the generic POSIX > ACL code (2aeccbe95). Update Ceph to use the new helpers as well by > dropping the now-generic functions and setting the set_acl inode op. > > Signed-off-by: Sage Weil Signed-off-by: Ilya Dryomov Thanks, Ilya -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2] ceph: fix posix ACL hooks
On Wed, Jan 29, 2014 at 8:37 AM, Ilya Dryomov wrote: > From: Sage Weil > > The merge of 7221fe4c2 raced with upstream changes in the generic POSIX > ACL code (2aeccbe95). Update Ceph to use the new helpers as well by > dropping the now-generic functions and setting the set_acl inode op. > > Signed-off-by: Sage Weil Ok, I already committed my untested (and broken) patch yesterday, because even a broken tree is better than a non-*compiling* tree for testing. I created a diff from my current tree to this, and will happily apply it, but the sign-off chain is now broken (Ilya didn't sign off on his changes to Sage's patch. Not that it really matters for what is really a one-liner change, but I thought I'd mention it, since another issue came up with this patch.. Ceph now does struct dentry *dentry = d_find_alias(inode); in its ceph_set_acl() function, and that's because the VFS layer doesn't pass down the dentry to the acl code any more. Christoph - that sounds like a misfeature in the new interfaces. I realize that for traditional unix filesystems, the path is irrelevant for an inode operation, but the thing is, from a Linux VFS standpoint and a conceptual standpoint, the dentry really is the more important and unique one, and some filesystems want the dentry because they are *correctly* designed to be about pathnames. Network filesystems that are pass file descriptors around (ie NFS etc) are not the "RightWay(tm)" of doing things, so I think that it is quite reasonable for ceph to want the *path* to the inode and not just the inode. So attached is the incremental diff of the patch by Sage and Ilya, and I'll apply it (delayed a bit to see if I can get the sign-off from Ilya), but I also think we should fix the (non-cached) ACL functions that call down to the filesystem layer to also get the dentry. We already do that for the xattr functions, just not for get_acl()/set_acl()/acl_chmod(). Christoph, Al? Yes, most filesystems will ignore the dentry pointer, but still.. Linus fs/ceph/acl.c | 9 - fs/ceph/dir.c | 1 + fs/ceph/inode.c | 2 ++ fs/ceph/super.h | 3 +++ 4 files changed, 10 insertions(+), 5 deletions(-) diff --git a/fs/ceph/acl.c b/fs/ceph/acl.c index f6911284c9bd..66d377a12f7c 100644 --- a/fs/ceph/acl.c +++ b/fs/ceph/acl.c @@ -107,14 +107,14 @@ struct posix_acl *ceph_get_acl(struct inode *inode, int type) return acl; } -static int ceph_set_acl(struct dentry *dentry, struct inode *inode, - struct posix_acl *acl, int type) +int ceph_set_acl(struct inode *inode, struct posix_acl *acl, int type) { int ret = 0, size = 0; const char *name = NULL; char *value = NULL; struct iattr newattrs; umode_t new_mode = inode->i_mode, old_mode = inode->i_mode; + struct dentry *dentry = d_find_alias(inode); if (acl) { ret = posix_acl_valid(acl); @@ -208,8 +208,7 @@ int ceph_init_acl(struct dentry *dentry, struct inode *inode, struct inode *dir) if (IS_POSIXACL(dir) && acl) { if (S_ISDIR(inode->i_mode)) { - ret = ceph_set_acl(dentry, inode, acl, - ACL_TYPE_DEFAULT); + ret = ceph_set_acl(inode, acl, ACL_TYPE_DEFAULT); if (ret) goto out_release; } @@ -217,7 +216,7 @@ int ceph_init_acl(struct dentry *dentry, struct inode *inode, struct inode *dir) if (ret < 0) goto out; else if (ret > 0) - ret = ceph_set_acl(dentry, inode, acl, ACL_TYPE_ACCESS); + ret = ceph_set_acl(inode, acl, ACL_TYPE_ACCESS); else cache_no_acl(inode); } else { diff --git a/fs/ceph/dir.c b/fs/ceph/dir.c index 619616d585b0..6da4df84ba30 100644 --- a/fs/ceph/dir.c +++ b/fs/ceph/dir.c @@ -1303,6 +1303,7 @@ const struct inode_operations ceph_dir_iops = { .listxattr = ceph_listxattr, .removexattr = ceph_removexattr, .get_acl = ceph_get_acl, + .set_acl = ceph_set_acl, .mknod = ceph_mknod, .symlink = ceph_symlink, .mkdir = ceph_mkdir, diff --git a/fs/ceph/inode.c b/fs/ceph/inode.c index 8b8b506636cc..32d519d8a2e2 100644 --- a/fs/ceph/inode.c +++ b/fs/ceph/inode.c @@ -97,6 +97,7 @@ const struct inode_operations ceph_file_iops = { .listxattr = ceph_listxattr, .removexattr = ceph_removexattr, .get_acl = ceph_get_acl, + .set_acl = ceph_set_acl, }; @@ -1616,6 +1617,7 @@ static const struct inode_operations ceph_symlink_iops = { .listxattr = ceph_listxattr, .removexattr = ceph_removexattr, .get_acl = ceph_get_acl, + .set_acl = ceph_set_acl, }; /* diff --git a/fs/ceph/super.h b/fs/ceph/super.h index 345933948b6e..aa260590f615 100644 --- a/fs/ceph/super.h
[PATCH v2] ceph: fix posix ACL hooks
From: Sage Weil The merge of 7221fe4c2 raced with upstream changes in the generic POSIX ACL code (2aeccbe95). Update Ceph to use the new helpers as well by dropping the now-generic functions and setting the set_acl inode op. Signed-off-by: Sage Weil --- v2: fixed the !CONFIG_CEPH_FS_POSIX_ACL case fs/ceph/acl.c | 112 +++ fs/ceph/dir.c |1 + fs/ceph/inode.c |5 ++- fs/ceph/super.h |6 +-- fs/ceph/xattr.c |5 ++- 5 files changed, 16 insertions(+), 113 deletions(-) diff --git a/fs/ceph/acl.c b/fs/ceph/acl.c index 64fddbc1d17b..66d377a12f7c 100644 --- a/fs/ceph/acl.c +++ b/fs/ceph/acl.c @@ -107,14 +107,14 @@ struct posix_acl *ceph_get_acl(struct inode *inode, int type) return acl; } -static int ceph_set_acl(struct dentry *dentry, struct inode *inode, - struct posix_acl *acl, int type) +int ceph_set_acl(struct inode *inode, struct posix_acl *acl, int type) { int ret = 0, size = 0; const char *name = NULL; char *value = NULL; struct iattr newattrs; umode_t new_mode = inode->i_mode, old_mode = inode->i_mode; + struct dentry *dentry = d_find_alias(inode); if (acl) { ret = posix_acl_valid(acl); @@ -208,16 +208,15 @@ int ceph_init_acl(struct dentry *dentry, struct inode *inode, struct inode *dir) if (IS_POSIXACL(dir) && acl) { if (S_ISDIR(inode->i_mode)) { - ret = ceph_set_acl(dentry, inode, acl, - ACL_TYPE_DEFAULT); + ret = ceph_set_acl(inode, acl, ACL_TYPE_DEFAULT); if (ret) goto out_release; } - ret = posix_acl_create(&acl, GFP_NOFS, &inode->i_mode); + ret = __posix_acl_create(&acl, GFP_NOFS, &inode->i_mode); if (ret < 0) goto out; else if (ret > 0) - ret = ceph_set_acl(dentry, inode, acl, ACL_TYPE_ACCESS); + ret = ceph_set_acl(inode, acl, ACL_TYPE_ACCESS); else cache_no_acl(inode); } else { @@ -229,104 +228,3 @@ out_release: out: return ret; } - -int ceph_acl_chmod(struct dentry *dentry, struct inode *inode) -{ - struct posix_acl *acl; - int ret = 0; - - if (S_ISLNK(inode->i_mode)) { - ret = -EOPNOTSUPP; - goto out; - } - - if (!IS_POSIXACL(inode)) - goto out; - - acl = ceph_get_acl(inode, ACL_TYPE_ACCESS); - if (IS_ERR_OR_NULL(acl)) { - ret = PTR_ERR(acl); - goto out; - } - - ret = posix_acl_chmod(&acl, GFP_KERNEL, inode->i_mode); - if (ret) - goto out; - ret = ceph_set_acl(dentry, inode, acl, ACL_TYPE_ACCESS); - posix_acl_release(acl); -out: - return ret; -} - -static int ceph_xattr_acl_get(struct dentry *dentry, const char *name, - void *value, size_t size, int type) -{ - struct posix_acl *acl; - int ret = 0; - - if (!IS_POSIXACL(dentry->d_inode)) - return -EOPNOTSUPP; - - acl = ceph_get_acl(dentry->d_inode, type); - if (IS_ERR(acl)) - return PTR_ERR(acl); - if (acl == NULL) - return -ENODATA; - - ret = posix_acl_to_xattr(&init_user_ns, acl, value, size); - posix_acl_release(acl); - - return ret; -} - -static int ceph_xattr_acl_set(struct dentry *dentry, const char *name, - const void *value, size_t size, int flags, int type) -{ - int ret = 0; - struct posix_acl *acl = NULL; - - if (!inode_owner_or_capable(dentry->d_inode)) { - ret = -EPERM; - goto out; - } - - if (!IS_POSIXACL(dentry->d_inode)) { - ret = -EOPNOTSUPP; - goto out; - } - - if (value) { - acl = posix_acl_from_xattr(&init_user_ns, value, size); - if (IS_ERR(acl)) { - ret = PTR_ERR(acl); - goto out; - } - - if (acl) { - ret = posix_acl_valid(acl); - if (ret) - goto out_release; - } - } - - ret = ceph_set_acl(dentry, dentry->d_inode, acl, type); - -out_release: - posix_acl_release(acl); -out: - return ret; -} - -const struct xattr_handler ceph_xattr_acl_default_handler = { - .prefix = POSIX_ACL_XATTR_DEFAULT, - .flags = ACL_TYPE_DEFAULT, - .get= ceph_xattr_acl_get, - .set= ceph_xattr_acl_set, -}; - -const struct xattr_handler ceph_xattr_acl_access_handler = { - .prefix = POSIX_ACL_XATTR_ACCESS, - .flags = ACL_TYPE_ACCESS, - .get