Re: [PATCH v2] ceph: fix posix ACL hooks

2014-02-06 Thread Jeremy Allison
On Mon, Feb 03, 2014 at 10:31:27PM +, Al Viro wrote:
> 
> > And the fact is, filesystems with hardlinks and path-name-based
> > operations do exist. cifs with the unix extensions is one of them.
> 
> Pox on Tridge...

Actually you have to blame me for that. Tridge always
*HATED* the UNIX extensions :-).
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2] ceph: fix posix ACL hooks

2014-02-04 Thread Linus Torvalds
On Tue, Feb 4, 2014 at 3:33 AM, Steven Whitehouse  wrote:
>
> The other question that I have relating to that side of things, is why
> security_inode_permission() is called from __inode_permission() rather
> than from generic_permission() ? Maybe there is a good reason, but I
> can't immediately see what it is at the moment.

"generic_permission()" is just a helper that implements the default
UNIX permissions, and won't necessarily even be called. A filesystem
could decide not to call it at all, and in fact there are cases that
don't (eg coda or the bad_inode case).

The inode_permission() class of helpers, in contrast, is what gets
called by the VFS layer itself. So if you want to catch all permission
checks (and that would be security_inode_permission()) then you need
to catch it there.

 Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2] ceph: fix posix ACL hooks

2014-02-04 Thread Christoph Hellwig
On Tue, Feb 04, 2014 at 11:33:35AM +, Steven Whitehouse wrote:
> To diverge from that topic for a moment, this thread has also brought
> together some discussion on another issue which I've been pondering
> recently that of whether the inode operations for get/set_xattr
> should take a dentry or not. I had thought that we'd come to the
> conclusion that 9p made it impossible to swap the current dentry
> argument for an inode, and I was about to send a patch for selinux
> support on clustered fs on that basis. However the discussion in this
> thread has made me wonder whether that really is the case or not Al,
> can you confirm whether your xattr-experimental patches are still under
> active consideration?

My plan was to work on the 9p and cifs conversions using the
d_find_alias hack we have in ceph right now.  That means the base work
could switch to passed in dentries or in case of 9p the per-inode fids
easily.

> The other question that I have relating to that side of things, is why
> security_inode_permission() is called from __inode_permission() rather
> than from generic_permission() ? Maybe there is a good reason, but I
> can't immediately see what it is at the moment.

Seems like almost everything of the security_* family is called from the
VFS instead of the filesystem.  There's also some very odd other
behaviour in there, e.g. for the xattrs sets are handed to the
filesystem first, and then the xattr layer calls into the security
layer, which for reads the filesystems is never reached at all.

> In response to the question elsewhere about GFS2 calling
> gfs2_permission() after the vfs has already done its checks, that is
> indeed down to needing to ensure that we have the cluster locks when
> this check is called. More importantly to know that things haven't
> changed since the VFS called the same function in case we've raced with
> another node changing the permissions, for example. There are a number
> of cases where we redo vfs level checks for this reason,

Seems like we should be able to grab a cluster lock where we grab
i_mutex in the namespace code to avoid having to redo all these checks.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2] ceph: fix posix ACL hooks

2014-02-04 Thread Steven Whitehouse
On Mon, 2014-02-03 at 22:40 +, Al Viro wrote:
> > >> +static int gfs2_vfs_permission(struct dentry *dentry, struct inode 
> > >> *inode, int mask)
> > >> +{
> > >> + return gfs2_permission(inode, mask);
> > >> +}
> > >
> > > Er...  You do realize that callers of gfs2_permission() tend to have
> > > the dentry in question, either directly or as ->d_parent of something
> > > they have?
> > 
> > Not true. Look closer.
> > 
> > Look at gfs2_lookupi() in particular, and check how it is called.
> 
> Yeowch...  gfs2_ok_to_move() is particulary nasty...  WTF do we need
> it for and why is it not racy as hell?

Well, the original intent was to prevent us from moving a directory into
one of its subdirectories, and thus creating a loop. It is only called
when the rename is moving a directory, and when the parent directories
(source and destination) are different.

I know there is a problem there, recently reported by Bruce Fields which
he came across when looking at the d_splice_alias issue. The bug that
Bruce found only shows up in the clustered case, and not in the single
node case though.

Which particular race are you worried about? This check is covered by
the rename glock, which is basically a cluster wide version of the vfs
level ->s_vfs_rename_mutex.

To diverge from that topic for a moment, this thread has also brought
together some discussion on another issue which I've been pondering
recently that of whether the inode operations for get/set_xattr
should take a dentry or not. I had thought that we'd come to the
conclusion that 9p made it impossible to swap the current dentry
argument for an inode, and I was about to send a patch for selinux
support on clustered fs on that basis. However the discussion in this
thread has made me wonder whether that really is the case or not Al,
can you confirm whether your xattr-experimental patches are still under
active consideration?

The other question that I have relating to that side of things, is why
security_inode_permission() is called from __inode_permission() rather
than from generic_permission() ? Maybe there is a good reason, but I
can't immediately see what it is at the moment.

In response to the question elsewhere about GFS2 calling
gfs2_permission() after the vfs has already done its checks, that is
indeed down to needing to ensure that we have the cluster locks when
this check is called. More importantly to know that things haven't
changed since the VFS called the same function in case we've raced with
another node changing the permissions, for example. There are a number
of cases where we redo vfs level checks for this reason,

Steve.


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2] ceph: fix posix ACL hooks

2014-02-03 Thread Linus Torvalds
On Mon, Feb 3, 2014 at 2:40 PM, Al Viro  wrote:
>
> Umm...  Point, but that actually means that we get an extra pitfall for
> filesystem writers here.  foo_permission() passes dentry (now that it
> has one) to foo_wank_a_lot(), with the latter using dentry->d_inode at
> some point...

I agree.

The good news, though, is that in the RCU lookup case, we have that
MAY_NOT_BLOCK thing, and most filesystems will have errored out for
any complex operations.

RCU lookup is special, and complicated, and sadly, the permissions
checking is very much part of that. But for the really complex cases,
at least we can punt.

>> Look at gfs2_lookupi() in particular, and check how it is called.
>
> Yeowch...  gfs2_ok_to_move() is particulary nasty...  WTF do we need
> it for and why is it not racy as hell?

I don't know. And I suspect that for things like the journal index
file lookup (which is actually worse - see gfs2_jindex_hold()) we
don't really about the permissions, since this is just done at
init_journal() time.

So I think all of this is quite solvable for gfs2, it just wasn't the
obvious kind of "we already have the dentry" case that every single
other case I looked at was.

 Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2] ceph: fix posix ACL hooks

2014-02-03 Thread Al Viro
On Mon, Feb 03, 2014 at 02:12:00PM -0800, Linus Torvalds wrote:

> >> -int afs_permission(struct inode *inode, int mask)
> >> +int afs_permission(struct dentry *dentry, struct inode *inode, int mask)
> >
> > Oh, _lovely_.  So not only do we pass dentry, the arguments are redundant
> > as well.
> 
> Note that *not* passing in inode would make the patch much bigger,
> because now every filesystem would have to add the
> 
>struct inode *inode = dentry->d_inode;
> 
> at the top.
> 
> Also, I'm not actually convinced it is redundant at all. Remember the
> RCU lookup case? dentry->d_inode is not safe.

Umm...  Point, but that actually means that we get an extra pitfall for
filesystem writers here.  foo_permission() passes dentry (now that it
has one) to foo_wank_a_lot(), with the latter using dentry->d_inode at
some point...

> >> +static int gfs2_vfs_permission(struct dentry *dentry, struct inode 
> >> *inode, int mask)
> >> +{
> >> + return gfs2_permission(inode, mask);
> >> +}
> >
> > Er...  You do realize that callers of gfs2_permission() tend to have
> > the dentry in question, either directly or as ->d_parent of something
> > they have?
> 
> Not true. Look closer.
> 
> Look at gfs2_lookupi() in particular, and check how it is called.

Yeowch...  gfs2_ok_to_move() is particulary nasty...  WTF do we need
it for and why is it not racy as hell?
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2] ceph: fix posix ACL hooks

2014-02-03 Thread Al Viro
On Mon, Feb 03, 2014 at 01:44:22PM -0800, Linus Torvalds wrote:
> On Mon, Feb 3, 2014 at 1:39 PM, Al Viro  wrote:
> >
> > If we really have hardlinks, the result of permission check would better
> > be a function of inode itself - as in, "if it gives different results
> > for two pathnames reachable for the same user, we have a bug".
  ^
> No. You're wrong.
> 
> EVEN ON A UNIX FILESYSTEM THE PATH IS MEANINGFUL.
> 
> Do this: create a hardlink in two different directories. Make the
> *directory* permissions for one of the directories be something you
> cannot traverse. Now try to check the permissions of the *same* inode
> through those two paths. Notice how you get *different* results.
> 
> Really.

Yes.  In one case we won't get to looking at the permissions of that thing
at all.
 
> Now, imagine that you are doing the same thing over a network. On the
> server, there may be a single inode for the file, but when the client
> gives the server a pathname, the two pathnames to that single inode
> ARE NOT EQUIVALENT.

Why do we pretend that those are links, in the first place?

> And the fact is, filesystems with hardlinks and path-name-based
> operations do exist. cifs with the unix extensions is one of them.

Pox on Tridge...

I really, really hate that change; I can buy "->getxattr() has inconvenient
interface because of lousy protocol design", but spreading the same to
->permission(), with everything that will fall out of that... 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2] ceph: fix posix ACL hooks

2014-02-03 Thread Linus Torvalds
On Mon, Feb 3, 2014 at 1:59 PM, Al Viro  wrote:
> On Mon, Feb 03, 2014 at 01:03:32PM -0800, Linus Torvalds wrote:
>
>> - err = vfs_mkdir(path.dentry->d_inode, dentry, mode);
>> + err = vfs_mkdir(path.dentry, dentry, mode);
>
> Pointless - path.dentry == dentry->d_parent anyway.

Heh. It's no less redundant than it used to be.

But if you want to clean up the vfs_xyzzy() ones further, I'm
perfectly fine with that.

>> - err = ll_vfs_rename(dir->d_inode, dchild_old, mnt,
>> - dir->d_inode, dchild_new, mnt, NULL);
>> + err = ll_vfs_rename(dir, dchild_old, mnt,
>> + dir, dchild_new, mnt, NULL);
>
>
> ... and again, that's completely pointless.

Minimal patch.. I really didn't want to check what the heck lustre
does with the insane ll_vfs thing.

>> -int afs_permission(struct inode *inode, int mask)
>> +int afs_permission(struct dentry *dentry, struct inode *inode, int mask)
>
> Oh, _lovely_.  So not only do we pass dentry, the arguments are redundant
> as well.

Note that *not* passing in inode would make the patch much bigger,
because now every filesystem would have to add the

   struct inode *inode = dentry->d_inode;

at the top.

Also, I'm not actually convinced it is redundant at all. Remember the
RCU lookup case? dentry->d_inode is not safe.

The RCU case actually does

  inode_permission(nd->path.dentry, nd->inode, ..)

and here the difference between nd->inode and dentry->d_inode are
relevant, I think.

>> +static int gfs2_vfs_permission(struct dentry *dentry, struct inode *inode, 
>> int mask)
>> +{
>> + return gfs2_permission(inode, mask);
>> +}
>
> Er...  You do realize that callers of gfs2_permission() tend to have
> the dentry in question, either directly or as ->d_parent of something
> they have?

Not true. Look closer.

Look at gfs2_lookupi() in particular, and check how it is called.

I did hate that part of the patch, and I did mention the kinds of
problems this will cause if the next phase passes in dentry to
"generic_permission()".

Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2] ceph: fix posix ACL hooks

2014-02-03 Thread Al Viro
On Mon, Feb 03, 2014 at 01:03:32PM -0800, Linus Torvalds wrote:

> - err = vfs_mkdir(path.dentry->d_inode, dentry, mode);
> + err = vfs_mkdir(path.dentry, dentry, mode);

Pointless - path.dentry == dentry->d_parent anyway.
 
> - err = vfs_mknod(path.dentry->d_inode, dentry, mode, dev->devt);
> + err = vfs_mknod(path.dentry, dentry, mode, dev->devt);

Ditto.

> @@ -237,7 +237,7 @@ static int dev_rmdir(const char *name)
>   return PTR_ERR(dentry);
>   if (dentry->d_inode) {
>   if (dentry->d_inode->i_private == &thread)
> - err = vfs_rmdir(parent.dentry->d_inode, dentry);
> + err = vfs_rmdir(parent.dentry, dentry);
 
 Ditto, with s/path/parent/
> @@ -324,7 +324,7 @@ static int handle_remove(const char *nodename, struct 
> device *dev)
>   mutex_lock(&dentry->d_inode->i_mutex);
>   notify_change(dentry, &newattrs, NULL);
>   mutex_unlock(&dentry->d_inode->i_mutex);
> - err = vfs_unlink(parent.dentry->d_inode, dentry, NULL);
> + err = vfs_unlink(parent.dentry, dentry, NULL);
>   if (!err || err == -ENOENT)
>   deleted = 1;
>   }
 
And here as well.
 
> +++ b/drivers/staging/lustre/lustre/lvfs/lvfs_linux.c
> @@ -222,8 +222,8 @@ int lustre_rename(struct dentry *dir, struct vfsmount 
> *mnt,
>   if (IS_ERR(dchild_new))
>   GOTO(put_old, err = PTR_ERR(dchild_new));
>  
> - err = ll_vfs_rename(dir->d_inode, dchild_old, mnt,
> - dir->d_inode, dchild_new, mnt, NULL);
> + err = ll_vfs_rename(dir, dchild_old, mnt,
> + dir, dchild_new, mnt, NULL);


... and again, that's completely pointless.

> -int afs_permission(struct inode *inode, int mask)
> +int afs_permission(struct dentry *dentry, struct inode *inode, int mask)

Oh, _lovely_.  So not only do we pass dentry, the arguments are redundant
as well.

> -static inline int btrfs_may_create(struct inode *dir, struct dentry *child)
> +static inline int btrfs_may_create(struct dentry *parent, struct dentry 
> *child)

I'm fairly sure that it's also pointless, because parent is going to be, well,
the parent.  Of child.

> +static int gfs2_vfs_permission(struct dentry *dentry, struct inode *inode, 
> int mask)
> +{
> + return gfs2_permission(inode, mask);
> +}

Er...  You do realize that callers of gfs2_permission() tend to have
the dentry in question, either directly or as ->d_parent of something
they have?


I really hate the whole thing... ;-/
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2] ceph: fix posix ACL hooks

2014-02-03 Thread Linus Torvalds
On Mon, Feb 3, 2014 at 1:39 PM, Al Viro  wrote:
>
> If we really have hardlinks, the result of permission check would better
> be a function of inode itself - as in, "if it gives different results
> for two pathnames reachable for the same user, we have a bug".

No. You're wrong.

EVEN ON A UNIX FILESYSTEM THE PATH IS MEANINGFUL.

Do this: create a hardlink in two different directories. Make the
*directory* permissions for one of the directories be something you
cannot traverse. Now try to check the permissions of the *same* inode
through those two paths. Notice how you get *different* results.

Really.

Now, imagine that you are doing the same thing over a network. On the
server, there may be a single inode for the file, but when the client
gives the server a pathname, the two pathnames to that single inode
ARE NOT EQUIVALENT.

And the fact is, filesystems with hardlinks and path-name-based
operations do exist. cifs with the unix extensions is one of them.

Al, face it, you're wrong this time.

  Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2] ceph: fix posix ACL hooks

2014-02-03 Thread Al Viro
On Mon, Feb 03, 2014 at 01:37:03PM -0800, Linus Torvalds wrote:
> On Mon, Feb 3, 2014 at 1:31 PM, Al Viro  wrote:
> >
> > Yes, and...?  CIFS also doesn't have hardlinks, so _there_ d_find_alias()
> > is just fine.
> 
> Hmm? I'm pretty sure cifs can actually have hardlinks.

*UGH*

How the hell does it...  Oh, right - samba on Unix server.  I really wonder
how well do Windows clients deal with those...

Crap...  I reall hate that prototype change ;-/
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2] ceph: fix posix ACL hooks

2014-02-03 Thread Al Viro
On Mon, Feb 03, 2014 at 09:39:26PM +, Al Viro wrote:

> Which fs are you talking about?  For 9P it *is* a function of inode alone.
> For CIFS there's no wrong dentry to pick - it doesn't have links to start
> with.

Except that apparently it does ;-/  Bugger...
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2] ceph: fix posix ACL hooks

2014-02-03 Thread Al Viro
On Mon, Feb 03, 2014 at 01:31:19PM -0800, Linus Torvalds wrote:

> If the protocol is path-based (and it happens, and it's actually the
> *correct* thing to do for a network filesystem, rather than the
> idiotic "file handle" crap that tries to emulate the unix inode
> semantics in the protocol), then the inode is simply not sufficient.
> 
> And no, d_find_alias() is not correct or sufficient either. It can
> work in practice (and probably does perfectly fine 99.9% of the time),
> but it can equally well give the *wrong* dentry: yes, the dentry it
> returns would have been a valid dentry for somebody at some time, but
> it might be a stale dentry *now*, and it might be the wrong dentry for
> the current user (because the current user may not have permissions to
> that particular path, even if the user has permissions through his
> *own* path).
> 
> So I really think you're *fundamentally* incorrect when you say
> "result *is* a function of inode alone".

Which fs are you talking about?  For 9P it *is* a function of inode alone.
For CIFS there's no wrong dentry to pick - it doesn't have links to start
with.

If we really have hardlinks, the result of permission check would better
be a function of inode itself - as in, "if it gives different results
for two pathnames reachable for the same user, we have a bug".
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2] ceph: fix posix ACL hooks

2014-02-03 Thread Linus Torvalds
On Mon, Feb 3, 2014 at 1:31 PM, Al Viro  wrote:
>
> Yes, and...?  CIFS also doesn't have hardlinks, so _there_ d_find_alias()
> is just fine.

Hmm? I'm pretty sure cifs can actually have hardlinks.

 Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2] ceph: fix posix ACL hooks

2014-02-03 Thread Christoph Hellwig
On Mon, Feb 03, 2014 at 09:31:53PM +, Al Viro wrote:
> Yes, and...?  CIFS also doesn't have hardlinks, so _there_ d_find_alias()
> is just fine.

It does have hardlinks, look at cifs_hardlink and functions called from
it.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2] ceph: fix posix ACL hooks

2014-02-03 Thread Al Viro
On Mon, Feb 03, 2014 at 01:24:47PM -0800, Christoph Hellwig wrote:
> On Mon, Feb 03, 2014 at 09:19:55PM +, Al Viro wrote:
> > Result *is* a function of inode alone; the problem with 9P is that we
> > are caching FIDs in the wrong place.
> 
> I don't think that's true for CIFS unfortunately, which is path based.

Yes, and...?  CIFS also doesn't have hardlinks, so _there_ d_find_alias()
is just fine.

9P is actually trickier; there we need some massage, but for CIFS it's
literally a matter of one function call.

The problem with 9P is that the way we do it right now, you might pick
the wrong dentry.  _Some_ alias already bears the FID you are after,
but if you end up picking the one that doesn't, you might very well
end up being unable to obtain one either.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2] ceph: fix posix ACL hooks

2014-02-03 Thread Linus Torvalds
On Mon, Feb 3, 2014 at 1:19 PM, Al Viro  wrote:
>
> Please, don't do that.  Half of that is pointless (e.g. what you are
> doing with vfs_rmdir() - if anything, we could get rid of the first
> argument completely, it's always victim->d_parent->d_inode and we are
> holding enough locks for that to be stable) and ->permission() is
> just plain wrong.
>
> Result *is* a function of inode alone; the problem with 9P is that we
> are caching FIDs in the wrong place.

No, Al. It's *not* just a function of the inode alone. That's the point.

Some network filesystems pass the *path* to the server. Any operation
that needs to check something on the server needs the *dentry*, not
the inode.

This whole "the inode describes the file" mentality comes from old
broken UNIX semantics. It's fundamentally true for local Unix
filesystems, but that's it. It's not true in general.

Sure, many network filesystems then emulate the local Unix filesystem
behavior, so in practice, you get the unix semantics quite often. But
it really is wrong.

If the protocol is path-based (and it happens, and it's actually the
*correct* thing to do for a network filesystem, rather than the
idiotic "file handle" crap that tries to emulate the unix inode
semantics in the protocol), then the inode is simply not sufficient.

And no, d_find_alias() is not correct or sufficient either. It can
work in practice (and probably does perfectly fine 99.9% of the time),
but it can equally well give the *wrong* dentry: yes, the dentry it
returns would have been a valid dentry for somebody at some time, but
it might be a stale dentry *now*, and it might be the wrong dentry for
the current user (because the current user may not have permissions to
that particular path, even if the user has permissions through his
*own* path).

So I really think you're *fundamentally* incorrect when you say
"result *is* a function of inode alone".

  Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2] ceph: fix posix ACL hooks

2014-02-03 Thread Christoph Hellwig
On Mon, Feb 03, 2014 at 09:19:55PM +, Al Viro wrote:
> Result *is* a function of inode alone; the problem with 9P is that we
> are caching FIDs in the wrong place.

I don't think that's true for CIFS unfortunately, which is path based.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2] ceph: fix posix ACL hooks

2014-02-03 Thread Christoph Hellwig
On Mon, Feb 03, 2014 at 01:03:32PM -0800, Linus Torvalds wrote:
> Now, to be honest, pushing it down one more level (to
> generic_permission()) will actually start causing some trouble. In
> particular, gfs2_permission() fundamentally does not have a dentry for
> several of the callers.

Looking over the gfs2 code the problem seems to be that it duplicates
permissions checks from the may_{lookup,create,linkat,delete}, most
likely because it needs cluster locking in place for them.  The right
fix seems to be to optionally call the filesystem from those.  That
being said I wonder how ocfs2 or network filesystems get away without
that.

> What do you think? I guess this patch could be split up into two: one
> that does the "vfs_xyz()" helper functions, and another that does the
> inode_permission() change. I tied them together mainly because I
> started with the inode_permission() change, and that required the
> vfs_xyz() change.

The changes look good to me, and yes I think they should be split.
I'll see if I can take this further, but doing something non-hacky
in GFS2 would be the first step here.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2] ceph: fix posix ACL hooks

2014-02-03 Thread Al Viro
On Mon, Feb 03, 2014 at 01:03:32PM -0800, Linus Torvalds wrote:

> What do you think? I guess this patch could be split up into two: one
> that does the "vfs_xyz()" helper functions, and another that does the
> inode_permission() change. I tied them together mainly because I
> started with the inode_permission() change, and that required the
> vfs_xyz() change.

Please, don't do that.  Half of that is pointless (e.g. what you are
doing with vfs_rmdir() - if anything, we could get rid of the first
argument completely, it's always victim->d_parent->d_inode and we are
holding enough locks for that to be stable) and ->permission() is
just plain wrong.

Result *is* a function of inode alone; the problem with 9P is that we
are caching FIDs in the wrong place.

What really happens is that protocol refers to objects by 32bit tokens,
given by server out to client.  Many of those can correspond to the
same file; think of those as file descriptors.  We are associating them
with dentries, but if you have several links to the same file a FID
acquired for either of them will do.

What we ought to do, AFAICS, is moving these guys from dentry to inode;
sure, to *get* one in the first place we need some dentry.  But by the
time ->setxattr() and friends get to see the inode, the caller has already
done things that make sure that some FID of his exists.

IOW, NAK.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2] ceph: fix posix ACL hooks

2014-02-03 Thread Al Viro
On Mon, Feb 03, 2014 at 02:29:43AM -0800, Christoph Hellwig wrote:
> On Thu, Jan 30, 2014 at 02:01:38PM -0800, Linus Torvalds wrote:
> > In the end, all the original call-sites should have a dentry, and none
> > of this is "fundamental". But you're right, it looks like an absolute
> > nightmare to add the dentry pointer through the whole chain. Damn.
> > 
> > So I'm not thrilled about it, but maybe that "d_find_alias(inode)" to
> > find the dentry is good enough in practice. It feels very much
> > incorrect (it could find a dentry with a path that you cannot actually
> > access on the server, and result in user-visible errors), but I
> > definitely see your argument. It may just not be worth the pain for
> > this odd ceph case.
> 
> It's not just ceph.  9p fundamentally needs it and I really want to
> convert 9p to the new code so that we can get rid of the lower level
> interfaces entirely and eventually move ACL dispatching entirely
> into the VFS.  The same d_find_alias hack should work for 9p as well,
> although spreading this even more gets uglier and uglier.  Similarly
> for CIFS which pretends to understand the Posix ACL xattrs, but doesn't
> use any of the infrastructure as it seems to rely on server side
> enforcement.

9P is going to be fun to deal with; that's why I've ended up abandoning
vfs.git#experimental-xattr last year.  We probably want to move FIDs
from dentries to inodes there, and rely in ->getxattr() et.al. upon
having already done ->d_revalidate() on some dentry for that inode.

Another pile of fun is fsnotify_xattr() call in __vfs_setxattr_noperm()
and the whole misbegotten IMA/EVM mess ;-/  

See #experimental-xattr - a lot of stuff in that direction is sitting there;
might turn out to be useful.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2] ceph: fix posix ACL hooks

2014-02-03 Thread Christoph Hellwig
On Thu, Jan 30, 2014 at 02:01:38PM -0800, Linus Torvalds wrote:
> In the end, all the original call-sites should have a dentry, and none
> of this is "fundamental". But you're right, it looks like an absolute
> nightmare to add the dentry pointer through the whole chain. Damn.
> 
> So I'm not thrilled about it, but maybe that "d_find_alias(inode)" to
> find the dentry is good enough in practice. It feels very much
> incorrect (it could find a dentry with a path that you cannot actually
> access on the server, and result in user-visible errors), but I
> definitely see your argument. It may just not be worth the pain for
> this odd ceph case.

It's not just ceph.  9p fundamentally needs it and I really want to
convert 9p to the new code so that we can get rid of the lower level
interfaces entirely and eventually move ACL dispatching entirely
into the VFS.  The same d_find_alias hack should work for 9p as well,
although spreading this even more gets uglier and uglier.  Similarly
for CIFS which pretends to understand the Posix ACL xattrs, but doesn't
use any of the infrastructure as it seems to rely on server side
enforcement.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2] ceph: fix posix ACL hooks

2014-01-30 Thread Sage Weil
On Thu, 30 Jan 2014, Linus Torvalds wrote:
> On Wed, Jan 29, 2014 at 11:54 PM, Christoph Hellwig  
> wrote:
> >
> > For ->set_acl that's fairly easily doable and I actually had a version
> > doing that to be able to convert 9p.  But for ->get_acl the path walking
> > caller didn't seem easily feasible.  ->get_acl actually is an invention
> > of yours, so if you got a good idea to get the dentry to it I'd love
> > to be able to pass it.
> 
> Yeah, that's pretty annoying, largely because that path is also
> RCU-walk aware, which does *not* need this all (because it will never
> call down into the filesystem - if the acl isn't found in the cached
> acl's, we just abort).
> 
> And we're going through that very common "generic_permission()" thing
> that in turn is also often called from the low-level filesystens, and
> it's all fairly tightly integrated with __inode_permission() etc.
> 
> In the end, all the original call-sites should have a dentry, and none
> of this is "fundamental". But you're right, it looks like an absolute
> nightmare to add the dentry pointer through the whole chain. Damn.
> 
> So I'm not thrilled about it, but maybe that "d_find_alias(inode)" to
> find the dentry is good enough in practice. It feels very much
> incorrect (it could find a dentry with a path that you cannot actually
> access on the server, and result in user-visible errors), but I
> definitely see your argument. It may just not be worth the pain for
> this odd ceph case.
> 
> That said, if the ceph people decide to try to bite the bullet and do
> the required conversions to pass the dentry to the permissions
> functions, I think I'd take it unless it ends up being *too* horribly
> messy.

FWIW the dentry isn't useful in the get case; it's only on put that it is 
currently used.  And now that I look closely, it is only being used by 
ceph_setattr to associate the update with the parent directory for the 
purposes of fsync(dirfd)... which is, I think, incorrect anyway (that 
should only flush out/wait for namespace modifications, not inode attr 
updates).

So I think it's fine as is, and we'll clean this up later.

I do have a couple patches on top of what's in your tree, though, that 
clean up a couple duplicated lines in your fix and apply Christoph's 
cleanup:

 git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client.git for-linus

Thanks!
sage
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2] ceph: fix posix ACL hooks

2014-01-30 Thread Linus Torvalds
On Wed, Jan 29, 2014 at 11:54 PM, Christoph Hellwig  wrote:
>
> For ->set_acl that's fairly easily doable and I actually had a version
> doing that to be able to convert 9p.  But for ->get_acl the path walking
> caller didn't seem easily feasible.  ->get_acl actually is an invention
> of yours, so if you got a good idea to get the dentry to it I'd love
> to be able to pass it.

Yeah, that's pretty annoying, largely because that path is also
RCU-walk aware, which does *not* need this all (because it will never
call down into the filesystem - if the acl isn't found in the cached
acl's, we just abort).

And we're going through that very common "generic_permission()" thing
that in turn is also often called from the low-level filesystens, and
it's all fairly tightly integrated with __inode_permission() etc.

In the end, all the original call-sites should have a dentry, and none
of this is "fundamental". But you're right, it looks like an absolute
nightmare to add the dentry pointer through the whole chain. Damn.

So I'm not thrilled about it, but maybe that "d_find_alias(inode)" to
find the dentry is good enough in practice. It feels very much
incorrect (it could find a dentry with a path that you cannot actually
access on the server, and result in user-visible errors), but I
definitely see your argument. It may just not be worth the pain for
this odd ceph case.

That said, if the ceph people decide to try to bite the bullet and do
the required conversions to pass the dentry to the permissions
functions, I think I'd take it unless it ends up being *too* horribly
messy.

   Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2] ceph: fix posix ACL hooks

2014-01-29 Thread Christoph Hellwig
On Wed, Jan 29, 2014 at 11:09:18AM -0800, Linus Torvalds wrote:
> So attached is the incremental diff of the patch by Sage and Ilya, and
> I'll apply it (delayed a bit to see if I can get the sign-off from
> Ilya), but I also think we should fix the (non-cached) ACL functions
> that call down to the filesystem layer to also get the dentry.

For ->set_acl that's fairly easily doable and I actually had a version
doing that to be able to convert 9p.  But for ->get_acl the path walking
caller didn't seem easily feasible.  ->get_acl actually is an invention
of yours, so if you got a good idea to get the dentry to it I'd love
to be able to pass it.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2] ceph: fix posix ACL hooks

2014-01-29 Thread Ilya Dryomov
On Wed, Jan 29, 2014 at 6:37 PM, Ilya Dryomov  wrote:
> From: Sage Weil 
>
> The merge of 7221fe4c2 raced with upstream changes in the generic POSIX
> ACL code (2aeccbe95).  Update Ceph to use the new helpers as well by
> dropping the now-generic functions and setting the set_acl inode op.
>
> Signed-off-by: Sage Weil 

Signed-off-by: Ilya Dryomov 

Thanks,

Ilya
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2] ceph: fix posix ACL hooks

2014-01-29 Thread Linus Torvalds
On Wed, Jan 29, 2014 at 8:37 AM, Ilya Dryomov  wrote:
> From: Sage Weil 
>
> The merge of 7221fe4c2 raced with upstream changes in the generic POSIX
> ACL code (2aeccbe95).  Update Ceph to use the new helpers as well by
> dropping the now-generic functions and setting the set_acl inode op.
>
> Signed-off-by: Sage Weil 

Ok, I already committed my untested (and broken) patch yesterday,
because even a broken tree is better than a non-*compiling* tree for
testing.

I created a diff from my current tree to this, and will happily apply
it, but the sign-off chain is now broken (Ilya didn't sign off on his
changes to Sage's patch. Not that it really matters for what is really
a one-liner change, but I thought I'd mention it, since another issue
came up with this patch..

Ceph now does

struct dentry *dentry = d_find_alias(inode);

in its ceph_set_acl() function, and that's because the VFS layer
doesn't pass down the dentry to the acl code any more.

Christoph - that sounds like a misfeature in the new interfaces. I
realize that for traditional unix filesystems, the path is irrelevant
for an inode operation, but the thing is, from a Linux VFS standpoint
and a conceptual standpoint, the dentry really is the more important
and unique one, and some filesystems want the dentry because they are
*correctly* designed to be about pathnames.

Network filesystems that are pass file descriptors around (ie NFS etc)
are not the "RightWay(tm)" of doing things, so I think that it is
quite reasonable for ceph to want the *path* to the inode and not just
the inode.

So attached is the incremental diff of the patch by Sage and Ilya, and
I'll apply it (delayed a bit to see if I can get the sign-off from
Ilya), but I also think we should fix the (non-cached) ACL functions
that call down to the filesystem layer to also get the dentry.

We already do that for the xattr functions, just not for
get_acl()/set_acl()/acl_chmod().

Christoph, Al? Yes, most filesystems will ignore the dentry pointer, but still..

   Linus
 fs/ceph/acl.c   | 9 -
 fs/ceph/dir.c   | 1 +
 fs/ceph/inode.c | 2 ++
 fs/ceph/super.h | 3 +++
 4 files changed, 10 insertions(+), 5 deletions(-)

diff --git a/fs/ceph/acl.c b/fs/ceph/acl.c
index f6911284c9bd..66d377a12f7c 100644
--- a/fs/ceph/acl.c
+++ b/fs/ceph/acl.c
@@ -107,14 +107,14 @@ struct posix_acl *ceph_get_acl(struct inode *inode, int 
type)
return acl;
 }
 
-static int ceph_set_acl(struct dentry *dentry, struct inode *inode,
-   struct posix_acl *acl, int type)
+int ceph_set_acl(struct inode *inode, struct posix_acl *acl, int type)
 {
int ret = 0, size = 0;
const char *name = NULL;
char *value = NULL;
struct iattr newattrs;
umode_t new_mode = inode->i_mode, old_mode = inode->i_mode;
+   struct dentry *dentry = d_find_alias(inode);
 
if (acl) {
ret = posix_acl_valid(acl);
@@ -208,8 +208,7 @@ int ceph_init_acl(struct dentry *dentry, struct inode 
*inode, struct inode *dir)
 
if (IS_POSIXACL(dir) && acl) {
if (S_ISDIR(inode->i_mode)) {
-   ret = ceph_set_acl(dentry, inode, acl,
-   ACL_TYPE_DEFAULT);
+   ret = ceph_set_acl(inode, acl, ACL_TYPE_DEFAULT);
if (ret)
goto out_release;
}
@@ -217,7 +216,7 @@ int ceph_init_acl(struct dentry *dentry, struct inode 
*inode, struct inode *dir)
if (ret < 0)
goto out;
else if (ret > 0)
-   ret = ceph_set_acl(dentry, inode, acl, ACL_TYPE_ACCESS);
+   ret = ceph_set_acl(inode, acl, ACL_TYPE_ACCESS);
else
cache_no_acl(inode);
} else {
diff --git a/fs/ceph/dir.c b/fs/ceph/dir.c
index 619616d585b0..6da4df84ba30 100644
--- a/fs/ceph/dir.c
+++ b/fs/ceph/dir.c
@@ -1303,6 +1303,7 @@ const struct inode_operations ceph_dir_iops = {
.listxattr = ceph_listxattr,
.removexattr = ceph_removexattr,
.get_acl = ceph_get_acl,
+   .set_acl = ceph_set_acl,
.mknod = ceph_mknod,
.symlink = ceph_symlink,
.mkdir = ceph_mkdir,
diff --git a/fs/ceph/inode.c b/fs/ceph/inode.c
index 8b8b506636cc..32d519d8a2e2 100644
--- a/fs/ceph/inode.c
+++ b/fs/ceph/inode.c
@@ -97,6 +97,7 @@ const struct inode_operations ceph_file_iops = {
.listxattr = ceph_listxattr,
.removexattr = ceph_removexattr,
.get_acl = ceph_get_acl,
+   .set_acl = ceph_set_acl,
 };
 
 
@@ -1616,6 +1617,7 @@ static const struct inode_operations ceph_symlink_iops = {
.listxattr = ceph_listxattr,
.removexattr = ceph_removexattr,
.get_acl = ceph_get_acl,
+   .set_acl = ceph_set_acl,
 };
 
 /*
diff --git a/fs/ceph/super.h b/fs/ceph/super.h
index 345933948b6e..aa260590f615 100644
--- a/fs/ceph/super.h

[PATCH v2] ceph: fix posix ACL hooks

2014-01-29 Thread Ilya Dryomov
From: Sage Weil 

The merge of 7221fe4c2 raced with upstream changes in the generic POSIX
ACL code (2aeccbe95).  Update Ceph to use the new helpers as well by
dropping the now-generic functions and setting the set_acl inode op.

Signed-off-by: Sage Weil 
---
v2: fixed the !CONFIG_CEPH_FS_POSIX_ACL case

 fs/ceph/acl.c   |  112 +++
 fs/ceph/dir.c   |1 +
 fs/ceph/inode.c |5 ++-
 fs/ceph/super.h |6 +--
 fs/ceph/xattr.c |5 ++-
 5 files changed, 16 insertions(+), 113 deletions(-)

diff --git a/fs/ceph/acl.c b/fs/ceph/acl.c
index 64fddbc1d17b..66d377a12f7c 100644
--- a/fs/ceph/acl.c
+++ b/fs/ceph/acl.c
@@ -107,14 +107,14 @@ struct posix_acl *ceph_get_acl(struct inode *inode, int 
type)
return acl;
 }
 
-static int ceph_set_acl(struct dentry *dentry, struct inode *inode,
-   struct posix_acl *acl, int type)
+int ceph_set_acl(struct inode *inode, struct posix_acl *acl, int type)
 {
int ret = 0, size = 0;
const char *name = NULL;
char *value = NULL;
struct iattr newattrs;
umode_t new_mode = inode->i_mode, old_mode = inode->i_mode;
+   struct dentry *dentry = d_find_alias(inode);
 
if (acl) {
ret = posix_acl_valid(acl);
@@ -208,16 +208,15 @@ int ceph_init_acl(struct dentry *dentry, struct inode 
*inode, struct inode *dir)
 
if (IS_POSIXACL(dir) && acl) {
if (S_ISDIR(inode->i_mode)) {
-   ret = ceph_set_acl(dentry, inode, acl,
-   ACL_TYPE_DEFAULT);
+   ret = ceph_set_acl(inode, acl, ACL_TYPE_DEFAULT);
if (ret)
goto out_release;
}
-   ret = posix_acl_create(&acl, GFP_NOFS, &inode->i_mode);
+   ret = __posix_acl_create(&acl, GFP_NOFS, &inode->i_mode);
if (ret < 0)
goto out;
else if (ret > 0)
-   ret = ceph_set_acl(dentry, inode, acl, ACL_TYPE_ACCESS);
+   ret = ceph_set_acl(inode, acl, ACL_TYPE_ACCESS);
else
cache_no_acl(inode);
} else {
@@ -229,104 +228,3 @@ out_release:
 out:
return ret;
 }
-
-int ceph_acl_chmod(struct dentry *dentry, struct inode *inode)
-{
-   struct posix_acl *acl;
-   int ret = 0;
-
-   if (S_ISLNK(inode->i_mode)) {
-   ret = -EOPNOTSUPP;
-   goto out;
-   }
-
-   if (!IS_POSIXACL(inode))
-   goto out;
-
-   acl = ceph_get_acl(inode, ACL_TYPE_ACCESS);
-   if (IS_ERR_OR_NULL(acl)) {
-   ret = PTR_ERR(acl);
-   goto out;
-   }
-
-   ret = posix_acl_chmod(&acl, GFP_KERNEL, inode->i_mode);
-   if (ret)
-   goto out;
-   ret = ceph_set_acl(dentry, inode, acl, ACL_TYPE_ACCESS);
-   posix_acl_release(acl);
-out:
-   return ret;
-}
-
-static int ceph_xattr_acl_get(struct dentry *dentry, const char *name,
-   void *value, size_t size, int type)
-{
-   struct posix_acl *acl;
-   int ret = 0;
-
-   if (!IS_POSIXACL(dentry->d_inode))
-   return -EOPNOTSUPP;
-
-   acl = ceph_get_acl(dentry->d_inode, type);
-   if (IS_ERR(acl))
-   return PTR_ERR(acl);
-   if (acl == NULL)
-   return -ENODATA;
-
-   ret = posix_acl_to_xattr(&init_user_ns, acl, value, size);
-   posix_acl_release(acl);
-
-   return ret;
-}
-
-static int ceph_xattr_acl_set(struct dentry *dentry, const char *name,
-   const void *value, size_t size, int flags, int type)
-{
-   int ret = 0;
-   struct posix_acl *acl = NULL;
-
-   if (!inode_owner_or_capable(dentry->d_inode)) {
-   ret = -EPERM;
-   goto out;
-   }
-
-   if (!IS_POSIXACL(dentry->d_inode)) {
-   ret = -EOPNOTSUPP;
-   goto out;
-   }
-
-   if (value) {
-   acl = posix_acl_from_xattr(&init_user_ns, value, size);
-   if (IS_ERR(acl)) {
-   ret = PTR_ERR(acl);
-   goto out;
-   }
-
-   if (acl) {
-   ret = posix_acl_valid(acl);
-   if (ret)
-   goto out_release;
-   }
-   }
-
-   ret = ceph_set_acl(dentry, dentry->d_inode, acl, type);
-
-out_release:
-   posix_acl_release(acl);
-out:
-   return ret;
-}
-
-const struct xattr_handler ceph_xattr_acl_default_handler = {
-   .prefix = POSIX_ACL_XATTR_DEFAULT,
-   .flags  = ACL_TYPE_DEFAULT,
-   .get= ceph_xattr_acl_get,
-   .set= ceph_xattr_acl_set,
-};
-
-const struct xattr_handler ceph_xattr_acl_access_handler = {
-   .prefix = POSIX_ACL_XATTR_ACCESS,
-   .flags  = ACL_TYPE_ACCESS,
-   .get