On Fri, Jul 08, 2016 at 12:16:14PM -0400, Oleg Drokin wrote:
> 
> On Jul 8, 2016, at 12:04 PM, J. Bruce Fields wrote:
> 
> > On Fri, Jul 08, 2016 at 11:53:28AM -0400, Jeff Layton wrote:
> >> On Fri, 2016-07-08 at 11:14 -0400, Oleg Drokin wrote:
> >>> On Jul 8, 2016, at 7:02 AM, Jeff Layton wrote:
> >>> 
> >>>> On Thu, 2016-07-07 at 21:47 -0400, Oleg Drokin wrote:
> >>>>> It looks like we are bit overzealous about failing mkdir/create/mknod
> >>>>> with permission denied if the parent dir is not writeable.
> >>>>> Need to make sure the name does not exist first, because we need to
> >>>>> return EEXIST in that case.
> >>>>> 
> >>>>> Signed-off-by: Oleg Drokin <[email protected]>
> >>>>> ---
> >>>>> A very similar problem exists with symlinks, but the patch is more
> >>>>> involved, so assuming this one is ok, I'll send a symlink one 
> >>>>> separately.
> >>>>>  fs/nfsd/nfs4proc.c |  6 +++++-
> >>>>>  fs/nfsd/vfs.c      | 11 ++++++++++-
> >>>>>  2 files changed, 15 insertions(+), 2 deletions(-)
> >>>>> 
> >>>> 
> >>>> 
> >>>> nit: subject says EPERM, but I think you mean EACCES. The mnemonic I've
> >>>> always used is that EPERM is "permanent". IOW, changing permissions
> >>>> won't ever allow the user to do something. For instance, unprivileged
> >>>> users can never chown a file, so they should get back EPERM there. When
> >>>> a directory isn't writeable on a create they should get EACCES since
> >>>> they could do the create if the directory were writeable.
> >>> 
> >>> Hm, I see, thanks.
> >>> Confusing that you get "Permission denied" from perror ;)
> >>> 
> >> 
> >> Yes indeed. It's a subtle and confusing distinction.
> >> 
> >>>>> diff --git a/fs/nfsd/nfs4proc.c b/fs/nfsd/nfs4proc.c
> >>>>> index de1ff1d..0067520 100644
> >>>>> --- a/fs/nfsd/nfs4proc.c
> >>>>> +++ b/fs/nfsd/nfs4proc.c
> >>>>> @@ -605,8 +605,12 @@ nfsd4_create(struct svc_rqst *rqstp, struct 
> >>>>> nfsd4_compound_state *cstate,
> >>>>>  
> >>>>>         fh_init(&resfh, NFS4_FHSIZE);
> >>>>>  
> >>>>> +       /*
> >>>>> +        * We just check thta parent is accessible here, nfsd_* do their
> >>>>> +        * own access permission checks
> >>>>> +        */
> >>>>>         status = fh_verify(rqstp, &cstate->current_fh, S_IFDIR,
> >>>>> -                          NFSD_MAY_CREATE);
> >>>>> +                          NFSD_MAY_EXEC);
> >>>>>         if (status)
> >>>>>                 return status;
> >>>>>  
> >>>>> diff --git a/fs/nfsd/vfs.c b/fs/nfsd/vfs.c
> >>>>> index 6fbd81e..6a45ec6 100644
> >>>>> --- a/fs/nfsd/vfs.c
> >>>>> +++ b/fs/nfsd/vfs.c
> >>>>> @@ -1161,7 +1161,11 @@ nfsd_create(struct svc_rqst *rqstp, struct 
> >>>>> svc_fh *fhp,
> >>>>>         if (isdotent(fname, flen))
> >>>>>                 goto out;
> >>>>>  
> >>>>> -       err = fh_verify(rqstp, fhp, S_IFDIR, NFSD_MAY_CREATE);
> >>>>> +       /*
> >>>>> +        * Even though it is a create, first we see if we are even 
> >>>>> allowed
> >>>>> +        * to peek inside the parent
> >>>>> +        */
> >>>>> +       err = fh_verify(rqstp, fhp, S_IFDIR, NFSD_MAY_EXEC);
> >>>>>         if (err)
> >>>>>                 goto out;
> >>>>>  
> >>>>> @@ -1211,6 +1215,11 @@ nfsd_create(struct svc_rqst *rqstp, struct 
> >>>>> svc_fh *fhp,
> >>>>>                 goto out; 
> >>>>>         }
> >>>>>  
> >>>>> +       /* Now let's see if we actually have permissions to create */
> >>>>> +       err = nfsd_permission(rqstp, fhp->fh_export, dentry, 
> >>>>> NFSD_MAY_CREATE);
> >>>>> +       if (err)
> >>>>> +               goto out;
> >>>>> +
> >>>>>         if (!(iap->ia_valid & ATTR_MODE))
> >>>>>                 iap->ia_mode = 0;
> >>>>>         iap->ia_mode = (iap->ia_mode & S_IALLUGO) | type;
> >>>> 
> >>>> 
> >>>> Ouch. This means two nfsd_permission calls per create operation. If
> >>>> it's necessary for correctness then so be it, but is it actually
> >>>> documented anywhere (POSIX perhaps?) that we must prefer EEXIST over
> >>>> EACCES in this situation?
> >>> 
> >>> Opengroup manpage: 
> >>> http://pubs.opengroup.org/onlinepubs/009695399/functions/mkdir.html
> >>> newer version is here:
> >>> http://pubs.opengroup.org/onlinepubs/9699919799/
> >>> 
> >>> They tell us that we absolutely must fail with EEXIST if the name is a 
> >>> symlink
> >>> (so we need to lookup it anyway), and also that EEXIST is the failure code
> >>> if the path exists.
> >>> 
> >> 
> >> I'm not sure that that verbiage supersedes the fact that you don't have
> >> write permissions on the directory. Does it?
> >> 
> >> ISTM that it's perfectly valid to shortcut looking up the dentry if the
> >> user doesn't have write permissions on the directory, even when the
> >> target is a symlink.
> >> 
> >> IOW, I'm not sure I see a bug here.
> > 
> > If this is causing real programs to behave incorrectly, then that may
> > matter more than the letter of the spec.  But I'm a little curious why
> > we'd be hearing about that just now--did the client or server's behavior
> > change recently?
> 
> We, on the Lustre side, have been hearing about this since 2010, (this 
> optimization
> was enabled in 2009).
> 
> I suspect some people just complain in places that not everybody monitors.

Sure, but you said "tons of programs" do this, and off hand I can't
recall a single report.  That's weird.

Anyway, I agree that the behavior your want seems more consistent at
least.

--b.

> I tried 3.10 and it has the same problem here.
> I just tried on RHEL6 (2.6.32) and the problem is also apparent there.
> 
> Also it's confusing how you get different errors depending on if the cache is 
> hot or not:
> [green@centos6-16 racer]$ mkdir test
> mkdir: cannot create directory `test': Permission denied
> [green@centos6-16 racer]$ ls -ld test
> drwxr-xr-x 2 root root 4096 Jul  8 12:12 test
> [green@centos6-16 racer]$ mkdir test
> mkdir: cannot create directory `test': File exists
> 
> 
> >>> Are double permission checks really as bad for nfs? it looked like it 
> >>> would
> >>> call mostly into VFS so even if first call would be expensive, second 
> >>> call should
> >>> be really cheap?
> >>> 
> >> 
> >> It depends on the underlying fs. In most cases, you're right, but you
> >> can export things that overload the ->permission op, and those can be
> >> as expensive as they like (within reason of course).
> > 
> > Weird if the expense of a second permission call is significant compared
> > to following the mkdir and sync.  But, what do I know.
> > 
> > --b.

Reply via email to