[I have quoted kre's mail in full here, not just the parts I'm
replying to, because the original didn't go to the list.]

Robert Elz wrote, on 25 Mar 2022:
>
>   | Do these results (from coreutils 8.30) cover what you need to know:
> 
> Some of it, but you tested a directory with x permission, but
> without r (in addition to one with neither) where the more
> interesting case is one with r (so we can see what files exist)
> but without x (so they cannot be stat'd to determine if they
> are symlinks or not - assuming a filesys where the directory doesn't
> contain file type info).

$ mkdir -m a=rw d_nox 
$ realpath ./d_nox/foo; echo exit $?
realpath: ./d_nox/foo: Permission denied
exit 1
$ realpath -e ./d_nox/foo; echo exit $?
realpath: ./d_nox/foo: Permission denied
exit 1

> There are also more cases of symlinks to consider
> 
>       test -d /tmp/realpath || mkdir /tmp/realpath
>       cd /tmp/realpath
>       mkdir d
>       rm -rf foo
>       ln -s /tmp/realpath/foo d/link
>       realpath d/link

$ realpath d/link; echo exit $?
/tmp/realpath/foo
exit 0
$ realpath -e d/link; echo exit $?
realpath: d/link: No such file or directory
exit 1

> is one, and then
> 
>       ln -s   /tmp/gibberish/file d/link2
>       realpath d/link2

$ realpath d/link2; echo exit $?     
realpath: d/link2: No such file or directory
exit 1
$ realpath -e d/link2; echo exit $?
realpath: d/link2: No such file or directory
exit 1

> is another.   Those are both cases where the final component
> (link and link2) names a file that doesn't exist (in a sense,
> the case "realpath d/nofile" is the simple one, handling that
> case is easy), but in a different way, are they treated the same
> or differently?   And what happens in any case?  I can do no more
> that guess.

They are different (without -e).  Here is what strace shows for
the two cases:

lstat64("/tmp/realpath/d", {st_mode=S_IFDIR|0775, st_size=4096, ...}) = 0
lstat64("/tmp/realpath/d/link", {st_mode=S_IFLNK|0777, st_size=17, ...}) = 0
readlink("/tmp/realpath/d/link", "/tmp/realpath/foo", 18) = 17
lstat64("/tmp", {st_mode=S_IFDIR|S_ISVTX|0777, st_size=4096, ...}) = 0
lstat64("/tmp/realpath", {st_mode=S_IFDIR|0775, st_size=4096, ...}) = 0
lstat64("/tmp/realpath/foo", 0xbfbe9a3c) = -1 ENOENT

lstat64("/tmp/realpath/d", {st_mode=S_IFDIR|0775, st_size=4096, ...}) = 0
lstat64("/tmp/realpath/d/link2", {st_mode=S_IFLNK|0777, st_size=19, ...}) = 0
readlink("/tmp/realpath/d/link2", "/tmp/gibberish/file", 20) = 19
lstat64("/tmp", {st_mode=S_IFDIR|S_ISVTX|0777, st_size=4096, ...}) = 0
lstat64("/tmp/gibberish", 0xbfe9cf8c)   = -1 ENOENT

> The examples with -e are less interesting, those should be
> how our realpath acts now, but I will verify later, I will only
> comment more on that if there are differences.
> 
>   | My bugnote 5700 has what I came up with for readlink -f (without -n),
>   | which I think could be reused for realpath without -e.
> 
> Yes, the 2nd bullet point looks about right, but that makes it more
> important to verify what the coreutils version does with a directory
> that has read premission, but no x permission, as:
> 
>       If the file operand names a file that is not a symbolic link,
>         or if the path prefix of file resolves, with symbolic links followed,
>         to an existing directory
> 
> that much is just realpath(3) on the path prefix (and probably a
> test that the result is a directory), and that's fine
> 
>         and the final component of file does not exist as a directory
>       entry in that directory,
> 
> and that can be determined by reading the directory, and comparing
> entries with the final component of "file" to see if that is an entry
> in the directory or not - which requires r permission, but doesn't
> need x, or by calling lstat() on the results from realpath, with the
> final component appended, which requires x but doesn't require r.
> 
> Which one are we supposed to do to determine this question?  Which
> dces the coreutils version do?   The text (probably for both readlink -f
> and realpath) needs to be very clear about this.

Here's the strace for the d_nox case:

lstat64("/tmp/realpath/d_nox", {st_mode=S_IFDIR|0666, st_size=4096, ...}) = 0
lstat64("/tmp/realpath/d_nox/foo", 0xbf97522c) = -1 EACCES

It doesn't read any directories, it just uses lstat() for existence
checks.

However, I don't see the need for the standard to be specific about
this, since there are many other places where it talks about existence
of files but doesn't specify how the existence check is done.  If an
implementation wants to go the extra mile and try to read the directory
after an EACCES from lstat(), it can. It's a quality-of-implementation
issue.

>   | It doesn't state anything explicit about the directory permissions,
>   | but I think that's fine because it's normal for utility descriptions
>   | to leave such things to be covered by the general rules on utility errors.
> 
> As it is written in that note it is fine, if the necessary extra piece
> of which method to use to determine whether the final component in the
> directory exists or not.
> 
> It is also important to determine if this "final component" test
> applies only to the final component of the original path, or also
> to the final component of the path after symbolic links in the
> original path have been resolved, and in that vein, the first
> bullet point in that note:
> 
>       If the file operand names a symbolic link, readlink shall write
>       to standard output the absolute pathname that would be returned
>       by a call to the realpath() function with the pathname as its
>       first argument.
> 
> Looks to be inconsistent with the results that you just posted.  That is:
> 
>       $ realpath ./nofilelink; echo exit $? 
>       /tmp/realpath/nofile
>       exit 0
> 
> In that case, the file operand names a symbolic link, so what we should
> get is realpath() applied to the pathname, which is exactly as you plan
> on realpath -e operating:
> 
>   |     The realpath utility shall write to standard output the absolute
>   |     pathname that would be returned by a call to the realpath()
>   |     function with the file operand as its first argument, followed by
>   |     a <newline> character.
> 
> But you showed what happens in that case:
> 
>       $ realpath -e ./nofilelink; echo exit $?
>       realpath: ./nofilelink: No such file or directory
>       exit 1
> 
> and it isn't the same.

Good catch.  So perhaps the first bullet should be modified something
like this:

        If the file operand names a symbolic link:
        - If the symbolic link resolves (with symbolic links followed)
          to an existing file, realpath shall write to standard output
          the absolute pathname that would be returned by a call to
          the realpath() function with the pathname as its first
          argument.
        - Otherwise, realpath shall write to standard output the
          absolute pathname that would be returned by a call to the
          realpath() function with the path prefix of file as its
          first argument, followed by a <slash> character and the
          final component of file. If file is non-empty and only has
          one pathname component, it shall be treated as if it had a
          path prefix of "./".

[I have no further comments after this]

> Your text in this message (as quoted just above) for what happens with -e
> looks fine, but that case isn't the issue, and never has been.   The problems
> are all when -e isn't given, and correctly defining what is really intended
> to happen in all the odd cases that can then occur.
> 
> To make all this more explicit, I can imagine two reasonable scenarios for
> how to handle the implementation of realpath without -e (in the coreutils
> sense of it) - the first is something like (pseudo code, don't try to
> compile this)
> 
>       for (;;) {
>               last = strrchr(file, '/');      /* ignore last == NULL for now 
> */
>               *last++ = '\0';
>               path = realpath(file, NULL);
>               file = strcat(strcat(path, "/"), last);
>               if (lstat(file, &sb) == 0) {
>                       if (S_ISLNK(sb.st_mode))
>                               continue;
>               } else {
>                       /* do we ignore all errors here, or just ENOENT ? */
>               }
>               printf("%s\n", file);
>       }
> 
> (with error checking added, naturally).
> 
> The second (similar caveats) is
> 
>       if ((path = realpath(file, NULL)) == NULL) {
>               last = strrchr(file, '/');      /* ignore last == NULL for now 
> */
>               *last++ = '\0';
>               path = realpath(file, NULL);
>               strcat(strcat(path, "/"), last);
>       }
>       printf("%s\n", path);
> 
> Which one would give the correct results, or would I
> need something quite different?
> 
> Perhaps it is even necessary to open code the realpath()
> function, rather than simply calling it, to get the
> correct results?
> 
> kre
> 
> ps: I know that we cannot just strcat() onto the malloc'd
> result from realpath(), this is just to show the contrasting
> possible implementations, not get the details right.  There
> would need to be calls to free() added as well...

-- 
Geoff Clare <g.cl...@opengroup.org>
The Open Group, Apex Plaza, Forbury Road, Reading, RG1 1AX, England

          • ... Robert Elz via austin-group-l at The Open Group
            • ... enh via austin-group-l at The Open Group
              • ... Thorsten Glaser via austin-group-l at The Open Group
              • ... enh via austin-group-l at The Open Group
              • ... Thorsten Glaser via austin-group-l at The Open Group
            • ... Robert Elz via austin-group-l at The Open Group
            • ... Geoff Clare via austin-group-l at The Open Group
            • ... Robert Elz via austin-group-l at The Open Group
              • ... Geoff Clare via austin-group-l at The Open Group
              • ... Robert Elz via austin-group-l at The Open Group
              • ... Geoff Clare via austin-group-l at The Open Group
              • ... Geoff Clare via austin-group-l at The Open Group
              • ... Robert Elz via austin-group-l at The Open Group
          • ... Jonathan Wakely via austin-group-l at The Open Group
          • ... Robert Elz via austin-group-l at The Open Group
            • ... Jonathan Wakely via austin-group-l at The Open Group
              • ... Jonathan Wakely via austin-group-l at The Open Group
              • ... G. Branden Robinson via austin-group-l at The Open Group
  • [1003.1(2016... Austin Group Bug Tracker via austin-group-l at The Open Group
    • Re: [10... Robert Elz via austin-group-l at The Open Group
      • Re:... Thorsten Glaser via austin-group-l at The Open Group

Reply via email to