Re: readdir and d_ino of mount points (Was: rm -rf ./ ../)

Geoff Clare Mon, 12 Jun 2017 01:59:16 -0700

Stephane Chazelas <stephane.chaze...@gmail.com> wrote, on 09 Jun 2017:
>
> 2017-06-09 15:46:56 +0100, Stephane Chazelas:
> [...]
> > In addition to leaving it unspecified whether a "." or ".."
> > entry is returned, we may also want to clarify (or leave
> > unspecified) what d_ino the ".." entry may have for mount-points
> > and that in any case, there's no guarantee that it would be the
> > same as the inode returned by a stat("..").
> > 
> > Do we guarantee that all the d_ino returned by readdir() are
> > from the same file system?
> > 
> > The rationale part seems to say the opposite which is against
> > current practice AFAICT.
> > 
> > SUSv4TC2> When returning a directory entry for the root of a mounted
> > SUSv4TC2> file system, some historical implementations of readdir()
> > SUSv4TC2> returned the file serial number of the underlying mount
> > SUSv4TC2> point, rather than of the root of the mounted file system.
> > SUSv4TC2> This behavior is considered to be a bug, since the
> > SUSv4TC2> underlying file serial number has no significance to
> > SUSv4TC2> applications.
> > 
> > That's not at all what I observe. On Solaris 11:
> [...]
> > Same on GNU/Linux (though GNU ls would use lstat() to get the inode
> > numbers).
> [...]
> 
> Same on FreeBSD (whose ls -i behaves like Linux' (displays the
> inode number from lstat(), not from readdir()).
> 
> In any case, either value (the inode of the file or the inode of
> the file in the original filesystem the file is mounted on) is
> not useful.
> 
> - In the first case (the one required by POSIX)
> 
>   OK it's the "correct" inode number (the one we would get from
>   lstat() on that path), but we don't know what file system it's
>   for as it's a different one from the one of the directory
>   we're listing. That also means that we can get different
>   entries with the same d_ino even though they are not hardlinks
>   (it's common for the root directory to file systems to have a
>   fixed low-number inode number (typically "2" on ext4).
> 
> - In the second case (the one in FreeBSD, Linux and Solaris at
>   least), that's the inode number of a file we
>   cannot access by that path (and again, applications using
>   d_inos to detect hard links could be fooled).


Of these two behaviours, it seems to me that the second is the one
that makes the most sense.

A file is identified by a dev/ino pair.  If using d_ino, what dev
number do you use? - obviously you use the dev number for the
directory you are reading.  Therefore in order for the dev/ino
pair to correctly indicate file uniqueness, the ino number in d_ino
has to be the one for the filesystem in which the directory being
read resides.

You say that "applications using d_inos to detect hard links could be
fooled", but this would only happen if the root of the mounted file
system is hard linked to another directory somewhere below it.  That
seems like a very unlikely situation.  (We could consider forbidding
it, i.e. on a file system that supports hard linking directories,
require that an attempt to hard link the root of the file system must
fail.)

> One could imagine a third approach where we return a
> INO_DIFFERENT_FILESYSTEM special reserved number to warn the
> application they can't reliably  use that d_ino.

That's an interesting idea.  Since the point of d_ino is to save
on lstat() calls, having a way of telling the application it needs to
use lstat() when a directory entry is a mount point would preserve this
advantage.

> So we might as well leave the behaviour unspecified.
> 
> And maybe note in "application usage" that the d_ino cannot be
> relied upon and people should use lstat().

I would prefer that we find a way to ensure d_ino can be used.  On a
directory with thousands of entries, it can be a big win not to have
to lstat() all the entries.

-- 
Geoff Clare <g.cl...@opengroup.org>
The Open Group, Apex Plaza, Forbury Road, Reading, RG1 1AX, England

Re: readdir and d_ino of mount points (Was: rm -rf ./ ../)

Reply via email to