Jim Meyering writes ("Re: making GNU ls -i (--inode) work around the linux readdir bug"): > Ian Jackson <[EMAIL PROTECTED]> wrote: > > That is all systems. All UN*X systems since the dawn of time have > > behaved this way. > > Just because everyone does it doesn't make it right.
In fact, since you yourself are referring to standards documents (which are supposed to document existing practice) to prove your point, yes, it does! Furthermore it _is_ right even in absolute terms. You have failed _again_ to respond to my point about the device number. Let me repeat it: When files are only in a single filesystem the inum is sufficient to uniquely identify a file. But when we consider more than one filesystem, the inum and device are needed together because inums on different filesystems are unrelated and may be (often are) the same. Thus any program which uses _only_ inums to tell files apart is broken if it works near a mountpoint. Either the documentation for or filesystem layout used by that program must ensure that mountpoints are not relevant, or the program must take extra special care somehow itself. ls -i prints only inums. So by using ls -i a program promises that mountpoints are not relevant. On many conventional filesystems, readdir is O(n) in the size of the directory but so is stat. So ls -i which does stat is O(n^2). Even on more recent filesystems with tree-structured directories, stat is O(log n) so a statting ls -i is O(n.log n) whereas a traditional ls -i is O(n). ls -i is the _only_ way to get coreutils to give you this listing in O(n). Even if a new interface was introduced to get the old behaviour it would not be backward compatible with existing software. > Besides, according to this, > http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=369822#60 > at least Cygwin 1.5.20 provides a readdir function that works > the way I expect. I can't believe it! You're holding up Cygwin as an example of what the UN*X API should be! Am I in some kind of bad dream mirror universe ?! I'm sorry if my tone is rather strident but I'm just boggling. > >> I think correctness is important enough to sacrifice > >> the optimization in this unusual corner-case usage of ls. > > > > This is the whole _point_ of ls -i ! > > Being fast and inaccurate was never the point of ls -i. It's not `inaccurate'. It's perfectly accurate. Any existing correct program will behave correctly with the traditional ls -i. > Besides, ls had the "-i" option long before d_ino was invented. I think you will find that this is false. d_ino has nearly always been there, although some libcs suppressed it. > If there are tools for which the optimization is important enough, > and ls gets a new option, then people will eventually update them > to use the new option to enable the fast-and-loose > behavior that is currently the default. I wrote such a tool myself, magicmirror. I'm sure there are others but I don't have any references right now. There are *no existing programs* and *no plausible correct programs* which depend on your new behaviour. Why break existing software for no benefit ? > Or better still, maybe someone will fix Linux's getdents (the syscall > behind readdir) to do the right thing even in the presence > of mount points. This would be slow for many of the same reasons (although maybe not _as_ slow as doing stat for each entry). I hope and trust that kernel developers are more aware of the proper behaviour of the API than implied by your suggestion. Can you name _one_ UN*X system (Cygwin does _not_ count) which behaves the way you think is correct ? > I insist: it is a bug. > If I weren't convinced I wouldn't be spending time on it, now. If there is nothing I can say to change your mind then why are we having this conversation ? > > It's the permitted by the specs > > The old POSIX spec permitted anything. > The soon-to-be-current version of POSIX has new wording: > > The value of the structure's d_ino member shall be set to the file > serial number of the file named by the d_name member. If there is no caveat (I don't have the text here) then this is wrong. > But adding new options to ls is a big deal, requiring more justification > than I've seen so far. If you provide some actual details, like names > of applications, along with performance comparisons, that may be enough. !!!! My own application magicmirror runs perfectly well without this alleged `fix'. The ls -i takes a negligible time compared to the rest of the program. With a stat on each call, the ls -i did not complete within the time I was willing to let it have (several hours IIRC). > I don't presume to know all usage scenarios, so want > the default behavior to favor correctness. What correct programs are broken by the traditional behaviour ? > > So behaviour you consider `undesirable' is in fact the standard. > > Ha! No. It just means they're all wrong. *boggle* > Even if POSIX is adjusted or interpreted to allow their legacy > misbehavior, I prefer to make coreutils work around such vagaries. It's not legacy misbehaviour. It's CORRECT and FAST. > > It's only incorrect in situations where using the inode number is > > incorrect anyway. You've failed to respond to my comments about the > > lack of the device number. > > Sure, but how can ls (in general, and efficiently) know whether > the device number is relevant? The _caller_ knows that the device number is _irrelevant_. Because otherwise the results from ls -i are useless. So if the caller knows that the device number is irrelevant there is no point going to any effort to supply it. The caller communicates this fact (that the device number is irrelevant) to ls by passing the `-i' option. This is a safe assumption by ls because no correct program could use the output from ls -i unless the device number were irrelevant. > If ls has somehow already done the work to assure that a directory > it is listing contains no mount points, then there's no point > in calling readdir again via ls -i. If an application "knows" > that the optimization is safe, then it can use the new option. That option is `-i' (without other options which _do_ depend on the results of stat). Ian. _______________________________________________ Bug-coreutils mailing list Bug-coreutils@gnu.org http://lists.gnu.org/mailman/listinfo/bug-coreutils