Once every few years I find myself on a system with vanilla GNU diff and have to fix it so that it behaves correctly in the presence of special files. Just did this again, so let me mutter a bit - last time was long ago.
My hope is to tar a directory tree, untar it again, and see no differences from diff -r assuming nothing changed and no I/O errors occurred. The two main problem areas are special files and symlinks. Today one sees many lines "path1 is a foo while path2 is a foo", e.g. for fifos - useless clutter that obscures real differences. For symlinks the problem is really bad since symlinks are followed, which is very very bad. Following symlinks is bad for a long list of reasons. First, maybe the link cannot be followed because the target does not exist. If path1 and path2 are both symlinks with the same contents then I do not want to see diff output. Second, a file and a symlink to a file behave differently. Diff should consider the two different and give output. (If in a source tree a config.h in a subdirectory is a symlink to ../config.h then it will change automatically when ../config.h is updated. If it is a copy the make will fail.) Third, maybe the symlink points to a directory elsewhere in the tree, and following symlinks may give loops or exponential amounts of output. What one wants to see is the smallest list of changes that change one tree into the other, the input to a hypothetical version of patch. Certainly the diff that produces this patch input does not follow symlinks. Another small change I made to my copy of diff is that two regular files with a length of 0 bytes are not compared. Some programs make empty files of mode 0 and opening them fails, but the files do not differ. Andries In short: use lstat instead of stat.
