First of all, apologies for posting here.  Use of BTS would lead to
too many bug reassignments in that case ;-/

        Summary: there is a bug in 2.2 kernels that leads to breakage of
glibc 2.3.2 on 2.2 boxen, to breakage of coreutils and to FTBFS of m4 on
alpha.  Additionally, there's a nasty issue with alpha buildd.  That is to
say, 5 sides involved and a plenty of space for finger-pointing ;-/

        1) Source of the bug: in 2.2 (and 2.4 prior to 2.4.19-rc2) sys_utimes()
lacks a test on times==NULL branch.  It should have the same behaviour as
sys_utime() - if we ask to set timestamps to present (second argument of
syscall is NULL), caller must either have write permissions to file or
be its owner.  In 2.2 the second part is missing.

        That is to say, if /tmp/foo is your file and its r--r--r--, you
can't do utimes("/tmp/foo", NULL).  It will fail with -EACCES on 2.2.
Note that you *CAN* do utimes("/tmp/foo", &tvp) - IOW, you can set the
timestamps to whatever values you want, which is more than the failing
call would do.

        It is a bug, it makes no sense, it violates POSIX, it's contrary
to other kernels' behaviour _and_ it had been fixed in 2.4 for about a year.
Fix is a one-liner and would apply to any 2.2.x:
--- linux/fs/open.c     Sun Jul  1 13:32:19 2001
+++ linux.fix/fs/open.c Fri Sep 26 02:32:22 2003
@@ -262,7 +262,8 @@
                newattrs.ia_mtime = times[1].tv_sec;
                newattrs.ia_valid |= ATTR_ATIME_SET | ATTR_MTIME_SET;
        } else {
-               if ((error = permission(inode,MAY_WRITE)) != 0)
+               if (current->fsuid != inode->i_uid &&
+                   (error = permission(inode,MAY_WRITE)) != 0)
                        goto dput_and_out;
        }
        error = notify_change(dentry, &newattrs);
That's precisely what had been done in 2.4 and that's what we should do in
2.2.

        That was kernel side of things.  Now, for visible userland bugs:


2) m4 FTBFS on alpha.

It fails when built on 2.2 kernel with glibc 2.3.2.  What happens is
        touch ./stamp-checks
        touch: cannot touch `./stamp-checks': Permission denied
which breaks the build.

Note that stamp-checks exists and has 444 as mode.  touch(1) should *not*
break on that.  It doesn't break on x86.  It doesn't break with 2.4 kernel.
It doesn't break with old glibc.  With the combination above (2.2 on alpha +
glibc 2.3.2, which is what lully ran at the time of 1.4-17 build attempt)
it *does* break.

At that point m4 is exhonerated - FTBFS is a consequence of touch(1) breakage,
so it's a side-effect of coreutils being broken.


3) coreutils and touch(1).

On alpha running 2.2 kernel with glibc 2.3.2, try the following:
% touch foo
% chmod 444 foo
% touch foo
It will give you "Permission denied" on the second touch(1).

What happens is that touch(1) correctly calls utime("foo", NULL).  Which
breaks on aforementioned combination.  Note that it's libc utime(), not
the kernel one.   Which exhonerates coreutils - we are in libc land.


4) libc and utime()/utimes().

Both utime() and utimes() end up using the same syscall.  The choice depends
on platform; generally sys_utimes() is prefered.  2.2 (and 2.4) x86 do not
have sys_utimes in syscall table (2.6 does, but it doesn't have the bug in
question).  On alpha and sparc sys_utimes() is a syscall, so both utime() and
utimes() end up calling it.

        Note that both old and new libc have the bug on 2.2/alpha.
utimes(pathname, NULL) will fail if you do it to your read-only file.
The difference between their behaviour is in utime(name, NULL).  Old
libc has basically
        struct timeval timevals[2];
        if (times) {
                fill timevals from *times;
        } else {
                call gettimeofday() set timevals[0] and timevals[1] to current;
        }
        utimes(name, timevals);

New one doesn't bother with gettimeofday() and (correctly) calls utimes(name,
NULL) directly.

Old behaviour had masked the kernel bug in question, since we didn't pass NULL
to sys_utimes().   *However*, that had lead not only to extra syscall, but
to very real breakage.  Namely, if file was writable to you, but you were
not the owner,  you would get "permission denied" from uname(name, NULL).
Which is a bug - non-owners are not allowed to *set* timestamps, but they
can reset them to present on files they can write to.  New behaviour fixes
that problem and is clearly the Right Thing(tm).

Unfortunately, it exposes the sys_utimes() breakage not only in utimes(3)
(where it was exposed for years) but in utime(3), where it triggers the
userland breakage we were not used to (see #2 and #3 for some examples).


Now, it's obvious that
        a) there is a bug.
        b) there are only two places where it might've been fixed - 2.2 kernel
and glibc.
        c) "fix" in glibc would reintroduce old bugs and would be useless on
newer kernels.

Seeing that kernel-side fix (== what had been done in 2.4/2.5) is trivial
and obviously correct (compare fs/open.c::sys_utimes() with sys_utime()
several lines prior to it), I would suggest to fix that crap in 2.2 the
same way.


5)  Why does lully (or any of buildd boxen) run 2.2? (as it apparently does)
Seeing that sarge is hopefully going to be 2.4-based...


Reply via email to