On Tue, 24 Mar 2015, Mateusz Guzik wrote:
Log: filedesc: microoptimize fget_unlocked by getting rid of fd < 0 branch
This has no effect. Compilers optimize to the equivalent of the the unsigned cast hack if this is good. On x86, it is good since no instructions are needed for the conversion, and the only difference between the code generated by if (fd >= fdt->fdt_nfiles) and if (fd < 0 || fd >= fdt->fdt_nfiles) is to change from jl (jump if less than) to jb (jump if below) (both jumps over the if clause). Negative values satisfy jl but not jb.
Casting fd to an unsigned type simplifies fd range coparison to mere checking if the result is bigger than the table.
No, it obfuscates the range comparison. On some arches, conversion to unsigned is slow. Then compilers should optimize in the opposite direction by first undoing the bogus cast to get back to the range check and then optimizing the range check using the best strategy. Compilers should probably first undo the bogus cast even on x86, so as to reduce to the range check case. Range checks are more important and more uniform than bogus casts, so they are more likely to be optimized. Similarly if fd has type u_int to begin with. (Grosser forms of the obfuscation do this. The gross forms are still common :-(. The first one in syscalls.master is "u_int fd" for dup(). But this doesn't even survive as far as do_dup(), since sys_dup() explicitly converts to int. The conversions are: - the arg "int fd" is type-punned to "u_int fd" in the args struct - uap->fd is cast to before passing it to do_dup(). dup2() is similar. The first one in syscalls.master that uses the obfuscation is getgroups() or perhaps getlogin(). The conversions and obfuscations for getgroups are: - the arg "int gidsetlen" is type-punned to "u_int gidsetsize" in the args struct. This also converts the name to a worse one. The arg gives the number of elements, not a length or a size - uap->gidsetsize is compared with a maximum value. This depends on it being u_int to reject negative values. - uap->gidsetsize passed is without a cast to copyout(). It becomes a size_t. This conversion cannot overflow for the checked value even if uap->gidsetsize has the correct type (int), but an explicit cast might be needed to avoid compiler warnings then (sys_dup() does such a cast). copyout() has related naming problems. Its count arg is 'named' len. bcopy() and memcpy()'s count arg is also named 'len' in FreeBSD, but in C99 and POSIX memcpy()'s count arg is just named 'n'.
Modified: head/sys/kern/kern_descrip.c ============================================================================== --- head/sys/kern/kern_descrip.c Tue Mar 24 00:01:30 2015 (r280406) +++ head/sys/kern/kern_descrip.c Tue Mar 24 00:10:11 2015 (r280407) @@ -2342,7 +2342,7 @@ fget_unlocked(struct filedesc *fdp, int #endif fdt = fdp->fd_files; - if (fd < 0 || fd >= fdt->fdt_nfiles) + if ((u_int)fd >= fdt->fdt_nfiles) return (EBADF); /* * Fetch the descriptor locklessly. We avoid fdrop() races by
This undoes the changes that removed the obfuscation. Many commits were involved: - 4.4BSD-Lite1 uses the obfuscation in a modification of the above form, with "u_int" spelled in non-BSD style as "unsigned" for at least most of fcntl(), close(), compat_43_fstat(), fstat(), fpathconf(), flock() and dupfdopen(). - 4.4BSD-Lite2 fixed the style bug, so that these functions used the obfuscation in exactly the above form - FreeBSD never merged these changes from Lite2 - dupfd_open() apparently had the BSD spelling in Lite1, so FreeBSD-2+ always had that. jhb committed my change to unobfuscate dupfd_open() in 2002. - I apparently missed half of the obfuscations with the "unsigned" spelling. But half of them were moved into _fget(), fget() or fget_locked(). - early in 2002, the obfuscation was in _fget() in exactly the above form. _fget() far too large for an inline function, but was static inline in kern_descrip.c. - a little later in 2002, the obfuscated part of _fget() and a little more was turned into fget_locked(). fget_locked() was not too large for an inline function, and was static inline in filedesc.h. But this change turned the obfuscation into nonsense. It added the explicit check for the lower bound, but kept the bogus cast. - I asked the committer to fix this, but my instructions were apparently too tl;dr (maybe 1/4 as long as this mail), and the result was less than reversion to the previous obfuscated version -- it was that plus, plus another bogus cast to u_int (the second one has no effect not already given by the first one), and a comment documenting the obfuscation. The comment takes about 4 times as much source code as the unobfuscated version. - I fixed this in 2004. The unobfuscated version is slightly shorter then the version with the doubled bogus cast, even without the comment. - this version of fget_locked() survived until at least FreeBSD-9, but in recent versions its style regressed in other ways. It expanded from 2 lines of statements to 6 to add 1 assert statement and 3 lines with style bugs (2 extra blank lines and one loss of use of a conditional expression). - fget_locked() seems to be unchanged recently, so it doesn't have the obfuscation. fget_unlocked() is newer than the unobfuscated version of fget_locked(). It is in a different file, but may have copied the unobfuscated range check from fget_locked(). This commit restores the obfuscation to it alone. There are now some other range checks in kern_desc.c that are missing the obfuscation: grepping for "fd <" gives: - a magic treatment for negative fd's in closefrom() - the range check in an unobfuscated by confusing form in fdisused(). The check has to be inverted since it is in a KASSERT(), and the inversion is done by reversing the inequalities. - similarly in fdalloc(). I used to use this hack a lot 30 years ago, but stopped when compilers got better 20-25 years ago. Bruce _______________________________________________ svn-src-all@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/svn-src-all To unsubscribe, send any mail to "svn-src-all-unsubscr...@freebsd.org"