On Wed, Feb 22, 2012 at 01:00:39AM +0000, Stuart Henderson wrote: > On 2012/02/21 19:15, Woodchuck wrote: > > >Fix: > > surround the nitems macro with #if define(_KERNEL) and > > let any non-kernel code that has come to depend on it > > choke. > > The userland program/s relying on this should be defining it > themselves, but breaking the tree is not an acceptable way to > get them fixed.
I agree with both sentiments. I meant my remark about letting non-kernel code choke to suggest a (brutal) way to judge the effects of surrounding nitems with appropriate ifdefs. I'll be doing a "make build" later on today to judge these effects. A couple of casual glances from some grep results suggest they should be zero to minor. Effects within OpenBSD's kernel should probably be reported as a bug, ("breach of defensive programming bug") my impression is that _KERNEL should always be defined for such compilations. I *grok fully* the OpenBSD mindset ("take it or leave it", "this project is for the amusement of its authors", "show me the code", "rtfm", "rtfFAQ", "do not break happy code", etc), and *fully approve it*. I *share* that attitude towards my own stuff. I *encourage* that attitude in others, and cite OpenBSD as an example of how an uncompromising application of that attitude can result in a very interesting, useful and attractive result, without which I would feel lost and alone in a world of cruft and bugs. Hence my pleasure at finding an opportunity to report a subtle bug and its likely fix. In 3rd party code (existing ports and completely alien code) we are confronted with many problems. How I came across the nitems issue is illustrative. I am collaborating on porting an application to OpenBSD that is of significant interest to radio operators, worldwide, amateurs and potentially others. It is fldigi, and implements various digital radio transmission and reception modes through the audio facilities of PeeCees, Macs and certain other architectures to eventually include tablet, small Soekris-like systems, and so on. Besides needing fairly decent access to the audio subsystem, it also needs to communicate with the radio transceiver, chiefly through RS-232 or similar (USB) channels. The program, obviously, runs in real time, and performs some heavy (FFT) calculations to synthesize signals to be transmitted, and to analyze those received, much like the work done in a software-defined radio transceiver, it is typically compute-bound. It consists of about 120,000 lines of code, and is written in C++, what I would term a medium-sized program. It is intended to be simultaenously mathematically sophisticated and user-friendly, the users consisting of advanced amateur radio operators, engineers and suchlike but also computer-semiliterate emergency services operators; it has, therefore, a GUI interface. It is intended to be as bullet-proof as possible, and to function in fairly hostile environments. It is ambitious, and it is succeeding pretty well. It is natural to want to port it to OpenBSD, to make use of the well known benefits of OpenBSD. It is open-source. Its homepage is at http://www.w1hkj.com/Fldigi.html Fldigi also wants to communicate over TCP/IP using XML to exchange certain data with various remote sites. To do that it uses xlmrpc-c, an OpenBSD port. Device access and net access are not unreasonable tasks for 3rd party software. Getting system-specific definitions and declarations is legitimate for such a program. It is legitimate for applications to include sys/*.h, and it is legitimate to expect that code that compiles and runs on Linux, FreeBSD, NetBSD, and yea even unto Cygwin and native Windoze, to compile on OpenBSD. ("principle of least surprise"). After usual tweakage, of course. Fldigi nowhere includes sys/param.h. It does include netdb.h, and other members of sys/*. Fldigi's control panel (for X11), is implemented using FLTK, which is ported to OpenBSD, although the port is out of date. The xmlrpc-c software is, at best, dubious. Such is life. The bug only surfaces in fldigi when xml is configured into the compilation; however the bug appears in a file included from the FLTK port, namely "Fl.H", where nitems() is defined as a public method for some C++ class. Along the lines of int nitems() { return nitems_; } IMO, the usual C++ cruft, but there it is. A curse on C++ for including executable code in included files, but that Petri dish was dropped long ago, alas. Analysis reveals that this is an interaction of included files from xmlrpc-c and FLTK, the xmlrpc-c included files include netdb.h, which brings in sys/param.h. It is legitimate and necessary for an application implementing XML to include netdb.h. I have not yet determined why the failing user routine includes FLTK files, but I suspect that that, is a legitimate but not anecessary thing. It is (probably) due to unfortunate design choices, chiefly due to convenience in C++, which encourages a sort of "include mania" that I am sure many find frightening and the source of much grief. My point is that "we" (programmers, users, porters) are largely stuck with the world as it is, and the terrors of C++ are now part of it. And I have a workaround for this specific case, an inclusion in user source of "#undef nitems" at a point after the implicit inclusion of sys/param.h and before the implicit inclusion of Fl.H (from FLTK). Such a solution is not generally possible, however. The problem with the nitems bug is that the compiler provides no hint of where nitems might be defined, as it is a preprocessor symbol, and the preprocessor doesn't care (much) about C or C++ syntactic niceties. So to track it down, I had to go on a snipe hunt; I know how to do this sort of thing, so I did it. I expected some sort of cruel interaction between various C++ "include files", which nest to a depth of about 5 in the vicinity of the problem. There are 5-6 external pacakages involved. My disappointment is that these results are the exact opposite of what I would have expected -- I would have expected the compilation to fail because OpenBSD had *not* defined some innovative local feature, and I could say, "Behold the purity of the True Source, which defineth no cruft." and supply a patch to the user code that defines nitems. I will continue analyzing the bug and report any results. Addendum: files in /usr/include that *directly* include sys/param.h User code referencing any of these will be vulnerable to the bug. [djv@aemilia ~ 0:65]$ find /usr/include/ -name "*.h" | xargs -n100 grep -l sys/param.h /usr/include/altq/altq_var.h /usr/include/arpa/inet.h /usr/include/arpa/nameser.h /usr/include/dev/ic/aic79xx_openbsd.h /usr/include/dev/ic/aic7xxx_openbsd.h /usr/include/dev/ic/ar5xxx.h /usr/include/dev/ic/isp_openbsd.h /usr/include/dev/ic/pdqvar.h /usr/include/dev/ic/smc93cx6var.h /usr/include/dev/ofw/openfirm.h /usr/include/dev/pci/drm/drmP.h /usr/include/dev/pci/if_bnxreg.h /usr/include/dev/pci/if_em.h /usr/include/dev/pci/if_ixgb.h /usr/include/dev/pci/ixgbe.h /usr/include/dev/raidframe/rf_configure.h /usr/include/dev/raidframe/rf_threadstuff.h /usr/include/dev/raidframe/rf_types.h /usr/include/g++/i386-unknown-openbsd5.0/bits/c++config.h /usr/include/i386/cpu.h /usr/include/i386/db_machdep.h /usr/include/net/pfvar.h /usr/include/netdb.h /usr/include/netmpls/mpls.h /usr/include/nnpfs/nnpfs_config.h /usr/include/nnpfs/nnpfs_locl.h /usr/include/openssl/e_os.h /usr/include/resolv.h /usr/include/rpcsvc/bootparam_prot.h /usr/include/sndio.h /usr/include/sys/localedef.h /usr/include/sys/mbuf.h /usr/include/sys/radioio.h On my very lightly populated system: [djv@aemilia ~ 0:66]$ find /usr/local/include/ -name "*.h" | xargs -n100 grep -l sys/param.h /usr/local/include/tcl8.5/tclInt.h /usr/local/include/tcl8.5/tclUnixPort.h /usr/local/include/python2.7/pyconfig.h /usr/local/include/Xm/Xmos_r.h /usr/local/include/Xm/Xmpoll.h In /usr/ports (including pobj), there are on my system 97 instances of files referencing sys/param.h on that system. in src: [djv@aemilia ~ 0:75]$ find /usr/src -name "*.h" | grep -v /usr/src/sys | \ xargs -n100 grep -l sys/param.h | wc 129 129 5284 These counts do not include indirect inclusion of sys/param.h (such as from netdb.h and certain others). This bug is pernicious because it can also generate perfectly good C and C++ code, with no warning to the user of a macro expansion being performed. nitems shouldn't have leaked out of the kernel, but there it is. I don't think it's documented anywhere, but could be wrong. It's a long fine manual to read. It is very interesting to look at the results of $ find /usr/share -type f | xargs -n100 grep nitems which I leave as an exercise for the reader. 73, Dave AB3NR