> Date: Tue, 24 Jun 2014 15:53:20 -0700 > From: Matthew Dempsky <[email protected]> > > On Tue, Jun 24, 2014 at 11:04:10AM -0700, Matthew Dempsky wrote: > > SIGBUS/BUS_ADRERR: Accessing a mapped page that exceeds the end of > > the underlying mapped file. > > Generating SIGBUS for this case has proven controversial due to > concern that this is Linux invented behavior and not compatible with > Solaris, so I decided to collect some more background information on > the subject. > > - SunOS 4.1.3's mmap() manual specifies: "Any reference to addresses > beyond the end of the object, however, will result in the delivery of > a SIGBUS signal." This wording was relaxed to "SIGBUS or SIGSEGV" in > SunOS 5.6 and remains in current manuals. (I'm not sure, but I suspect > this may be to simply reflect that memory protection violations take > priority over bounds checking.)
It makes sense that memory protection violations take priority over bounds checking. > SunOS 4.1.3: > http://www.freebsd.org/cgi/man.cgi?query=mmap&sektion=2&manpath=SunOS+4.1.3 > SunOS 5.6: > http://www.freebsd.org/cgi/man.cgi?query=mmap&sektion=2&manpath=SunOS+5.6 > Solaris 11: http://docs.oracle.com/cd/E23824_01/html/821-1463/mmap-2.html > > - Many other SVR-derived OSes similarly document SIGBUS in their > mmap() manuals too: > > AIX: > http://www-01.ibm.com/support/knowledgecenter/ssw_aix_53/com.ibm.aix.basetechref/doc/basetrf1/mmap.htm?lang=en > HPUX: > http://h20566.www2.hp.com/portal/site/hpsc/template.BINARYPORTLET/public/kb/docDisplay/resource.process/?spf_p.tpst=kbDocDisplay_ws_BI&spf_p.rid_kbDocDisplay=docDisplayResURL&javax.portlet.begCacheTok=com.vignette.cachetoken&spf_p.rst_kbDocDisplay=wsrp-resourceState%3DdocId%253Demr_na-c02261243-2%257CdocLocale%253D&javax.portlet.endCacheTok=com.vignette.cachetoken > UnixWare: http://uw714doc.sco.com/en/man/html.2/mmap.2.html > > - This behavior has been (awkwardly) specified for mmap() since SUSv2: > "References within the address range starting at pa and continuing for > len bytes to whole pages following the end of an object shall result > in delivery of a SIGBUS signal." Later versions of POSIX have the same > wording. > > SUSv2: http://pubs.opengroup.org/onlinepubs/007908799/xsh/mmap.html > POSIX.2001: > http://pubs.opengroup.org/onlinepubs/009695399/functions/mmap.html > POSIX.2008: > http://pubs.opengroup.org/onlinepubs/9699919799/functions/mmap.html > > - More generally, POSIX explains the SIGBUS/SIGSEGV distinction > thusly: "When an object is mapped, various application accesses to the > mapped region may result in signals. In this context, SIGBUS is used > to indicate an error using the mapped object, and SIGSEGV is used to > indicate a protection violation or misuse of an address." Specific > examples are provided too: > > Memory Protection: > http://pubs.opengroup.org/onlinepubs/9699919799/functions/V2_chap02.html#tag_15_08_03_03 > Generating SIGBUS for access beyond the end of an object makes some sense. In this case there is a valid mapping; it's just that the underlying physical memory pages aren't there. It is no dissimmilar to having mapped a physical address that maps to say the PCI bus. On real hardware accessing such a mapping will lead to a failed bus transaction for which the logical representation is a SIGBUS. (On PeeCee hardware you'll probably get back an all-ones bit-pattern). >From a hardware-oriented perspective, SIGSEGV is generated by the MMU and SIGBUS is generated by the underlying hardware. So I don't think the Sun engineers made a totally unreasonable decision here. Unfortunately the CRSG made a different decision when they reimplemented mmap support in 4.3BSD-Reno. Or perhaps things got broken after that... In my view, generating SIGBUS under these circumstances is a bit unfortunate. Currently, SIGBUS on OpenBSD is a very clear indication of an alignment issue. If we would generate SIGBUS for access beyond the end of a mmap'ed object this would no longer be the case. We'd actually have to look at the siginfo, which isn't printed by the shell. On the other hand, passing memory objects by fd is getting more common. Xorg recently modernized its shared memory interface (MIT-SHM, aka XShm) to support mmap'ing file descriptor passed over sockets. And DRM is moving in the same direction to solve security issues with access to graphics objects. But this approach has a downside. A malicious client could pass an fd to the X server and subsequently truncate it after the X server mapped it. If the X server accesses this mapping, it will crash. To prevent this from happening, the X server will install a signal handler for SIGBUS, check if a shared memory object is being accessed and patch things up (by mmap'ing anonymous memory on top of the mapping). This code can be extended of course by handling SIGSEGV as well. But this means more work in xenocara and ports, and we might miss some places where this needs to be done. Theo has some worries that changing SIGSEGV to SIGBUS in this case will lead to problems in ports. I'm not so worried. For one thing, i386 and amd64 actually generate SIGBUS in cases where we really should generate SIGSEGV. And Linux does implement the SIGBUS behaviour specified in POSIX here.
