Module Name: src Committed By: dholland Date: Sat Aug 27 23:06:01 UTC 2016
Modified Files: src/lib/libc/sys: brk.2 Log Message: Rework pursuant to PR 7934: be more clear about the page granularity behavior and when new memory is zeroed. Also, strengthen the warning about mixing with calls to malloc (which is not a bug) and mention that the portable way to fetch the initial break is to call sbrk(0). There are implementations in the wild where using _end as the initial break doesn't work. To generate a diff of this commit: cvs rdiff -u -r1.32 -r1.33 src/lib/libc/sys/brk.2 Please note that diffs are not public domain; they are subject to the copyright notices on the relevant files.
Modified files: Index: src/lib/libc/sys/brk.2 diff -u src/lib/libc/sys/brk.2:1.32 src/lib/libc/sys/brk.2:1.33 --- src/lib/libc/sys/brk.2:1.32 Thu May 13 10:20:57 2004 +++ src/lib/libc/sys/brk.2 Sat Aug 27 23:06:01 2016 @@ -1,4 +1,4 @@ -.\" $NetBSD: brk.2,v 1.32 2004/05/13 10:20:57 wiz Exp $ +.\" $NetBSD: brk.2,v 1.33 2016/08/27 23:06:01 dholland Exp $ .\" .\" Copyright (c) 1980, 1991, 1993 .\" The Regents of the University of California. All rights reserved. @@ -29,7 +29,7 @@ .\" .\" @(#)brk.2 8.4 (Berkeley) 5/1/95 .\" -.Dd July 12, 1999 +.Dd August 27, 2016 .Dt BRK 2 .Os .Sh NAME @@ -56,16 +56,9 @@ and .Fn sbrk functions are used to change the amount of memory allocated in a process's data segment. -They do this by moving the location of the +They do this by moving the address at which the process's heap ends. +This address is known as the .Dq break . -The break is the first address after the end of the process's -uninitialized data segment (also known as the -.Dq BSS ) . -.Pp -While the actual process data segment size maintained by the kernel will only -grow or shrink in page sizes, these functions allow setting the break -to unaligned values (i.e. it may point to any address inside the last -page of the data segment). .Pp The .Fn brk @@ -74,24 +67,27 @@ function sets the break to .Pp The .Fn sbrk -function raises the break by at least +function changes the break by +.Fa incr +bytes. +If .Fa incr -bytes, thus allocating at least +is positive, this allocates .Fa incr bytes of new memory in the data segment. If .Fa incr is negative, -the break is lowered by -.Fa incr -bytes. +this releases the corresponding number of bytes. .Pp -.Fn sbrk -returns the prior address of the break. -The current value of the program break may be determined by calling -.Fn sbrk 0 . -(See also -.Xr end 3 ) . +While the break may be set to any address, actual allocation takes +place in page-sized quantities. +For allocation and access control purposes the address of the break is +always rounded up to the next page boundary. +Thus, changes to the break that do not cross a page boundary have no +material effect. +Any new pages that are allocated, however, always appear freshly +zeroed. .Pp The .Xr getrlimit 2 @@ -99,18 +95,71 @@ system call may be used to determine the maximum permissible size of the .Em data segment; -it will not be possible to set the break -beyond the +it will not be possible to set the break so that the sum of the heap +size and the data segment is greater than the .Dv RLIMIT_DATA .Em rlim_max value returned from a call to -.Xr getrlimit 2 , -e.g. -.Dq etext + rlim.rlim_max . -(see +.Xr getrlimit 2 . +One can use the +.Dq _etext +symbol to find the end of the program text and thus the beginning of +the data segment. +.\" XXX is that always true? there are platforms where there's a fairly +.\" large unmapped gap between text and data, plus using etext doesn't +.\" take into account read-only data, which is probably (or should be) +.\" billed against text size and not data size. +See .Xr end 3 -for the definition of -.Em etext ) . +regarding +.Dq _etext . +.Pp +Historically and in +.Nx +the heap immediately follows the data segment, and in fact is +considered part of it. +Thus the initial break is the first address after the end of the +process's uninitialized data (also known as the +.Dq BSS ) . +This address is provided by the linker as +.Dq _end ; +see +.Xr end 3 . +.Pp +There exist implementations in the wild where this is not the case, +however, or where the initial break is rounded up to a page boundary, +or other minor variations, so the recommended more-portable way to +retrieve the initial break is by calling +.Fn sbrk 0 +at program startup. +(This returns the current break without changing it.) +.Pp +In any event, the break may not be set to an address below its initial +position. +.Pp +Note that ordinary application code should use +.Xr malloc 3 +and related functions to allocate memory, or +.Xr mmap 2 +for lower-level page-granularity control. +While the +.Fn brk +and/or +.Fn sbrk +functions exist in most Unix-like environments, their semantics +sometimes vary subtly and their use is not particularly portable. +Also, one must take care not to mix calls to +.Xr malloc 3 +or related functions with calls to +.Fn brk +or +.Fn sbrk +as this will ordinarily confuse +.Xr malloc 3 ; +this can be difficult to accomplish given that many things in the +C library call +.Xr malloc 3 +themselves. .Sh RETURN VALUES .Fn brk returns 0 if successful; @@ -156,18 +205,6 @@ A function call appeared in .At v7 . .Sh BUGS -Note that -mixing -.Fn brk -and -.Fn sbrk -with -.Xr malloc 3 , -.Xr free 3 , -and similar functions may result in non-portable program -behavior. -Caution is advised. -.Pp Setting the break may fail due to a temporary lack of swap space. It is not possible to distinguish this from a failure caused by exceeding the maximum size of the data segment without consulting