Module Name: src
Committed By: dholland
Date: Sat Aug 27 23:06:01 UTC 2016
Modified Files:
src/lib/libc/sys: brk.2
Log Message:
Rework pursuant to PR 7934: be more clear about the page granularity
behavior and when new memory is zeroed.
Also, strengthen the warning about mixing with calls to malloc (which
is not a bug) and mention that the portable way to fetch the initial
break is to call sbrk(0). There are implementations in the wild where
using _end as the initial break doesn't work.
To generate a diff of this commit:
cvs rdiff -u -r1.32 -r1.33 src/lib/libc/sys/brk.2
Please note that diffs are not public domain; they are subject to the
copyright notices on the relevant files.
Modified files:
Index: src/lib/libc/sys/brk.2
diff -u src/lib/libc/sys/brk.2:1.32 src/lib/libc/sys/brk.2:1.33
--- src/lib/libc/sys/brk.2:1.32 Thu May 13 10:20:57 2004
+++ src/lib/libc/sys/brk.2 Sat Aug 27 23:06:01 2016
@@ -1,4 +1,4 @@
-.\" $NetBSD: brk.2,v 1.32 2004/05/13 10:20:57 wiz Exp $
+.\" $NetBSD: brk.2,v 1.33 2016/08/27 23:06:01 dholland Exp $
.\"
.\" Copyright (c) 1980, 1991, 1993
.\" The Regents of the University of California. All rights reserved.
@@ -29,7 +29,7 @@
.\"
.\" @(#)brk.2 8.4 (Berkeley) 5/1/95
.\"
-.Dd July 12, 1999
+.Dd August 27, 2016
.Dt BRK 2
.Os
.Sh NAME
@@ -56,16 +56,9 @@ and
.Fn sbrk
functions are used to change the amount of memory allocated in a
process's data segment.
-They do this by moving the location of the
+They do this by moving the address at which the process's heap ends.
+This address is known as the
.Dq break .
-The break is the first address after the end of the process's
-uninitialized data segment (also known as the
-.Dq BSS ) .
-.Pp
-While the actual process data segment size maintained by the kernel will only
-grow or shrink in page sizes, these functions allow setting the break
-to unaligned values (i.e. it may point to any address inside the last
-page of the data segment).
.Pp
The
.Fn brk
@@ -74,24 +67,27 @@ function sets the break to
.Pp
The
.Fn sbrk
-function raises the break by at least
+function changes the break by
+.Fa incr
+bytes.
+If
.Fa incr
-bytes, thus allocating at least
+is positive, this allocates
.Fa incr
bytes of new memory in the data segment.
If
.Fa incr
is negative,
-the break is lowered by
-.Fa incr
-bytes.
+this releases the corresponding number of bytes.
.Pp
-.Fn sbrk
-returns the prior address of the break.
-The current value of the program break may be determined by calling
-.Fn sbrk 0 .
-(See also
-.Xr end 3 ) .
+While the break may be set to any address, actual allocation takes
+place in page-sized quantities.
+For allocation and access control purposes the address of the break is
+always rounded up to the next page boundary.
+Thus, changes to the break that do not cross a page boundary have no
+material effect.
+Any new pages that are allocated, however, always appear freshly
+zeroed.
.Pp
The
.Xr getrlimit 2
@@ -99,18 +95,71 @@ system call may be used to determine
the maximum permissible size of the
.Em data
segment;
-it will not be possible to set the break
-beyond the
+it will not be possible to set the break so that the sum of the heap
+size and the data segment is greater than the
.Dv RLIMIT_DATA
.Em rlim_max
value returned from a call to
-.Xr getrlimit 2 ,
-e.g.
-.Dq etext + rlim.rlim_max .
-(see
+.Xr getrlimit 2 .
+One can use the
+.Dq _etext
+symbol to find the end of the program text and thus the beginning of
+the data segment.
+.\" XXX is that always true? there are platforms where there's a fairly
+.\" large unmapped gap between text and data, plus using etext doesn't
+.\" take into account read-only data, which is probably (or should be)
+.\" billed against text size and not data size.
+See
.Xr end 3
-for the definition of
-.Em etext ) .
+regarding
+.Dq _etext .
+.Pp
+Historically and in
+.Nx
+the heap immediately follows the data segment, and in fact is
+considered part of it.
+Thus the initial break is the first address after the end of the
+process's uninitialized data (also known as the
+.Dq BSS ) .
+This address is provided by the linker as
+.Dq _end ;
+see
+.Xr end 3 .
+.Pp
+There exist implementations in the wild where this is not the case,
+however, or where the initial break is rounded up to a page boundary,
+or other minor variations, so the recommended more-portable way to
+retrieve the initial break is by calling
+.Fn sbrk 0
+at program startup.
+(This returns the current break without changing it.)
+.Pp
+In any event, the break may not be set to an address below its initial
+position.
+.Pp
+Note that ordinary application code should use
+.Xr malloc 3
+and related functions to allocate memory, or
+.Xr mmap 2
+for lower-level page-granularity control.
+While the
+.Fn brk
+and/or
+.Fn sbrk
+functions exist in most Unix-like environments, their semantics
+sometimes vary subtly and their use is not particularly portable.
+Also, one must take care not to mix calls to
+.Xr malloc 3
+or related functions with calls to
+.Fn brk
+or
+.Fn sbrk
+as this will ordinarily confuse
+.Xr malloc 3 ;
+this can be difficult to accomplish given that many things in the
+C library call
+.Xr malloc 3
+themselves.
.Sh RETURN VALUES
.Fn brk
returns 0 if successful;
@@ -156,18 +205,6 @@ A
function call appeared in
.At v7 .
.Sh BUGS
-Note that
-mixing
-.Fn brk
-and
-.Fn sbrk
-with
-.Xr malloc 3 ,
-.Xr free 3 ,
-and similar functions may result in non-portable program
-behavior.
-Caution is advised.
-.Pp
Setting the break may fail due to a temporary lack of swap space.
It is not possible to distinguish this from a failure caused by
exceeding the maximum size of the data segment without consulting