Uri Guttman writes:
> 
> mmap won't help with returning ram to the system. there is still a
malloc/free
> involved and that will come from the same place as the rest of ram. the
only
> way to return virtual ram is by lowering the top virtual address with
brk()/sbrk()
> calls. and that needs contiguous ram at the top of the address space. and
that
> means freed ram must be collected into contiguous blocks. that is nothing
to do
> with mark/sweep vs ref counting or any other type of gc. mark sweep will
use a
> large contiguous area to copy active data but it needs the other buffer to
do it
> again. it won't ever want to free that ram. the only way this ever happens
is the
> custom code that manages the ram blocks by calling brk() directly. it is
too
> tricky to do in most cases. and with virtual ram it doesn't really matter
as i said,
> it will be swapped out until the process dies if you don't use it again.

At the risk of again poking this prickly mailing list... this is simply
wrong. Certainly I cannot speak for every variation of every operating
system ever written, and your understanding of this may be based on some
system that actually worked this way, unlikely though that seems. But mmap
does not perform any malloc & free -- nor does it use brk/sbrk to extend the
heap boundary.

At any given point in time, every process has a _virtual_ address space
which is essentially "From 0 to MAXINT/MAXLONG/MAXLONGLONG/Etc", depending
on the specific OS -- that is, the virtual address space is every possible
location that a pointer can hold, even if nothing is "at" that address.

To keep things simple, processes have a region of memory called "heap",
which is basically one big .. heap.  It's a block of addresses, and when a
process calls malloc, it attempts to find some unused memory in that block,
and if it cannot, it uses brk/sbrk to extend the heap -- make it bigger.

At any given point in time, if a process attempts to access a bit of memory,
it passes through one or more layers of virtual-address-translation.  If the
address is in a location the kernel says is "ok", then the translation will
map to a page of physical RAM -- if this is a new mapping, the page is
faulted in when the VAT mapping is established. When a process brk's the
heap, the kernel will be prepared to facilitate faulting RAM to back the
additional address space.

However, mmap does not use the heap -- it picks other locations in the
virtual address space to use (often near the stack, far away from the heap)
-- the caller can even specify where in that address space they want the
mapping to occur, which can facilitate using non-offset internal pointers.
When the kernel services a mmap call, it arranges to handle page faults
within the range provided by mmap (base address + size) by using the VAT to
map the virtual address to a page of physical RAM which contains (by reading
it from the file the first time it's accessed) the content of the mapped
file.  (When mmap is used without a file, the content is left
uninitialized.)  If the mmap call is private/exclusive, then this page of
physical RAM will only be mapped to this process's virtual addressing; if it
is shared, then multiple processes, from distinct virtual addressed, will
VAT to the same page of physical RAM.  This can facilitate many
efficiencies.

Again, to keep this simple... in order for the VAT to become unmapped, and
return the physical RAM to the pool where it may be reused (multiple such
pools exist, but I'll not dwell on that), the kernel has to be told that the
process no longer wants that VAT to work -- that is, it arranges for
accesses to those virtual addresses to result in segmentation violation
(segv). At that point, the RAM itself is "free" and available for any other
process.

In order for the heap's range of virtual addresses to return RAM, the "end"
of the heap must be lowered, and thus -- as people have said -- the process
needs to be sure that all the addressable stuff it is using lies below some
point, such that the end can be lowered to that point.

In order for the range of addresses provided by any given mmap to return
RAM, the kernel only needs to be told to unmap the memory.  Now, it seems
obvious, but I will point out that this does require that the process be
sure that it is not still using virtual addresses that are in that range --
one cannot have one's memory and free it too.  But it does not require that
anything be shuffled around to fit into a smaller contiguous block such as
the help.

It is worth pointing out that (in general) if one simply mmap's a 100MB file
and accesses a single byte in the middle of that file, then one will only
fault in _one_ page of RAM, and when one unmap's the file, one will only
release _one_ page of RAM.  It is possible to instruct the kernel to
pre-fault pages into RAM to enhance performance, but as a general rule, you
will only use as much RAM as is required based on the actual reads & writes
to virtual addresses mapped into a file.

Anyway, to continue would belabour the point. This information is widely
available.


_______________________________________________
Boston-pm mailing list
Boston-pm@mail.pm.org
http://mail.pm.org/mailman/listinfo/boston-pm

Reply via email to