On 11/09/12 08:56, Gerhard Roth wrote:
On Thu, 08 Nov 2012 16:22:41 -0500
Ted Unangst <t...@tedunangst.com> wrote:
On Thu, Nov 08, 2012 at 13:34, Ilya Bakulin wrote:
The problem seems to be in uvm_map_pageable_all() function
(sys/uvm/uvm_map.c). This function is a "special case of uvm_map_pageable",
which tries to mlockall() all mapped memory regions.
Prior to calling uvm_map_pageable_wire(), which actually does locking, it
tries to count how many memory bytes will be locked, and compares this
number
with uvmexp.wiredmax, which is set by RLIMIT_MEMLOCK.
The problem is that counting algorithm doesn't take into account that some
pages have VM_PROT_NONE flag set and hence won't be locked anyway.
Later in uvm_map_pageable_wire() these pages are skipped when doing actual
job.
I don't know if this is right. Should prot_none pages not be wired?
I think the opposite should happen. prot_none pages should be locked
as well. The app may be using prot_none as a way to protect its super
secret secrets from itself. It certainly wouldn't want them being
swapped out.
As long as they have VM_PROT_NONE, they can't be accessed and wiring them
is just a waste of resources.
If your scenario applies then uvm_map_protect() kicks in. It takes care of
wiring pages if the protection changes from VM_PROT_NONE to some different
value, though I have to admit that this happens only in case the
VM_MAP_WIREFUTURE flag was specified. But that looks acceptable to me.
Tedu is right and you're wrong. PROT_NONE protected pages must be wired
when calling mlock* functions.
The main argument: malloc protects its bookkeeping data using
mprotect(PROT_NONE), which you definitely want to wire if you call
mlockall (either because you want to prevent information leaking to disk
or you have a time-sensitive program like ntpd and swap hurts). As for
wasting resources: the kernel has insufficient information to fix
wasteful programs, nor does it have sufficient information to consider
PROT_NONE pages on a case-by-case basis.
Also consider that there is a limitation on wired memory, if you are
concerned about wasting resources.
Ilya Bakulin does point out a serious bug in the vmmap code however: the
resource counting algorithms and locking algorithm count differently.
The code ought to be in sync; if no developer is going to fix the
commit-part of the code, I would seriously recommend putting Ilya's diff in.
--
Ariane