On 11/09/12 08:56, Gerhard Roth wrote:
On Thu, 08 Nov 2012 16:22:41 -0500
Ted Unangst <t...@tedunangst.com> wrote:
On Thu, Nov 08, 2012 at 13:34, Ilya Bakulin wrote:

The problem seems to be in uvm_map_pageable_all() function
(sys/uvm/uvm_map.c). This function is a "special case of uvm_map_pageable",
which tries to mlockall() all mapped memory regions.
Prior to calling uvm_map_pageable_wire(), which actually does locking, it
tries to count how many memory bytes will be locked, and compares this
number
with uvmexp.wiredmax, which is set by RLIMIT_MEMLOCK.
The problem is that counting algorithm doesn't take into account that some
pages have VM_PROT_NONE flag set and hence won't be locked anyway.
Later in uvm_map_pageable_wire() these pages are skipped when doing actual
job.
I don't know if this is right.  Should prot_none pages not be wired?

I think the opposite should happen.  prot_none pages should be locked
as well.  The app may be using prot_none as a way to protect its super
secret secrets from itself.  It certainly wouldn't want them being
swapped out.

As long as they have VM_PROT_NONE, they can't be accessed and wiring them
is just a waste of resources.

If your scenario applies then uvm_map_protect() kicks in. It takes care of
wiring pages if the protection changes from VM_PROT_NONE to some different
value, though I have to admit that this happens only in case the
VM_MAP_WIREFUTURE flag was specified. But that looks acceptable to me.

Tedu is right and you're wrong. PROT_NONE protected pages must be wired when calling mlock* functions.

The main argument: malloc protects its bookkeeping data using mprotect(PROT_NONE), which you definitely want to wire if you call mlockall (either because you want to prevent information leaking to disk or you have a time-sensitive program like ntpd and swap hurts). As for wasting resources: the kernel has insufficient information to fix wasteful programs, nor does it have sufficient information to consider PROT_NONE pages on a case-by-case basis.

Also consider that there is a limitation on wired memory, if you are concerned about wasting resources.


Ilya Bakulin does point out a serious bug in the vmmap code however: the resource counting algorithms and locking algorithm count differently. The code ought to be in sync; if no developer is going to fix the commit-part of the code, I would seriously recommend putting Ilya's diff in.
--
Ariane

Reply via email to