Bas Westerbaan wrote:
Hello,
Quite a lot of userspace applications cache. Firefox caches pages;
mySQL caches queries; libc' free (like practically all other
userspace allocators) won't directly return the first byte of memory
freed, etc.
These applications are unaware of memory pressure. While the kernel
is trying its best to to free memory by for instance dropping,
possibly more valuable caches, userspace is left blissfully unaware.
Obviously this isn't a really big problem, given that we've still got
swap to swap out those rarely used caches, except for when the caches
aren't _that_ rarely used and of which the backing store (eg.
precomputed values) might be faster than the disk to swap back the
pages from.
A solution would be to either
a) let the application make the kernel aware of pages that, when in
memory pressure, may be dropped. This would be tricky to implement
for the userspace: it's hard to avoid an application to race into a
dropped page. However, the kernel can directly free a page from
userspace, which makes it use full when under real pressure. This in
contrast to
b) letting the application register itself with a cache share
priority. The application (and other aware applications) would then
be able to query how fair they are at the moment proportional to their
cache share priority. Freeing would still be completely in their own
hands.
The only relevant related matter I could find were madvise and mincore.
With madvise pages can be marked to be unnecessary and these should be
swapped out earlier. With mincore one can determine whether pages are
resident (not cached). This would make an existing alternative to
solution a. However, this doesn't eliminate the writes to the swap
and polling everytime before accessing a cache isn't really pretty.
I did consider guessing the memory pressure by looking at
/proc/meminfo, but I think it isn't that accurate.
The prev_priority field in the zoneinfo stuff is more useful for
memory pressure. I'm playing with making a blocking callback that
can wake someone up when this gets down to a certain priority level
(prio=12 => everything's rosy, prio=0 => we're in deep shit).
Before hacking something together (and being uncertain about the
thoroughness with which I searched for existing work, sorry), I would
like your thoughts on this.
Please CC me, I'm not in the list.
Bas
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/