On Monday, October 24, 2005 05:05:08 PM +0200 Peter Somogyi <[EMAIL PROTECTED]> wrote:

We've encountered into the same bug (vmalloc leak in afs client when klog
many users, openafs-1.4.0-rc4). Is the need to call "sysctl -w
afs.GCPAGs=1" - when you don't want memleak - a bug, or it's by design?

It's not actually a leak, and turning on GCPAGs is a performance tradeoff.
Let me try to describe what's going on here...


A PAG itself does not occupy any storage -- it's just a number, used to label processes which are members of that PAG, and also tokens, connections, and cached access rights belonging to that PAG. It is these things which take up space. These objects are cleaned up by background daemon which performs several checks at different intervals.

Every three minutes, the background thread sweeps the token cache, looking for tokens which are expired or have been discarded. Any in finds in this state are deleted (freeing the storage they occupy), as are cached access rights for the PAG containing them.


Every ten minutes, the background thread does a sweep of all PAGs for which we have tokens or active Rx connections. For any which either have no tokens or whose tokens have expired (within a short grace period), all connections are destroyed, and the structure used to track them is freed.

So, after a short delay, no resources are used by a PAG whose tokens have expired or been deleted. This is done reasonably efficiently, by traversing a list of data structures which exist only for active PAGs.



The problem that comes up is in a situation where you frequently create a PAG, put some tokens in it, use it briefly, and then forget about it without bothering to delete its tokens. The ideal thing to do here is to fix either the "frequently" or the "without bothering to delete its tokens" parts. Lacking that, you can turn on the PAG garbage collector.

When enabled, the garbage collector runs every hour. What this does is scan the process table, setting an in-use flag on each PAG which has at least one process in it (*). Then, it marks as expired the tokens of any PAG which has no members, which causes the sweeps described above to throw away those tokens after a short time. This is not quite as racy as it sounds, but it does have the potential to miss a process and thus nuke tokens which are actually in use. It's also a performance hit, and is completely unnecessary in the majority of cases. Thus, it is not enabled by default.


(*) Note that as mentioned above, PAGs don't actually occupy any storage.
The flag described is actually set on the structure used to manage tokens and connections for a PAG, which exists only if there is anything to manage.


-- Jeff
_______________________________________________
OpenAFS-devel mailing list
[email protected]
https://lists.openafs.org/mailman/listinfo/openafs-devel

Reply via email to