Greetings all... We've got a situation where we are currently using jobacct_gather/linux and enforcing memory limits on our jobs. The problem is that for some of our jobs our users would like to use memory mapped files. Currently jobacct_gather/linux only examines the RSS of the process, and if that passes the memory limit the job gets terminated.
I've attached a functional patch that implements the fix I've made to ignore shared pages of a process. This means that if a job allocates more RAM than the limit it still gets terminated, but if it's just mmaping very large files it does not get disturbed. I don't think this patch is the right solution though. In an ideal world we would either maintain our own patch, or even better there would be a configuration option for jobacct_gather/linux to not count shared pages. I don't see an easy way to do either of these though. Currently to build a jobacct_gather plugin you need to be wired in to the source tree - that's how I managed to get this built and tested. Reading configuration values (like the sched plugins do) appears to be quite tightly coupled to parts of src/common and would require changing more files than I'd like. Questions are: - Are there enough people out there interested in the functionality described here to warrant making this a config option for jobacct_gather/linux? - If not, is there an easy way to be able to build this outside of the source tree so we can maintain it ourselves on the side? Thanks, Chris
jag_l_shared.patch
Description: Binary data