Ian Downes created MESOS-2978: --------------------------------- Summary: Provide more debug information when OOMing a container Key: MESOS-2978 URL: https://issues.apache.org/jira/browse/MESOS-2978 Project: Mesos Issue Type: Improvement Components: isolation Affects Versions: 0.22.1 Reporter: Ian Downes Priority: Minor
Currently, the cgroup memory isolator will log the output of {{memory.stat}} if it detects the container has oom'ed. This information is of some use to see how different types of memory used contributed to the oom but it does not provide information about memory usage of specific processes. We should log process (thread) information, e.g., something to the effect of: {noformat} [idownes@foobar]$ pwd /sys/fs/cgroup/memory/mesos/XXXX [idownes@foobar]$ cat tasks | xargs ps -o pid,tid,stat,time,rss,command -L -p {noformat} This output is of variable size (memory.stat is bounded) so measures should be taken to limit the amount logged. Note: the oom notification from the kernel is asynchronous with the kernel's oom handler killing processes and observing the notification is asynchronous in Mesos. Logging of information is thus best effort and it may lack information about process(es) that have already been killed by the kernel or even may not be logged at all if Mesos reacts first to the executor terminating. -- This message was sent by Atlassian JIRA (v6.3.4#6332)