> On July 27, 2016, 10:05 a.m., Qian Zhang wrote: > > src/slave/containerizer/mesos/isolators/cgroups/subsystem.cpp, line 494 > > <https://reviews.apache.org/r/49851/diff/5/?file=1453262#file1453262line494> > > > > How do we recover this field `updatedLimit`? I mean during agent > > recovery, `updatedLimit` of each Info will be reset to `false`, right? But > > I think that is not correct. > > haosdent huang wrote: > I think this problem exists at `mem.cpp` as well, and update the limit > twice here does not bring problems here. So I drop this. Feel free to reopen > this if you think it would bring problems here.
Yes, it is also a bug in `mem.cpp` which uses `info->pid` to check if it is the first time to update the memory limit, but the value of `info->pid` will also be lost during agent recovery. And I think updating the memory hard limit twice will bring some problems, as mentioned in this comments (https://github.com/apache/mesos/blob/1.0.0/src/slave/containerizer/mesos/isolators/cgroups/mem.cpp#L369:#L371), decreasing memory hard limit may induce an OOM. I had a discussion with Jie on this issue, and his suggestion is that we should not use such kind of field (e.g., `pid` or `updatedLimit`) because they can not be recovered, instead, before updating the hard limit, we should check if its current value is still the initial value (should be -1? And it may be different in different Linux distributions, so you may need to double check it), if it is, then that means it is the first time to update the hard limit, then we can just update it, if it is not, then that means it is not the first time, then we can only increase it but not decrease it. - Qian ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/49851/#review143517 ----------------------------------------------------------- On Aug. 4, 2016, 1:58 a.m., haosdent huang wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/49851/ > ----------------------------------------------------------- > > (Updated Aug. 4, 2016, 1:58 a.m.) > > > Review request for mesos, Gilbert Song, Jie Yu, Qian Zhang, and Zhengju Sha. > > > Bugs: MESOS-5045 > https://issues.apache.org/jira/browse/MESOS-5045 > > > Repository: mesos > > > Description > ------- > > Implemented `MemorySubsystem`. > > > Diffs > ----- > > src/slave/containerizer/mesos/isolators/cgroups/cgroups.hpp > b191b2a52a9645fc902a35ed52909b2142f0b4c0 > src/slave/containerizer/mesos/isolators/cgroups/cgroups.cpp > 2659252d8cffcefc233bc85fb4707c8147272737 > src/slave/containerizer/mesos/isolators/cgroups/constants.hpp > c45d88092f3fe497373dfeaa8346aef9126c7b8b > src/slave/containerizer/mesos/isolators/cgroups/subsystem.hpp > 5f52a076a1fa3a21d886cb961ddeed5046a38d7c > src/slave/containerizer/mesos/isolators/cgroups/subsystem.cpp > a30ecafcbecc9d3b6eeea2b04dcb4d278750af41 > > Diff: https://reviews.apache.org/r/49851/diff/ > > > Testing > ------- > > > Thanks, > > haosdent huang > >