Thanks!
Le ven. 1 mai 2020 à 00:11, Vinod Kone <vinodk...@apache.org> a écrit : > > I commented on the JIRA. > > On Thu, Apr 30, 2020 at 3:02 PM Charles-François Natali <cf.nat...@gmail.com> > wrote: > > > Thanks Vinod. > > > > Yes, I understand that Mesos assumes it's the only process managing > > resources, makes sense. > > Looking at the code and testing shows the agent reports as available > > memory the total memory of the host, minus 1GB (or half the total > > memory if the total memory is below 2GB) > > ( > > https://github.com/apache/mesos/blob/master/src/slave/containerizer/containerizer.cpp#L152 > > ). > > So basically it means that if assumes that the OS doesn't use more > > than 1GB. I guess if it's not the case one can just specify the memory > > manually to the agent, so that's fine. > > > > Actually the reason I was wondering about this is because we recently > > had a problem where containers couldn't be destroyed because of tasks > > stuck in uninterruptible (D) state, which caused the memory to be > > basically leaked, i.e. the agent was advertising the memory free while > > it was still being used by the stuck processes. We ran into a similar > > issue with GPUs - it's a known issue > > https://issues.apache.org/jira/browse/MESOS-8038 - I posted an > > analysis and potential fix, it'd be great if someone could have a look > > :). > > > > Cheers, > > > > Charles > > > > Le jeu. 30 avr. 2020 à 15:36, Vinod Kone <vinodk...@apache.org> a écrit : > > > > > > Mesos assumes that it is the only process managing resources of a box > > (cpu, > > > mem, disk). So if you have out of band processes using up resources it > > > won't be reflected in the resource offers and the box can be > > overcommitted. > > > There is no runtime periodic check of available resources, it's only > > > calculated once at startup. > > > > > > Resource detection logic is here: > > > > > https://github.com/apache/mesos/blob/master/src/slave/containerizer/containerizer.cpp#L65 > > > > > > On Thu, Apr 30, 2020 at 8:17 AM Charles-François Natali < > > cf.nat...@gmail.com> > > > wrote: > > > > > > > Hi, > > > > > > > > Could someone point me to some code/documentation explaining how the > > > > agent available memory is computed, and when it is refreshed? > > > > > > > > For example, if I have an agent started, with some outstanding offers, > > > > and I then start a process - not as a task managed by Mesos, but as an > > > > external process which just allocates a lot of memory - and touches > > > > it, not just committed - I can see the machine available memory go > > > > down (as reported by free, and MemAvailable in /proc/meminfo), but the > > > > agent doesn't rescind any offer, and never seems to actually refresh > > > > it - event after starting/stopping tasks. > > > > > > > > Cheers, > > > > > > > > Charles > > > > > >