Crystal clear, thanks Ben!
On 1 August 2014 01:36, Benjamin Mahler <benjamin.mah...@gmail.com> wrote: > Everything is scheduled for the garbage collection delay (1 week by > default) from when it was last modified, but as the disk fills up we'll > start pruning the older directories ahead of schedule. > > This means that things should be removed in the same order that they were > scheduled. > > You can think of this as follows, everything gets scheduled for 1 week in > the future, but we'll "speed up" the existing schedule when we need to make > room. Make sense? > > > On Thu, Jul 31, 2014 at 4:18 PM, Tom Arnfeld <t...@duedil.com> wrote: > >> Yeah, specifically the docker issue was related to volumes not being >> removed with `docker rm` but that's a separate issue. >> >> So right now mesos won't remove older work directories to make room for >> new ones (old ones that have already been scheduled for removal in a few >> days time)? This means when the disk gets quite full, newer work >> directories will be removed much faster than older ones. Is that correct? >> >> >> >> On 31 July 2014 23:56, Benjamin Mahler <benjamin.mah...@gmail.com> wrote: >> >>> Apologies for the lack of documentation, in the default setup, the slave >>> will schedule the work directories for garbage collection when: >>> >>> (1) Executors terminate. >>> (2) The slave recovers and discovers work directories for terminal >>> executors. >>> >>> Sounds like the docker integration code you're using has a bug in this >>> respect, either by not scheduling docker directories for garbage collection >>> during (1) and/or (2). >>> >>> >>> On Thu, Jul 31, 2014 at 3:40 PM, Tom Arnfeld <t...@duedil.com> wrote: >>> >>>> I don't have them to hand now, but I recall it saying something in the >>>> high 90's and 0ns for the max allowed age. I actually found the root cause >>>> of the probably, docker related and out of mesos's control... though i'm >>>> still curious about the expected behaviour of the GC process. It doesn't >>>> seem to be well documented anywhere. >>>> >>>> Tom. >>>> >>>> >>>> On 31 July 2014 23:33, Benjamin Mahler <benjamin.mah...@gmail.com> >>>> wrote: >>>> >>>>> What do the slave logs say? >>>>> >>>>> E.g. >>>>> >>>>> I0731 22:22:17.851347 23525 slave.cpp:2879] Current usage 7.84%. Max >>>>> allowed age: 5.751197441470081days >>>>> >>>>> >>>>> On Wed, Jul 30, 2014 at 8:55 AM, Tom Arnfeld <t...@duedil.com> wrote: >>>>> >>>>>> I'm not sure if this is something already supported by mesos, and if >>>>>> so it'd be great if someone could point me in the right direction. >>>>>> >>>>>> Is there a way of asking a slave to garbage collect old executors >>>>>> manually? >>>>>> >>>>>> Maybe i'm misunderstanding things, but as each executor does (insert >>>>>> knowledge gap) mesos works out how long it is able to keep the sandbox >>>>>> for >>>>>> and schedules it for garbage collection appropriately, also taking into >>>>>> account the command line >>>>>> >>>>>> The disk on one of my slaves is getting quite full (98%) and i'm >>>>>> curious how mesos is going to behave in this situation. Should it start >>>>>> clearing things up, given a task could launch that needs to use an amount >>>>>> of disk space, but that disk is being eaten up by old executor sandboxes. >>>>>> >>>>>> It may be worth noting i'm not specifying --gc_delay on any slave >>>>>> right now, perhaps I should be? >>>>>> >>>>>> Any input would be much appreciated. >>>>>> >>>>>> Cheers, >>>>>> >>>>>> Tom. >>>>>> >>>>> >>>>> >>>> >>> >> >