> On Aug. 23, 2012, 2:28 a.m., Ben Mahler wrote:
> > src/slave/constants.hpp, line 30
> > <https://reviews.apache.org/r/6704/diff/2/?file=143654#file143654line30>
> >
> >     so does 1 week seem correct?
> >     
> >     using the math you have below:
> >     disk 50%  
> >     max_age 3.5 days
> >     
> >     disk 75%
> >     max_age 1.75 days
> >     
> >     do we actually see the disks fill up this rapidly? perhaps initially we 
> > should be less agressive on this age, like set 2x or 3x on the current 
> > gc_timeout_hours
> >     
> >     of course, ben and you know much more about the typical disk usage 
> > patterns than I do, so let me know your opinion please :)

I agree. The current algorithm for calculating age is completely arbitrary. Not 
based on historical data. The (TODO) idea is for the age function to be 
expressed on command line. That makes it easy for us to tune it on the go (with 
a puppet run + master roll)


> On Aug. 23, 2012, 2:28 a.m., Ben Mahler wrote:
> > src/slave/constants.hpp, line 31
> > <https://reviews.apache.org/r/6704/diff/2/?file=143654#file143654line31>
> >
> >     seems a bit frequent no? what does statvfs do under the hood?

changed it to 60 seconds. btw, statvfs is pretty fast (this is what 'df' uses). 
unlike 'du' which calculates disk space usage "right now", df uses cached 
information from something called a primary superblock of the filesystem. for 
more info: 
http://linuxshellaccount.blogspot.com/2008/12/why-du-and-df-display-different-values.html


> On Aug. 23, 2012, 2:28 a.m., Ben Mahler wrote:
> > src/slave/gc.hpp, line 23
> > <https://reviews.apache.org/r/6704/diff/2/?file=143656#file143656line23>
> >
> >     kill these new imports, and put them in the cpp file instead?

done


> On Aug. 23, 2012, 2:28 a.m., Ben Mahler wrote:
> > src/slave/gc.cpp, line 127
> > <https://reviews.apache.org/r/6704/diff/2/?file=143657#file143657line127>
> >
> >     this will just print true or false, would be nice to have the actual 
> > error here, so how about we LOG in os::rmdir() when anything goes wrong?
> >     
> >

the exit status shouldn't be there. i used it for debugging. removed it.

i think most (all?) our stout functions should be wrapped with 
Try/Result/Option. this would make it easier to grab their exit status at the 
call site. on the flip side, this makes the functions a bit cumbersome (call, 
check, get) to use. i will punt on this for now, may be another review 
(Technical debt :))


> On Aug. 23, 2012, 2:28 a.m., Ben Mahler wrote:
> > src/slave/gc.cpp, line 146
> > <https://reviews.apache.org/r/6704/diff/2/?file=143657#file143657line146>
> >
> >     move the log line within the if? looks really noisy

done


> On Aug. 23, 2012, 2:28 a.m., Ben Mahler wrote:
> > src/slave/slave.cpp, line 1591
> > <https://reviews.apache.org/r/6704/diff/2/?file=143659#file143659line1591>
> >
> >     I'm not getting the math here:
> >     
> >     current gc timeout hours: 1 week
> >     suppose max age: 1 day
> >     
> >     then
> >     
> >     1 week - 1 day = 6 days
> >     
> >     so we would prune everything scheduled for deletion within the next 6 
> > days...?

yes that's correct.  this is because, if a file's deletion time is within next 
6 days then its atleast 1 day old! this stems from the fact that a directory is 
always scheduled for deletion 1 week into future from the time it becomes a 
candidate for deletion.

i know this might be hard to wrap one's head around. i will add a comment 
explaining this. but i'm to open to suggestions for making the math easier to 
understand.


> On Aug. 23, 2012, 2:28 a.m., Ben Mahler wrote:
> > src/slave/slave.cpp, line 1588
> > <https://reviews.apache.org/r/6704/diff/2/?file=143659#file143659line1588>
> >
> >     can you nicely format this number, like %.2f

its a bit involved to do this in C++ (set width, set precision, unset 
precision, unset width). 
since we haven't done that anywhere else in the code base, i'm gonna punt on 
this.


- Vinod


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/6704/#review10603
-----------------------------------------------------------


On Aug. 21, 2012, 11:20 p.m., Vinod Kone wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/6704/
> -----------------------------------------------------------
> 
> (Updated Aug. 21, 2012, 11:20 p.m.)
> 
> 
> Review request for mesos, Benjamin Hindman, John Sirois, and Ben Mahler.
> 
> 
> Description
> -------
> 
> Added the ability in the slave to GC executor directories based on the 
> current disk usage.
> 
> Currently, it uses a very simple algorithm to decide which directories to 
> delete.
> 
> 
> This addresses bug MESOS-254.
>     https://issues.apache.org/jira/browse/MESOS-254
> 
> 
> Diffs
> -----
> 
>   src/slave/constants.hpp ab83972 
>   src/slave/flags.hpp 0c7917f 
>   src/slave/gc.hpp 6704742 
>   src/slave/gc.cpp 9c01024 
>   src/slave/slave.hpp 10c537b 
>   src/slave/slave.cpp 4efd41e 
>   src/tests/gc_tests.cpp 2f0bdde 
>   third_party/libprocess/include/stout/os.hpp b1eceb3 
> 
> Diff: https://reviews.apache.org/r/6704/diff/
> 
> 
> Testing
> -------
> 
> make check
> 
> 
> Thanks,
> 
> Vinod Kone
> 
>

Reply via email to