----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/29688/#review68314 -----------------------------------------------------------
Ship it! Main comment is to document the caveats of this approach with respect to not seeing usage data for a long time when there are many containers, see below. src/slave/containerizer/isolators/posix/disk.hpp <https://reviews.apache.org/r/29688/#comment112487> Do you want future, option, and resources included? src/slave/containerizer/isolators/posix/disk.hpp <https://reviews.apache.org/r/29688/#comment112530> How about 'DiskUsageCollector', given that it doesn't do the monitoring? Right now the monitoring is inside the isolator process, in that it keeps calling back into this: ``` // Responsible for collecting disk usage for paths, while ensuring // that an interval elapses between each collection. class DiskUsageCollector { ... }; ``` src/slave/containerizer/isolators/posix/disk.hpp <https://reviews.apache.org/r/29688/#comment112532> How about an additional note related to the fact that all containers are in a single "queue": ``` // This isolator monitors the disk usage for containers, and reports // Limitation when a container exceeds its disk quota. This leverages // the DiskUsageCollector to ensure that we don't induce too much // CPU usage and disk caching effects from running 'du' too often. // // NOTE: Currently all containers are processed in the same queue, // which means that when a container starts, it could take many disk // collection intervals until any data is available in the resource // usage statistics! // // TODO(jieyu): Consider handling each container independently, or // triggering an initial collection when the container starts, to // ensure that we have usage statistics without a large delay. ``` src/slave/containerizer/isolators/posix/disk.cpp <https://reviews.apache.org/r/29688/#comment112536> How about: ``` // The path might have just been removed from this container's resources. ``` Similar to the comment above about container destruction. src/slave/containerizer/isolators/posix/disk.cpp <https://reviews.apache.org/r/29688/#comment112542> Should this use os::killtree? Given that there might be a shell in the way? ``` slave | shell -c "..." | /usr/bin/du ... ``` src/slave/containerizer/isolators/posix/disk.cpp <https://reviews.apache.org/r/29688/#comment112551> Mind adding a newline? ``` // platforms (e.g. OS X uses 512 byte blocks). // // NOTE: ... ``` src/slave/containerizer/isolators/posix/disk.cpp <https://reviews.apache.org/r/29688/#comment112552> Could you add: ``` CHECK(!entries.empty()); ``` src/slave/containerizer/isolators/posix/disk.cpp <https://reviews.apache.org/r/29688/#comment112553> Maybe say "Failed to reap the status of 'du'?" src/slave/containerizer/isolators/posix/disk.cpp <https://reviews.apache.org/r/29688/#comment112534> Why not just use a 'queue' here? It's more obvious as to what it's for (the comment becomes unnecessary), and I believe erasing from a deque is often faster anyway since it avoids the pointer indirection of the list. ``` queue<Owned<Entry>> entries; ``` src/slave/flags.hpp <https://reviews.apache.org/r/29688/#comment112481> Can we clarify in here that this flag is for the "posix/disk" isolator? We probably should have made this explicit for the "network/port_mapping" flags as well, but let's start now with this one :) src/slave/flags.hpp <https://reviews.apache.org/r/29688/#comment112486> Might also want to clarify that this is for containers, not the disk (one might get confused between this and `--disk_watch_interval`). "Minimal" seems a bit confusing, should we say: "the interval between disk quota checks" (notice it's unlike the "Interval between the __start__ of perf stat samples" above). - Ben Mahler On Jan. 14, 2015, 8:28 p.m., Jie Yu wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/29688/ > ----------------------------------------------------------- > > (Updated Jan. 14, 2015, 8:28 p.m.) > > > Review request for mesos, Benjamin Hindman, Ben Mahler, and Ian Downes. > > > Bugs: MESOS-1588 > https://issues.apache.org/jira/browse/MESOS-1588 > > > Repository: mesos-git > > > Description > ------- > > Added DiskQuotaIsolator to enforce disk quota. I created a DiskUsageChecker > to check disk usage by calling 'du'. The DiskUsageChecker is throttled (see > comments). The isolator uses DiskUsageChecker to enforce disk quota. > > > Diffs > ----- > > src/Makefile.am fc0c3227466ccf364353a739fec8d9532ea3c683 > src/slave/containerizer/isolators/posix/disk.hpp PRE-CREATION > src/slave/containerizer/isolators/posix/disk.cpp PRE-CREATION > src/slave/containerizer/mesos/containerizer.cpp > 0bcf5ce7cfab470cabd3af3535344d19cb33b1c8 > src/slave/flags.hpp f1b8dfbb7391167b67a9498561742aa9ab9089a6 > > Diff: https://reviews.apache.org/r/29688/diff/ > > > Testing > ------- > > make check > > > Thanks, > > Jie Yu > >