-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/29688/#review68314
-----------------------------------------------------------

Ship it!


Main comment is to document the caveats of this approach with respect to not 
seeing usage data for a long time when there are many containers, see below.


src/slave/containerizer/isolators/posix/disk.hpp
<https://reviews.apache.org/r/29688/#comment112487>

    Do you want future, option, and resources included?



src/slave/containerizer/isolators/posix/disk.hpp
<https://reviews.apache.org/r/29688/#comment112530>

    How about 'DiskUsageCollector', given that it doesn't do the monitoring? 
Right now the monitoring is inside the isolator process, in that it keeps 
calling back into this:
    
    ```
    // Responsible for collecting disk usage for paths, while ensuring
    // that an interval elapses between each collection.
    class DiskUsageCollector
    {
      ...
    };
    ```



src/slave/containerizer/isolators/posix/disk.hpp
<https://reviews.apache.org/r/29688/#comment112532>

    How about an additional note related to the fact that all containers are in 
a single "queue":
    
    ```
    // This isolator monitors the disk usage for containers, and reports
    // Limitation when a container exceeds its disk quota. This leverages
    // the DiskUsageCollector to ensure that we don't induce too much
    // CPU usage and disk caching effects from running 'du' too often.
    //
    // NOTE: Currently all containers are processed in the same queue,
    // which means that when a container starts, it could take many disk
    // collection intervals until any data is available in the resource
    // usage statistics!
    //
    // TODO(jieyu): Consider handling each container independently, or
    // triggering an initial collection when the container starts, to
    // ensure that we have usage statistics without a large delay.
    ```



src/slave/containerizer/isolators/posix/disk.cpp
<https://reviews.apache.org/r/29688/#comment112536>

    How about:
    
    ```
    // The path might have just been removed from this container's resources.
    ```
    
    Similar to the comment above about container destruction.



src/slave/containerizer/isolators/posix/disk.cpp
<https://reviews.apache.org/r/29688/#comment112542>

    Should this use os::killtree? Given that there might be a shell in the way?
    
    ```
    slave 
      |
    shell -c "..."
      |
    /usr/bin/du ...
    ```



src/slave/containerizer/isolators/posix/disk.cpp
<https://reviews.apache.org/r/29688/#comment112551>

    Mind adding a newline?
    
    ```
    // platforms (e.g. OS X uses 512 byte blocks).
    //
    // NOTE: ...
    ```



src/slave/containerizer/isolators/posix/disk.cpp
<https://reviews.apache.org/r/29688/#comment112552>

    Could you add:
    
    ```
    CHECK(!entries.empty());
    ```



src/slave/containerizer/isolators/posix/disk.cpp
<https://reviews.apache.org/r/29688/#comment112553>

    Maybe say "Failed to reap the status of 'du'?"



src/slave/containerizer/isolators/posix/disk.cpp
<https://reviews.apache.org/r/29688/#comment112534>

    Why not just use a 'queue' here? It's more obvious as to what it's for (the 
comment becomes unnecessary), and I believe erasing from a deque is often 
faster anyway since it avoids the pointer indirection of the list.
    
    ```
    queue<Owned<Entry>> entries;
    ```



src/slave/flags.hpp
<https://reviews.apache.org/r/29688/#comment112481>

    Can we clarify in here that this flag is for the "posix/disk" isolator?
    
    We probably should have made this explicit for the "network/port_mapping" 
flags as well, but let's start now with this one :)



src/slave/flags.hpp
<https://reviews.apache.org/r/29688/#comment112486>

    Might also want to clarify that this is for containers, not the disk (one 
might get confused between this and `--disk_watch_interval`).
    
    "Minimal" seems a bit confusing, should we say:
    
    "the interval between disk quota checks" (notice it's unlike the "Interval 
between the __start__ of perf stat samples" above).


- Ben Mahler


On Jan. 14, 2015, 8:28 p.m., Jie Yu wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/29688/
> -----------------------------------------------------------
> 
> (Updated Jan. 14, 2015, 8:28 p.m.)
> 
> 
> Review request for mesos, Benjamin Hindman, Ben Mahler, and Ian Downes.
> 
> 
> Bugs: MESOS-1588
>     https://issues.apache.org/jira/browse/MESOS-1588
> 
> 
> Repository: mesos-git
> 
> 
> Description
> -------
> 
> Added DiskQuotaIsolator to enforce disk quota. I created a DiskUsageChecker 
> to check disk usage by calling 'du'. The DiskUsageChecker is throttled (see 
> comments). The isolator uses DiskUsageChecker to enforce disk quota.
> 
> 
> Diffs
> -----
> 
>   src/Makefile.am fc0c3227466ccf364353a739fec8d9532ea3c683 
>   src/slave/containerizer/isolators/posix/disk.hpp PRE-CREATION 
>   src/slave/containerizer/isolators/posix/disk.cpp PRE-CREATION 
>   src/slave/containerizer/mesos/containerizer.cpp 
> 0bcf5ce7cfab470cabd3af3535344d19cb33b1c8 
>   src/slave/flags.hpp f1b8dfbb7391167b67a9498561742aa9ab9089a6 
> 
> Diff: https://reviews.apache.org/r/29688/diff/
> 
> 
> Testing
> -------
> 
> make check
> 
> 
> Thanks,
> 
> Jie Yu
> 
>

Reply via email to