> On July 18, 2018, 1:51 p.m., James Peach wrote:
> > src/slave/containerizer/mesos/isolators/xfs/disk.cpp
> > Line 495 (original), 504 (patched)
> > <https://reviews.apache.org/r/67914/diff/2/?file=2059407#file2059407line504>
> >
> >     I'm a little concerned that we could exhaust the project IDs (we 
> > default to 5000 IDs). What do you think about adding a metric for the 
> > number of available project IDs?

Good idea. Added `containerizer/mesos/disk/project_ids_free` metric.


> On July 18, 2018, 1:51 p.m., James Peach wrote:
> > src/slave/containerizer/mesos/isolators/xfs/disk.cpp
> > Lines 547 (patched)
> > <https://reviews.apache.org/r/67914/diff/2/?file=2059407#file2059407line553>
> >
> >     Let's make this message symmetric with the one we emit when we assign:
> >     
> >     ```
> >     LOG(INFO) << "Reclaimed project " << stringify(projectId.get()) << " 
> > from '"
> >               << containerConfig.directory() << "'";
> >     ```

Fixed.


> On July 18, 2018, 1:51 p.m., James Peach wrote:
> > src/slave/containerizer/mesos/isolators/xfs/disk.cpp
> > Lines 549 (patched)
> > <https://reviews.apache.org/r/67914/diff/2/?file=2059407#file2059407line555>
> >
> >     If possible, I think this is simple enough to inline into the check loop

Definitely possible, but I would prefer to keep that logical separation unless 
you feel strongly about that :) `loop()` schedules the periodic check and 
`checkProjectIdUsage()` actually performs that check.


> On July 18, 2018, 1:51 p.m., James Peach wrote:
> > src/slave/flags.cpp
> > Lines 1338 (patched)
> > <https://reviews.apache.org/r/67914/diff/2/?file=2059409#file2059409line1338>
> >
> >     Rather than introducing a new flag, lets use 
> > `container_disk_watch_interval` or `disk_watch_interval` ... probably the 
> > former?

I think if we do this then we should rather use the latter because disk GC is 
driven by `disk_watch_interval` and `gc_delay` (maybe use which one of those 2 
is the smallest?). `container_disk_watch_interval` will most likely be set to a 
smaller value to promptly terminate containers that have breached their quota 
(that's true in our cluster). Running sandboxes check at that frequently would 
be wasteful.


> On July 18, 2018, 1:51 p.m., James Peach wrote:
> > src/tests/containerizer/xfs_quota_tests.cpp
> > Lines 1197 (patched)
> > <https://reviews.apache.org/r/67914/diff/2/?file=2059410#file2059410line1197>
> >
> >     For consistency with other tests, could we call this 
> > `ProjectIdReclaiming`?
> >     
> >     Can you please add a short comment explaining the purpose of the test?

Done.


> On July 18, 2018, 1:51 p.m., James Peach wrote:
> > src/tests/containerizer/xfs_quota_tests.cpp
> > Lines 1319 (patched)
> > <https://reviews.apache.org/r/67914/diff/2/?file=2059410#file2059410line1319>
> >
> >     Is this re-using the futures an OK thing to do? If they are already 
> > ready from the previous task launch, won't they still be ready when we 
> > await them here?

Yes, it seems to be safe. The `Future` is reset in `FutureArg()` during 
expectations setup: 
https://github.com/apache/mesos/blob/9147283171d761a4d38710f24ba654f8a96e325c/3rdparty/libprocess/include/process/gmock.hpp#L82


> On July 18, 2018, 1:51 p.m., James Peach wrote:
> > src/tests/containerizer/xfs_quota_tests.cpp
> > Lines 1330 (patched)
> > <https://reviews.apache.org/r/67914/diff/2/?file=2059410#file2059410line1330>
> >
> >     This can be `EXPECT_EQ`.

Fixed.


> On July 18, 2018, 1:51 p.m., James Peach wrote:
> > src/tests/containerizer/xfs_quota_tests.cpp
> > Lines 1332 (patched)
> > <https://reviews.apache.org/r/67914/diff/2/?file=2059410#file2059410line1332>
> >
> >     Is the link the `latest` link? Can you add an explanatory comment?

Done.


> On July 18, 2018, 1:51 p.m., James Peach wrote:
> > src/tests/containerizer/xfs_quota_tests.cpp
> > Lines 1333 (patched)
> > <https://reviews.apache.org/r/67914/diff/2/?file=2059410#file2059410line1333>
> >
> >     This can me `EXPECT_SOME_EQ`.

Fixed.


- Ilya


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/67914/#review206203
-----------------------------------------------------------


On July 19, 2018, 4:12 p.m., Ilya Pronin wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/67914/
> -----------------------------------------------------------
> 
> (Updated July 19, 2018, 4:12 p.m.)
> 
> 
> Review request for mesos and James Peach.
> 
> 
> Bugs: MESOS-9007
>     https://issues.apache.org/jira/browse/MESOS-9007
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> Currently upon container destruction its project ID is unallocated by
> the isolator and removed from the container work directory. However due
> to API limitations we can't unset project IDs on symlinks that may exist
> inside the directory. Because of that the project may still exist until
> the container directory is garbage collected. If the project ID is
> reused for a new container, any lingering symlinks that still have that
> project ID will contribute to disk usage of the new container. Typically
> symlinks don't take much space, but still this leads to inaccuracy in
> disk space usage accounting.
> 
> This patch postpones project ID reclaiming until sandbox GC time. The
> isolator periodically checks if sandboxes of terminated containers still
> exist and deallocates project IDs of the ones that were removed. This
> mechanism can be improved in the future if we introduce a way for the
> isolators to learn about disk GCs.
> 
> Current number of available project IDs can be tracked with the new
> "containerizer/mesos/disk/project_ids_free" metric.
> 
> 
> Diffs
> -----
> 
>   src/slave/containerizer/mesos/isolators/xfs/disk.hpp 
> 0891f7709aa4f98758a727856d58e6177d46adca 
>   src/slave/containerizer/mesos/isolators/xfs/disk.cpp 
> 25f52a43b34b141bdaf7c448817423cf4264e22a 
>   src/slave/flags.hpp eeb9708f9ec76d83b6719541f4a012544c7c0cbe 
>   src/slave/flags.cpp 58cdc0f1100fe244e5bf1036e1ccf39478d5d478 
>   src/tests/containerizer/xfs_quota_tests.cpp 
> dc18a8a59d1eb7fae3592ef6ba8c046e4f46ee4a 
> 
> 
> Diff: https://reviews.apache.org/r/67914/diff/3/
> 
> 
> Testing
> -------
> 
> Added `ROOT_XFS_QuotaTest.ProjectIDReclaiming` test that verifies that 
> project ID is reclaimed and reused after sandbox GC. Ran `sudo make check`.
> 
> 
> Thanks,
> 
> Ilya Pronin
> 
>

Reply via email to