Re: Review Request 70757: Added a NNP isolator.

2019-07-18 Thread James Peach

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/70757/#review216755
---


Fix it, then Ship it!




Just some minor formatting style nits.


src/slave/containerizer/mesos/isolators/linux/nnp.hpp
Lines 21 (patched)


These should be the other way round and newline separated:

```
#include "slave/flags.hpp"

#include "slave/containerizer/mesos/isolator.hpp"

```



src/slave/containerizer/mesos/isolators/linux/nnp.cpp
Lines 18 (patched)


You don't need `flags.hpp` again because it is in the isolator header.



src/slave/containerizer/mesos/isolators/linux/nnp.cpp
Lines 22 (patched)


This should come before "common/kernel_version.hpp"


https://github.com/apache/mesos/blob/master/docs/c%2B%2B-style-guide.md#order-of-includes



src/slave/containerizer/mesos/isolators/linux/nnp.cpp
Lines 48 (patched)


Nit: Newline here (I can add it when I commit).



src/tests/containerizer/linux_nnp_isolator_tests.cpp
Lines 19 (patched)


Need to fix order of includes (see c++ style guide).



src/tests/containerizer/linux_nnp_isolator_tests.cpp
Lines 57 (patched)


We can clean up the includes and using statements:
```
$ git diff
diff --git a/src/tests/containerizer/linux_nnp_isolator_tests.cpp 
b/src/tests/containerizer/linux_nnp_isolator_tests.cpp
index 3afd6c08b..8045acfec 100644
--- a/src/tests/containerizer/linux_nnp_isolator_tests.cpp
+++ b/src/tests/containerizer/linux_nnp_isolator_tests.cpp
@@ -14,31 +14,26 @@
 // See the License for the specific language governing permissions and
 // limitations under the License.

-#include 
+#include 
+#include 

-#include "common/kernel_version.hpp"
+#include 

-#include 
 #include 
+#include 

-#include 
-#include 
-#include 
-#include 
 #include 

+#include "common/kernel_version.hpp"
+
 #include "slave/containerizer/mesos/containerizer.hpp"
-#include "slave/containerizer/mesos/isolators/linux/nnp.hpp"

-#include "tests/cluster.hpp"
-#include "tests/environment.hpp"
 #include "tests/mesos.hpp"

 #include "tests/containerizer/docker_archive.hpp"

 using process::Future;
 using process::Owned;
-using process::PID;

 using std::map;
 using std::string;
@@ -47,14 +42,9 @@ using mesos::internal::master::Master;

 using mesos::internal::slave::Containerizer;
 using mesos::internal::slave::Fetcher;
-using mesos::internal::slave::LinuxNNPIsolatorProcess;
 using mesos::internal::slave::MesosContainerizer;
-using mesos::internal::slave::Slave;
-
-using mesos::master::detector::MasterDetector;

 using mesos::slave::ContainerTermination;
-using mesos::slave::Isolator;

 namespace mesos {
 namespace internal {
```


- James Peach


On July 18, 2019, 8:36 p.m., Jacob Janco wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/70757/
> ---
> 
> (Updated July 18, 2019, 8:36 p.m.)
> 
> 
> Review request for mesos, Andrei Budnik, Gilbert Song, Jie Yu, and James 
> Peach.
> 
> 
> Bugs: MESOS-9770
> https://issues.apache.org/jira/browse/MESOS-9770
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> This patch adds the isolation capability of flipping the
> NO_NEW_PRIVILEGES bit for process control.
> 
> 
> Diffs
> -
> 
>   include/mesos/slave/containerizer.proto 
> 2d04f3c182a4274dda527a3da56c894c3c892a12 
>   src/CMakeLists.txt eb4e2ac699121ac16e2ac58e4606b26aae906ad8 
>   src/Makefile.am 761dde1d63e0f4f1ac4ab86f129f84f3746d3153 
>   src/slave/containerizer/mesos/containerizer.cpp 
> c9a369bcdbfc1676912430c3e84fa567bbd7f766 
>   src/slave/containerizer/mesos/isolators/linux/nnp.hpp PRE-CREATION 
>   src/slave/containerizer/mesos/isolators/linux/nnp.cpp PRE-CREATION 
>   src/slave/containerizer/mesos/launch.cpp 
> 0419e8e0fbf154396ab5fb5026d77b94cc9fca5b 
>   src/tests/CMakeLists.txt 5eb9c65f30d9d6ade289f9d47b18d73908b1f1db 
>   src/tests/containerizer/linux_nnp_isolator_tests.cpp PRE-CREATION 
> 
> 
> Diff: https://reviews.apache.org/r/70757/diff/15/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Jacob Janco
> 
>



Re: Review Request 71106: Moved kernelVersion check to common code.

2019-07-18 Thread James Peach

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71106/#review216754
---




src/Makefile.am
Lines 1073 (patched)


And add to `src/CMakeLists.txt` as well (the later review reminded me).


- James Peach


On July 18, 2019, 8:36 p.m., Jacob Janco wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/71106/
> ---
> 
> (Updated July 18, 2019, 8:36 p.m.)
> 
> 
> Review request for mesos and James Peach.
> 
> 
> Bugs: MESOS-9770
> https://issues.apache.org/jira/browse/MESOS-9770
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Moved kernelVersion check to common code.
> 
> 
> Diffs
> -
> 
>   src/Makefile.am 761dde1d63e0f4f1ac4ab86f129f84f3746d3153 
>   src/common/kernel_version.hpp PRE-CREATION 
>   src/common/kernel_version.cpp PRE-CREATION 
>   src/linux/ns.cpp 2440bb2232dfd9cf57a6cc36aeff874c96297c4b 
> 
> 
> Diff: https://reviews.apache.org/r/71106/diff/2/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Jacob Janco
> 
>



Re: Review Request 71080: Master should store the list of completed framework ids for lifecycle.

2019-07-18 Thread Xudong Ni via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71080/
---

(Updated July 19, 2019, 3:10 a.m.)


Review request for mesos, James Peach and Jiang Yan Xu.


Bugs: MESOS-8636
https://issues.apache.org/jira/browse/MESOS-8636


Repository: mesos


Description
---

It should be separately from that for webUI and endpoints.
Currently the master stores the history of completed frameworks in
a map with the full historical data of the framework, it is
prohibitively expensive to keep a long history; In order to reject
frameworks from reregistering if they have previously marked as
completed, we only need to persist the framework ids and are able
to keep long history.


Diffs (updated)
-

  docs/configuration/master.md c56ac8510ea968f9587e23e81ed310caa968ee9e 
  docs/operator-http-api.md 2d4a9b66e20cf19eceec718b7de3d812ab285772 
  include/mesos/allocator/allocator.hpp 
2bab53ab5fb25931a724c20a039e1301983ba574 
  src/master/constants.hpp 26afa356b7844b4ec6c2caeef33bd39c51148d5f 
  src/master/flags.hpp 78623d68bf428cd3f52684303d98a525d42eb878 
  src/master/flags.cpp 74f4daadd48e8e691be43759b88dc8b3c2df489a 
  src/master/http.cpp 765d5052ab6a8d731113f8922d20fb280b843003 
  src/master/master.hpp ffa7423ba533725f7c1123d9aa507b1348e7f281 
  src/master/master.cpp f1ca637b4cb0382caec53b5a81f6a4eb46f4dd2d 


Diff: https://reviews.apache.org/r/71080/diff/5/

Changes: https://reviews.apache.org/r/71080/diff/4-5/


Testing
---

make check

sudo GLOG_v=1 ./bin/mesos-tests.sh --verbose 
--gtest_filter=MasterTest.MaxCompletedFrameworksFlag --gtest_break_on_failure 
--gtest_repeat=1000
[   OK ] MasterTest.MaxCompletedFrameworksFlag (230 ms)
[--] 1 test from MasterTest (235 ms total)

[--] Global test environment tear-down
[==] 1 test from 1 test case ran. (281 ms total)
[  PASSED  ] 1 test.


Thanks,

Xudong Ni



Re: Review Request 71107: Added docs for the NNP isolator.

2019-07-18 Thread James Peach

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71107/#review216753
---



Please update `docs/upgrades.md` as well.


CHANGELOG
Lines 22 (patched)


Suggest:
```
to support enabling
```



docs/isolators/linux-nnp.md
Lines 2 (patched)


Space before the '('?



docs/isolators/linux-nnp.md
Lines 14 (patched)


I didn't find this paragraph very clear. Suggest:

```
The `no_new_privs` bit disables the ability of container tasks to acquire 
any additional privileges by means of executing a child process (e.g. by 
executing a setuid program).
```



docs/isolators/linux-nnp.md
Lines 18 (patched)


This little paragraph seems lonely by itself. We can just add a sentence to 
the preceeding paragraph:

```
To enable the `linux/nnp` isolator, append `linux/nnp` to the 
[`--isolation`](../configuration/agent.md#isolation) flag when starting the 
Mesos agent.
```


- James Peach


On July 18, 2019, 8:13 p.m., Jacob Janco wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/71107/
> ---
> 
> (Updated July 18, 2019, 8:13 p.m.)
> 
> 
> Review request for mesos and James Peach.
> 
> 
> Bugs: MESOS-9770
> https://issues.apache.org/jira/browse/MESOS-9770
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Added docs for the NNP isolator.
> 
> 
> Diffs
> -
> 
>   CHANGELOG 164465a71c660ab9f01fb18d43076afc4b892ad5 
>   docs/isolators/linux-nnp.md PRE-CREATION 
>   docs/mesos-containerizer.md e79976111ec8e9cc8e8d44b5f1b8d6e2c7e072d6 
> 
> 
> Diff: https://reviews.apache.org/r/71107/diff/2/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Jacob Janco
> 
>



Re: Review Request 71106: Moved kernelVersion check to common code.

2019-07-18 Thread James Peach

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71106/#review216752
---


Ship it!




Ship It!

- James Peach


On July 18, 2019, 8:36 p.m., Jacob Janco wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/71106/
> ---
> 
> (Updated July 18, 2019, 8:36 p.m.)
> 
> 
> Review request for mesos and James Peach.
> 
> 
> Bugs: MESOS-9770
> https://issues.apache.org/jira/browse/MESOS-9770
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Moved kernelVersion check to common code.
> 
> 
> Diffs
> -
> 
>   src/Makefile.am 761dde1d63e0f4f1ac4ab86f129f84f3746d3153 
>   src/common/kernel_version.hpp PRE-CREATION 
>   src/common/kernel_version.cpp PRE-CREATION 
>   src/linux/ns.cpp 2440bb2232dfd9cf57a6cc36aeff874c96297c4b 
> 
> 
> Diff: https://reviews.apache.org/r/71106/diff/2/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Jacob Janco
> 
>



Re: Review Request 71111: Added offer rescind logic for guarantees enforcement.

2019-07-18 Thread Meng Zhu


> On July 18, 2019, 10:32 a.m., Benjamin Mahler wrote:
> > src/master/quota_handler.cpp
> > Lines 651 (patched)
> > 
> >
> > Ditto last review, but in this case the presence of reservations in 
> > both leads to an under-rescind, which seems less acceptable vs an 
> > over-rescind?

Given that we are being very pessimistic here (assuming no available 
resources), we are more likely to over rescind than under. Hopefully, it is OK. 
See my comment in previous patch as well.


> On July 18, 2019, 10:32 a.m., Benjamin Mahler wrote:
> > src/tests/master_quota_tests.cpp
> > Lines 1691-1693 (patched)
> > 
> >
> > Ditto, probably want to pull out the test to ease reviewing?

This is a small patch. Would prefer to keep the test with it.


- Meng


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/7/#review216737
---


On July 17, 2019, 9:30 p.m., Meng Zhu wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/7/
> ---
> 
> (Updated July 17, 2019, 9:30 p.m.)
> 
> 
> Review request for mesos, Andrei Sekretenko and Benjamin Mahler.
> 
> 
> Bugs: MESOS-9812
> https://issues.apache.org/jira/browse/MESOS-9812
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Outstanding offers need to be rescinded as needed to ensure
> roles' requested guarantees can be satisfied.
> 
> For simplicity, and also due to the race between the master and
> the allocator, we pessimistically assume that what seems like
> "available" resources in the allocator are all gone. We greedily
> rescind offers until we can satisfy the guarantees.
> 
> Also added a test.
> 
> 
> Diffs
> -
> 
>   src/master/quota_handler.cpp 2a3ca5640c850ec8a8aca7abed45386b5b5e1125 
>   src/tests/master_quota_tests.cpp 64a9b31a38af23fb16a4afd04f6d3413c7d1ffe9 
> 
> 
> Diff: https://reviews.apache.org/r/7/diff/2/
> 
> 
> Testing
> ---
> 
> make check
> 
> 
> Thanks,
> 
> Meng Zhu
> 
>



Re: Review Request 71111: Added offer rescind logic for guarantees enforcement.

2019-07-18 Thread Meng Zhu


> On July 18, 2019, 3:56 a.m., Andrei Sekretenko wrote:
> > src/master/quota_handler.cpp
> > Lines 664 (patched)
> > 
> >
> > Does this mean that we might rescind an offer even if `consumption + 
> > offered` for the offer's role is within guarantees of that role? 
> > 
> > If yes - do we really have to be THAT greedy?...
> 
> Andrei Sekretenko wrote:
> Edit: s/is within guarantees/is going to be within guarantees/

Yes, it is inaccurate. But calculating a more correct amount would be 
expensive. And I do not feel it is necessary. See more details in my message in 
the public slack: https://mesos.slack.com/archives/C1NEKP60K/p1563498805199200


- Meng


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/7/#review216717
---


On July 17, 2019, 9:30 p.m., Meng Zhu wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/7/
> ---
> 
> (Updated July 17, 2019, 9:30 p.m.)
> 
> 
> Review request for mesos, Andrei Sekretenko and Benjamin Mahler.
> 
> 
> Bugs: MESOS-9812
> https://issues.apache.org/jira/browse/MESOS-9812
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Outstanding offers need to be rescinded as needed to ensure
> roles' requested guarantees can be satisfied.
> 
> For simplicity, and also due to the race between the master and
> the allocator, we pessimistically assume that what seems like
> "available" resources in the allocator are all gone. We greedily
> rescind offers until we can satisfy the guarantees.
> 
> Also added a test.
> 
> 
> Diffs
> -
> 
>   src/master/quota_handler.cpp 2a3ca5640c850ec8a8aca7abed45386b5b5e1125 
>   src/tests/master_quota_tests.cpp 64a9b31a38af23fb16a4afd04f6d3413c7d1ffe9 
> 
> 
> Diff: https://reviews.apache.org/r/7/diff/2/
> 
> 
> Testing
> ---
> 
> make check
> 
> 
> Thanks,
> 
> Meng Zhu
> 
>



Re: Review Request 71068: Added offer rescind logic for limits enforcement.

2019-07-18 Thread Meng Zhu


> On July 18, 2019, 10:28 a.m., Benjamin Mahler wrote:
> > Test looks good if you want to pull it out and get a ship it on it 
> > separately, that would be easier.

It's a small patch. I prefer keeping the code with the test. I found it would 
make reading the commit easier in retrospect.


> On July 18, 2019, 10:28 a.m., Benjamin Mahler wrote:
> > src/master/quota_handler.cpp
> > Lines 602-609 (patched)
> > 
> >
> > This actually seems somewhat unacceptable, I wonder if we can do 
> > something with what we have today, e.g.:
> > 
> > ```
> > Rescind logic
> > 
> > allocatorUpdatedFuture
> >   .then(defer(master, []() { Rescind logic })
> > ```
> > 
> > That is, we first rescind enough offers from those that are known to 
> > the master, then, after the allocator is done updating the quota (which 
> > means it won't enqueue more offers violating the limits), we do another 
> > round of rescinding. With this logic, it's still possible though for a 
> > framework to have accepted an offer that arrived after the initial rescind 
> > but before the allocator finished updating the quota.. so maybe it's not 
> > worth it.
> > 
> > Do we have a ticket that captures this issue? Would be good to link all 
> > the problems that are caused by the current offer ownership model.

Yeah, I prefer to keep the rescind logic simple. See my public slack comments 
for more details: https://mesos.slack.com/archives/C1NEKP60K/p1563498805199200

Re ticket, yes, https://issues.apache.org/jira/browse/MESOS-8639


- Meng


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71068/#review216736
---


On July 18, 2019, 10:23 a.m., Meng Zhu wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/71068/
> ---
> 
> (Updated July 18, 2019, 10:23 a.m.)
> 
> 
> Review request for mesos and Benjamin Mahler.
> 
> 
> Bugs: MESOS-9812
> https://issues.apache.org/jira/browse/MESOS-9812
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Outstanding offers of a role (including offers allocated to its
> subroles) are rescinded until the sum of its consumed quota and
> outstanding offered resources are below the requested limits.
> 
> The rescind logic here, however, overlooks the race between master
> and the allocator. Namely, there might be pending offers in the master
> mailbox that have yet to be processed. Here, for simplicity, we ignore
> this race. To properly handle this, the plan is to move offer
> management logic into the allocator and rescind offers there.
> 
> Also added a test.
> 
> 
> Diffs
> -
> 
>   src/master/quota_handler.cpp 2a3ca5640c850ec8a8aca7abed45386b5b5e1125 
>   src/tests/master_quota_tests.cpp 64a9b31a38af23fb16a4afd04f6d3413c7d1ffe9 
> 
> 
> Diff: https://reviews.apache.org/r/71068/diff/3/
> 
> 
> Testing
> ---
> 
> Ran continously without failure.
> 
> 
> Thanks,
> 
> Meng Zhu
> 
>



Re: Review Request 71080: Master should store the list of completed framework ids for lifecycle.

2019-07-18 Thread James Peach

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71080/#review216747
---




docs/configuration/master.md
Lines 449 (patched)


Let's rephrase this to improve clarity. I suggest:

```
Maximum number of IDs stored for completed framework detection. This
value can be set much higher than `--max_completed_frameworks`, and should 
be tuned higher for clusters that have a high rate of framework churn. 
(default: 100)
```

If it can be larger, then we can default it to a lerger value, let's do 100.


- James Peach


On July 19, 2019, 12:09 a.m., Xudong Ni wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/71080/
> ---
> 
> (Updated July 19, 2019, 12:09 a.m.)
> 
> 
> Review request for mesos, James Peach and Jiang Yan Xu.
> 
> 
> Bugs: MESOS-8636
> https://issues.apache.org/jira/browse/MESOS-8636
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> It should be separately from that for webUI and endpoints.
> Currently the master stores the history of completed frameworks in
> a map with the full historical data of the framework, it is
> prohibitively expensive to keep a long history; In order to reject
> frameworks from reregistering if they have previously marked as
> completed, we only need to persist the framework ids and are able
> to keep long history.
> 
> 
> Diffs
> -
> 
>   docs/configuration/master.md c56ac8510ea968f9587e23e81ed310caa968ee9e 
>   docs/operator-http-api.md 2d4a9b66e20cf19eceec718b7de3d812ab285772 
>   include/mesos/allocator/allocator.hpp 
> 2bab53ab5fb25931a724c20a039e1301983ba574 
>   src/master/flags.hpp 78623d68bf428cd3f52684303d98a525d42eb878 
>   src/master/flags.cpp 74f4daadd48e8e691be43759b88dc8b3c2df489a 
>   src/master/http.cpp 765d5052ab6a8d731113f8922d20fb280b843003 
>   src/master/master.hpp ffa7423ba533725f7c1123d9aa507b1348e7f281 
>   src/master/master.cpp f1ca637b4cb0382caec53b5a81f6a4eb46f4dd2d 
> 
> 
> Diff: https://reviews.apache.org/r/71080/diff/4/
> 
> 
> Testing
> ---
> 
> make check
> 
> sudo GLOG_v=1 ./bin/mesos-tests.sh --verbose 
> --gtest_filter=MasterTest.MaxCompletedFrameworksFlag --gtest_break_on_failure 
> --gtest_repeat=1000
> [   OK ] MasterTest.MaxCompletedFrameworksFlag (230 ms)
> [--] 1 test from MasterTest (235 ms total)
> 
> [--] Global test environment tear-down
> [==] 1 test from 1 test case ran. (281 ms total)
> [  PASSED  ] 1 test.
> 
> 
> Thanks,
> 
> Xudong Ni
> 
>



Re: Review Request 71080: Master should store the list of completed framework ids for lifecycle.

2019-07-18 Thread Xudong Ni via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71080/
---

(Updated July 19, 2019, 12:09 a.m.)


Review request for mesos, James Peach and Jiang Yan Xu.


Bugs: MESOS-8636
https://issues.apache.org/jira/browse/MESOS-8636


Repository: mesos


Description
---

It should be separately from that for webUI and endpoints.
Currently the master stores the history of completed frameworks in
a map with the full historical data of the framework, it is
prohibitively expensive to keep a long history; In order to reject
frameworks from reregistering if they have previously marked as
completed, we only need to persist the framework ids and are able
to keep long history.


Diffs (updated)
-

  docs/configuration/master.md c56ac8510ea968f9587e23e81ed310caa968ee9e 
  docs/operator-http-api.md 2d4a9b66e20cf19eceec718b7de3d812ab285772 
  include/mesos/allocator/allocator.hpp 
2bab53ab5fb25931a724c20a039e1301983ba574 
  src/master/flags.hpp 78623d68bf428cd3f52684303d98a525d42eb878 
  src/master/flags.cpp 74f4daadd48e8e691be43759b88dc8b3c2df489a 
  src/master/http.cpp 765d5052ab6a8d731113f8922d20fb280b843003 
  src/master/master.hpp ffa7423ba533725f7c1123d9aa507b1348e7f281 
  src/master/master.cpp f1ca637b4cb0382caec53b5a81f6a4eb46f4dd2d 


Diff: https://reviews.apache.org/r/71080/diff/4/

Changes: https://reviews.apache.org/r/71080/diff/3-4/


Testing
---

make check

sudo GLOG_v=1 ./bin/mesos-tests.sh --verbose 
--gtest_filter=MasterTest.MaxCompletedFrameworksFlag --gtest_break_on_failure 
--gtest_repeat=1000
[   OK ] MasterTest.MaxCompletedFrameworksFlag (230 ms)
[--] 1 test from MasterTest (235 ms total)

[--] Global test environment tear-down
[==] 1 test from 1 test case ran. (281 ms total)
[  PASSED  ] 1 test.


Thanks,

Xudong Ni



Re: Review Request 71080: Master should store the list of completed framework ids for lifecycle.

2019-07-18 Thread Gastón Kleiman

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71080/#review216746
---


Fix it, then Ship it!




Gave it a ship it. HoweverI'm not touching the master that much these days, so 
it would be nice to get a review from Greg/Benno/Joseph =).


docs/configuration/master.md
Lines 447 (patched)


Nit: s/ids/IDs/



src/master/flags.cpp
Lines 599-601 (patched)


Nit: s/id/ID/g



src/master/master.hpp
Lines 2215 (patched)


We only care about the key (framework ID), so using a bounded hashmap feels 
a bit wasteful. However stout doesn't provide other bounded data structures and 
I don't think that we should add a new one just for this.



src/master/master.cpp
Lines 11402 (patched)


Nit: s/ids/IDs/


- Gastón Kleiman


On July 17, 2019, 11:58 a.m., Xudong Ni wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/71080/
> ---
> 
> (Updated July 17, 2019, 11:58 a.m.)
> 
> 
> Review request for mesos, James Peach and Jiang Yan Xu.
> 
> 
> Bugs: MESOS-8636
> https://issues.apache.org/jira/browse/MESOS-8636
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> It should be separately from that for webUI and endpoints.
> Currently the master stores the history of completed frameworks in
> a map with the full historical data of the framework, it is
> prohibitively expensive to keep a long history; In order to reject
> frameworks from reregistering if they have previously marked as
> completed, we only need to persist the framework ids and are able
> to keep long history.
> 
> 
> Diffs
> -
> 
>   docs/configuration/master.md c56ac8510ea968f9587e23e81ed310caa968ee9e 
>   docs/operator-http-api.md 2d4a9b66e20cf19eceec718b7de3d812ab285772 
>   include/mesos/allocator/allocator.hpp 
> 2bab53ab5fb25931a724c20a039e1301983ba574 
>   src/master/flags.hpp 78623d68bf428cd3f52684303d98a525d42eb878 
>   src/master/flags.cpp 74f4daadd48e8e691be43759b88dc8b3c2df489a 
>   src/master/http.cpp 765d5052ab6a8d731113f8922d20fb280b843003 
>   src/master/master.hpp ffa7423ba533725f7c1123d9aa507b1348e7f281 
>   src/master/master.cpp f1ca637b4cb0382caec53b5a81f6a4eb46f4dd2d 
> 
> 
> Diff: https://reviews.apache.org/r/71080/diff/3/
> 
> 
> Testing
> ---
> 
> make check
> 
> sudo GLOG_v=1 ./bin/mesos-tests.sh --verbose 
> --gtest_filter=MasterTest.MaxCompletedFrameworksFlag --gtest_break_on_failure 
> --gtest_repeat=1000
> [   OK ] MasterTest.MaxCompletedFrameworksFlag (230 ms)
> [--] 1 test from MasterTest (235 ms total)
> 
> [--] Global test environment tear-down
> [==] 1 test from 1 test case ran. (281 ms total)
> [  PASSED  ] 1 test.
> 
> 
> Thanks,
> 
> Xudong Ni
> 
>



Re: Review Request 71111: Added offer rescind logic for guarantees enforcement.

2019-07-18 Thread Mesos Reviewbot

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/7/#review216745
---



Patch looks great!

Reviews applied: [71109, 71110, 71061, 71068, 7]

Passed command: export OS='ubuntu:14.04' BUILDTOOL='autotools' COMPILER='gcc' 
CONFIGURATION='--verbose --disable-libtool-wrappers 
--disable-parallel-test-execution' ENVIRONMENT='GLOG_v=1 MESOS_VERBOSE=1'; 
./support/docker-build.sh

- Mesos Reviewbot


On July 18, 2019, 4:30 a.m., Meng Zhu wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/7/
> ---
> 
> (Updated July 18, 2019, 4:30 a.m.)
> 
> 
> Review request for mesos, Andrei Sekretenko and Benjamin Mahler.
> 
> 
> Bugs: MESOS-9812
> https://issues.apache.org/jira/browse/MESOS-9812
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Outstanding offers need to be rescinded as needed to ensure
> roles' requested guarantees can be satisfied.
> 
> For simplicity, and also due to the race between the master and
> the allocator, we pessimistically assume that what seems like
> "available" resources in the allocator are all gone. We greedily
> rescind offers until we can satisfy the guarantees.
> 
> Also added a test.
> 
> 
> Diffs
> -
> 
>   src/master/quota_handler.cpp 2a3ca5640c850ec8a8aca7abed45386b5b5e1125 
>   src/tests/master_quota_tests.cpp 64a9b31a38af23fb16a4afd04f6d3413c7d1ffe9 
> 
> 
> Diff: https://reviews.apache.org/r/7/diff/2/
> 
> 
> Testing
> ---
> 
> make check
> 
> 
> Thanks,
> 
> Meng Zhu
> 
>



Re: Review Request 70757: Added a NNP isolator.

2019-07-18 Thread Jacob Janco

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/70757/
---

(Updated July 18, 2019, 8:36 p.m.)


Review request for mesos, Andrei Budnik, Gilbert Song, Jie Yu, and James Peach.


Bugs: MESOS-9770
https://issues.apache.org/jira/browse/MESOS-9770


Repository: mesos


Description
---

This patch adds the isolation capability of flipping the
NO_NEW_PRIVILEGES bit for process control.


Diffs (updated)
-

  include/mesos/slave/containerizer.proto 
2d04f3c182a4274dda527a3da56c894c3c892a12 
  src/CMakeLists.txt eb4e2ac699121ac16e2ac58e4606b26aae906ad8 
  src/Makefile.am 761dde1d63e0f4f1ac4ab86f129f84f3746d3153 
  src/slave/containerizer/mesos/containerizer.cpp 
c9a369bcdbfc1676912430c3e84fa567bbd7f766 
  src/slave/containerizer/mesos/isolators/linux/nnp.hpp PRE-CREATION 
  src/slave/containerizer/mesos/isolators/linux/nnp.cpp PRE-CREATION 
  src/slave/containerizer/mesos/launch.cpp 
0419e8e0fbf154396ab5fb5026d77b94cc9fca5b 
  src/tests/CMakeLists.txt 5eb9c65f30d9d6ade289f9d47b18d73908b1f1db 
  src/tests/containerizer/linux_nnp_isolator_tests.cpp PRE-CREATION 


Diff: https://reviews.apache.org/r/70757/diff/15/

Changes: https://reviews.apache.org/r/70757/diff/14-15/


Testing
---


Thanks,

Jacob Janco



Re: Review Request 71106: Moved kernelVersion check to common code.

2019-07-18 Thread Jacob Janco

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71106/
---

(Updated July 18, 2019, 8:15 p.m.)


Review request for mesos.


Repository: mesos


Description
---

Moved kernelVersion check to common code.


Diffs (updated)
-

  src/Makefile.am 761dde1d63e0f4f1ac4ab86f129f84f3746d3153 
  src/common/kernel_version.hpp PRE-CREATION 
  src/common/kernel_version.cpp PRE-CREATION 
  src/linux/ns.cpp 2440bb2232dfd9cf57a6cc36aeff874c96297c4b 


Diff: https://reviews.apache.org/r/71106/diff/2/

Changes: https://reviews.apache.org/r/71106/diff/1-2/


Testing
---


Thanks,

Jacob Janco



Re: Review Request 71106: Moved kernelVersion check to common code.

2019-07-18 Thread Jacob Janco


> On July 18, 2019, 7:37 a.m., James Peach wrote:
> > src/common/kernel_version.hpp
> > Lines 17 (patched)
> > 
> >
> > This should be `__KERNEL_VERSION_HPP__` to match the file name.

Ah missed that fixed.


> On July 18, 2019, 7:37 a.m., James Peach wrote:
> > src/common/kernel_version.hpp
> > Lines 24 (patched)
> > 
> >
> > Once you move the function body, you can remove the `using` statements 
> > and the only remaingin includes should be `try.hpp` and `stout/version.hpp`.

Fixed.


> On July 18, 2019, 7:37 a.m., James Peach wrote:
> > src/common/kernel_version.hpp
> > Lines 31 (patched)
> > 
> >
> > Can you please move this into `kernel_version.cpp`?

Sure


- Jacob


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71106/#review216712
---


On July 17, 2019, 8:38 p.m., Jacob Janco wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/71106/
> ---
> 
> (Updated July 17, 2019, 8:38 p.m.)
> 
> 
> Review request for mesos.
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Moved kernelVersion check to common code.
> 
> 
> Diffs
> -
> 
>   src/common/kernel_version.hpp PRE-CREATION 
>   src/linux/ns.cpp 2440bb2232dfd9cf57a6cc36aeff874c96297c4b 
> 
> 
> Diff: https://reviews.apache.org/r/71106/diff/1/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Jacob Janco
> 
>



Re: Review Request 70728: Backed `MockResourceProvider` by a process.

2019-07-18 Thread Chun-Hung Hsiao

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/70728/#review216741
---




src/tests/master_tests.cpp
Line 9017 (original), 9017 (patched)


There are two data races here:
1. `updateSlaveMessage` being ready doesn't mean that the `SUBSCRIBED` has 
been received by the resource provider process.
2. This line and the `subscribed` function accesses `info` in two differen 
threads without a proper synchronization.

I'd suggest that we have a `Future` or 
`Future` or `Future` in `MockResourceProvider` 
that is set up in the `subscribed` callback, and we wait for the future here, 
then avoid accessing `process->info` directly in the remaining of this test. If 
you don't want a copy, a reference that is set up after the wait is also okay 
since it would be clear that we have validated the reference. Maybe something 
like
```
AWAIT_READY(resourceProvider->info());
const ResourceProviderID& resourceProviderId = 
resourceProvider->info().id();
```



src/tests/master_tests.cpp
Line 9246 (original), 9247 (patched)


Let's not run the actor's member functions in a different context:
```
dispatch(resourceProvider.process, 
::operationDefault, operation.get());
```



src/tests/master_tests.cpp
Line 9376 (original), 9377 (patched)


We need to have a proper synchronization here to avoid data races. See my 
comments in `UpdateSlaveMessageWithPendingOffers`.

Also, let's avoid accessing `process->info` directly in the remaining of 
this test.



src/tests/master_tests.cpp
Line 9472 (original), 9473 (patched)


Let's not run the actor's member functions in a different context:
```
dispatch(resourceProvider.process, 
::operationDefault, applyOperation.get());
```



src/tests/mesos.hpp
Line 3030 (original), 3041 (patched)


No need to have a type alias here if you use `Self`.



src/tests/mesos.hpp
Line 3037 (original), 3048 (patched)


Not yours, but this template parameter is not used in the class template.



src/tests/mesos.hpp
Lines 3062 (patched)


Let's use `Self::connectDefault` instead. Ditto below.



src/tests/mesos.hpp
Line 3152 (original), 3168 (patched)


s/`MockResourceProviderProcessT`/`Self`/. Ditto below.



src/tests/mesos.hpp
Line 3270 (original), 3269 (patched)


Not yours, but since `Source` can be deduced from `Resoruce`, let's use 
`Resource::DiskInfo::Source` here and remove the `Source` template parameter 
above.



src/tests/mesos.hpp
Lines 3322 (patched)


Just a naming suggestion. Since `MockResourceProvider` is no longer a 
"mock" class (i.e., has no mock function), how about renaming it to 
`TestResourceProvider`, and the backing process becomes 
`TestResourceProviderProcess`? This is the pattern I used for 
`TestDiskProfileServer`.

Feel free to drop this though.



src/tests/mesos.hpp
Lines 3335 (patched)


Let's do `process::terminate(*process)` instead and remove the `terminate` 
function. See below.



src/tests/mesos.hpp
Lines 3350 (patched)


The use of this `terminate` function in other tests are error prone to me. 
Instead, let's have a `stop` function in `MockResourceProviderProcess`:
```
void stop()
{
  driver.reset();
}
```
And get rid of this function. See my comments in 
`ResubscribeResourceProvider`.

Also, this function provides a second way to stop the resource provider 
other than `resourceProvider.reset()`, and we use both in tests. I'd suggesting 
unify them by removing this function.

As of the name, I'd prefer `stop` over `terminate` to match the `start` 
call.



src/tests/mesos.hpp
Lines 3361-3372 (patched)


No need to introduce this type alias in this class template since the class 
template `MockResourceProviderProcess` is already public.



src/tests/operation_reconciliation_tests.cpp
Line 1760 (original), 1760 (patched)


We need to have a proper synchronization here to avoid data races. See my 
comments in `UpdateSlaveMessageWithPendingOffers`.



src/tests/resource_provider_manager_tests.cpp
Line 1167 (original), 1167 (patched)

Re: Review Request 71119: Added a test to ensure `UPDATE_QUOTA` is applied all-or-nothing.

2019-07-18 Thread Benjamin Mahler

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71119/#review216744
---


Fix it, then Ship it!





src/tests/master_quota_tests.cpp
Lines 299 (patched)


this is strange.. why are there stringify calls on these string literals?



src/tests/master_quota_tests.cpp
Lines 306-307 (patched)


Maybe mention that the purpose is to have a portion of an UPDATE_QUOTA be 
unauthorized?



src/tests/master_quota_tests.cpp
Lines 355-401 (patched)


I suppose using the /quota or GET_QUOTA API will make this test less 
brittle and verbose if we add things to the /roles endpoint.

Or, you could also use a sub-json contains check if you want to avoid 
having to specify 'resources', 'allocated', etc fields? That would make it less 
brittle.

E.g.


https://github.com/apache/mesos/blob/master/src/tests/role_tests.cpp#L452-L475


- Benjamin Mahler


On July 18, 2019, 7 p.m., Meng Zhu wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/71119/
> ---
> 
> (Updated July 18, 2019, 7 p.m.)
> 
> 
> Review request for mesos and Benjamin Mahler.
> 
> 
> Bugs: MESOS-8968
> https://issues.apache.org/jira/browse/MESOS-8968
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Added a test to ensure `UPDATE_QUOTA` is applied all-or-nothing.
> 
> 
> Diffs
> -
> 
>   src/tests/master_quota_tests.cpp 64a9b31a38af23fb16a4afd04f6d3413c7d1ffe9 
> 
> 
> Diff: https://reviews.apache.org/r/71119/diff/1/
> 
> 
> Testing
> ---
> 
> make check
> 
> 
> Thanks,
> 
> Meng Zhu
> 
>



Re: Review Request 71118: Update clang and cmake for arm docker build.

2019-07-18 Thread Tomasz Janiszewski

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71118/
---

(Updated Lip 18, 2019, 7:17 po południu)


Review request for mesos and Benjamin Bannier.


Repository: mesos


Description
---

Update clang and cmake for arm docker build.


Diffs (updated)
-

  support/mesos-build/ubuntu-16.04-arm.dockerfile 
d223202e38d1d78554af9c95de1ee78a70dc3818 


Diff: https://reviews.apache.org/r/71118/diff/4/

Changes: https://reviews.apache.org/r/71118/diff/3-4/


Testing
---


Thanks,

Tomasz Janiszewski



Re: Review Request 71118: Update clang and cmake for arm docker build.

2019-07-18 Thread Tomasz Janiszewski

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71118/
---

(Updated Lip 18, 2019, 7:12 po południu)


Review request for mesos and Benjamin Bannier.


Repository: mesos


Description
---

Update clang and cmake for arm docker build.


Diffs (updated)
-

  support/mesos-build/ubuntu-16.04-arm.dockerfile 
d223202e38d1d78554af9c95de1ee78a70dc3818 


Diff: https://reviews.apache.org/r/71118/diff/3/

Changes: https://reviews.apache.org/r/71118/diff/2-3/


Testing
---


Thanks,

Tomasz Janiszewski



Re: Review Request 71118: Update clang and cmake for arm docker build.

2019-07-18 Thread Tomasz Janiszewski

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71118/
---

(Updated Lip 18, 2019, 7:09 po południu)


Review request for mesos and Benjamin Bannier.


Repository: mesos


Description
---

Update clang and cmake for arm docker build.


Diffs (updated)
-

  support/mesos-build/ubuntu-16.04-arm.dockerfile 
d223202e38d1d78554af9c95de1ee78a70dc3818 


Diff: https://reviews.apache.org/r/71118/diff/2/

Changes: https://reviews.apache.org/r/71118/diff/1-2/


Testing
---


Thanks,

Tomasz Janiszewski



Re: Review Request 71118: Update clang and cmake for arm docker build.

2019-07-18 Thread Benjamin Bannier

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71118/#review216743
---


Fix it, then Ship it!





support/mesos-build/ubuntu-16.04-arm.dockerfile
Lines 61-62 (patched)


Can we instead "properly" expand the tarball into the system prefix? 
Something like

$ cp -R clang+llvm-8.0.0-aarch64-linux-gnu/* /usr/

should work.


- Benjamin Bannier


On July 18, 2019, 7:57 p.m., Tomasz Janiszewski wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/71118/
> ---
> 
> (Updated July 18, 2019, 7:57 p.m.)
> 
> 
> Review request for mesos and Benjamin Bannier.
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Update clang and cmake for arm docker build.
> 
> 
> Diffs
> -
> 
>   support/mesos-build/ubuntu-16.04-arm.dockerfile 
> d223202e38d1d78554af9c95de1ee78a70dc3818 
> 
> 
> Diff: https://reviews.apache.org/r/71118/diff/1/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Tomasz Janiszewski
> 
>



Review Request 71119: Added a test to ensure `UPDATE_QUOTA` is applied all-or-nothing.

2019-07-18 Thread Meng Zhu

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71119/
---

Review request for mesos and Benjamin Mahler.


Bugs: MESOS-8968
https://issues.apache.org/jira/browse/MESOS-8968


Repository: mesos


Description
---

Added a test to ensure `UPDATE_QUOTA` is applied all-or-nothing.


Diffs
-

  src/tests/master_quota_tests.cpp 64a9b31a38af23fb16a4afd04f6d3413c7d1ffe9 


Diff: https://reviews.apache.org/r/71119/diff/1/


Testing
---

make check


Thanks,

Meng Zhu



Re: Review Request 71083: Added enchanced multi-role capability support to the python bindings.

2019-07-18 Thread Benjamin Mahler

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71083/#review216740
---



Looks good, can you update the 'Testing Done'? Were you at least able to 
manually test these calls?

- Benjamin Mahler


On July 16, 2019, 3:12 p.m., Andrei Sekretenko wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/71083/
> ---
> 
> (Updated July 16, 2019, 3:12 p.m.)
> 
> 
> Review request for mesos, Benjamin Mahler and Meng Zhu.
> 
> 
> Bugs: MESOS-9864
> https://issues.apache.org/jira/browse/MESOS-9864
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> This patch adds suppressed roles list to the arguments of the scheduler
> driver constructor and `revivieOffers()`/`suppressOffers` methods, and
> also adds support of the `updateFramework()` method.
> 
> 
> Diffs
> -
> 
>   src/python/interface/src/mesos/interface/__init__.py 
> 1200ef652f6458b9bbcd29b4d93d34c28230c7d0 
>   src/python/scheduler/src/mesos/scheduler/mesos_scheduler_driver_impl.hpp 
> 8c98d46fb842595956d58a3f91f805e44be2bcf2 
>   src/python/scheduler/src/mesos/scheduler/mesos_scheduler_driver_impl.cpp 
> 6dec9daa295cfc67176fb043322b8e98efdd7d4b 
> 
> 
> Diff: https://reviews.apache.org/r/71083/diff/1/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Andrei Sekretenko
> 
>



Re: Review Request 71082: Added a helper to create std::vector from iterable into python bindings.

2019-07-18 Thread Benjamin Mahler

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71082/#review216739
---



Looks like I will wait until the egg issue is fixed


src/python/native_common/common.hpp
Lines 138 (patched)


stout Try would be more appropriate?



src/python/native_common/common.cpp
Lines 41 (patched)


Maybe some more context to the stderr message?

Failed to construct string from object: item is not a string


- Benjamin Mahler


On July 16, 2019, 3:11 p.m., Andrei Sekretenko wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/71082/
> ---
> 
> (Updated July 16, 2019, 3:11 p.m.)
> 
> 
> Review request for mesos, Benjamin Mahler and Meng Zhu.
> 
> 
> Bugs: MESOS-9864
> https://issues.apache.org/jira/browse/MESOS-9864
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Added a helper to create std::vector from iterable into python bindings.
> 
> 
> Diffs
> -
> 
>   src/python/native_common/common.hpp 
> ed743aab014301981fe2526a7e81f4a35c02e19b 
>   src/python/native_common/common.cpp PRE-CREATION 
>   src/python/native_common/ext_modules.py.in 
> 38e6717ec3b7bb09cfbb9b2fa25f173a17d78f51 
> 
> 
> Diff: https://reviews.apache.org/r/71082/diff/1/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Andrei Sekretenko
> 
>



Review Request 71118: Update clang and cmake for arm docker build.

2019-07-18 Thread Tomasz Janiszewski

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71118/
---

Review request for mesos and Benjamin Bannier.


Repository: mesos


Description
---

Update clang and cmake for arm docker build.


Diffs
-

  support/mesos-build/ubuntu-16.04-arm.dockerfile 
d223202e38d1d78554af9c95de1ee78a70dc3818 


Diff: https://reviews.apache.org/r/71118/diff/1/


Testing
---


Thanks,

Tomasz Janiszewski



Re: Review Request 71084: Added unsuppressing via `updateFramework()` to python example framework.

2019-07-18 Thread Benjamin Mahler

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71084/#review216738
---


Ship it!




Can you clarify the 'Testing Done'? Did you actually run this?


src/examples/python/test_framework.py
Lines 46 (patched)


How about a comment saying that once registered we unsuppress the roles? 
It's a little hard for the reader to see this here I think


- Benjamin Mahler


On July 16, 2019, 3:14 p.m., Andrei Sekretenko wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/71084/
> ---
> 
> (Updated July 16, 2019, 3:14 p.m.)
> 
> 
> Review request for mesos, Benjamin Mahler and Meng Zhu.
> 
> 
> Bugs: MESOS-9864
> https://issues.apache.org/jira/browse/MESOS-9864
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> This patch ensures that the list of suppressed roles is passed through
> the `updateFramework()` of python V0 bindings. Now the framework
> subscribes with all roles suppressed and after that issues
> UPDATE_FRAMEWORK call to unsuppress offers.
> 
> 
> Diffs
> -
> 
>   src/examples/python/test_framework.py 
> 27bc052dc4e57e0290ef58a39b2f1ac82fdb42fd 
> 
> 
> Diff: https://reviews.apache.org/r/71084/diff/1/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Andrei Sekretenko
> 
>



Re: Review Request 71111: Added offer rescind logic for guarantees enforcement.

2019-07-18 Thread Benjamin Mahler

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/7/#review216737
---




src/master/quota_handler.cpp
Lines 651 (patched)


Ditto last review, but in this case the presence of reservations in both 
leads to an under-rescind, which seems less acceptable vs an over-rescind?



src/tests/master_quota_tests.cpp
Lines 1691-1693 (patched)


Ditto, probably want to pull out the test to ease reviewing?


- Benjamin Mahler


On July 18, 2019, 4:30 a.m., Meng Zhu wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/7/
> ---
> 
> (Updated July 18, 2019, 4:30 a.m.)
> 
> 
> Review request for mesos, Andrei Sekretenko and Benjamin Mahler.
> 
> 
> Bugs: MESOS-9812
> https://issues.apache.org/jira/browse/MESOS-9812
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Outstanding offers need to be rescinded as needed to ensure
> roles' requested guarantees can be satisfied.
> 
> For simplicity, and also due to the race between the master and
> the allocator, we pessimistically assume that what seems like
> "available" resources in the allocator are all gone. We greedily
> rescind offers until we can satisfy the guarantees.
> 
> Also added a test.
> 
> 
> Diffs
> -
> 
>   src/master/quota_handler.cpp 2a3ca5640c850ec8a8aca7abed45386b5b5e1125 
>   src/tests/master_quota_tests.cpp 64a9b31a38af23fb16a4afd04f6d3413c7d1ffe9 
> 
> 
> Diff: https://reviews.apache.org/r/7/diff/2/
> 
> 
> Testing
> ---
> 
> make check
> 
> 
> Thanks,
> 
> Meng Zhu
> 
>



Re: Review Request 71068: Added offer rescind logic for limits enforcement.

2019-07-18 Thread Benjamin Mahler

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71068/#review216736
---



Test looks good if you want to pull it out and get a ship it on it separately, 
that would be easier.


src/master/quota_handler.cpp
Lines 589-590 (patched)


Hm.. this is an over estimate of what we're looking for since reservations 
might be included in both consumed and offered?

I suppose over-rescinding is acceptable, but certainly warrants a comment?



src/master/quota_handler.cpp
Lines 595-596 (patched)


This CHECK will fail in the future with optimistic offers where we don't 
set an allocation info, but I suppose we'll catch that and any others, since we 
won't need to rescind for frameworks with that future capability



src/master/quota_handler.cpp
Lines 602-609 (patched)


This actually seems somewhat unacceptable, I wonder if we can do something 
with what we have today, e.g.:

```
Rescind logic

allocatorUpdatedFuture
  .then(defer(master, []() { Rescind logic })
```

That is, we first rescind enough offers from those that are known to the 
master, then, after the allocator is done updating the quota (which means it 
won't enqueue more offers violating the limits), we do another round of 
rescinding. With this logic, it's still possible though for a framework to have 
accepted an offer that arrived after the initial rescind but before the 
allocator finished updating the quota.. so maybe it's not worth it.

Do we have a ticket that captures this issue? Would be good to link all the 
problems that are caused by the current offer ownership model.



src/master/quota_handler.cpp
Lines 623 (patched)


extra newline?



src/tests/master_quota_tests.cpp
Lines 1775 (patched)


are rescinded


- Benjamin Mahler


On July 18, 2019, 5:23 p.m., Meng Zhu wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/71068/
> ---
> 
> (Updated July 18, 2019, 5:23 p.m.)
> 
> 
> Review request for mesos and Benjamin Mahler.
> 
> 
> Bugs: MESOS-9812
> https://issues.apache.org/jira/browse/MESOS-9812
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Outstanding offers of a role (including offers allocated to its
> subroles) are rescinded until the sum of its consumed quota and
> outstanding offered resources are below the requested limits.
> 
> The rescind logic here, however, overlooks the race between master
> and the allocator. Namely, there might be pending offers in the master
> mailbox that have yet to be processed. Here, for simplicity, we ignore
> this race. To properly handle this, the plan is to move offer
> management logic into the allocator and rescind offers there.
> 
> Also added a test.
> 
> 
> Diffs
> -
> 
>   src/master/quota_handler.cpp 2a3ca5640c850ec8a8aca7abed45386b5b5e1125 
>   src/tests/master_quota_tests.cpp 64a9b31a38af23fb16a4afd04f6d3413c7d1ffe9 
> 
> 
> Diff: https://reviews.apache.org/r/71068/diff/3/
> 
> 
> Testing
> ---
> 
> Ran continously without failure.
> 
> 
> Thanks,
> 
> Meng Zhu
> 
>



Re: Review Request 71068: Added offer rescind logic for limits enforcement.

2019-07-18 Thread Meng Zhu

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71068/
---

(Updated July 18, 2019, 10:23 a.m.)


Review request for mesos and Benjamin Mahler.


Changes
---

Fixed a flaky.


Bugs: MESOS-9812
https://issues.apache.org/jira/browse/MESOS-9812


Repository: mesos


Description
---

Outstanding offers of a role (including offers allocated to its
subroles) are rescinded until the sum of its consumed quota and
outstanding offered resources are below the requested limits.

The rescind logic here, however, overlooks the race between master
and the allocator. Namely, there might be pending offers in the master
mailbox that have yet to be processed. Here, for simplicity, we ignore
this race. To properly handle this, the plan is to move offer
management logic into the allocator and rescind offers there.

Also added a test.


Diffs (updated)
-

  src/master/quota_handler.cpp 2a3ca5640c850ec8a8aca7abed45386b5b5e1125 
  src/tests/master_quota_tests.cpp 64a9b31a38af23fb16a4afd04f6d3413c7d1ffe9 


Diff: https://reviews.apache.org/r/71068/diff/3/

Changes: https://reviews.apache.org/r/71068/diff/2-3/


Testing
---

Ran continously without failure.


Thanks,

Meng Zhu



Re: Review Request 71110: Refactored `struct ResourceBreakdown`.

2019-07-18 Thread Benjamin Mahler

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71110/#review216734
---


Fix it, then Ship it!




How about 'Refactored master's role ResourceBreakdown to compute results 
lazily.'?


src/master/readonly_handler.cpp
Line 742 (original), 740 (patched)


Why use the implicit initializer list constructor rather than the one 
explicitly defined?

i.e. s/{/( s/}/)



src/master/readonly_handler.cpp
Lines 771-774 (original), 766-769 (patched)


Since these are pretty expensive calls, can you avoid recomputing 
allocated() and offered() here?


- Benjamin Mahler


On July 18, 2019, 1:54 a.m., Meng Zhu wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/71110/
> ---
> 
> (Updated July 18, 2019, 1:54 a.m.)
> 
> 
> Review request for mesos, Andrei Sekretenko and Benjamin Mahler.
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> This breaks the computation of offered, allocated, reserved
> and consumedQuota into separate getter functions which are
> lazily calculated upon invoked. The struct is renamed to
> `RoleResourceBreakdown`.
> 
> 
> Diffs
> -
> 
>   src/master/master.hpp ffa7423ba533725f7c1123d9aa507b1348e7f281 
>   src/master/master.cpp f1ca637b4cb0382caec53b5a81f6a4eb46f4dd2d 
>   src/master/readonly_handler.cpp 60dac9ae48e1e84380c7c499a367797069ca64bd 
> 
> 
> Diff: https://reviews.apache.org/r/71110/diff/1/
> 
> 
> Testing
> ---
> 
> make check
> 
> 
> Thanks,
> 
> Meng Zhu
> 
>



Re: Review Request 71117: WIP: debug egg build at ASF CI.

2019-07-18 Thread Mesos Reviewbot

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71117/#review216735
---



Bad review!

Reviews applied: []

Error:
No reviewers specified. Please find a reviewer by asking on JIRA or the mailing 
list.

- Mesos Reviewbot


On July 18, 2019, 4:05 p.m., Andrei Sekretenko wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/71117/
> ---
> 
> (Updated July 18, 2019, 4:05 p.m.)
> 
> 
> Review request for mesos.
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> WIP: debug egg build at ASF CI.
> 
> 
> Diffs
> -
> 
>   src/python/native_common/ext_modules.py.in 
> 38e6717ec3b7bb09cfbb9b2fa25f173a17d78f51 
> 
> 
> Diff: https://reviews.apache.org/r/71117/diff/1/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Andrei Sekretenko
> 
>



Re: Review Request 71109: Added helpers to return resources allocated/reserved to a role subtree.

2019-07-18 Thread Benjamin Mahler

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71109/#review216733
---


Ship it!




Any reason not to update the resources_tests.cpp to test this?

- Benjamin Mahler


On July 18, 2019, 1:52 a.m., Meng Zhu wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/71109/
> ---
> 
> (Updated July 18, 2019, 1:52 a.m.)
> 
> 
> Review request for mesos, Andrei Sekretenko and Benjamin Mahler.
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Added helpers to return resources allocated/reserved to a role subtree.
> 
> 
> Diffs
> -
> 
>   include/mesos/resources.hpp 3fc2c34b383cf1cabfb5367a72e0e71a1aed4238 
>   include/mesos/v1/resources.hpp 39c04035263cb9f96e957d1fb80ab0cc8313f97d 
>   src/common/resources.cpp 6e02d5c7cd479d4a8161260df2eba7ae0a4e9b6d 
>   src/v1/resources.cpp 25fb2136c0ecae4102bc7dfc72cdb5d8ceeff836 
> 
> 
> Diff: https://reviews.apache.org/r/71109/diff/1/
> 
> 
> Testing
> ---
> 
> make check
> 
> 
> Thanks,
> 
> Meng Zhu
> 
>



Re: Review Request 71065: Implemented `FutureTrackTest` tests.

2019-07-18 Thread Mesos Reviewbot

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71065/#review216731
---



Patch looks great!

Reviews applied: [70887, 70888, 70889, 70890, 70891, 70892, 71065]

Passed command: export OS='ubuntu:14.04' BUILDTOOL='autotools' COMPILER='gcc' 
CONFIGURATION='--verbose --disable-libtool-wrappers 
--disable-parallel-test-execution' ENVIRONMENT='GLOG_v=1 MESOS_VERBOSE=1'; 
./support/docker-build.sh

- Mesos Reviewbot


On July 12, 2019, 9:03 a.m., Andrei Budnik wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/71065/
> ---
> 
> (Updated July 12, 2019, 9:03 a.m.)
> 
> 
> Review request for mesos, Gilbert Song and Qian Zhang.
> 
> 
> Bugs: MESOS-9842
> https://issues.apache.org/jira/browse/MESOS-9842
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> These tests verify functionality provided by the
> `PendingFutureTracker` class.
> 
> 
> Diffs
> -
> 
>   src/Makefile.am 761dde1d63e0f4f1ac4ab86f129f84f3746d3153 
>   src/tests/CMakeLists.txt 5eb9c65f30d9d6ade289f9d47b18d73908b1f1db 
>   src/tests/common/future_tracker_tests.cpp PRE-CREATION 
> 
> 
> Diff: https://reviews.apache.org/r/71065/diff/4/
> 
> 
> Testing
> ---
> 
> `src/mesos-tests --gtest_repeat=100 --gtest_filter=FutureTrackTest.*`
> 
> 
> Thanks,
> 
> Andrei Budnik
> 
>



Re: Review Request 71065: Implemented `FutureTrackTest` tests.

2019-07-18 Thread Gilbert Song

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71065/#review216730
---


Ship it!




Ship It!

- Gilbert Song


On July 12, 2019, 9:03 a.m., Andrei Budnik wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/71065/
> ---
> 
> (Updated July 12, 2019, 9:03 a.m.)
> 
> 
> Review request for mesos, Gilbert Song and Qian Zhang.
> 
> 
> Bugs: MESOS-9842
> https://issues.apache.org/jira/browse/MESOS-9842
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> These tests verify functionality provided by the
> `PendingFutureTracker` class.
> 
> 
> Diffs
> -
> 
>   src/Makefile.am 761dde1d63e0f4f1ac4ab86f129f84f3746d3153 
>   src/tests/CMakeLists.txt 5eb9c65f30d9d6ade289f9d47b18d73908b1f1db 
>   src/tests/common/future_tracker_tests.cpp PRE-CREATION 
> 
> 
> Diff: https://reviews.apache.org/r/71065/diff/4/
> 
> 
> Testing
> ---
> 
> `src/mesos-tests --gtest_repeat=100 --gtest_filter=FutureTrackTest.*`
> 
> 
> Thanks,
> 
> Andrei Budnik
> 
>



Re: Review Request 70892: Added `/containerizer/debug` endpoint.

2019-07-18 Thread Gilbert Song

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/70892/#review216729
---


Ship it!




Ship It!

- Gilbert Song


On June 19, 2019, 7:49 a.m., Andrei Budnik wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/70892/
> ---
> 
> (Updated June 19, 2019, 7:49 a.m.)
> 
> 
> Review request for mesos, Gilbert Song, James Peach, and Qian Zhang.
> 
> 
> Bugs: MESOS-9829
> https://issues.apache.org/jira/browse/MESOS-9829
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> This patch introduces an agent's `/containerizer/debug` endpoint,
> which exposes the debug info related to Mesos containerizer.
> Currently, this endpoint returns a list of pending futures related to
> isolators or containerizer launcher. This endpoint is experimental,
> and the format of its output may change over time.
> 
> 
> Diffs
> -
> 
>   docs/authorization.md 91dc03be66bdcf135610bff50e5abcfe8d95c5db 
>   src/common/authorization.cpp 89487219ef87619084663e1a0a17e75924394d38 
>   src/local/local.cpp 6ac6b027f66ef555440ca86803198bfe3227f8d7 
>   src/slave/http.hpp b8c83f1db95c2175bb94f6cd76cc2c58aa118c4e 
>   src/slave/http.cpp 321dca786f106169f88df086422f5ef817d2f098 
>   src/slave/main.cpp ef5ea02f169427d9913b807664bf3073d72cd9b6 
>   src/slave/slave.hpp 58a5608ceb53d7f75a085f1d3a4f757e8c3ddd3b 
>   src/slave/slave.cpp 0e7e4d450b77f48a42debf47cb6fc12fdde5858b 
>   src/tests/cluster.cpp b43f806eab096b7d4e86e2b5d4b81d6baecb0ee5 
>   src/tests/mock_slave.hpp c057b40d6db433f52d1dfe10ae02a3b0bc936564 
>   src/tests/mock_slave.cpp dd458e3d09e212d6cde514d86989878954f79fc1 
> 
> 
> Diff: https://reviews.apache.org/r/70892/diff/5/
> 
> 
> Testing
> ---
> 
> Manual testing.
> 1. Add `::sleep(5)` to the beginning of the 
> `LinuxSeccompIsolatorProcess::prepare` method.
> 2. Rebuild Mesos and then spin up a local Mesos cluster.
> 3. Run simple task, e.g.:
> ```
> ./src/mesos-execute --master="`hostname`:5050" --name="test" 
> --containerizer=mesos --command="sleep 1"
> ```
> 4. In the parallel terminal session:
> ```
> curl `hostname`:5051/containerizer/debug
> # will return: 
> {"pending":[["linux/seccomp::prepare",{"containerId":""}]]}
> ```
> 5. After 5 seconds this endpoint returns an empty list of pending futures: 
> `{"pending":[]}`
> 
> Unit tests: TBD in the following patches.
> 
> 
> Thanks,
> 
> Andrei Budnik
> 
>



Re: Review Request 70890: Added `LauncherTracker` for tracking calls of launcher methods.

2019-07-18 Thread Gilbert Song

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/70890/#review216728
---


Ship it!




Ship It!

- Gilbert Song


On June 19, 2019, 7:49 a.m., Andrei Budnik wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/70890/
> ---
> 
> (Updated June 19, 2019, 7:49 a.m.)
> 
> 
> Review request for mesos, Gilbert Song and Qian Zhang.
> 
> 
> Bugs: MESOS-9840
> https://issues.apache.org/jira/browse/MESOS-9840
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> This patch adds a new `LauncherTracker` class that proxies
> calls to the real `Launcher` and keeps track of returned futures
> by employing `PendingFutureTracker` class.
> 
> 
> Diffs
> -
> 
>   src/CMakeLists.txt 6d1409003c4727f6a627f0c0f20d65c2db623012 
>   src/Makefile.am 761dde1d63e0f4f1ac4ab86f129f84f3746d3153 
>   src/slave/containerizer/mesos/launcher_tracker.hpp PRE-CREATION 
>   src/slave/containerizer/mesos/launcher_tracker.cpp PRE-CREATION 
> 
> 
> Diff: https://reviews.apache.org/r/70890/diff/4/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Andrei Budnik
> 
>



Re: Review Request 70889: Wrapped isolators in `IsolatorTracker`.

2019-07-18 Thread Gilbert Song

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/70889/#review216727
---


Ship it!




Ship It!

- Gilbert Song


On June 19, 2019, 7:49 a.m., Andrei Budnik wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/70889/
> ---
> 
> (Updated June 19, 2019, 7:49 a.m.)
> 
> 
> Review request for mesos, Gilbert Song and Qian Zhang.
> 
> 
> Bugs: MESOS-9841
> https://issues.apache.org/jira/browse/MESOS-9841
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> This patch wraps every isolator in instance of `IsolatorTracker` class.
> If an isolator gets stuck in some operation, `pendingFutures` will
> return the isolator name, method name along with its arguments such as
> `containerId`, `pid`, etc.
> 
> 
> Diffs
> -
> 
>   src/slave/containerizer/containerizer.hpp 
> d33c65cd3fae9bb4d347681cb00dbdb25ecf57b8 
>   src/slave/containerizer/containerizer.cpp 
> 5ce0d9c4bfa707495ea644c90fa6e35eb0fe25cd 
>   src/slave/containerizer/mesos/containerizer.hpp 
> 558e412b28a84ae0ca1bf05b596540072bb216d6 
>   src/slave/containerizer/mesos/containerizer.cpp 
> c9a369bcdbfc1676912430c3e84fa567bbd7f766 
>   src/slave/main.cpp ef5ea02f169427d9913b807664bf3073d72cd9b6 
>   src/tests/cluster.cpp b43f806eab096b7d4e86e2b5d4b81d6baecb0ee5 
> 
> 
> Diff: https://reviews.apache.org/r/70889/diff/6/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Andrei Budnik
> 
>



Re: Review Request 70888: Added `IsolatorTracker` for tracking calls of isolator methods.

2019-07-18 Thread Gilbert Song

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/70888/#review216726
---


Ship it!




Ship It!

- Gilbert Song


On June 19, 2019, 7:49 a.m., Andrei Budnik wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/70888/
> ---
> 
> (Updated June 19, 2019, 7:49 a.m.)
> 
> 
> Review request for mesos, Gilbert Song and Qian Zhang.
> 
> 
> Bugs: MESOS-9839
> https://issues.apache.org/jira/browse/MESOS-9839
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> This patch adds a new `IsolatorTracker` class that proxies
> calls to the real isolator and keeps track of returned futures
> by employing `PendingFutureTracker` class.
> 
> 
> Diffs
> -
> 
>   src/CMakeLists.txt eb4e2ac699121ac16e2ac58e4606b26aae906ad8 
>   src/Makefile.am 761dde1d63e0f4f1ac4ab86f129f84f3746d3153 
>   src/slave/constants.hpp cc81e5111e8c3082c940b918230f660a1121a1ac 
>   src/slave/containerizer/mesos/isolator_tracker.hpp PRE-CREATION 
>   src/slave/containerizer/mesos/isolator_tracker.cpp PRE-CREATION 
> 
> 
> Diff: https://reviews.apache.org/r/70888/diff/3/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Andrei Budnik
> 
>



Re: Review Request 70887: Added `PendingFutureTracker` class for tracking pending futures.

2019-07-18 Thread Gilbert Song

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/70887/#review216725
---


Ship it!




Ship It!

- Gilbert Song


On June 19, 2019, 7:49 a.m., Andrei Budnik wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/70887/
> ---
> 
> (Updated June 19, 2019, 7:49 a.m.)
> 
> 
> Review request for mesos, Benjamin Mahler, Gilbert Song, James Peach, Meng 
> Zhu, and Qian Zhang.
> 
> 
> Bugs: MESOS-9837
> https://issues.apache.org/jira/browse/MESOS-9837
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> This patch introduces a mechanism for tracking pending futures.
> This feature allows detection of hanging operations, which get
> stuck on a blocking operation or asynchronously. However, this
> feature does not provide any mechanism for tracking pending
> promises, because `Promise` objects might not be accessible in
> various cases. Thereby, we introduce a new class that can be
> used to track pending futures, so it might facilitate debugging
> of stuck issues.
> 
> 
> Diffs
> -
> 
>   src/Makefile.am 761dde1d63e0f4f1ac4ab86f129f84f3746d3153 
>   src/common/future_tracker.hpp PRE-CREATION 
> 
> 
> Diff: https://reviews.apache.org/r/70887/diff/7/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Andrei Budnik
> 
>



Review Request 71117: WIP: debug egg build at ASF CI.

2019-07-18 Thread Andrei Sekretenko

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71117/
---

Review request for mesos.


Repository: mesos


Description
---

WIP: debug egg build at ASF CI.


Diffs
-

  src/python/native_common/ext_modules.py.in 
38e6717ec3b7bb09cfbb9b2fa25f173a17d78f51 


Diff: https://reviews.apache.org/r/71117/diff/1/


Testing
---


Thanks,

Andrei Sekretenko



Re: Review Request 71111: Added offer rescind logic for guarantees enforcement.

2019-07-18 Thread Andrei Sekretenko

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/7/#review216717
---




src/master/quota_handler.cpp
Lines 664 (patched)


Does this mean that we might rescind an offer even if `consumption + 
offered` for the offer's role is within guarantees of that role? 

If yes - do we really have to be THAT greedy?...


- Andrei Sekretenko


On July 18, 2019, 4:30 a.m., Meng Zhu wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/7/
> ---
> 
> (Updated July 18, 2019, 4:30 a.m.)
> 
> 
> Review request for mesos and Benjamin Mahler.
> 
> 
> Bugs: MESOS-9812
> https://issues.apache.org/jira/browse/MESOS-9812
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Outstanding offers need to be rescinded as needed to ensure
> roles' requested guarantees can be satisfied.
> 
> For simplicity, and also due to the race between the master and
> the allocator, we pessimistically assume that what seems like
> "available" resources in the allocator are all gone. We greedily
> rescind offers until we can satisfy the guarantees.
> 
> Also added a test.
> 
> 
> Diffs
> -
> 
>   src/master/quota_handler.cpp 2a3ca5640c850ec8a8aca7abed45386b5b5e1125 
>   src/tests/master_quota_tests.cpp 64a9b31a38af23fb16a4afd04f6d3413c7d1ffe9 
> 
> 
> Diff: https://reviews.apache.org/r/7/diff/1/
> 
> 
> Testing
> ---
> 
> make check
> 
> 
> Thanks,
> 
> Meng Zhu
> 
>



Re: Review Request 71068: Added offer rescind logic for limits enforcement.

2019-07-18 Thread Andrei Sekretenko

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71068/#review216716
---




src/master/quota_handler.cpp
Lines 619 (patched)


Follow-up question: do you see any way to do something better (in terms of 
cluster fragmentation) than rescinding offers in an arbitrary order? 

"Larger" first/"smaller" first/same-slave first/...?

(even if there is such a way, it might be not worth implementing, 
especially right now)


- Andrei Sekretenko


On July 18, 2019, 4:28 a.m., Meng Zhu wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/71068/
> ---
> 
> (Updated July 18, 2019, 4:28 a.m.)
> 
> 
> Review request for mesos and Benjamin Mahler.
> 
> 
> Bugs: MESOS-9812
> https://issues.apache.org/jira/browse/MESOS-9812
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Outstanding offers of a role (including offers allocated to its
> subroles) are rescinded until the sum of its consumed quota and
> outstanding offered resources are below the requested limits.
> 
> The rescind logic here, however, overlooks the race between master
> and the allocator. Namely, there might be pending offers in the master
> mailbox that have yet to be processed. Here, for simplicity, we ignore
> this race. To properly handle this, the plan is to move offer
> management logic into the allocator and rescind offers there.
> 
> Also added a test.
> 
> 
> Diffs
> -
> 
>   src/master/quota_handler.cpp 2a3ca5640c850ec8a8aca7abed45386b5b5e1125 
>   src/tests/master_quota_tests.cpp 64a9b31a38af23fb16a4afd04f6d3413c7d1ffe9 
> 
> 
> Diff: https://reviews.apache.org/r/71068/diff/2/
> 
> 
> Testing
> ---
> 
> Ran continously without failure.
> 
> 
> Thanks,
> 
> Meng Zhu
> 
>



Re: Review Request 71106: Moved kernelVersion check to common code.

2019-07-18 Thread James Peach

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71106/#review216712
---



Please add both `kernel_version.hpp` and `kernel_version.cpp` to 
`src/Makefile.am` (the headers are always needed for dist tarballs).


src/common/kernel_version.hpp
Lines 17 (patched)


This should be `__KERNEL_VERSION_HPP__` to match the file name.



src/common/kernel_version.hpp
Lines 24 (patched)


Once you move the function body, you can remove the `using` statements and 
the only remaingin includes should be `try.hpp` and `stout/version.hpp`.



src/common/kernel_version.hpp
Lines 31 (patched)


Can you please move this into `kernel_version.cpp`?


- James Peach


On July 17, 2019, 8:38 p.m., Jacob Janco wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/71106/
> ---
> 
> (Updated July 17, 2019, 8:38 p.m.)
> 
> 
> Review request for mesos.
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Moved kernelVersion check to common code.
> 
> 
> Diffs
> -
> 
>   src/common/kernel_version.hpp PRE-CREATION 
>   src/linux/ns.cpp 2440bb2232dfd9cf57a6cc36aeff874c96297c4b 
> 
> 
> Diff: https://reviews.apache.org/r/71106/diff/1/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Jacob Janco
> 
>



Re: Review Request 71111: Added offer rescind logic for guarantees enforcement.

2019-07-18 Thread Mesos Reviewbot

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/7/#review216711
---



Bad patch!

Reviews applied: [71109, 71110, 71061, 71068, 7]

Failed command: ['bash', '-c', "set -o pipefail; export OS='ubuntu:14.04' 
BUILDTOOL='autotools' COMPILER='gcc' CONFIGURATION='--verbose 
--disable-libtool-wrappers --disable-parallel-test-execution' 
ENVIRONMENT='GLOG_v=1 MESOS_VERBOSE=1'; ./support/docker-build.sh 2>&1 | tee 
build_7"]

Error:
..
b-ade3-31adcc3bff52 on agent 58665dec-7eb8-49bb-ad94-85d02ec4f434-S0
I0718 06:35:48.936278 18267 http_connection.hpp:131] Sending 2 call to 
http://172.17.0.2:42893/slave(1216)/api/v1/resource_provider
I0718 06:35:48.937249 18275 process.cpp:3671] Handling HTTP event for process 
'slave(1216)' with path: '/slave(1216)/api/v1/resource_provider'
I0718 06:35:48.940191 18257 process.cpp:3671] Handling HTTP event for process 
'master' with path: '/master/api/v1'
I0718 06:35:48.941807 18280 http.cpp:1115] HTTP POST for /master/api/v1 from 
172.17.0.2:52610
I0718 06:35:48.942044 18280 http.cpp:263] Processing call UNRESERVE_RESOURCES
I0718 06:35:48.942693 18280 master.cpp:3888] Authorizing principal 
'test-principal' to unreserve resources 
'[{"disk":{"source":{"id":"/tmp/CSIVersion_StorageLocalResourceProviderTest_OperatorOperationsWithResourceProviderResources_v0_z6gmBZ/2GB-fe75e496-5c41-4dee-b8cd-8a7e5df5283a","mount":{"root":"./csi/org.apache.mesos.csi.test/local/mounts"},"profile":"test","type":"MOUNT","vendor":"org.apache.mesos.csi.test.local"}},"name":"disk","provider_id":{"value":"8e9395d9-3d92-4641-876e-2a95f843d191"},"reservations":[{"role":"storage","type":"DYNAMIC"},{"principal":"test-principal","role":"storage/default-role","type":"DYNAMIC"}],"scalar":{"value":2048.0},"type":"SCALAR"}]'
I0718 06:35:48.944628 18263 master.cpp:12685] Removing offer 
58665dec-7eb8-49bb-ad94-85d02ec4f434-O5
I0718 06:35:48.944752 18276 sched.cpp:960] Rescinded offer 
58665dec-7eb8-49bb-ad94-85d02ec4f434-O5
I0718 06:35:48.944823 18276 sched.cpp:971] Scheduler::offerRescinded took 
21015ns
I0718 06:35:48.945453 18259 hierarchical.cpp:1218] Recovered disk(allocated: 
storage/default-role)(reservations: 
[(DYNAMIC,storage),(DYNAMIC,storage/default-role,test-principal)])[MOUNT(org.apache.mesos.csi.test.local,/tmp/CSIVersion_StorageLocalResourceProviderTest_OperatorOperationsWithResourceProviderResources_v0_z6gmBZ/2GB-fe75e496-5c41-4dee-b8cd-8a7e5df5283a,test)]:2048;
 cpus(allocated: storage/default-role):2; mem(allocated: 
storage/default-role):1024; disk(allocated: storage/default-role):1024; 
ports(allocated: storage/default-role):[31000-32000] (total: cpus:2; mem:1024; 
disk:1024; ports:[31000-32000]; disk(reservations: 
[(DYNAMIC,storage),(DYNAMIC,storage/default-role,test-principal)])[MOUNT(org.apache.mesos.csi.test.local,/tmp/CSIVersion_StorageLocalResourceProviderTest_OperatorOperationsWithResourceProviderResources_v0_z6gmBZ/2GB-fe75e496-5c41-4dee-b8cd-8a7e5df5283a,test)]:2048,
 allocated: {}) on agent 58665dec-7eb8-49bb-ad94-85d02ec4f434-S0 from framework 
58665dec-7eb8-49
 bb-ad94-85d02ec4f434-
I0718 06:35:48.945569 18259 hierarchical.cpp:1264] Framework 
58665dec-7eb8-49bb-ad94-85d02ec4f434- filtered agent 
58665dec-7eb8-49bb-ad94-85d02ec4f434-S0 for 5secs
I0718 06:35:48.948091 18273 master.cpp:12576] Sending operation '' (uuid: 
32631748-d534-4830-af5b-2860ca4a04ce) to agent 
58665dec-7eb8-49bb-ad94-85d02ec4f434-S0 at slave(1216)@172.17.0.2:42893 
(2dfcf722f334)
I0718 06:35:48.948653 18267 slave.cpp:4335] Ignoring new checkpointed resources 
and operations identical to the current version
I0718 06:35:48.951097 18279 hierarchical.cpp:1508] Performed allocation for 1 
agents in 1.161105ms
I0718 06:35:48.951505 18262 provider.cpp:481] Received APPLY_OPERATION event
I0718 06:35:48.951555 18262 provider.cpp:1295] Received UNRESERVE operation '' 
(uuid: 32631748-d534-4830-af5b-2860ca4a04ce)
I0718 06:35:48.951690 18260 master.cpp:10393] Sending offers [ 
58665dec-7eb8-49bb-ad94-85d02ec4f434-O6 ] to framework 
58665dec-7eb8-49bb-ad94-85d02ec4f434- (default) at 
scheduler-4cf5ae9f-d24d-4eae-88a0-dbf9626d4e8c@172.17.0.2:42893
I0718 06:35:48.952296 18261 sched.cpp:934] Scheduler::resourceOffers took 
77288ns
I0718 06:35:48.976239 18268 http.cpp:1115] HTTP POST for 
/slave(1216)/api/v1/resource_provider from 172.17.0.2:52600
I0718 06:35:48.977198 18271 slave.cpp:8406] Handling resource provider message 
'UPDATE_OPERATION_STATUS: (uuid: d9520b5c-c932-4e3b-ade3-31adcc3bff52) for 
framework  (latest state: OPERATION_FINISHED, status update state: 
OPERATION_FINISHED)'
I0718 06:35:48.977396 18271 slave.cpp:8859] Updating the state of operation 
with no ID (uuid: d9520b5c-c932-4e3b-ade3-31adcc3bff52) for an operation API 
call (latest state: OPERATION_FINISHED, status update state: OPERATION_FINISHED)
I0718 06:35:48.977449 

Re: Review Request 70890: Added `LauncherTracker` for tracking calls of launcher methods.

2019-07-18 Thread Gilbert Song

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/70890/#review216710
---




src/slave/containerizer/mesos/launcher_tracker.cpp
Lines 62-80 (patched)


Could we avoid returning future.get(). It is not a risk here but not very 
readable.

We could rely on a promise:
```
  Promise promise;
  tracker->track(
  promise.future(),
  "launcher::fork",
  COMPONENT_NAME_CONTAINERIZER,
  {{"containerId", stringify(containerId)},
   {"path", path}});

  Try forked = launcher->fork(
  containerId,
  path,
  argv,
  containerIO,
  flags,
  environment,
  enterNamespaces,
  cloneNamespaces,
  whitelistFds);

  promise.associate(forked);
  return forked;
```


- Gilbert Song


On June 19, 2019, 7:49 a.m., Andrei Budnik wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/70890/
> ---
> 
> (Updated June 19, 2019, 7:49 a.m.)
> 
> 
> Review request for mesos, Gilbert Song and Qian Zhang.
> 
> 
> Bugs: MESOS-9840
> https://issues.apache.org/jira/browse/MESOS-9840
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> This patch adds a new `LauncherTracker` class that proxies
> calls to the real `Launcher` and keeps track of returned futures
> by employing `PendingFutureTracker` class.
> 
> 
> Diffs
> -
> 
>   src/CMakeLists.txt eb4e2ac699121ac16e2ac58e4606b26aae906ad8 
>   src/Makefile.am 761dde1d63e0f4f1ac4ab86f129f84f3746d3153 
>   src/slave/containerizer/mesos/launcher_tracker.hpp PRE-CREATION 
>   src/slave/containerizer/mesos/launcher_tracker.cpp PRE-CREATION 
> 
> 
> Diff: https://reviews.apache.org/r/70890/diff/3/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Andrei Budnik
> 
>