Re: Review Request 70570: Logged headroom related info in the allocator.

2019-05-06 Thread Mesos Reviewbot

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/70570/#review215081
---



Patch looks great!

Reviews applied: [70570]

Passed command: export OS='ubuntu:14.04' BUILDTOOL='autotools' COMPILER='gcc' 
CONFIGURATION='--verbose --disable-libtool-wrappers 
--disable-parallel-test-execution' ENVIRONMENT='GLOG_v=1 MESOS_VERBOSE=1'; 
./support/docker-build.sh

- Mesos Reviewbot


On April 30, 2019, 1 a.m., Meng Zhu wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/70570/
> ---
> 
> (Updated April 30, 2019, 1 a.m.)
> 
> 
> Review request for mesos and Benjamin Mahler.
> 
> 
> Bugs: MESOS-9759
> https://issues.apache.org/jira/browse/MESOS-9759
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> This patch logs `requiredHeadroom` and `availableHeadroom`
> before each allocation cycle and logs `requiredHeadroom`,
> resources and number of agents held back for quota `headroom`
> as well as resources filtered at the end of each allocation
> cycle.
> 
> While we also held resources back for quota headroom in the
> first stage, we do not log it there. This is because in the
> second stage, we try to allocate all resources (including the
> ones held back in the first stage). Thus only resources held
> back in the second stage are truly held back for the whole
> allocation cycle.
> 
> 
> Diffs
> -
> 
>   src/master/allocator/mesos/hierarchical.cpp 
> 64a076ddd29711437d539a06bb0470755828cc87 
> 
> 
> Diff: https://reviews.apache.org/r/70570/diff/4/
> 
> 
> Testing
> ---
> 
> N/A
> 
> 
> Thanks,
> 
> Meng Zhu
> 
>



Re: Review Request 70550: Added `UpdateQuotaConfig` authorization in the local authorizor.

2019-05-06 Thread Mesos Reviewbot

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/70550/#review215079
---



Patch looks great!

Reviews applied: [70549, 70550]

Passed command: export OS='ubuntu:14.04' BUILDTOOL='autotools' COMPILER='gcc' 
CONFIGURATION='--verbose --disable-libtool-wrappers 
--disable-parallel-test-execution' ENVIRONMENT='GLOG_v=1 MESOS_VERBOSE=1'; 
./support/docker-build.sh

- Mesos Reviewbot


On April 25, 2019, 11:14 p.m., Meng Zhu wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/70550/
> ---
> 
> (Updated April 25, 2019, 11:14 p.m.)
> 
> 
> Review request for mesos, Andrei Sekretenko, Benjamin Mahler, and Till 
> Toenshoff.
> 
> 
> Bugs: MESOS-9640
> https://issues.apache.org/jira/browse/MESOS-9640
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Also added a test.
> 
> 
> Diffs
> -
> 
>   include/mesos/authorizer/acls.proto 
> 85a3979bc02ab0145a0934f29eeef7bb65f3c42c 
>   include/mesos/authorizer/authorizer.hpp 
> a86a6eeb592adfc267dcf3faef40e8da3471feaf 
>   src/authorizer/local/authorizer.cpp 
> ec86f053357e4bc85532ea4073b57b272ffc74fe 
>   src/tests/authorization_tests.cpp cad76f664c02dc778884d9850184b672e28f5fee 
> 
> 
> Diff: https://reviews.apache.org/r/70550/diff/1/
> 
> 
> Testing
> ---
> 
> make check
> 
> 
> Thanks,
> 
> Meng Zhu
> 
>



Re: Review Request 70549: Added authorization for `UpdateQuota` call in the master.

2019-05-06 Thread Meng Zhu

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/70549/
---

(Updated May 6, 2019, 4 p.m.)


Review request for mesos, Andrei Sekretenko and Benjamin Mahler.


Changes
---

Clarified why a new action is introduced.


Bugs: MESOS-9640
https://issues.apache.org/jira/browse/MESOS-9640


Repository: mesos


Description (updated)
---

A new authorizable action `update_quota_configs` is added.
This disambiguates with the old action `update_quota`
which are used for the old `SetQuota` and
`RemoveQuota` calls. `update_quota` action requires
`QuotaInfo` as the object while the new `UpdatedQuota` call
uses `QuotaConfig`. To keep it compatible with any external
authorization modules, a new action `update_quota_config`
is introduced.


Diffs (updated)
-

  include/mesos/authorizer/authorizer.proto 
e2740c402732bb37db991ec92b9301e58b33215b 
  src/master/master.hpp 7d9732f1e432f6f0290d234242864cbdbf381fa8 
  src/master/quota_handler.cpp 5d449e6f027a69ccaa0ac3473ea4cf57441601f3 


Diff: https://reviews.apache.org/r/70549/diff/2/

Changes: https://reviews.apache.org/r/70549/diff/1-2/


Testing
---

make check


Thanks,

Meng Zhu



Re: Review Request 70595: Adds a regression test for MESOS-9766.

2019-05-06 Thread Benjamin Mahler

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/70595/
---

(Updated May 6, 2019, 10:04 p.m.)


Review request for mesos, Alexander Rukletsov and Chun-Hung Hsiao.


Changes
---

Renamed the test to NoHang


Bugs: MESOS-9766
https://issues.apache.org/jira/browse/MESOS-9766


Repository: mesos


Description
---

This test fails on master prior to applying the fix for MESOS-9766.
It attempts to ensure that processes are terminated after the
/__processes__ handler dispatches to them.


Diffs (updated)
-

  3rdparty/libprocess/src/tests/process_tests.cpp 
60f3dd653153c2b2ccf9c3a7eae8c75fd6ff025c 


Diff: https://reviews.apache.org/r/70595/diff/4/

Changes: https://reviews.apache.org/r/70595/diff/3-4/


Testing
---

Ran in repetition, although it appears to consistently fail on master without 
repetition needed.


Thanks,

Benjamin Mahler



Re: Review Request 70595: Adds a regression test for MESOS-9766.

2019-05-06 Thread Chun-Hung Hsiao

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/70595/#review215075
---


Ship it!




Ship It!

- Chun-Hung Hsiao


On May 6, 2019, 10:04 p.m., Benjamin Mahler wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/70595/
> ---
> 
> (Updated May 6, 2019, 10:04 p.m.)
> 
> 
> Review request for mesos, Alexander Rukletsov and Chun-Hung Hsiao.
> 
> 
> Bugs: MESOS-9766
> https://issues.apache.org/jira/browse/MESOS-9766
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> This test fails on master prior to applying the fix for MESOS-9766.
> It attempts to ensure that processes are terminated after the
> /__processes__ handler dispatches to them.
> 
> 
> Diffs
> -
> 
>   3rdparty/libprocess/src/tests/process_tests.cpp 
> 60f3dd653153c2b2ccf9c3a7eae8c75fd6ff025c 
> 
> 
> Diff: https://reviews.apache.org/r/70595/diff/4/
> 
> 
> Testing
> ---
> 
> Ran in repetition, although it appears to consistently fail on master without 
> repetition needed.
> 
> 
> Thanks,
> 
> Benjamin Mahler
> 
>



Re: Review Request 70595: Adds a regression test for MESOS-9766.

2019-05-06 Thread Benjamin Mahler

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/70595/
---

(Updated May 6, 2019, 10:02 p.m.)


Review request for mesos, Alexander Rukletsov and Chun-Hung Hsiao.


Changes
---

Use a managed process and avoid an additional race.


Bugs: MESOS-9766
https://issues.apache.org/jira/browse/MESOS-9766


Repository: mesos


Description
---

This test fails on master prior to applying the fix for MESOS-9766.
It attempts to ensure that processes are terminated after the
/__processes__ handler dispatches to them.


Diffs (updated)
-

  3rdparty/libprocess/src/tests/process_tests.cpp 
60f3dd653153c2b2ccf9c3a7eae8c75fd6ff025c 


Diff: https://reviews.apache.org/r/70595/diff/3/

Changes: https://reviews.apache.org/r/70595/diff/2-3/


Testing
---

Ran in repetition, although it appears to consistently fail on master without 
repetition needed.


Thanks,

Benjamin Mahler



Re: Review Request 70595: Adds a regression test for MESOS-9766.

2019-05-06 Thread Chun-Hung Hsiao

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/70595/#review215074
---


Fix it, then Ship it!




Thanks for making this test more deterministic! Got some comments, but IMO this 
test is robust enough in practice so please feel free to drop my issues.


3rdparty/libprocess/src/tests/process_tests.cpp
Lines 2122 (patched)


If you want libprocess to manager this actor, you can do the following:
```
PID pid = spawn(new TestProcess(), true);

Future waited = dispatch(pid, &TestProcess::wait_for_terminate);
```
Then you won't need to do `process::wait` and `delete` later.

If you do so, you don't even need `waited`.



3rdparty/libprocess/src/tests/process_tests.cpp
Lines 2138 (patched)


Although highly unlikely, it is theoratically possible that the terminate 
event is enqueued before any event of `process` got dequeued. If this happen, 
this await would fail even w/ the fix.

A possible way to fix this is to set up some promise in 
`wait_for_terminate`, and then wait for the corresponding future before calling 
`terminate` in this test.

That said, fixing this flake that's unlikely to happen might not worth it, 
so please feel free to drop this issue.


- Chun-Hung Hsiao


On May 6, 2019, 6:34 p.m., Benjamin Mahler wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/70595/
> ---
> 
> (Updated May 6, 2019, 6:34 p.m.)
> 
> 
> Review request for mesos, Alexander Rukletsov and Chun-Hung Hsiao.
> 
> 
> Bugs: MESOS-9766
> https://issues.apache.org/jira/browse/MESOS-9766
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> This test fails on master prior to applying the fix for MESOS-9766.
> It attempts to ensure that processes are terminated after the
> /__processes__ handler dispatches to them.
> 
> 
> Diffs
> -
> 
>   3rdparty/libprocess/src/tests/process_tests.cpp 
> 60f3dd653153c2b2ccf9c3a7eae8c75fd6ff025c 
> 
> 
> Diff: https://reviews.apache.org/r/70595/diff/2/
> 
> 
> Testing
> ---
> 
> Ran in repetition, although it appears to consistently fail on master without 
> repetition needed.
> 
> 
> Thanks,
> 
> Benjamin Mahler
> 
>



Re: Review Request 70549: Added authorization for `UpdateQuota` call in the master.

2019-05-06 Thread Benjamin Mahler

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/70549/#review215073
---




include/mesos/authorizer/authorizer.proto
Lines 142-143 (patched)


An alternative would have been to stick with UPDATE_QUOTA and look for 
whichever of quota_info and quota_config is set.

After thinking about it, I'm guessing we don't want to do that since third 
party code is looking at these protos and might behave badly with the missing 
field..? Whereas it would fail more clearly with an unknown Action enum value?

Perhaps we can spell this out so it's clear in the code and description?


- Benjamin Mahler


On April 25, 2019, 11:23 p.m., Meng Zhu wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/70549/
> ---
> 
> (Updated April 25, 2019, 11:23 p.m.)
> 
> 
> Review request for mesos, Andrei Sekretenko and Benjamin Mahler.
> 
> 
> Bugs: MESOS-9640
> https://issues.apache.org/jira/browse/MESOS-9640
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> A new authorizable action `update_quota_configs` is added.
> This disambiguates with the old action `update_quota`
> which are used for the old `SetQuota` and
> `RemoveQuota` calls.
> 
> 
> Diffs
> -
> 
>   include/mesos/authorizer/authorizer.proto 
> e2740c402732bb37db991ec92b9301e58b33215b 
>   src/master/master.hpp 7d9732f1e432f6f0290d234242864cbdbf381fa8 
>   src/master/quota_handler.cpp 5d449e6f027a69ccaa0ac3473ea4cf57441601f3 
> 
> 
> Diff: https://reviews.apache.org/r/70549/diff/1/
> 
> 
> Testing
> ---
> 
> make check
> 
> 
> Thanks,
> 
> Meng Zhu
> 
>



Re: Review Request 70550: Added `UpdateQuotaConfig` authorization in the local authorizor.

2019-05-06 Thread Benjamin Mahler

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/70550/#review215072
---




include/mesos/authorizer/acls.proto
Lines 135-143 (original), 135-152 (patched)


Why do we need the two idential messages?

Seems like this needs to be spelled out in the comments and in the commit 
description, as it's not obvious


- Benjamin Mahler


On April 25, 2019, 11:14 p.m., Meng Zhu wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/70550/
> ---
> 
> (Updated April 25, 2019, 11:14 p.m.)
> 
> 
> Review request for mesos, Andrei Sekretenko, Benjamin Mahler, and Till 
> Toenshoff.
> 
> 
> Bugs: MESOS-9640
> https://issues.apache.org/jira/browse/MESOS-9640
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Also added a test.
> 
> 
> Diffs
> -
> 
>   include/mesos/authorizer/acls.proto 
> 85a3979bc02ab0145a0934f29eeef7bb65f3c42c 
>   include/mesos/authorizer/authorizer.hpp 
> a86a6eeb592adfc267dcf3faef40e8da3471feaf 
>   src/authorizer/local/authorizer.cpp 
> ec86f053357e4bc85532ea4073b57b272ffc74fe 
>   src/tests/authorization_tests.cpp cad76f664c02dc778884d9850184b672e28f5fee 
> 
> 
> Diff: https://reviews.apache.org/r/70550/diff/1/
> 
> 
> Testing
> ---
> 
> make check
> 
> 
> Thanks,
> 
> Meng Zhu
> 
>



Re: Review Request 70570: Logged headroom related info in the allocator.

2019-05-06 Thread Benjamin Mahler

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/70570/#review215071
---



Hm.. don't we need to track the amount held back during the quota allocation? 
What are your thoughts on that?


src/master/allocator/mesos/hierarchical.cpp
Lines 1800 (patched)


"and available"
or
" ; available"



src/master/allocator/mesos/hierarchical.cpp
Lines 1930-1942 (original), 1934-1946 (patched)


Isn't this also holding back for headroom?



src/master/allocator/mesos/hierarchical.cpp
Lines 2012-2019 (patched)


Probably clearer with a single comment above the two variables about what 
we track for logging purposes?



src/master/allocator/mesos/hierarchical.cpp
Lines 2018 (patched)


s/these/this/ ?



src/master/allocator/mesos/hierarchical.cpp
Lines 2019 (patched)


size_t?



src/master/allocator/mesos/hierarchical.cpp
Lines 2090 (patched)


pre-increment?



src/master/allocator/mesos/hierarchical.cpp
Lines 2122 (patched)


were held back



src/master/allocator/mesos/hierarchical.cpp
Lines 2123 (patched)


"for that" -> "to ensure sufficient headroom"



src/master/allocator/mesos/hierarchical.cpp
Lines 2124 (patched)


"were filtered"


- Benjamin Mahler


On April 30, 2019, 1 a.m., Meng Zhu wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/70570/
> ---
> 
> (Updated April 30, 2019, 1 a.m.)
> 
> 
> Review request for mesos and Benjamin Mahler.
> 
> 
> Bugs: MESOS-9759
> https://issues.apache.org/jira/browse/MESOS-9759
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Logged headroom related info in the allocator.
> 
> 
> Diffs
> -
> 
>   src/master/allocator/mesos/hierarchical.cpp 
> 64a076ddd29711437d539a06bb0470755828cc87 
> 
> 
> Diff: https://reviews.apache.org/r/70570/diff/3/
> 
> 
> Testing
> ---
> 
> N/A
> 
> 
> Thanks,
> 
> Meng Zhu
> 
>



Re: Review Request 70595: Adds a regression test for MESOS-9766.

2019-05-06 Thread Mesos Reviewbot

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/70595/#review215070
---



Patch looks great!

Reviews applied: [70594, 70595]

Passed command: export OS='ubuntu:14.04' BUILDTOOL='autotools' COMPILER='gcc' 
CONFIGURATION='--verbose --disable-libtool-wrappers 
--disable-parallel-test-execution' ENVIRONMENT='GLOG_v=1 MESOS_VERBOSE=1'; 
./support/docker-build.sh

- Mesos Reviewbot


On May 6, 2019, 6:34 p.m., Benjamin Mahler wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/70595/
> ---
> 
> (Updated May 6, 2019, 6:34 p.m.)
> 
> 
> Review request for mesos, Alexander Rukletsov and Chun-Hung Hsiao.
> 
> 
> Bugs: MESOS-9766
> https://issues.apache.org/jira/browse/MESOS-9766
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> This test fails on master prior to applying the fix for MESOS-9766.
> It attempts to ensure that processes are terminated after the
> /__processes__ handler dispatches to them.
> 
> 
> Diffs
> -
> 
>   3rdparty/libprocess/src/tests/process_tests.cpp 
> 60f3dd653153c2b2ccf9c3a7eae8c75fd6ff025c 
> 
> 
> Diff: https://reviews.apache.org/r/70595/diff/2/
> 
> 
> Testing
> ---
> 
> Ran in repetition, although it appears to consistently fail on master without 
> repetition needed.
> 
> 
> Thanks,
> 
> Benjamin Mahler
> 
>



Re: Review Request 70591: Introduced struct RoleInfo to track reservations and framework IDs in one map.

2019-05-06 Thread Meng Zhu

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/70591/#review215069
---



Regarding the commit message. We usually write the commit message more formally 
and leave out the discussion part. The discussion should happen while working 
on the patch and the commit description should only include the discussion 
result/decision. My take of an ideal commit message is (1) describe the change 
(2) (optional) document the current (old) way (3) why the change (simplify, 
perf improvement...) (4) (optional) benchmark result if any.

Also, we try to align the length with the summary (i.e. 80char length). How 
about:

This patch introduces `struct RoleInfo` in the allocator which contains the
framework IDs and reservations tied to a role. A single hashmap using the struct
replaces the two existing separate hashmaps. This simplifies the role tracking
logic in the allocator. The patch introduces minimal to no performance impact.

- Meng Zhu


On May 3, 2019, 8:50 a.m., Andrei Sekretenko wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/70591/
> ---
> 
> (Updated May 3, 2019, 8:50 a.m.)
> 
> 
> Review request for mesos, Benjamin Mahler and Meng Zhu.
> 
> 
> Bugs: MESOS-9701
> https://issues.apache.org/jira/browse/MESOS-9701
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> This patch introduces `struct RoleInfo` which contains the framework IDs and 
> reservations tied to the role and replaces `roles` and 
> `reservationScalarQuantities` hashmaps with `hashmap`.
> 
> I personally do not like the name `RoleInfo` and would appreciate any better 
> naming ideas. 
> However, naming this entity `Role` would have been even worse: it would imply 
> using `const Role& role` throughout the code, which would require changing 
> all the places which use `std::string role` as a key...
> 
> 
> Diffs
> -
> 
>   src/master/allocator/mesos/hierarchical.hpp 
> c2058baca5159da4cdcab77afd5de3c0d5ae6c48 
>   src/master/allocator/mesos/hierarchical.cpp 
> 64a076ddd29711437d539a06bb0470755828cc87 
> 
> 
> Diff: https://reviews.apache.org/r/70591/diff/1/
> 
> 
> Testing
> ---
> 
> make check
> 
> Benchmarking: 5 runs of 
> `BENCHMARK_HierarchicalAllocator_WithQuotaParam.LargeAndSmallQuota/5` with 
> the optimized build.
> 
> Performance impact in this benchmark seems to be negligible.
> 
> BEFORE:
> Added 3000 agents in 51.553929ms
> Added 3000 frameworks in 15.174748344secs
> Benchmark setup: 3000 agents, 3000 roles, 3000 frameworks, with random sorter
> Made 3856 allocations in 12.400805171secs
> Made 0 allocation in 12.5850238secs
> 
> Added 3000 agents in 55.739336ms
> Added 3000 frameworks in 14.730404769secs
> Benchmark setup: 3000 agents, 3000 roles, 3000 frameworks, with random sorter
> Made 3856 allocations in 12.563439682secs
> Made 0 allocation in 13.063555055secs
> 
> Added 3000 agents in 54.414733ms
> Added 3000 frameworks in 15.10136842secs
> Benchmark setup: 3000 agents, 3000 roles, 3000 frameworks, with random sorter
> Made 3856 allocations in 13.501664915secs
> Made 0 allocation in 12.89034382secs
> 
> Added 3000 agents in 52.58252ms
> Added 3000 frameworks in 14.048350298secs
> Benchmark setup: 3000 agents, 3000 roles, 3000 frameworks, with random sorter
> Made 3856 allocations in 12.299952145secs
> Made 0 allocation in 11.888248811secs
> 
> Added 3000 agents in 52.821439ms
> Added 3000 frameworks in 15.344450583secs
> Benchmark setup: 3000 agents, 3000 roles, 3000 frameworks, with random sorter
> Made 3856 allocations in 12.63425secs
> Made 0 allocation in 12.427171541secs
> 
> 
> AFTER:
> 
> Added 3000 agents in 69.716648ms
> Added 3000 frameworks in 15.249001979secs
> Benchmark setup: 3000 agents, 3000 roles, 3000 frameworks, with random sorter
> Made 3856 allocations in 12.860494226secs
> Made 0 allocation in 12.228866329secs
> 
> Added 3000 agents in 52.639388ms
> Added 3000 frameworks in 15.207895482secs
> Benchmark setup: 3000 agents, 3000 roles, 3000 frameworks, with random sorter
> Made 3856 allocations in 12.504777266secs
> Made 0 allocation in 12.70388062secs
> 
> Added 3000 agents in 56.865794ms
> Added 3000 frameworks in 15.284003915secs
> Benchmark setup: 3000 agents, 3000 roles, 3000 frameworks, with random sorter
> Made 3856 allocations in 13.86859815secs
> Made 0 allocation in 12.538958231secs
> 
> Added 3000 agents in 56.028013ms
> Added 3000 frameworks in 13.892577869secs
> Benchmark setup: 3000 agents, 3000 roles, 3000 frameworks, with random sorter
> Made 3856 allocations in 12.341724418secs
> Made 0 allocation in 12.23022189secs
> 
> Added 3000 agents in 52.368219ms
> Added 3000 frameworks in 13.978581104secs
> Benchmark setup: 3000 agents, 3000 roles, 3

Re: Review Request 70591: Introduced struct RoleInfo to track reservations and framework IDs in one map.

2019-05-06 Thread Meng Zhu

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/70591/#review215068
---




src/master/allocator/mesos/hierarchical.hpp
Line 542 (original), 555 (patched)


s/.../and when there are reservations tied to this role/



src/master/allocator/mesos/hierarchical.hpp
Line 543 (original), 556 (patched)


Calling it `roles`?



src/master/allocator/mesos/hierarchical.cpp
Lines 1716 (patched)


newline below



src/master/allocator/mesos/hierarchical.cpp
Line 2119 (original), 2122 (patched)


Shouldn't this be:

```
if (!roles.contains(role) || roles.at(role).frameworks.empty()) {
  
}
```



src/master/allocator/mesos/hierarchical.cpp
Line 2610 (original), 2614 (patched)


not yours, but I found this shadowing hurting readibility. Maybe we can 
sneak this fix in?


- Meng Zhu


On May 3, 2019, 8:50 a.m., Andrei Sekretenko wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/70591/
> ---
> 
> (Updated May 3, 2019, 8:50 a.m.)
> 
> 
> Review request for mesos, Benjamin Mahler and Meng Zhu.
> 
> 
> Bugs: MESOS-9701
> https://issues.apache.org/jira/browse/MESOS-9701
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> This patch introduces `struct RoleInfo` which contains the framework IDs and 
> reservations tied to the role and replaces `roles` and 
> `reservationScalarQuantities` hashmaps with `hashmap`.
> 
> I personally do not like the name `RoleInfo` and would appreciate any better 
> naming ideas. 
> However, naming this entity `Role` would have been even worse: it would imply 
> using `const Role& role` throughout the code, which would require changing 
> all the places which use `std::string role` as a key...
> 
> 
> Diffs
> -
> 
>   src/master/allocator/mesos/hierarchical.hpp 
> c2058baca5159da4cdcab77afd5de3c0d5ae6c48 
>   src/master/allocator/mesos/hierarchical.cpp 
> 64a076ddd29711437d539a06bb0470755828cc87 
> 
> 
> Diff: https://reviews.apache.org/r/70591/diff/1/
> 
> 
> Testing
> ---
> 
> make check
> 
> Benchmarking: 5 runs of 
> `BENCHMARK_HierarchicalAllocator_WithQuotaParam.LargeAndSmallQuota/5` with 
> the optimized build.
> 
> Performance impact in this benchmark seems to be negligible.
> 
> BEFORE:
> Added 3000 agents in 51.553929ms
> Added 3000 frameworks in 15.174748344secs
> Benchmark setup: 3000 agents, 3000 roles, 3000 frameworks, with random sorter
> Made 3856 allocations in 12.400805171secs
> Made 0 allocation in 12.5850238secs
> 
> Added 3000 agents in 55.739336ms
> Added 3000 frameworks in 14.730404769secs
> Benchmark setup: 3000 agents, 3000 roles, 3000 frameworks, with random sorter
> Made 3856 allocations in 12.563439682secs
> Made 0 allocation in 13.063555055secs
> 
> Added 3000 agents in 54.414733ms
> Added 3000 frameworks in 15.10136842secs
> Benchmark setup: 3000 agents, 3000 roles, 3000 frameworks, with random sorter
> Made 3856 allocations in 13.501664915secs
> Made 0 allocation in 12.89034382secs
> 
> Added 3000 agents in 52.58252ms
> Added 3000 frameworks in 14.048350298secs
> Benchmark setup: 3000 agents, 3000 roles, 3000 frameworks, with random sorter
> Made 3856 allocations in 12.299952145secs
> Made 0 allocation in 11.888248811secs
> 
> Added 3000 agents in 52.821439ms
> Added 3000 frameworks in 15.344450583secs
> Benchmark setup: 3000 agents, 3000 roles, 3000 frameworks, with random sorter
> Made 3856 allocations in 12.63425secs
> Made 0 allocation in 12.427171541secs
> 
> 
> AFTER:
> 
> Added 3000 agents in 69.716648ms
> Added 3000 frameworks in 15.249001979secs
> Benchmark setup: 3000 agents, 3000 roles, 3000 frameworks, with random sorter
> Made 3856 allocations in 12.860494226secs
> Made 0 allocation in 12.228866329secs
> 
> Added 3000 agents in 52.639388ms
> Added 3000 frameworks in 15.207895482secs
> Benchmark setup: 3000 agents, 3000 roles, 3000 frameworks, with random sorter
> Made 3856 allocations in 12.504777266secs
> Made 0 allocation in 12.70388062secs
> 
> Added 3000 agents in 56.865794ms
> Added 3000 frameworks in 15.284003915secs
> Benchmark setup: 3000 agents, 3000 roles, 3000 frameworks, with random sorter
> Made 3856 allocations in 13.86859815secs
> Made 0 allocation in 12.538958231secs
> 
> Added 3000 agents in 56.028013ms
> Added 3000 frameworks in 13.892577869secs
> Benchmark setup: 3000 agents, 3000 roles, 3000 frameworks, with random sorter
> Made 3856 allocations in 12.341724418secs
> Made 0 allocation in 12.2

Re: Review Request 70595: Adds a regression test for MESOS-9766.

2019-05-06 Thread Benjamin Mahler

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/70595/
---

(Updated May 6, 2019, 6:34 p.m.)


Review request for mesos, Alexander Rukletsov and Chun-Hung Hsiao.


Changes
---

Made the test not rely on racing behavior.


Bugs: MESOS-9766
https://issues.apache.org/jira/browse/MESOS-9766


Repository: mesos


Description
---

This test fails on master prior to applying the fix for MESOS-9766.
It attempts to ensure that processes are terminated after the
/__processes__ handler dispatches to them.


Diffs (updated)
-

  3rdparty/libprocess/src/tests/process_tests.cpp 
60f3dd653153c2b2ccf9c3a7eae8c75fd6ff025c 


Diff: https://reviews.apache.org/r/70595/diff/2/

Changes: https://reviews.apache.org/r/70595/diff/1-2/


Testing
---

Ran in repetition, although it appears to consistently fail on master without 
repetition needed.


Thanks,

Benjamin Mahler



Re: Review Request 70595: Adds a regression test for MESOS-9766.

2019-05-06 Thread Benjamin Mahler


> On May 3, 2019, 11:31 p.m., Chun-Hung Hsiao wrote:
> > 3rdparty/libprocess/src/tests/process_tests.cpp
> > Lines 2092-2108 (patched)
> > 
> >
> > I tried this unit test w/o the fix in the following settings:
> > 1. `--gtest_repeat=100`, which resulted in 3 passes and 97 failures.
> > 2. Invoked the test w/o repetition in a for-loop, which resulted in 60 
> > passes and 40 failures.
> > 
> > So it seems to me that this test passes fairly easily if it only runs 
> > once.
> > 
> > Then I replace this snippet with the following code, which just used 1 
> > actor but failed every time in the above two settings:
> > ```
> >   UPID pid = spawn(new ProcessBase(), true);
> > 
> >   dispatch(pid, [] { os::sleep(Milliseconds(50)); });
> > 
> >   http::URL url = http::URL(
> >   "http",
> >   process::address().ip,
> >   process::address().port,
> >   "/__processes__");
> > 
> >   Future response = http::get(url);
> > 
> >   terminate(pid, true);
> > ```
> 
> Alexander Rukletsov wrote:
> Agree with Chun, we should avoid races as much as we can. Several 
> suggestions to further improve Chun's version:
> - Pause the clock in the beginnig. We can then sleep for 42 minutes to 
> make sure the first message is still being processes when `http::get()` is 
> invoked (will of course require `settle()` before).
> - `settle()` after `http::get` to ensure `/__processes__` handler is put 
> into the `pid`'s mailbaox.
> - Check that the resulted JSON does not include the entry for `pid`, 
> i.e., is empty.
> 
> Chun-Hung Hsiao wrote:
> I'm not sure if adding settles works:
> 1. Adding a `settle` before `http::get` makes it wait for the completion 
> of the sleep.
> 2. Adding a `settle` after `http::get` makes `terminate` wait for the 
> completion of the sleep.
> IIUC both will increase the likelihood of the test passing w/o che fix.

Thanks guys, I will upload a test that doesn't rely on racing, we could also go 
with Chun's simpler test which relies on racing but will only rarely not 
exercise what we want.


> On May 3, 2019, 11:31 p.m., Chun-Hung Hsiao wrote:
> > 3rdparty/libprocess/src/tests/process_tests.cpp
> > Lines 2110-2112 (patched)
> > 
> >
> > `AWAIT_EXPECT_RESPONSE_STATUS_EQ(http::OK().status, response);`

Now that I look at it, it looks less clear/readable with the long macro.


- Benjamin


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/70595/#review215035
---


On May 3, 2019, 7:58 p.m., Benjamin Mahler wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/70595/
> ---
> 
> (Updated May 3, 2019, 7:58 p.m.)
> 
> 
> Review request for mesos, Alexander Rukletsov and Chun-Hung Hsiao.
> 
> 
> Bugs: MESOS-9766
> https://issues.apache.org/jira/browse/MESOS-9766
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> This test fails on master prior to applying the fix for MESOS-9766.
> It attempts to ensure that processes are terminated after the
> /__processes__ handler dispatches to them.
> 
> 
> Diffs
> -
> 
>   3rdparty/libprocess/src/tests/process_tests.cpp 
> 60f3dd653153c2b2ccf9c3a7eae8c75fd6ff025c 
> 
> 
> Diff: https://reviews.apache.org/r/70595/diff/1/
> 
> 
> Testing
> ---
> 
> Ran in repetition, although it appears to consistently fail on master without 
> repetition needed.
> 
> 
> Thanks,
> 
> Benjamin Mahler
> 
>



Re: Review Request 70598: Added blogpost for 1.8.0 release.

2019-05-06 Thread Benjamin Mahler

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/70598/#review215065
---


Fix it, then Ship it!




Thanks Benno!

Showing stats seems interesting, maybe just things like JIRA ticket count, # of 
commits, # lines added / deleted, as opposed to which files and folders?


site/source/blog/2019-05-02-mesos-1-8-0-released.md
Lines 13-14 (patched)


number of commits as well?



site/source/blog/2019-05-02-mesos-1-8-0-released.md
Lines 14 (patched)


the period here will lead people to interpret it as 60 rather than 6 
(in american english at least..), you could use "k" or nothing?



site/source/blog/2019-05-02-mesos-1-8-0-released.md
Lines 16-49 (patched)


These snippets should probably be in triple backtick sections to ensure 
they are monospaced and lined up when rendered?



site/source/blog/2019-05-02-mesos-1-8-0-released.md
Lines 38-49 (patched)


is this lines of code? commits touching the files? something else?

the subfolder stuff seems a bit bizarre here, for example, is the number 
for src/ `src - src subfolders shown above`?

Perhaps a hierarchically displayed version will be less prone to confusion?

```
30.2% src/
  11.0% src/master
   7.8% src/cli
4.0% support/
   3.7% support/python3/

```

But then 11% here isn't clear if it's 11% of src's 30% or 11% of the 
overall changes.

I don't have any good suggestions, but I think the summary stats are 
interesting. Curious how other projects display them..



site/source/blog/2019-05-02-mesos-1-8-0-released.md
Lines 54 (patched)


newline?



site/source/blog/2019-05-02-mesos-1-8-0-released.md
Lines 59 (patched)


s/F/f/ or should probably just say scheduler to stay consistent with the 
rest of this section and to be clear that it's the scheduler part of the 
framework



site/source/blog/2019-05-02-mesos-1-8-0-released.md
Lines 141 (patched)


extra newline here?


- Benjamin Mahler


On May 6, 2019, 2:47 p.m., Benno Evers wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/70598/
> ---
> 
> (Updated May 6, 2019, 2:47 p.m.)
> 
> 
> Review request for mesos.
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Added blogpost for 1.8.0 release.
> 
> 
> Diffs
> -
> 
>   site/source/blog/2019-05-02-mesos-1-8-0-released.md PRE-CREATION 
> 
> 
> Diff: https://reviews.apache.org/r/70598/diff/1/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Benno Evers
> 
>



Re: Review Request 70595: Adds a regression test for MESOS-9766.

2019-05-06 Thread Chun-Hung Hsiao


> On May 3, 2019, 11:31 p.m., Chun-Hung Hsiao wrote:
> > 3rdparty/libprocess/src/tests/process_tests.cpp
> > Lines 2092-2108 (patched)
> > 
> >
> > I tried this unit test w/o the fix in the following settings:
> > 1. `--gtest_repeat=100`, which resulted in 3 passes and 97 failures.
> > 2. Invoked the test w/o repetition in a for-loop, which resulted in 60 
> > passes and 40 failures.
> > 
> > So it seems to me that this test passes fairly easily if it only runs 
> > once.
> > 
> > Then I replace this snippet with the following code, which just used 1 
> > actor but failed every time in the above two settings:
> > ```
> >   UPID pid = spawn(new ProcessBase(), true);
> > 
> >   dispatch(pid, [] { os::sleep(Milliseconds(50)); });
> > 
> >   http::URL url = http::URL(
> >   "http",
> >   process::address().ip,
> >   process::address().port,
> >   "/__processes__");
> > 
> >   Future response = http::get(url);
> > 
> >   terminate(pid, true);
> > ```
> 
> Alexander Rukletsov wrote:
> Agree with Chun, we should avoid races as much as we can. Several 
> suggestions to further improve Chun's version:
> - Pause the clock in the beginnig. We can then sleep for 42 minutes to 
> make sure the first message is still being processes when `http::get()` is 
> invoked (will of course require `settle()` before).
> - `settle()` after `http::get` to ensure `/__processes__` handler is put 
> into the `pid`'s mailbaox.
> - Check that the resulted JSON does not include the entry for `pid`, 
> i.e., is empty.

I'm not sure if adding settles works:
1. Adding a `settle` before `http::get` makes it wait for the completion of the 
sleep.
2. Adding a `settle` after `http::get` makes `terminate` wait for the 
completion of the sleep.
IIUC both will increase the likelihood of the test passing w/o che fix.


- Chun-Hung


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/70595/#review215035
---


On May 3, 2019, 7:58 p.m., Benjamin Mahler wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/70595/
> ---
> 
> (Updated May 3, 2019, 7:58 p.m.)
> 
> 
> Review request for mesos, Alexander Rukletsov and Chun-Hung Hsiao.
> 
> 
> Bugs: MESOS-9766
> https://issues.apache.org/jira/browse/MESOS-9766
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> This test fails on master prior to applying the fix for MESOS-9766.
> It attempts to ensure that processes are terminated after the
> /__processes__ handler dispatches to them.
> 
> 
> Diffs
> -
> 
>   3rdparty/libprocess/src/tests/process_tests.cpp 
> 60f3dd653153c2b2ccf9c3a7eae8c75fd6ff025c 
> 
> 
> Diff: https://reviews.apache.org/r/70595/diff/1/
> 
> 
> Testing
> ---
> 
> Ran in repetition, although it appears to consistently fail on master without 
> repetition needed.
> 
> 
> Thanks,
> 
> Benjamin Mahler
> 
>



Re: Review Request 70598: Added blogpost for 1.8.0 release.

2019-05-06 Thread Mesos Reviewbot

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/70598/#review215062
---



Bad review!

Reviews applied: []

Error:
No reviewers specified. Please find a reviewer by asking on JIRA or the mailing 
list.

- Mesos Reviewbot


On May 6, 2019, 2:47 p.m., Benno Evers wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/70598/
> ---
> 
> (Updated May 6, 2019, 2:47 p.m.)
> 
> 
> Review request for mesos.
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Added blogpost for 1.8.0 release.
> 
> 
> Diffs
> -
> 
>   site/source/blog/2019-05-02-mesos-1-8-0-released.md PRE-CREATION 
> 
> 
> Diff: https://reviews.apache.org/r/70598/diff/1/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Benno Evers
> 
>



Review Request 70598: Added blogpost for 1.8.0 release.

2019-05-06 Thread Benno Evers

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/70598/
---

Review request for mesos.


Repository: mesos


Description
---

Added blogpost for 1.8.0 release.


Diffs
-

  site/source/blog/2019-05-02-mesos-1-8-0-released.md PRE-CREATION 


Diff: https://reviews.apache.org/r/70598/diff/1/


Testing
---


Thanks,

Benno Evers



Re: Review Request 70596: Launched tasks with more memory in SLRP unit tests.

2019-05-06 Thread Benjamin Bannier

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/70596/#review215058
---


Fix it, then Ship it!




Ship It!


src/tests/storage_local_resource_provider_tests.cpp
Line 1 (original), 1 (patched)


Please make sure your commit message passes the linter (e.g., needs to be 
wrapped at 72 chars).



src/tests/storage_local_resource_provider_tests.cpp
Line 1795 (original), 1795 (patched)


Could we move this into a test class-specific constant?

Here and below.


- Benjamin Bannier


On May 3, 2019, 11:09 p.m., Chun-Hung Hsiao wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/70596/
> ---
> 
> (Updated May 3, 2019, 11:09 p.m.)
> 
> 
> Review request for mesos, Benjamin Bannier and Jan Schlicht.
> 
> 
> Bugs: MESOS-9765
> https://issues.apache.org/jira/browse/MESOS-9765
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Raised the task memory to 128MB (which are the value used in most persistent
> volume tests) in all SLRP tests that launch tasks to avoid OOM.
> 
> 
> Diffs
> -
> 
>   src/tests/storage_local_resource_provider_tests.cpp 
> ecd7aeef1cb3d1a5b4b3419dfd912d41a8c6 
> 
> 
> Diff: https://reviews.apache.org/r/70596/diff/1/
> 
> 
> Testing
> ---
> 
> sudo make check
> 
> 
> Thanks,
> 
> Chun-Hung Hsiao
> 
>



Re: Review Request 70595: Adds a regression test for MESOS-9766.

2019-05-06 Thread Alexander Rukletsov


> On May 3, 2019, 11:31 p.m., Chun-Hung Hsiao wrote:
> > 3rdparty/libprocess/src/tests/process_tests.cpp
> > Lines 2092-2108 (patched)
> > 
> >
> > I tried this unit test w/o the fix in the following settings:
> > 1. `--gtest_repeat=100`, which resulted in 3 passes and 97 failures.
> > 2. Invoked the test w/o repetition in a for-loop, which resulted in 60 
> > passes and 40 failures.
> > 
> > So it seems to me that this test passes fairly easily if it only runs 
> > once.
> > 
> > Then I replace this snippet with the following code, which just used 1 
> > actor but failed every time in the above two settings:
> > ```
> >   UPID pid = spawn(new ProcessBase(), true);
> > 
> >   dispatch(pid, [] { os::sleep(Milliseconds(50)); });
> > 
> >   http::URL url = http::URL(
> >   "http",
> >   process::address().ip,
> >   process::address().port,
> >   "/__processes__");
> > 
> >   Future response = http::get(url);
> > 
> >   terminate(pid, true);
> > ```

Agree with Chun, we should avoid races as much as we can. Several suggestions 
to further improve Chun's version:
- Pause the clock in the beginnig. We can then sleep for 42 minutes to make 
sure the first message is still being processes when `http::get()` is invoked 
(will of course require `settle()` before).
- `settle()` after `http::get` to ensure `/__processes__` handler is put into 
the `pid`'s mailbaox.
- Check that the resulted JSON does not include the entry for `pid`, i.e., is 
empty.


- Alexander


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/70595/#review215035
---


On May 3, 2019, 7:58 p.m., Benjamin Mahler wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/70595/
> ---
> 
> (Updated May 3, 2019, 7:58 p.m.)
> 
> 
> Review request for mesos, Alexander Rukletsov and Chun-Hung Hsiao.
> 
> 
> Bugs: MESOS-9766
> https://issues.apache.org/jira/browse/MESOS-9766
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> This test fails on master prior to applying the fix for MESOS-9766.
> It attempts to ensure that processes are terminated after the
> /__processes__ handler dispatches to them.
> 
> 
> Diffs
> -
> 
>   3rdparty/libprocess/src/tests/process_tests.cpp 
> 60f3dd653153c2b2ccf9c3a7eae8c75fd6ff025c 
> 
> 
> Diff: https://reviews.apache.org/r/70595/diff/1/
> 
> 
> Testing
> ---
> 
> Ran in repetition, although it appears to consistently fail on master without 
> repetition needed.
> 
> 
> Thanks,
> 
> Benjamin Mahler
> 
>



Re: Review Request 70184: Fixed flakiness in 'RetryRpcWithExponentialBackoff'.

2019-05-06 Thread Mesos Reviewbot

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/70184/#review215060
---



Patch looks great!

Reviews applied: [70184]

Passed command: export OS='ubuntu:14.04' BUILDTOOL='autotools' COMPILER='gcc' 
CONFIGURATION='--verbose --disable-libtool-wrappers 
--disable-parallel-test-execution' ENVIRONMENT='GLOG_v=1 MESOS_VERBOSE=1'; 
./support/docker-build.sh

- Mesos Reviewbot


On May 6, 2019, 9:48 a.m., Jan Schlicht wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/70184/
> ---
> 
> (Updated May 6, 2019, 9:48 a.m.)
> 
> 
> Review request for mesos and Chun-Hung Hsiao.
> 
> 
> Bugs: MESOS-9594
> https://issues.apache.org/jira/browse/MESOS-9594
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Under some circumstances, offers would be filtered, resulting in the
> test being stuck while waiting for offers. This has been resolved by
> settling the clock before accepting new offers.
> 
> 
> Diffs
> -
> 
>   src/tests/storage_local_resource_provider_tests.cpp 
> ecd7aeef1cb3d1a5b4b3419dfd912d41a8c6 
> 
> 
> Diff: https://reviews.apache.org/r/70184/diff/2/
> 
> 
> Testing
> ---
> 
> make check
> 
> I could not reproduce the flaky behavior of this test case, hence only assume 
> that this patch resolves the flakiness.
> 
> 
> Thanks,
> 
> Jan Schlicht
> 
>



Re: Review Request 70594: Fixed an issue where /__processes__ never returns a response.

2019-05-06 Thread Alexander Rukletsov

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/70594/#review215059
---


Ship it!




Slick solution, thanks, Ben!

- Alexander Rukletsov


On May 3, 2019, 7:58 p.m., Benjamin Mahler wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/70594/
> ---
> 
> (Updated May 3, 2019, 7:58 p.m.)
> 
> 
> Review request for mesos, Alexander Rukletsov and Chun-Hung Hsiao.
> 
> 
> Bugs: MESOS-9766
> https://issues.apache.org/jira/browse/MESOS-9766
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> It's possible for /__processes__ to never return a response if
> a process is terminated after the /__processes__ handler dispatches
> to it, thus leading the Future to be abandoned.
> 
> This patch ensures we ignore those cases, rather than wait forever.
> 
> See MESOS-9766.
> 
> 
> Diffs
> -
> 
>   3rdparty/libprocess/src/process.cpp 
> 124836472313721a5dbfe4b1ca55f0da3cecd66b 
> 
> 
> Diff: https://reviews.apache.org/r/70594/diff/1/
> 
> 
> Testing
> ---
> 
> Added a test in the subsequent patch.
> 
> 
> Thanks,
> 
> Benjamin Mahler
> 
>



Re: Review Request 70184: Fixed flakiness in 'RetryRpcWithExponentialBackoff'.

2019-05-06 Thread Jan Schlicht

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/70184/
---

(Updated May 6, 2019, 11:48 a.m.)


Review request for mesos and Chun-Hung Hsiao.


Changes
---

Addressed issues.


Bugs: MESOS-9594
https://issues.apache.org/jira/browse/MESOS-9594


Repository: mesos


Description (updated)
---

Under some circumstances, offers would be filtered, resulting in the
test being stuck while waiting for offers. This has been resolved by
settling the clock before accepting new offers.


Diffs (updated)
-

  src/tests/storage_local_resource_provider_tests.cpp 
ecd7aeef1cb3d1a5b4b3419dfd912d41a8c6 


Diff: https://reviews.apache.org/r/70184/diff/2/

Changes: https://reviews.apache.org/r/70184/diff/1-2/


Testing
---

make check

I could not reproduce the flaky behavior of this test case, hence only assume 
that this patch resolves the flakiness.


Thanks,

Jan Schlicht