Re: Review Request 71111: Added offer rescind logic for guarantees enforcement.

2019-07-23 Thread Benjamin Mahler

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/7/#review216817
---


Fix it, then Ship it!




Ditto here, could you add the caveat about the under-rescind to the commit 
description?


src/master/quota_handler.cpp
Lines 662-667 (patched)


Thanks!



src/tests/master_quota_tests.cpp
Lines 1733 (patched)


Do you mean Offered here?



src/tests/master_quota_tests.cpp
Lines 1735-1736 (patched)


This seems a bit weak as a test? But I suppose we don't want to write a 
test that's revealing the quirks of the current under/over rescind 
implementation?



src/tests/master_quota_tests.cpp
Lines 1736 (patched)


Outstanding


- Benjamin Mahler


On July 23, 2019, 8:39 p.m., Meng Zhu wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/7/
> ---
> 
> (Updated July 23, 2019, 8:39 p.m.)
> 
> 
> Review request for mesos, Andrei Sekretenko and Benjamin Mahler.
> 
> 
> Bugs: MESOS-9812
> https://issues.apache.org/jira/browse/MESOS-9812
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Outstanding offers need to be rescinded as needed to ensure
> roles' requested guarantees can be satisfied.
> 
> For simplicity, and also due to the race between the master and
> the allocator, we pessimistically assume that what seems like
> "available" resources in the allocator are all gone. We greedily
> rescind offers until we can satisfy the guarantees.
> 
> Also added a test.
> 
> 
> Diffs
> -
> 
>   src/master/quota_handler.cpp 2a3ca5640c850ec8a8aca7abed45386b5b5e1125 
>   src/tests/master_quota_tests.cpp 8105765172f17f3ea50aa09bc66fede8d21365ab 
> 
> 
> Diff: https://reviews.apache.org/r/7/diff/4/
> 
> 
> Testing
> ---
> 
> make check
> 
> 
> Thanks,
> 
> Meng Zhu
> 
>



Re: Review Request 71111: Added offer rescind logic for guarantees enforcement.

2019-07-23 Thread Meng Zhu

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/7/
---

(Updated July 23, 2019, 1:39 p.m.)


Review request for mesos, Andrei Sekretenko and Benjamin Mahler.


Changes
---

Added more comments about the best-effort rescinding approach.


Bugs: MESOS-9812
https://issues.apache.org/jira/browse/MESOS-9812


Repository: mesos


Description
---

Outstanding offers need to be rescinded as needed to ensure
roles' requested guarantees can be satisfied.

For simplicity, and also due to the race between the master and
the allocator, we pessimistically assume that what seems like
"available" resources in the allocator are all gone. We greedily
rescind offers until we can satisfy the guarantees.

Also added a test.


Diffs (updated)
-

  src/master/quota_handler.cpp 2a3ca5640c850ec8a8aca7abed45386b5b5e1125 
  src/tests/master_quota_tests.cpp 8105765172f17f3ea50aa09bc66fede8d21365ab 


Diff: https://reviews.apache.org/r/7/diff/4/

Changes: https://reviews.apache.org/r/7/diff/3-4/


Testing
---

make check


Thanks,

Meng Zhu



Re: Review Request 71111: Added offer rescind logic for guarantees enforcement.

2019-07-18 Thread Meng Zhu


> On July 18, 2019, 10:32 a.m., Benjamin Mahler wrote:
> > src/master/quota_handler.cpp
> > Lines 651 (patched)
> > 
> >
> > Ditto last review, but in this case the presence of reservations in 
> > both leads to an under-rescind, which seems less acceptable vs an 
> > over-rescind?

Given that we are being very pessimistic here (assuming no available 
resources), we are more likely to over rescind than under. Hopefully, it is OK. 
See my comment in previous patch as well.


> On July 18, 2019, 10:32 a.m., Benjamin Mahler wrote:
> > src/tests/master_quota_tests.cpp
> > Lines 1691-1693 (patched)
> > 
> >
> > Ditto, probably want to pull out the test to ease reviewing?

This is a small patch. Would prefer to keep the test with it.


- Meng


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/7/#review216737
---


On July 17, 2019, 9:30 p.m., Meng Zhu wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/7/
> ---
> 
> (Updated July 17, 2019, 9:30 p.m.)
> 
> 
> Review request for mesos, Andrei Sekretenko and Benjamin Mahler.
> 
> 
> Bugs: MESOS-9812
> https://issues.apache.org/jira/browse/MESOS-9812
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Outstanding offers need to be rescinded as needed to ensure
> roles' requested guarantees can be satisfied.
> 
> For simplicity, and also due to the race between the master and
> the allocator, we pessimistically assume that what seems like
> "available" resources in the allocator are all gone. We greedily
> rescind offers until we can satisfy the guarantees.
> 
> Also added a test.
> 
> 
> Diffs
> -
> 
>   src/master/quota_handler.cpp 2a3ca5640c850ec8a8aca7abed45386b5b5e1125 
>   src/tests/master_quota_tests.cpp 64a9b31a38af23fb16a4afd04f6d3413c7d1ffe9 
> 
> 
> Diff: https://reviews.apache.org/r/7/diff/2/
> 
> 
> Testing
> ---
> 
> make check
> 
> 
> Thanks,
> 
> Meng Zhu
> 
>



Re: Review Request 71111: Added offer rescind logic for guarantees enforcement.

2019-07-18 Thread Meng Zhu


> On July 18, 2019, 3:56 a.m., Andrei Sekretenko wrote:
> > src/master/quota_handler.cpp
> > Lines 664 (patched)
> > 
> >
> > Does this mean that we might rescind an offer even if `consumption + 
> > offered` for the offer's role is within guarantees of that role? 
> > 
> > If yes - do we really have to be THAT greedy?...
> 
> Andrei Sekretenko wrote:
> Edit: s/is within guarantees/is going to be within guarantees/

Yes, it is inaccurate. But calculating a more correct amount would be 
expensive. And I do not feel it is necessary. See more details in my message in 
the public slack: https://mesos.slack.com/archives/C1NEKP60K/p1563498805199200


- Meng


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/7/#review216717
---


On July 17, 2019, 9:30 p.m., Meng Zhu wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/7/
> ---
> 
> (Updated July 17, 2019, 9:30 p.m.)
> 
> 
> Review request for mesos, Andrei Sekretenko and Benjamin Mahler.
> 
> 
> Bugs: MESOS-9812
> https://issues.apache.org/jira/browse/MESOS-9812
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Outstanding offers need to be rescinded as needed to ensure
> roles' requested guarantees can be satisfied.
> 
> For simplicity, and also due to the race between the master and
> the allocator, we pessimistically assume that what seems like
> "available" resources in the allocator are all gone. We greedily
> rescind offers until we can satisfy the guarantees.
> 
> Also added a test.
> 
> 
> Diffs
> -
> 
>   src/master/quota_handler.cpp 2a3ca5640c850ec8a8aca7abed45386b5b5e1125 
>   src/tests/master_quota_tests.cpp 64a9b31a38af23fb16a4afd04f6d3413c7d1ffe9 
> 
> 
> Diff: https://reviews.apache.org/r/7/diff/2/
> 
> 
> Testing
> ---
> 
> make check
> 
> 
> Thanks,
> 
> Meng Zhu
> 
>



Re: Review Request 71111: Added offer rescind logic for guarantees enforcement.

2019-07-18 Thread Mesos Reviewbot

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/7/#review216745
---



Patch looks great!

Reviews applied: [71109, 71110, 71061, 71068, 7]

Passed command: export OS='ubuntu:14.04' BUILDTOOL='autotools' COMPILER='gcc' 
CONFIGURATION='--verbose --disable-libtool-wrappers 
--disable-parallel-test-execution' ENVIRONMENT='GLOG_v=1 MESOS_VERBOSE=1'; 
./support/docker-build.sh

- Mesos Reviewbot


On July 18, 2019, 4:30 a.m., Meng Zhu wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/7/
> ---
> 
> (Updated July 18, 2019, 4:30 a.m.)
> 
> 
> Review request for mesos, Andrei Sekretenko and Benjamin Mahler.
> 
> 
> Bugs: MESOS-9812
> https://issues.apache.org/jira/browse/MESOS-9812
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Outstanding offers need to be rescinded as needed to ensure
> roles' requested guarantees can be satisfied.
> 
> For simplicity, and also due to the race between the master and
> the allocator, we pessimistically assume that what seems like
> "available" resources in the allocator are all gone. We greedily
> rescind offers until we can satisfy the guarantees.
> 
> Also added a test.
> 
> 
> Diffs
> -
> 
>   src/master/quota_handler.cpp 2a3ca5640c850ec8a8aca7abed45386b5b5e1125 
>   src/tests/master_quota_tests.cpp 64a9b31a38af23fb16a4afd04f6d3413c7d1ffe9 
> 
> 
> Diff: https://reviews.apache.org/r/7/diff/2/
> 
> 
> Testing
> ---
> 
> make check
> 
> 
> Thanks,
> 
> Meng Zhu
> 
>



Re: Review Request 71111: Added offer rescind logic for guarantees enforcement.

2019-07-18 Thread Benjamin Mahler

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/7/#review216737
---




src/master/quota_handler.cpp
Lines 651 (patched)


Ditto last review, but in this case the presence of reservations in both 
leads to an under-rescind, which seems less acceptable vs an over-rescind?



src/tests/master_quota_tests.cpp
Lines 1691-1693 (patched)


Ditto, probably want to pull out the test to ease reviewing?


- Benjamin Mahler


On July 18, 2019, 4:30 a.m., Meng Zhu wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/7/
> ---
> 
> (Updated July 18, 2019, 4:30 a.m.)
> 
> 
> Review request for mesos, Andrei Sekretenko and Benjamin Mahler.
> 
> 
> Bugs: MESOS-9812
> https://issues.apache.org/jira/browse/MESOS-9812
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Outstanding offers need to be rescinded as needed to ensure
> roles' requested guarantees can be satisfied.
> 
> For simplicity, and also due to the race between the master and
> the allocator, we pessimistically assume that what seems like
> "available" resources in the allocator are all gone. We greedily
> rescind offers until we can satisfy the guarantees.
> 
> Also added a test.
> 
> 
> Diffs
> -
> 
>   src/master/quota_handler.cpp 2a3ca5640c850ec8a8aca7abed45386b5b5e1125 
>   src/tests/master_quota_tests.cpp 64a9b31a38af23fb16a4afd04f6d3413c7d1ffe9 
> 
> 
> Diff: https://reviews.apache.org/r/7/diff/2/
> 
> 
> Testing
> ---
> 
> make check
> 
> 
> Thanks,
> 
> Meng Zhu
> 
>



Re: Review Request 71111: Added offer rescind logic for guarantees enforcement.

2019-07-18 Thread Andrei Sekretenko

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/7/#review216717
---




src/master/quota_handler.cpp
Lines 664 (patched)


Does this mean that we might rescind an offer even if `consumption + 
offered` for the offer's role is within guarantees of that role? 

If yes - do we really have to be THAT greedy?...


- Andrei Sekretenko


On July 18, 2019, 4:30 a.m., Meng Zhu wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/7/
> ---
> 
> (Updated July 18, 2019, 4:30 a.m.)
> 
> 
> Review request for mesos and Benjamin Mahler.
> 
> 
> Bugs: MESOS-9812
> https://issues.apache.org/jira/browse/MESOS-9812
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Outstanding offers need to be rescinded as needed to ensure
> roles' requested guarantees can be satisfied.
> 
> For simplicity, and also due to the race between the master and
> the allocator, we pessimistically assume that what seems like
> "available" resources in the allocator are all gone. We greedily
> rescind offers until we can satisfy the guarantees.
> 
> Also added a test.
> 
> 
> Diffs
> -
> 
>   src/master/quota_handler.cpp 2a3ca5640c850ec8a8aca7abed45386b5b5e1125 
>   src/tests/master_quota_tests.cpp 64a9b31a38af23fb16a4afd04f6d3413c7d1ffe9 
> 
> 
> Diff: https://reviews.apache.org/r/7/diff/1/
> 
> 
> Testing
> ---
> 
> make check
> 
> 
> Thanks,
> 
> Meng Zhu
> 
>



Re: Review Request 71111: Added offer rescind logic for guarantees enforcement.

2019-07-18 Thread Mesos Reviewbot

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/7/#review216711
---



Bad patch!

Reviews applied: [71109, 71110, 71061, 71068, 7]

Failed command: ['bash', '-c', "set -o pipefail; export OS='ubuntu:14.04' 
BUILDTOOL='autotools' COMPILER='gcc' CONFIGURATION='--verbose 
--disable-libtool-wrappers --disable-parallel-test-execution' 
ENVIRONMENT='GLOG_v=1 MESOS_VERBOSE=1'; ./support/docker-build.sh 2>&1 | tee 
build_7"]

Error:
..
b-ade3-31adcc3bff52 on agent 58665dec-7eb8-49bb-ad94-85d02ec4f434-S0
I0718 06:35:48.936278 18267 http_connection.hpp:131] Sending 2 call to 
http://172.17.0.2:42893/slave(1216)/api/v1/resource_provider
I0718 06:35:48.937249 18275 process.cpp:3671] Handling HTTP event for process 
'slave(1216)' with path: '/slave(1216)/api/v1/resource_provider'
I0718 06:35:48.940191 18257 process.cpp:3671] Handling HTTP event for process 
'master' with path: '/master/api/v1'
I0718 06:35:48.941807 18280 http.cpp:1115] HTTP POST for /master/api/v1 from 
172.17.0.2:52610
I0718 06:35:48.942044 18280 http.cpp:263] Processing call UNRESERVE_RESOURCES
I0718 06:35:48.942693 18280 master.cpp:3888] Authorizing principal 
'test-principal' to unreserve resources 
'[{"disk":{"source":{"id":"/tmp/CSIVersion_StorageLocalResourceProviderTest_OperatorOperationsWithResourceProviderResources_v0_z6gmBZ/2GB-fe75e496-5c41-4dee-b8cd-8a7e5df5283a","mount":{"root":"./csi/org.apache.mesos.csi.test/local/mounts"},"profile":"test","type":"MOUNT","vendor":"org.apache.mesos.csi.test.local"}},"name":"disk","provider_id":{"value":"8e9395d9-3d92-4641-876e-2a95f843d191"},"reservations":[{"role":"storage","type":"DYNAMIC"},{"principal":"test-principal","role":"storage/default-role","type":"DYNAMIC"}],"scalar":{"value":2048.0},"type":"SCALAR"}]'
I0718 06:35:48.944628 18263 master.cpp:12685] Removing offer 
58665dec-7eb8-49bb-ad94-85d02ec4f434-O5
I0718 06:35:48.944752 18276 sched.cpp:960] Rescinded offer 
58665dec-7eb8-49bb-ad94-85d02ec4f434-O5
I0718 06:35:48.944823 18276 sched.cpp:971] Scheduler::offerRescinded took 
21015ns
I0718 06:35:48.945453 18259 hierarchical.cpp:1218] Recovered disk(allocated: 
storage/default-role)(reservations: 
[(DYNAMIC,storage),(DYNAMIC,storage/default-role,test-principal)])[MOUNT(org.apache.mesos.csi.test.local,/tmp/CSIVersion_StorageLocalResourceProviderTest_OperatorOperationsWithResourceProviderResources_v0_z6gmBZ/2GB-fe75e496-5c41-4dee-b8cd-8a7e5df5283a,test)]:2048;
 cpus(allocated: storage/default-role):2; mem(allocated: 
storage/default-role):1024; disk(allocated: storage/default-role):1024; 
ports(allocated: storage/default-role):[31000-32000] (total: cpus:2; mem:1024; 
disk:1024; ports:[31000-32000]; disk(reservations: 
[(DYNAMIC,storage),(DYNAMIC,storage/default-role,test-principal)])[MOUNT(org.apache.mesos.csi.test.local,/tmp/CSIVersion_StorageLocalResourceProviderTest_OperatorOperationsWithResourceProviderResources_v0_z6gmBZ/2GB-fe75e496-5c41-4dee-b8cd-8a7e5df5283a,test)]:2048,
 allocated: {}) on agent 58665dec-7eb8-49bb-ad94-85d02ec4f434-S0 from framework 
58665dec-7eb8-49
 bb-ad94-85d02ec4f434-
I0718 06:35:48.945569 18259 hierarchical.cpp:1264] Framework 
58665dec-7eb8-49bb-ad94-85d02ec4f434- filtered agent 
58665dec-7eb8-49bb-ad94-85d02ec4f434-S0 for 5secs
I0718 06:35:48.948091 18273 master.cpp:12576] Sending operation '' (uuid: 
32631748-d534-4830-af5b-2860ca4a04ce) to agent 
58665dec-7eb8-49bb-ad94-85d02ec4f434-S0 at slave(1216)@172.17.0.2:42893 
(2dfcf722f334)
I0718 06:35:48.948653 18267 slave.cpp:4335] Ignoring new checkpointed resources 
and operations identical to the current version
I0718 06:35:48.951097 18279 hierarchical.cpp:1508] Performed allocation for 1 
agents in 1.161105ms
I0718 06:35:48.951505 18262 provider.cpp:481] Received APPLY_OPERATION event
I0718 06:35:48.951555 18262 provider.cpp:1295] Received UNRESERVE operation '' 
(uuid: 32631748-d534-4830-af5b-2860ca4a04ce)
I0718 06:35:48.951690 18260 master.cpp:10393] Sending offers [ 
58665dec-7eb8-49bb-ad94-85d02ec4f434-O6 ] to framework 
58665dec-7eb8-49bb-ad94-85d02ec4f434- (default) at 
scheduler-4cf5ae9f-d24d-4eae-88a0-dbf9626d4e8c@172.17.0.2:42893
I0718 06:35:48.952296 18261 sched.cpp:934] Scheduler::resourceOffers took 
77288ns
I0718 06:35:48.976239 18268 http.cpp:1115] HTTP POST for 
/slave(1216)/api/v1/resource_provider from 172.17.0.2:52600
I0718 06:35:48.977198 18271 slave.cpp:8406] Handling resource provider message 
'UPDATE_OPERATION_STATUS: (uuid: d9520b5c-c932-4e3b-ade3-31adcc3bff52) for 
framework  (latest state: OPERATION_FINISHED, status update state: 
OPERATION_FINISHED)'
I0718 06:35:48.977396 18271 slave.cpp:8859] Updating the state of operation 
with no ID (uuid: d9520b5c-c932-4e3b-ade3-31adcc3bff52) for an operation API 
call (latest state: OPERATION_FINISHED, status update state: OPERATION_FINISHED)
I0718 06:35:48.977449 

Review Request 71111: Added offer rescind logic for guarantees enforcement.

2019-07-17 Thread Meng Zhu

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/7/
---

Review request for mesos and Benjamin Mahler.


Bugs: MESOS-9812
https://issues.apache.org/jira/browse/MESOS-9812


Repository: mesos


Description
---

Outstanding offers need to be rescinded as needed to ensure
roles' requested guarantees can be satisfied.

For simplicity, and also due to the race between the master and
the allocator, we pessimistically assume that what seems like
"available" resources in the allocator are all gone. We greedily
rescind offers until we can satisfy the guarantees.

Also added a test.


Diffs
-

  src/master/quota_handler.cpp 2a3ca5640c850ec8a8aca7abed45386b5b5e1125 
  src/tests/master_quota_tests.cpp 64a9b31a38af23fb16a4afd04f6d3413c7d1ffe9 


Diff: https://reviews.apache.org/r/7/diff/1/


Testing
---

make check


Thanks,

Meng Zhu