Re: Review Request 70620: Made SLRP allow changes in volume context.

2019-05-20 Thread Chun-Hung Hsiao


> On May 14, 2019, 10:04 a.m., Benjamin Bannier wrote:
> > src/resource_provider/storage/provider.cpp
> > Line 923 (original), 930 (patched)
> > 
> >
> > Let's create a dedicated error message.

Do you mean printing out an error or making it a hard failure? In theory this 
should never fail and that's why I make it a hard assertion here.


> On May 14, 2019, 10:04 a.m., Benjamin Bannier wrote:
> > src/resource_provider/storage/provider.cpp
> > Lines 1067 (patched)
> > 
> >
> > I don't think we need optional semantics here, I'd suggest to just work 
> > with an empty `Labels` here.
> > 
> > If you want you can convert an empty `Labels` to a `None` when invoking 
> > `createRawDiskResource` below. I think even that function might not need 
> > optional semantics for labels though.

This is for maintaining the following proto2 <-> proto3 semantic, since proto3 
doesn't have an OPTIONAL semantic, instead any field w/ the default value will 
not go onto the wire:

Resource.DiskInfo.Source.has_metadata() <=> !VolumeInfo.context.empty()

Since in practice there's no difference, we can change this to set up 
`metadata`to an empty `Labels` even if the `context` map is empty, as long as 
we handle backward compatibility carefully. But if we don't do this, it seems 
more reasonable to me to pass in a `None` here than passing a `Labels` here and 
doing a special check in another function.


- Chun-Hung


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/70620/#review215220
---


On May 10, 2019, 1:14 a.m., Chun-Hung Hsiao wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/70620/
> ---
> 
> (Updated May 10, 2019, 1:14 a.m.)
> 
> 
> Review request for mesos, Benjamin Bannier and James DeFelice.
> 
> 
> Bugs: MESOS-9395
> https://issues.apache.org/jira/browse/MESOS-9395
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> To make SLRP more robust against non-conforming CSI plugins that change
> volume contexts, the `getExistVolumes` method returns a list of resource
> conversions consisting of one for converting old volume contexts to new
> volume contexts, and one to remove missing volumes and add new volumes.
> 
> To make the interfaces consistent, `getStoragePools` now also returns a
> list of resource conversions consisting of one conversion.
> 
> 
> Diffs
> -
> 
>   include/mesos/mesos.proto dc6a87f14864cb4d46b9856f251c2946a475162d 
>   include/mesos/v1/mesos.proto e8086e0f70cac73876a8ae31db3365b5059b5c44 
>   src/resource_provider/storage/provider.cpp 
> 999fe95bb6f38f5a25068accd854b37788b24028 
> 
> 
> Diff: https://reviews.apache.org/r/70620/diff/1/
> 
> 
> Testing
> ---
> 
> sudo make check
> more testing done later in chain
> 
> 
> Thanks,
> 
> Chun-Hung Hsiao
> 
>



Re: Review Request 70620: Made SLRP allow changes in volume context.

2019-05-20 Thread Chun-Hung Hsiao


> On May 12, 2019, 12:44 a.m., James DeFelice wrote:
> > src/resource_provider/storage/provider.cpp
> > Lines 1078 (patched)
> > 
> >
> > how often are these conversions applied? i suppose it's every time that 
> > reconciliation happens - how often is that?

Right now it only happens once when the SLRP is restarted. The plan is to make 
this happen periodically or on demand through 
https://issues.apache.org/jira/browse/MESOS-9254.


- Chun-Hung


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/70620/#review215202
---


On May 10, 2019, 1:14 a.m., Chun-Hung Hsiao wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/70620/
> ---
> 
> (Updated May 10, 2019, 1:14 a.m.)
> 
> 
> Review request for mesos, Benjamin Bannier and James DeFelice.
> 
> 
> Bugs: MESOS-9395
> https://issues.apache.org/jira/browse/MESOS-9395
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> To make SLRP more robust against non-conforming CSI plugins that change
> volume contexts, the `getExistVolumes` method returns a list of resource
> conversions consisting of one for converting old volume contexts to new
> volume contexts, and one to remove missing volumes and add new volumes.
> 
> To make the interfaces consistent, `getStoragePools` now also returns a
> list of resource conversions consisting of one conversion.
> 
> 
> Diffs
> -
> 
>   include/mesos/mesos.proto dc6a87f14864cb4d46b9856f251c2946a475162d 
>   include/mesos/v1/mesos.proto e8086e0f70cac73876a8ae31db3365b5059b5c44 
>   src/resource_provider/storage/provider.cpp 
> 999fe95bb6f38f5a25068accd854b37788b24028 
> 
> 
> Diff: https://reviews.apache.org/r/70620/diff/1/
> 
> 
> Testing
> ---
> 
> sudo make check
> more testing done later in chain
> 
> 
> Thanks,
> 
> Chun-Hung Hsiao
> 
>



Review Request 70686: Added a unit test for master operation authorization.

2019-05-20 Thread Chun-Hung Hsiao

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/70686/
---

Review request for mesos, Benjamin Bannier and Greg Mann.


Bugs: MESOS-9485
https://issues.apache.org/jira/browse/MESOS-9485


Repository: mesos


Description
---

This test verifies that allowing or denying an action will only result
in a success or failure on specific operations but not other operations
in an accept call. This is a regression test for MESOS-9474 and
MESOS-9480.


Diffs
-

  src/slave/slave.hpp c740bf7c4af1699dd71a31f9169d29c1b4adb40d 
  src/tests/master_authorization_tests.cpp 
f65b6210a8189ec5e9089bce845a46d325253058 
  src/tests/mesos.hpp e70e0e1bbebef41557f9bb1f26c379357fc74e2a 
  src/tests/mock_slave.hpp 326a450eb2ae9912a19c6e0220e80f74d2953add 
  src/tests/mock_slave.cpp 1122c2a8b56b9d4e5deea4e892d81061f5e39a8d 


Diff: https://reviews.apache.org/r/70686/diff/1/


Testing
---

make check

Ran the unit test under stress for 500 iterations.


Thanks,

Chun-Hung Hsiao



Review Request 70685: Removed the `TaskStatusUpdateIsTerminalState` matcher.

2019-05-20 Thread Chun-Hung Hsiao

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/70685/
---

Review request for mesos, Benjamin Bannier and Gilbert Song.


Repository: mesos


Description
---

The `TaskStatusUpdateIsTerminalState` test matcher does not handle both
v0 and v1 status updates. Since introducing such a matcher in the
`v1::scheduler` namespace is inconsistent and it is not hard to use the
built-in `Truly` matcher to implement the same functionality, this
matcher is removed for now.


Diffs
-

  src/tests/containerizer/nvidia_gpu_isolator_tests.cpp 
7f5bd8cd4c07b724fe2c49068bf327e5afe9ec41 
  src/tests/mesos.hpp e70e0e1bbebef41557f9bb1f26c379357fc74e2a 


Diff: https://reviews.apache.org/r/70685/diff/1/


Testing
---

make check


Thanks,

Chun-Hung Hsiao



Review Request 70684: Changed the `*TaskIdEq` test matchers to take a `TaskID`.

2019-05-20 Thread Chun-Hung Hsiao

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/70684/
---

Review request for mesos, Benjamin Bannier and Greg Mann.


Bugs: MESOS-9485
https://issues.apache.org/jira/browse/MESOS-9485


Repository: mesos


Description
---

The `TaskStatusTaskIdEq` and `TaskStatusUpdateTaskIdEq` matchers
previously take a `TaskInfo`, which is not very intuitive and
inconsistent with other matchers such as `OptionTaskHasTaskID`. By
making these matchers take a `TaskID` we also reduce coupling of this
matcher and the structure of `TaskInfo`.


Diffs
-

  src/tests/api_tests.cpp bc19d7e9661f091aebf3072967548c7c3196239d 
  src/tests/containerizer/nvidia_gpu_isolator_tests.cpp 
7f5bd8cd4c07b724fe2c49068bf327e5afe9ec41 
  src/tests/default_executor_tests.cpp 93d7a1cfa1a9df838426c8257b6158e3553b5037 
  src/tests/gc_tests.cpp a583f1d79a9dd9def24ccfeb9e49ea0392173420 
  src/tests/master_tests.cpp 964d935771a99efaee63187affe46b551146f310 
  src/tests/mesos.hpp e70e0e1bbebef41557f9bb1f26c379357fc74e2a 
  src/tests/partition_tests.cpp 8148aff6bf1d9637d486b549d7f8c7124566d86c 
  src/tests/slave_tests.cpp 50882a5aa1b20c234f3cec8366f346fa9bfd4f83 


Diff: https://reviews.apache.org/r/70684/diff/1/


Testing
---

make check


Thanks,

Chun-Hung Hsiao



Review Request 70683: Added helpers for evolving operation-related proto messages.

2019-05-20 Thread Chun-Hung Hsiao

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/70683/
---

Review request for mesos, Benjamin Bannier and Greg Mann.


Bugs: MESOS-9485
https://issues.apache.org/jira/browse/MESOS-9485


Repository: mesos


Description
---

Added helpers for evolving operation-related proto messages.


Diffs
-

  src/internal/evolve.hpp 1044d9df75b6fc1f60d3704be9cb5751e6d4321d 
  src/internal/evolve.cpp 19c155967bf090fb2ec39211805ff1385787ab59 


Diff: https://reviews.apache.org/r/70683/diff/1/


Testing
---

make check


Thanks,

Chun-Hung Hsiao



Re: Review Request 70651: Notifies master `/api/v1` subscribers about recovered frameworks.

2019-05-20 Thread Chun-Hung Hsiao


> On May 17, 2019, 10:16 a.m., Greg Mann wrote:
> > src/tests/api_tests.cpp
> > Lines 2774 (patched)
> > 
> >
> > Nit: make this a const ref?
> 
> Chun-Hung Hsiao wrote:
> This is copied because `event` is overwritten later. Do you think that 
> it's okay to use a ref, since we don't access to it afterwards?
> 
> Greg Mann wrote:
> As long as `agentAdded` isn't modified, we can make it a const ref? I was 
> suggesting we do the same thing with `agentAdded` that we do previously in 
> this test with `getState`:
> ```
> const v1::master::Response::GetState& getState =
>   event->get().subscribed().get_state();
> ```

The problem is that `event->get()` will be destructed once `event` is 
overwritten, so the ref, const or not, will no longer be valid. So making a 
copy is a bit safer (if anyone touches this test in the future).


- Chun-Hung


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/70651/#review215319
---


On May 16, 2019, 5:11 a.m., Chun-Hung Hsiao wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/70651/
> ---
> 
> (Updated May 16, 2019, 5:11 a.m.)
> 
> 
> Review request for mesos and Greg Mann.
> 
> 
> Bugs: MESOS-9785
> https://issues.apache.org/jira/browse/MESOS-9785
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> If one subscribes to master's `/api/v1` endpoint after a master failover
> but before an agent reregistration, frameworks recovered through the
> agent registration should be notified to the subscriber, otherwise
> recovered tasks will have framework IDs referring to frameworks unknown
> to the subscriber.
> 
> 
> Diffs
> -
> 
>   src/common/protobuf_utils.cpp 8b252cb11e17356836988dfc44a63953579a1def 
>   src/master/master.cpp 6c0e30bff3fbfe0ad1f3af44eb3df6f82048f0e1 
>   src/tests/api_tests.cpp bc19d7e9661f091aebf3072967548c7c3196239d 
> 
> 
> Diff: https://reviews.apache.org/r/70651/diff/1/
> 
> 
> Testing
> ---
> 
> make check
> 
> 
> Thanks,
> 
> Chun-Hung Hsiao
> 
>



Re: Review Request 70668: Fixed the race between validating and applying FrameworkInfo updates.

2019-05-20 Thread Benjamin Mahler

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/70668/#review215378
---


Ship it!





src/master/master.cpp
Lines 2777 (patched)


```
if (updateValidationError.isSome()) {
```



src/master/master.cpp
Lines 3050 (patched)


```
if (updateValidationError.isSome()) {
```


- Benjamin Mahler


On May 17, 2019, 3:07 p.m., Andrei Sekretenko wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/70668/
> ---
> 
> (Updated May 17, 2019, 3:07 p.m.)
> 
> 
> Review request for mesos and Benjamin Mahler.
> 
> 
> Bugs: MESOS-7258 and MESOS-9763
> https://issues.apache.org/jira/browse/MESOS-7258
> https://issues.apache.org/jira/browse/MESOS-9763
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> This patch moves the FrameworkInfo update validation into the
> continuation of the SUBSCRIBE calls (after authorization). This fixes
> the race between two concurrent SUBSCRIBE calls attempting to change
> principal (see MESOS-9763) and prevents the future changes of the update
> validation from exacerbating this race.
> 
> 
> Diffs
> -
> 
>   src/master/master.cpp c72b92656dddca63ee89abf676da66bd76f58a6d 
> 
> 
> Diff: https://reviews.apache.org/r/70668/diff/1/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Andrei Sekretenko
> 
>



Re: Review Request 70667: Added unit tests for 'framework::validateUpdate()'.

2019-05-20 Thread Benjamin Mahler

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/70667/#review215377
---


Fix it, then Ship it!





src/tests/master_validation_tests.cpp
Lines 4773-4774 (patched)


Usually when the wrapping is this far, we'll wrap on the next line:

```
EXPECT_NONE(framework::validateUpdate(
DEFAULT_FRAMEWORK_INFO, frameworkInfo));

// or:

EXPECT_NONE(framework::validateUpdate(
DEFAULT_FRAMEWORK_INFO,
frameworkInfo));
```



src/tests/master_validation_tests.cpp
Lines 4785 (patched)


Ditto on the previous patch, this should be a TODO that 
https://reviews.apache.org/r/70669/ later removes, to make it clear to the 
reader what the plan is.


- Benjamin Mahler


On May 17, 2019, 3:06 p.m., Andrei Sekretenko wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/70667/
> ---
> 
> (Updated May 17, 2019, 3:06 p.m.)
> 
> 
> Review request for mesos and Benjamin Mahler.
> 
> 
> Bugs: MESOS-7258
> https://issues.apache.org/jira/browse/MESOS-7258
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Added unit tests for 'framework::validateUpdate()'.
> 
> 
> Diffs
> -
> 
>   src/tests/master_validation_tests.cpp 
> 1b7a8273f2704e0a4dedf33c55ede33dc1b1a4af 
> 
> 
> Diff: https://reviews.apache.org/r/70667/diff/1/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Andrei Sekretenko
> 
>



Re: Review Request 70666: Introduced a function for validating a `FrameworkInfo` update.

2019-05-20 Thread Benjamin Mahler

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/70666/#review215375
---


Fix it, then Ship it!




Looks good, but some clarity in the description and some TODOs indicating that 
this does not validate all `user` and `checkpoint` would be great.


src/master/master.cpp
Line 2574 (original), 2574 (patched)


This is not accurate, if the framework attempts to change 'user' or 
'checkpoint', which are immutable, this validation doesn't catch it.

Can you comment on the intention here more clearly? I see 
https://reviews.apache.org/r/70669/ updates it later to be more coherent: so, 
some appropriate TODOs here are needed to indicate where this is going.



src/master/master.cpp
Line 2585 (original), 2585 (patched)


newline and space before the {

Not yours, I realize it's copied from the condition above, but probably the 
following is more readable?

```
if (validationError.isSome()) {
```

i.e. "if there is an error" instead of "if there isn't no error"



src/master/validation.hpp
Lines 115 (patched)


Ditto, a TODO here is needed to indicate that this doesn't actually check 
all immutable fields.



src/master/validation.cpp
Lines 574-582 (patched)


no need for the std:: prefix anymore



src/master/validation.cpp
Lines 576 (patched)


not yours, but 2 space indent here



src/master/validation.cpp
Lines 579 (patched)


Perhaps the following for a bit more clarity:

```
  Option newPrincipal = None();
```



src/master/validation.cpp
Lines 590-594 (patched)


not yours, but just a comment, we usually try to put the opening and 
closing single quotes on the same line if possible, to reduce accidental 
omission of the closing quote (which has happened quite often)



src/master/validation.cpp
Lines 591 (patched)


Not yours, but we can do the following to clean it up a bit:

```
LOG(WARNING)
  << "Framework " << oldInfo.id() << " which had a principal "
  << " '" << oldPrincipal.getOrElse("") << "'"
  << " tried to (re)subscribe with a new principal "
  << " '" << newPrincipal.getOrElse("") << "'";
```



src/master/validation.cpp
Lines 595 (patched)


newline

Also, not yours, but we generally don't add the period at the end.


- Benjamin Mahler


On May 17, 2019, 3:05 p.m., Andrei Sekretenko wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/70666/
> ---
> 
> (Updated May 17, 2019, 3:05 p.m.)
> 
> 
> Review request for mesos and Benjamin Mahler.
> 
> 
> Bugs: MESOS-7258
> https://issues.apache.org/jira/browse/MESOS-7258
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> This patch moves the code validating a new `FrameworkInfo` against
> the current one into a separate function.
> 
> 
> Diffs
> -
> 
>   src/master/master.cpp c72b92656dddca63ee89abf676da66bd76f58a6d 
>   src/master/validation.hpp 95638a17052ece6c957aa76e4cead8d7bfe82024 
>   src/master/validation.cpp 9fb0850987ce385d345302cac9721adead7181b8 
> 
> 
> Diff: https://reviews.apache.org/r/70666/diff/1/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Andrei Sekretenko
> 
>



Re: Review Request 70618: Store framework sorters inside RoleInfos.

2019-05-20 Thread Mesos Reviewbot

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/70618/#review215376
---



Patch looks great!

Reviews applied: [70626, 70591, 70679, 70618]

Passed command: export OS='ubuntu:14.04' BUILDTOOL='autotools' COMPILER='gcc' 
CONFIGURATION='--verbose --disable-libtool-wrappers 
--disable-parallel-test-execution' ENVIRONMENT='GLOG_v=1 MESOS_VERBOSE=1'; 
./support/docker-build.sh

- Mesos Reviewbot


On May 20, 2019, 9:25 a.m., Andrei Sekretenko wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/70618/
> ---
> 
> (Updated May 20, 2019, 9:25 a.m.)
> 
> 
> Review request for mesos, Benjamin Mahler and Meng Zhu.
> 
> 
> Bugs: MESOS-9701
> https://issues.apache.org/jira/browse/MESOS-9701
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> This patch replaces a hashmap> used for tracking
> framework sorters with a unique_ptr inside of the RoleInfo.
> This simplifies the framework tracking logic and slightly improves
> performance.
> 
> 
> Diffs
> -
> 
>   src/master/allocator/mesos/hierarchical.hpp 
> c2058baca5159da4cdcab77afd5de3c0d5ae6c48 
>   src/master/allocator/mesos/hierarchical.cpp 
> 6df9a3b4ef470e57862815a862494cdb5e26d3e3 
> 
> 
> Diff: https://reviews.apache.org/r/70618/diff/4/
> 
> 
> Testing
> ---
> 
> make check
> 
> Benchmarking: 5 runs of 
> BENCHMARK_HierarchicalAllocator_WithQuotaParam.LargeAndSmallQuota/5 with the 
> optimized build.
> 
> BEFORE (master):
> 
> Added 3000 agents in 57.444301ms
> Added 3000 frameworks in 15.100499419secs
> Benchmark setup: 3000 agents, 3000 roles, 3000 frameworks, with random sorter
> Made 3856 allocations in 13.359926851secs
> Made 0 allocation in 12.147765014secs
> 
> Added 3000 agents in 58.651887ms
> Added 3000 frameworks in 14.485925735secs
> Benchmark setup: 3000 agents, 3000 roles, 3000 frameworks, with random sorter
> Made 3856 allocations in 13.6526783secs
> Made 0 allocation in 12.138439924secs
> 
> Added 3000 agents in 59.028581ms
> Added 3000 frameworks in 14.72945866secs
> Benchmark setup: 3000 agents, 3000 roles, 3000 frameworks, with random sorter
> Made 3856 allocations in 12.739135078secs
> Made 0 allocation in 12.673302496secs
> 
> Added 3000 agents in 59.050577ms
> Added 3000 frameworks in 14.78273576secs
> Benchmark setup: 3000 agents, 3000 roles, 3000 frameworks, with random sorter
> Made 3856 allocations in 13.20400492secs
> Made 0 allocation in 13.289808943secs
> 
> Added 3000 agents in 58.629888ms
> Added 3000 frameworks in 14.714786337secs
> Benchmark setup: 3000 agents, 3000 roles, 3000 frameworks, with random sorter
> Made 3856 allocations in 13.721094744secs
> Made 0 allocation in 12.688377237secs
> 
> ---
> AFTER (master + previous patch https://reviews.apache.org/r/70591/ + this 
> patch):
> 
> Added 3000 agents in 58.04155ms
> Added 3000 frameworks in 13.694651977secs
> Benchmark setup: 3000 agents, 3000 roles, 3000 frameworks, with random sorter
> Made 3856 allocations in 12.417541213secs
> Made 0 allocation in 12.130755905secs
> 
> Added 3000 agents in 55.246813ms
> Added 3000 frameworks in 13.460479936secs
> Benchmark setup: 3000 agents, 3000 roles, 3000 frameworks, with random sorter
> Made 3856 allocations in 12.393765514secs
> Made 0 allocation in 12.196981426secs
> 
> Added 3000 agents in 58.013477ms
> Added 3000 frameworks in 13.69361015secs
> Benchmark setup: 3000 agents, 3000 roles, 3000 frameworks, with random sorter
> Made 3856 allocations in 12.245916159secs
> Made 0 allocation in 11.699553888secs
> 
> Added 3000 agents in 57.163681ms
> Added 3000 frameworks in 13.738218369secs
> Benchmark setup: 3000 agents, 3000 roles, 3000 frameworks, with random sorter
> Made 3856 allocations in 12.809206431secs
> Made 0 allocation in 12.400140164secs
> 
> Added 3000 agents in 57.942087ms
> Added 3000 frameworks in 13.836390727secs
> Benchmark setup: 3000 agents, 3000 roles, 3000 frameworks, with random sorter
> Made 3856 allocations in 12.75595625secs
> Made 0 allocation in 11.939263998secs
> 
> 
> Thanks,
> 
> Andrei Sekretenko
> 
>



Re: Review Request 70665: Moved the logic of sending 'Framework' updates into a separate method.

2019-05-20 Thread Benjamin Mahler

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/70665/#review215374
---




src/master/master.hpp
Lines 667 (patched)


Why did this need to be passed separately? Is there some reason 
`framework->pid` can't be used?



src/master/master.cpp
Lines 2819 (patched)


newline



src/master/master.cpp
Lines 2832-2835 (original), 2839-2842 (patched)


This looks like a bug, the pid needs to be set here in the non http api 
case.



src/master/master.cpp
Lines 3151 (patched)


newline


- Benjamin Mahler


On May 17, 2019, 3:03 p.m., Andrei Sekretenko wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/70665/
> ---
> 
> (Updated May 17, 2019, 3:03 p.m.)
> 
> 
> Review request for mesos and Benjamin Mahler.
> 
> 
> Bugs: MESOS-7258
> https://issues.apache.org/jira/browse/MESOS-7258
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> This patch consolidates the scattered logic of sending Framework updates
> (namely, the FRAMEWORK_ADDED event to master V1 API subscribers and
> UpdateFrameworkMessage to agents).
> 
> NOTE: The UpdateFrameworkMessage is now also being sent in the case of
> non-`forced` subscription request from a framework which has already
> been registered. Previously it was not being sent in this case -
> this likely is a bug.
> 
> 
> Diffs
> -
> 
>   src/master/master.hpp c523c937e294eaffe0e58306c267770c119c9f42 
>   src/master/master.cpp c72b92656dddca63ee89abf676da66bd76f58a6d 
> 
> 
> Diff: https://reviews.apache.org/r/70665/diff/1/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Andrei Sekretenko
> 
>



Re: Review Request 70664: Made `activateRecoveredFramework()` return void instead of Nothing().

2019-05-20 Thread Benjamin Mahler

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/70664/#review215372
---


Ship it!




Thanks! I was a bit confused at first from the summary, I think the key part is 
that we're going from a Try to a void, rather than a Nothing to a void.

- Benjamin Mahler


On May 17, 2019, 3:02 p.m., Andrei Sekretenko wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/70664/
> ---
> 
> (Updated May 17, 2019, 3:02 p.m.)
> 
> 
> Review request for mesos and Benjamin Mahler.
> 
> 
> Bugs: MESOS-7258
> https://issues.apache.org/jira/browse/MESOS-7258
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> The `Master::activateRecoveredFramework()` method always returns
> `Nothing()`. This patch changes its return type to `void` and removes
> the unused error handling code.
> 
> 
> Diffs
> -
> 
>   src/master/master.hpp c523c937e294eaffe0e58306c267770c119c9f42 
>   src/master/master.cpp c72b92656dddca63ee89abf676da66bd76f58a6d 
> 
> 
> Diff: https://reviews.apache.org/r/70664/diff/1/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Andrei Sekretenko
> 
>



Re: Review Request 70663: Removed non-implemented declaration of 'Master::validate()'.

2019-05-20 Thread Benjamin Mahler

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/70663/#review215371
---


Ship it!




Ship It!

- Benjamin Mahler


On May 17, 2019, 3:01 p.m., Andrei Sekretenko wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/70663/
> ---
> 
> (Updated May 17, 2019, 3:01 p.m.)
> 
> 
> Review request for mesos and Benjamin Mahler.
> 
> 
> Bugs: MESOS-7258
> https://issues.apache.org/jira/browse/MESOS-7258
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Removed non-implemented declaration of 'Master::validate()'.
> 
> 
> Diffs
> -
> 
>   src/master/master.hpp c523c937e294eaffe0e58306c267770c119c9f42 
> 
> 
> Diff: https://reviews.apache.org/r/70663/diff/1/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Andrei Sekretenko
> 
>



Re: Review Request 70618: Store framework sorters inside RoleInfos.

2019-05-20 Thread Andrei Sekretenko


> On May 20, 2019, 12:18 p.m., Meng Zhu wrote:
> > src/master/allocator/mesos/hierarchical.cpp
> > Line 367 (original), 367 (patched)
> > 
> >
> > Not yours, but this could be just a reference?
> > 
> > const hashmap& allocation = ...
> > 
> > Want to add a quick patch for that?

It turns out that such change breaks a lot of tests.
Indeed, the code below is modifying the hashmap reference to which is returned 
by `frameworkSorter->allocation()`, but I could not immediately understand what 
_exactly_ breaks. I'll have a deeper look some time later.


> On May 20, 2019, 12:18 p.m., Meng Zhu wrote:
> > src/master/allocator/mesos/hierarchical.cpp
> > Line 1853 (original), 1853 (patched)
> > 
> >
> > Not yours, but this comment is tied to DRF, we should remove/modify it.

Fixed in preceding https://reviews.apache.org/r/70679/


- Andrei


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/70618/#review215354
---


On May 20, 2019, 4:25 p.m., Andrei Sekretenko wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/70618/
> ---
> 
> (Updated May 20, 2019, 4:25 p.m.)
> 
> 
> Review request for mesos, Benjamin Mahler and Meng Zhu.
> 
> 
> Bugs: MESOS-9701
> https://issues.apache.org/jira/browse/MESOS-9701
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> This patch replaces a hashmap> used for tracking
> framework sorters with a unique_ptr inside of the RoleInfo.
> This simplifies the framework tracking logic and slightly improves
> performance.
> 
> 
> Diffs
> -
> 
>   src/master/allocator/mesos/hierarchical.hpp 
> c2058baca5159da4cdcab77afd5de3c0d5ae6c48 
>   src/master/allocator/mesos/hierarchical.cpp 
> 6df9a3b4ef470e57862815a862494cdb5e26d3e3 
> 
> 
> Diff: https://reviews.apache.org/r/70618/diff/4/
> 
> 
> Testing
> ---
> 
> make check
> 
> Benchmarking: 5 runs of 
> BENCHMARK_HierarchicalAllocator_WithQuotaParam.LargeAndSmallQuota/5 with the 
> optimized build.
> 
> BEFORE (master):
> 
> Added 3000 agents in 57.444301ms
> Added 3000 frameworks in 15.100499419secs
> Benchmark setup: 3000 agents, 3000 roles, 3000 frameworks, with random sorter
> Made 3856 allocations in 13.359926851secs
> Made 0 allocation in 12.147765014secs
> 
> Added 3000 agents in 58.651887ms
> Added 3000 frameworks in 14.485925735secs
> Benchmark setup: 3000 agents, 3000 roles, 3000 frameworks, with random sorter
> Made 3856 allocations in 13.6526783secs
> Made 0 allocation in 12.138439924secs
> 
> Added 3000 agents in 59.028581ms
> Added 3000 frameworks in 14.72945866secs
> Benchmark setup: 3000 agents, 3000 roles, 3000 frameworks, with random sorter
> Made 3856 allocations in 12.739135078secs
> Made 0 allocation in 12.673302496secs
> 
> Added 3000 agents in 59.050577ms
> Added 3000 frameworks in 14.78273576secs
> Benchmark setup: 3000 agents, 3000 roles, 3000 frameworks, with random sorter
> Made 3856 allocations in 13.20400492secs
> Made 0 allocation in 13.289808943secs
> 
> Added 3000 agents in 58.629888ms
> Added 3000 frameworks in 14.714786337secs
> Benchmark setup: 3000 agents, 3000 roles, 3000 frameworks, with random sorter
> Made 3856 allocations in 13.721094744secs
> Made 0 allocation in 12.688377237secs
> 
> ---
> AFTER (master + previous patch https://reviews.apache.org/r/70591/ + this 
> patch):
> 
> Added 3000 agents in 58.04155ms
> Added 3000 frameworks in 13.694651977secs
> Benchmark setup: 3000 agents, 3000 roles, 3000 frameworks, with random sorter
> Made 3856 allocations in 12.417541213secs
> Made 0 allocation in 12.130755905secs
> 
> Added 3000 agents in 55.246813ms
> Added 3000 frameworks in 13.460479936secs
> Benchmark setup: 3000 agents, 3000 roles, 3000 frameworks, with random sorter
> Made 3856 allocations in 12.393765514secs
> Made 0 allocation in 12.196981426secs
> 
> Added 3000 agents in 58.013477ms
> Added 3000 frameworks in 13.69361015secs
> Benchmark setup: 3000 agents, 3000 roles, 3000 frameworks, with random sorter
> Made 3856 allocations in 12.245916159secs
> Made 0 allocation in 11.699553888secs
> 
> Added 3000 agents in 57.163681ms
> Added 3000 frameworks in 13.738218369secs
> Benchmark setup: 3000 agents, 3000 roles, 3000 frameworks, with random sorter
> Made 3856 allocations in 12.809206431secs
> Made 0 allocation in 12.400140164secs
> 
> Added 3000 agents in 57.942087ms
> Added 3000 frameworks in 13.836390727secs
> Benchmark setup: 3000 agents, 3000 roles, 3000 frameworks, with random sorter
> Made 3856 allocations in 12.75595625secs
> Made 0 allocation in 

Re: Review Request 70626: Refactored `untrackReservations()` method.

2019-05-20 Thread Andrei Sekretenko

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/70626/
---

(Updated May 20, 2019, 4:27 p.m.)


Review request for mesos and Meng Zhu.


Bugs: MESOS-9701
https://issues.apache.org/jira/browse/MESOS-9701


Repository: mesos


Description
---

This patch gets rid of insertion into the beginning of the vector
and also of shadowing HierarchicalAllocatorProcess::roles by a local.


Diffs (updated)
-

  src/master/allocator/mesos/hierarchical.cpp 
6df9a3b4ef470e57862815a862494cdb5e26d3e3 


Diff: https://reviews.apache.org/r/70626/diff/3/

Changes: https://reviews.apache.org/r/70626/diff/2-3/


Testing
---

make check


Thanks,

Andrei Sekretenko



Re: Review Request 70591: Added `struct RoleInfo` to track role reservations and framework IDs.

2019-05-20 Thread Andrei Sekretenko

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/70591/
---

(Updated May 20, 2019, 4:26 p.m.)


Review request for mesos, Benjamin Mahler and Meng Zhu.


Bugs: MESOS-9701
https://issues.apache.org/jira/browse/MESOS-9701


Repository: mesos


Description
---

This patch introduces struct RoleInfo which contains the framework IDs
and reservatons tracked under a role. Two separate hashmaps used for
this purpose previously are replaced by a single hashmap, thus the need
to keep the two consistent with each other is removed. This simplifies
the role tracking logic in the allocator. The patch introduces minimal
to no performance impact.


Diffs (updated)
-

  src/master/allocator/mesos/hierarchical.hpp 
c2058baca5159da4cdcab77afd5de3c0d5ae6c48 
  src/master/allocator/mesos/hierarchical.cpp 
6df9a3b4ef470e57862815a862494cdb5e26d3e3 


Diff: https://reviews.apache.org/r/70591/diff/5/

Changes: https://reviews.apache.org/r/70591/diff/4-5/


Testing
---

make check

Benchmarking: 5 runs of 
`BENCHMARK_HierarchicalAllocator_WithQuotaParam.LargeAndSmallQuota/5` with the 
optimized build.

Performance impact in this benchmark seems to be negligible.

BEFORE:
Added 3000 agents in 51.553929ms
Added 3000 frameworks in 15.174748344secs
Benchmark setup: 3000 agents, 3000 roles, 3000 frameworks, with random sorter
Made 3856 allocations in 12.400805171secs
Made 0 allocation in 12.5850238secs

Added 3000 agents in 55.739336ms
Added 3000 frameworks in 14.730404769secs
Benchmark setup: 3000 agents, 3000 roles, 3000 frameworks, with random sorter
Made 3856 allocations in 12.563439682secs
Made 0 allocation in 13.063555055secs

Added 3000 agents in 54.414733ms
Added 3000 frameworks in 15.10136842secs
Benchmark setup: 3000 agents, 3000 roles, 3000 frameworks, with random sorter
Made 3856 allocations in 13.501664915secs
Made 0 allocation in 12.89034382secs

Added 3000 agents in 52.58252ms
Added 3000 frameworks in 14.048350298secs
Benchmark setup: 3000 agents, 3000 roles, 3000 frameworks, with random sorter
Made 3856 allocations in 12.299952145secs
Made 0 allocation in 11.888248811secs

Added 3000 agents in 52.821439ms
Added 3000 frameworks in 15.344450583secs
Benchmark setup: 3000 agents, 3000 roles, 3000 frameworks, with random sorter
Made 3856 allocations in 12.63425secs
Made 0 allocation in 12.427171541secs


AFTER:

Added 3000 agents in 69.716648ms
Added 3000 frameworks in 15.249001979secs
Benchmark setup: 3000 agents, 3000 roles, 3000 frameworks, with random sorter
Made 3856 allocations in 12.860494226secs
Made 0 allocation in 12.228866329secs

Added 3000 agents in 52.639388ms
Added 3000 frameworks in 15.207895482secs
Benchmark setup: 3000 agents, 3000 roles, 3000 frameworks, with random sorter
Made 3856 allocations in 12.504777266secs
Made 0 allocation in 12.70388062secs

Added 3000 agents in 56.865794ms
Added 3000 frameworks in 15.284003915secs
Benchmark setup: 3000 agents, 3000 roles, 3000 frameworks, with random sorter
Made 3856 allocations in 13.86859815secs
Made 0 allocation in 12.538958231secs

Added 3000 agents in 56.028013ms
Added 3000 frameworks in 13.892577869secs
Benchmark setup: 3000 agents, 3000 roles, 3000 frameworks, with random sorter
Made 3856 allocations in 12.341724418secs
Made 0 allocation in 12.23022189secs

Added 3000 agents in 52.368219ms
Added 3000 frameworks in 13.978581104secs
Benchmark setup: 3000 agents, 3000 roles, 3000 frameworks, with random sorter
Made 3856 allocations in 12.701682501secs
Made 0 allocation in 12.141360313secs


Thanks,

Andrei Sekretenko



Review Request 70679: Fixed comment to reflect the existence of sorters other than DRF.

2019-05-20 Thread Andrei Sekretenko

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/70679/
---

Review request for mesos and Meng Zhu.


Repository: mesos


Description
---

Fixed comment to reflect the existence of sorters other than DRF.


Diffs
-

  src/master/allocator/mesos/hierarchical.cpp 
6df9a3b4ef470e57862815a862494cdb5e26d3e3 


Diff: https://reviews.apache.org/r/70679/diff/1/


Testing
---


Thanks,

Andrei Sekretenko



Re: Review Request 70618: Store framework sorters inside RoleInfos.

2019-05-20 Thread Andrei Sekretenko

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/70618/
---

(Updated May 20, 2019, 4:25 p.m.)


Review request for mesos, Benjamin Mahler and Meng Zhu.


Bugs: MESOS-9701
https://issues.apache.org/jira/browse/MESOS-9701


Repository: mesos


Description
---

This patch replaces a hashmap> used for tracking
framework sorters with a unique_ptr inside of the RoleInfo.
This simplifies the framework tracking logic and slightly improves
performance.


Diffs (updated)
-

  src/master/allocator/mesos/hierarchical.hpp 
c2058baca5159da4cdcab77afd5de3c0d5ae6c48 
  src/master/allocator/mesos/hierarchical.cpp 
6df9a3b4ef470e57862815a862494cdb5e26d3e3 


Diff: https://reviews.apache.org/r/70618/diff/4/

Changes: https://reviews.apache.org/r/70618/diff/3-4/


Testing
---

make check

Benchmarking: 5 runs of 
BENCHMARK_HierarchicalAllocator_WithQuotaParam.LargeAndSmallQuota/5 with the 
optimized build.

BEFORE (master):

Added 3000 agents in 57.444301ms
Added 3000 frameworks in 15.100499419secs
Benchmark setup: 3000 agents, 3000 roles, 3000 frameworks, with random sorter
Made 3856 allocations in 13.359926851secs
Made 0 allocation in 12.147765014secs

Added 3000 agents in 58.651887ms
Added 3000 frameworks in 14.485925735secs
Benchmark setup: 3000 agents, 3000 roles, 3000 frameworks, with random sorter
Made 3856 allocations in 13.6526783secs
Made 0 allocation in 12.138439924secs

Added 3000 agents in 59.028581ms
Added 3000 frameworks in 14.72945866secs
Benchmark setup: 3000 agents, 3000 roles, 3000 frameworks, with random sorter
Made 3856 allocations in 12.739135078secs
Made 0 allocation in 12.673302496secs

Added 3000 agents in 59.050577ms
Added 3000 frameworks in 14.78273576secs
Benchmark setup: 3000 agents, 3000 roles, 3000 frameworks, with random sorter
Made 3856 allocations in 13.20400492secs
Made 0 allocation in 13.289808943secs

Added 3000 agents in 58.629888ms
Added 3000 frameworks in 14.714786337secs
Benchmark setup: 3000 agents, 3000 roles, 3000 frameworks, with random sorter
Made 3856 allocations in 13.721094744secs
Made 0 allocation in 12.688377237secs

---
AFTER (master + previous patch https://reviews.apache.org/r/70591/ + this 
patch):

Added 3000 agents in 58.04155ms
Added 3000 frameworks in 13.694651977secs
Benchmark setup: 3000 agents, 3000 roles, 3000 frameworks, with random sorter
Made 3856 allocations in 12.417541213secs
Made 0 allocation in 12.130755905secs

Added 3000 agents in 55.246813ms
Added 3000 frameworks in 13.460479936secs
Benchmark setup: 3000 agents, 3000 roles, 3000 frameworks, with random sorter
Made 3856 allocations in 12.393765514secs
Made 0 allocation in 12.196981426secs

Added 3000 agents in 58.013477ms
Added 3000 frameworks in 13.69361015secs
Benchmark setup: 3000 agents, 3000 roles, 3000 frameworks, with random sorter
Made 3856 allocations in 12.245916159secs
Made 0 allocation in 11.699553888secs

Added 3000 agents in 57.163681ms
Added 3000 frameworks in 13.738218369secs
Benchmark setup: 3000 agents, 3000 roles, 3000 frameworks, with random sorter
Made 3856 allocations in 12.809206431secs
Made 0 allocation in 12.400140164secs

Added 3000 agents in 57.942087ms
Added 3000 frameworks in 13.836390727secs
Benchmark setup: 3000 agents, 3000 roles, 3000 frameworks, with random sorter
Made 3856 allocations in 12.75595625secs
Made 0 allocation in 11.939263998secs


Thanks,

Andrei Sekretenko



Re: Review Request 70618: Store framework sorters inside RoleInfos.

2019-05-20 Thread Andrei Sekretenko


> On May 20, 2019, 12:18 p.m., Meng Zhu wrote:
> > src/master/allocator/mesos/hierarchical.cpp
> > Lines 1708-1709 (original), 1706-1709 (patched)
> > 
> >
> > The logic here is not related to this patch.
> > I am OK with either doing this in the previous patch or keep it as is.

I cannot keep it as it is - RoleInfo has no default constructor now (for the 
purpose of maintaining RoleInfo's only - as for now - class invariant: that it 
always has a framework sorter).

On the other side, doing this change in the previous patch will improve 
readability, I'll do that.


- Andrei


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/70618/#review215354
---


On May 16, 2019, 2:17 p.m., Andrei Sekretenko wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/70618/
> ---
> 
> (Updated May 16, 2019, 2:17 p.m.)
> 
> 
> Review request for mesos, Benjamin Mahler and Meng Zhu.
> 
> 
> Bugs: MESOS-9701
> https://issues.apache.org/jira/browse/MESOS-9701
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> This patch replaces a hashmap> used for tracking
> framework sorters with a unique_ptr inside of the RoleInfo.
> This simplifies the framework tracking logic and slightly improves
> performance.
> 
> 
> Diffs
> -
> 
>   src/master/allocator/mesos/hierarchical.hpp 
> c2058baca5159da4cdcab77afd5de3c0d5ae6c48 
>   src/master/allocator/mesos/hierarchical.cpp 
> 875cfcd091f5f58afb89e752da5100a75828dd67 
> 
> 
> Diff: https://reviews.apache.org/r/70618/diff/3/
> 
> 
> Testing
> ---
> 
> make check
> 
> Benchmarking: 5 runs of 
> BENCHMARK_HierarchicalAllocator_WithQuotaParam.LargeAndSmallQuota/5 with the 
> optimized build.
> 
> BEFORE (master):
> 
> Added 3000 agents in 57.444301ms
> Added 3000 frameworks in 15.100499419secs
> Benchmark setup: 3000 agents, 3000 roles, 3000 frameworks, with random sorter
> Made 3856 allocations in 13.359926851secs
> Made 0 allocation in 12.147765014secs
> 
> Added 3000 agents in 58.651887ms
> Added 3000 frameworks in 14.485925735secs
> Benchmark setup: 3000 agents, 3000 roles, 3000 frameworks, with random sorter
> Made 3856 allocations in 13.6526783secs
> Made 0 allocation in 12.138439924secs
> 
> Added 3000 agents in 59.028581ms
> Added 3000 frameworks in 14.72945866secs
> Benchmark setup: 3000 agents, 3000 roles, 3000 frameworks, with random sorter
> Made 3856 allocations in 12.739135078secs
> Made 0 allocation in 12.673302496secs
> 
> Added 3000 agents in 59.050577ms
> Added 3000 frameworks in 14.78273576secs
> Benchmark setup: 3000 agents, 3000 roles, 3000 frameworks, with random sorter
> Made 3856 allocations in 13.20400492secs
> Made 0 allocation in 13.289808943secs
> 
> Added 3000 agents in 58.629888ms
> Added 3000 frameworks in 14.714786337secs
> Benchmark setup: 3000 agents, 3000 roles, 3000 frameworks, with random sorter
> Made 3856 allocations in 13.721094744secs
> Made 0 allocation in 12.688377237secs
> 
> ---
> AFTER (master + previous patch https://reviews.apache.org/r/70591/ + this 
> patch):
> 
> Added 3000 agents in 58.04155ms
> Added 3000 frameworks in 13.694651977secs
> Benchmark setup: 3000 agents, 3000 roles, 3000 frameworks, with random sorter
> Made 3856 allocations in 12.417541213secs
> Made 0 allocation in 12.130755905secs
> 
> Added 3000 agents in 55.246813ms
> Added 3000 frameworks in 13.460479936secs
> Benchmark setup: 3000 agents, 3000 roles, 3000 frameworks, with random sorter
> Made 3856 allocations in 12.393765514secs
> Made 0 allocation in 12.196981426secs
> 
> Added 3000 agents in 58.013477ms
> Added 3000 frameworks in 13.69361015secs
> Benchmark setup: 3000 agents, 3000 roles, 3000 frameworks, with random sorter
> Made 3856 allocations in 12.245916159secs
> Made 0 allocation in 11.699553888secs
> 
> Added 3000 agents in 57.163681ms
> Added 3000 frameworks in 13.738218369secs
> Benchmark setup: 3000 agents, 3000 roles, 3000 frameworks, with random sorter
> Made 3856 allocations in 12.809206431secs
> Made 0 allocation in 12.400140164secs
> 
> Added 3000 agents in 57.942087ms
> Added 3000 frameworks in 13.836390727secs
> Benchmark setup: 3000 agents, 3000 roles, 3000 frameworks, with random sorter
> Made 3856 allocations in 12.75595625secs
> Made 0 allocation in 11.939263998secs
> 
> 
> Thanks,
> 
> Andrei Sekretenko
> 
>



Re: Review Request 70618: Store framework sorters inside RoleInfos.

2019-05-20 Thread Meng Zhu

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/70618/#review215354
---



Just a partial review, will post the rest later


src/master/allocator/mesos/hierarchical.hpp
Line 110 (original), 111 (patched)


space before brace

Did I mention this in the last patch? Did you rebase?



src/master/allocator/mesos/hierarchical.hpp
Lines 112 (patched)


You have a user-defined constructor declared below. So this is not strictly 
needed. And (correct me if I am wrong) I do not think our style guide requires 
this, I would suggest removing this for brevity.



src/master/allocator/mesos/hierarchical.hpp
Lines 115 (patched)


Put this in the line above.



src/master/allocator/mesos/hierarchical.hpp
Lines 118-120 (original), 128-133 (patched)


The original comment is verbose and outdated. Let's take this opportunity 
to improve it. Also, let's make an effort to align the comment length :)

- Let's remove the ties to DRF (as it can be any sorter algorithm)
- Let's remove the details of level and stage, readers who are familiar with
the matter do not need to comment, those who are not will just be confused. 
Also,
the structure of the code could change in the future.
- The comment regarding resource pool is no longer true due to: 

https://github.com/apache/mesos/commit/a3626219b402ff837192208b6d35946b3a069ce6

How about:

// The sorter determines the resource allocation order of the
// frameworks subscribed under this role.



src/master/allocator/mesos/hierarchical.cpp
Lines 358-359 (original), 358-359 (patched)


// FrameworkID may not be present in RoleInfo because the
// framework was previously deactivated and never re-added.



src/master/allocator/mesos/hierarchical.cpp
Line 367 (original), 367 (patched)


Not yours, but this could be just a reference?

const hashmap& allocation = ...

Want to add a quick patch for that?



src/master/allocator/mesos/hierarchical.cpp
Line 888 (original), 888 (patched)


hmm, it is unfortunate that we need to make a copy here due to the sorter 
interface



src/master/allocator/mesos/hierarchical.cpp
Lines 982-1000 (original), 981-999 (patched)


We should just use `ResourceQuantities` for these.
Feel free to do this in another patch.



src/master/allocator/mesos/hierarchical.cpp
Line 1369 (original), 1367 (patched)


suppressingRoles



src/master/allocator/mesos/hierarchical.cpp
Line 1373 (original), 1372 (patched)


newline not necessary?



src/master/allocator/mesos/hierarchical.cpp
Line 1395 (original), 1393 (patched)


revivingRoles



src/master/allocator/mesos/hierarchical.cpp
Line 1402 (original), 1401 (patched)


newline not necessary?



src/master/allocator/mesos/hierarchical.cpp
Lines 1708-1709 (original), 1706-1709 (patched)


The logic here is not related to this patch.
I am OK with either doing this in the previous patch or keep it as is.



src/master/allocator/mesos/hierarchical.cpp
Line 1714 (original), 1714 (patched)


This should be in the previous patch (I just raised an issue in the patch)



src/master/allocator/mesos/hierarchical.cpp
Line 1853 (original), 1853 (patched)


Not yours, but this comment is tied to DRF, we should remove/modify it.



src/master/allocator/mesos/hierarchical.cpp
Lines 2169 (patched)


space before braces



src/master/allocator/mesos/hierarchical.cpp
Lines 2171 (patched)


newline here


- Meng Zhu


On May 16, 2019, 7:17 a.m., Andrei Sekretenko wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/70618/
> ---
> 
> (Updated May 16, 2019, 7:17 a.m.)
> 
> 
> Review request for mesos, Benjamin Mahler and Meng Zhu.
> 
> 
> Bugs: MESOS-9701
> https://issues.apache.org/jira/browse/MESOS-9701
> 
> 
> Rep

Re: Review Request 70591: Added `struct RoleInfo` to track role reservations and framework IDs.

2019-05-20 Thread Meng Zhu

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/70591/#review215353
---



Much nicer, thanks!


src/master/allocator/mesos/hierarchical.hpp
Lines 114-116 (patched)


// Aggregated reserved scalar resource quantities on all agents
// tied to this role, if any. Note, non-scalar resources such as
// ports are excluded.



src/master/allocator/mesos/hierarchical.cpp
Lines 2578 (patched)


New line not necessary?



src/master/allocator/mesos/hierarchical.cpp
Lines 2643-2652 (original), 2646-2653 (patched)


Let's use a reference to make the code more consise:

```
CHECK(roles.contains(r));
RoleInfo& roleInfo = roles.at(r);

CHECK(roleInfo.reservationScalarQuantities.contains(quantities));
roleInfo.reservationScalarQuantities -= quantities;

if (roleInfo.isEmpty()) {
  roles.erase(r);
}

```


- Meng Zhu


On May 16, 2019, 7:16 a.m., Andrei Sekretenko wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/70591/
> ---
> 
> (Updated May 16, 2019, 7:16 a.m.)
> 
> 
> Review request for mesos, Benjamin Mahler and Meng Zhu.
> 
> 
> Bugs: MESOS-9701
> https://issues.apache.org/jira/browse/MESOS-9701
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> This patch introduces struct RoleInfo which contains the framework IDs
> and reservatons tracked under a role. Two separate hashmaps used for
> this purpose previously are replaced by a single hashmap, thus the need
> to keep the two consistent with each other is removed. This simplifies
> the role tracking logic in the allocator. The patch introduces minimal
> to no performance impact.
> 
> 
> Diffs
> -
> 
>   src/master/allocator/mesos/hierarchical.hpp 
> c2058baca5159da4cdcab77afd5de3c0d5ae6c48 
>   src/master/allocator/mesos/hierarchical.cpp 
> 875cfcd091f5f58afb89e752da5100a75828dd67 
> 
> 
> Diff: https://reviews.apache.org/r/70591/diff/4/
> 
> 
> Testing
> ---
> 
> make check
> 
> Benchmarking: 5 runs of 
> `BENCHMARK_HierarchicalAllocator_WithQuotaParam.LargeAndSmallQuota/5` with 
> the optimized build.
> 
> Performance impact in this benchmark seems to be negligible.
> 
> BEFORE:
> Added 3000 agents in 51.553929ms
> Added 3000 frameworks in 15.174748344secs
> Benchmark setup: 3000 agents, 3000 roles, 3000 frameworks, with random sorter
> Made 3856 allocations in 12.400805171secs
> Made 0 allocation in 12.5850238secs
> 
> Added 3000 agents in 55.739336ms
> Added 3000 frameworks in 14.730404769secs
> Benchmark setup: 3000 agents, 3000 roles, 3000 frameworks, with random sorter
> Made 3856 allocations in 12.563439682secs
> Made 0 allocation in 13.063555055secs
> 
> Added 3000 agents in 54.414733ms
> Added 3000 frameworks in 15.10136842secs
> Benchmark setup: 3000 agents, 3000 roles, 3000 frameworks, with random sorter
> Made 3856 allocations in 13.501664915secs
> Made 0 allocation in 12.89034382secs
> 
> Added 3000 agents in 52.58252ms
> Added 3000 frameworks in 14.048350298secs
> Benchmark setup: 3000 agents, 3000 roles, 3000 frameworks, with random sorter
> Made 3856 allocations in 12.299952145secs
> Made 0 allocation in 11.888248811secs
> 
> Added 3000 agents in 52.821439ms
> Added 3000 frameworks in 15.344450583secs
> Benchmark setup: 3000 agents, 3000 roles, 3000 frameworks, with random sorter
> Made 3856 allocations in 12.63425secs
> Made 0 allocation in 12.427171541secs
> 
> 
> AFTER:
> 
> Added 3000 agents in 69.716648ms
> Added 3000 frameworks in 15.249001979secs
> Benchmark setup: 3000 agents, 3000 roles, 3000 frameworks, with random sorter
> Made 3856 allocations in 12.860494226secs
> Made 0 allocation in 12.228866329secs
> 
> Added 3000 agents in 52.639388ms
> Added 3000 frameworks in 15.207895482secs
> Benchmark setup: 3000 agents, 3000 roles, 3000 frameworks, with random sorter
> Made 3856 allocations in 12.504777266secs
> Made 0 allocation in 12.70388062secs
> 
> Added 3000 agents in 56.865794ms
> Added 3000 frameworks in 15.284003915secs
> Benchmark setup: 3000 agents, 3000 roles, 3000 frameworks, with random sorter
> Made 3856 allocations in 13.86859815secs
> Made 0 allocation in 12.538958231secs
> 
> Added 3000 agents in 56.028013ms
> Added 3000 frameworks in 13.892577869secs
> Benchmark setup: 3000 agents, 3000 roles, 3000 frameworks, with random sorter
> Made 3856 allocations in 12.341724418secs
> Made 0 allocation in 12.23022189secs
> 
> Added 3000 agents in 52.368219ms
> Added 3000 frameworks in 13.978581104secs
> Benchmark setup: 3000 agents, 3000 roles, 3000 frameworks, with ra

Re: Review Request 70626: Refactored `untrackReservations()` method.

2019-05-20 Thread Meng Zhu

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/70626/#review215352
---


Fix it, then Ship it!




Thanks for the refactoring!


src/master/allocator/mesos/hierarchical.cpp
Line 2642 (original), 2642 (patched)


`untrackRole` or just `untrack`?


- Meng Zhu


On May 16, 2019, 7:16 a.m., Andrei Sekretenko wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/70626/
> ---
> 
> (Updated May 16, 2019, 7:16 a.m.)
> 
> 
> Review request for mesos and Meng Zhu.
> 
> 
> Bugs: MESOS-9701
> https://issues.apache.org/jira/browse/MESOS-9701
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> This patch gets rid of insertion into the beginning of the vector
> and also of shadowing HierarchicalAllocatorProcess::roles by a local.
> 
> 
> Diffs
> -
> 
>   src/master/allocator/mesos/hierarchical.cpp 
> 875cfcd091f5f58afb89e752da5100a75828dd67 
> 
> 
> Diff: https://reviews.apache.org/r/70626/diff/2/
> 
> 
> Testing
> ---
> 
> make check
> 
> 
> Thanks,
> 
> Andrei Sekretenko
> 
>