Re: Review Request 65368: Updated maintenance schedule logic.

2018-01-26 Thread Mesos Reviewbot Windows

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65368/#review196386
---



PASS: Mesos patch 65368 was successfully built and tested.

Reviews applied: `['65366', '65367', '65368']`

All the build artifacts available at: 
http://dcos-win.westus.cloudapp.azure.com/mesos-build/review/65368

- Mesos Reviewbot Windows


On Jan. 27, 2018, 1:40 a.m., Joseph Wu wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/65368/
> ---
> 
> (Updated Jan. 27, 2018, 1:40 a.m.)
> 
> 
> Review request for mesos and Benjamin Mahler.
> 
> 
> Bugs: MESOS-7882
> https://issues.apache.org/jira/browse/MESOS-7882
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> This moves most of the conditional logic of maintenance schedules
> into the master's `updateUnavailability` method, while the HTTP
> handler will only loop through all MachineIDs once.
> 
> The `updateUnavailability` method will now check if the given
> Unavailability has changed before it updates the allocator.  This will
> prevent a maintenance schedule from rescinding offers unrelated to the
> maintenance schedule.
> 
> 
> Diffs
> -
> 
>   src/master/http.cpp c489b6f525f157811549b2cc84addd9d85e87990 
>   src/master/master.cpp b97ebae6ebfd8ae0f73e617d0c55e140b9c3fce7 
> 
> 
> Diff: https://reviews.apache.org/r/65368/diff/1/
> 
> 
> Testing
> ---
> 
> make check
> 
> 
> Thanks,
> 
> Joseph Wu
> 
>



Re: Review Request 65367: Added equality operator for Unavailability protobuf.

2018-01-26 Thread Benjamin Mahler

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65367/#review196385
---


Fix it, then Ship it!





include/mesos/type_utils.hpp
Lines 234-249 (patched)


Should we try to start using 
https://developers.google.com/protocol-buffers/docs/reference/cpp/google.protobuf.util.message_differencer?

Commented on: https://issues.apache.org/jira/browse/MESOS-2487


- Benjamin Mahler


On Jan. 27, 2018, 1:40 a.m., Joseph Wu wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/65367/
> ---
> 
> (Updated Jan. 27, 2018, 1:40 a.m.)
> 
> 
> Review request for mesos and Benjamin Mahler.
> 
> 
> Bugs: MESOS-7882
> https://issues.apache.org/jira/browse/MESOS-7882
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> This will be used to compare parts of maintenance schedules for changes.
> 
> 
> Diffs
> -
> 
>   include/mesos/type_utils.hpp af2b187b9b59552e4ba515ad640fd4419eaf5075 
>   include/mesos/v1/mesos.hpp d4c354ab596a6ea361f2fe45afa46089f8c1a543 
> 
> 
> Diff: https://reviews.apache.org/r/65367/diff/1/
> 
> 
> Testing
> ---
> 
> See next review.
> 
> 
> Thanks,
> 
> Joseph Wu
> 
>



Re: Review Request 65368: Updated maintenance schedule logic.

2018-01-26 Thread Mesos Reviewbot Windows

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65368/#review196384
---



PASS: Mesos patch 65368 was successfully built and tested.

Reviews applied: `['65366', '65367', '65368']`

All the build artifacts available at: 
http://dcos-win.westus.cloudapp.azure.com/mesos-build/review/65368

- Mesos Reviewbot Windows


On Jan. 27, 2018, 1:40 a.m., Joseph Wu wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/65368/
> ---
> 
> (Updated Jan. 27, 2018, 1:40 a.m.)
> 
> 
> Review request for mesos and Benjamin Mahler.
> 
> 
> Bugs: MESOS-7882
> https://issues.apache.org/jira/browse/MESOS-7882
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> This moves most of the conditional logic of maintenance schedules
> into the master's `updateUnavailability` method, while the HTTP
> handler will only loop through all MachineIDs once.
> 
> The `updateUnavailability` method will now check if the given
> Unavailability has changed before it updates the allocator.  This will
> prevent a maintenance schedule from rescinding offers unrelated to the
> maintenance schedule.
> 
> 
> Diffs
> -
> 
>   src/master/http.cpp c489b6f525f157811549b2cc84addd9d85e87990 
>   src/master/master.cpp b97ebae6ebfd8ae0f73e617d0c55e140b9c3fce7 
> 
> 
> Diff: https://reviews.apache.org/r/65368/diff/1/
> 
> 
> Testing
> ---
> 
> make check
> 
> 
> Thanks,
> 
> Joseph Wu
> 
>



Re: Review Request 65346: Updated Web UI to show quota guarantee and limit for roles.

2018-01-26 Thread Benjamin Mahler

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65346/#review196383
---



I wonder if we should split out the allocation / guarantee / limit into their 
own columns? E.g.

https://docs.google.com/spreadsheets/d/1ttUmo39T30TZpz7_EVfe5jVnOkDK-s_ZjxaEyimlqHc/edit?usp=sharing
 (there are 2 sheets)

- Benjamin Mahler


On Jan. 27, 2018, 2:23 a.m., Armand Grillet wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/65346/
> ---
> 
> (Updated Jan. 27, 2018, 2:23 a.m.)
> 
> 
> Review request for mesos, Benjamin Mahler, Kapil Arya, Michael Park, and Meng 
> Zhu.
> 
> 
> Bugs: MESOS-8486
> https://issues.apache.org/jira/browse/MESOS-8486
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> See summary.
> 
> 
> Diffs
> -
> 
>   src/webui/master/static/roles.html 7578852e4e2ad29a0f07e16dcff85af6a2255b02 
> 
> 
> Diff: https://reviews.apache.org/r/65346/diff/1/
> 
> 
> Testing
> ---
> 
> New UI:
> ![New roles row](https://i.imgur.com/5mq31Ls.png)
> 
> 
> Thanks,
> 
> Armand Grillet
> 
>



Re: Review Request 65351: Reordered roles in Web UI to display them alphabetically by default.

2018-01-26 Thread Benjamin Mahler

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65351/#review196382
---



Isn't this just doing reverse alphabetical?

z
z/z
z/b
z/a
b
a
a/b
a/a

Ideally we would have:

a
a/z
a/b
b
z
z/a
z/b
z/z

- Benjamin Mahler


On Jan. 26, 2018, 3:16 p.m., Armand Grillet wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/65351/
> ---
> 
> (Updated Jan. 26, 2018, 3:16 p.m.)
> 
> 
> Review request for mesos and Benjamin Mahler.
> 
> 
> Bugs: MESOS-8495
> https://issues.apache.org/jira/browse/MESOS-8495
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Use the alphabetical order of the name of the roles by
> default to display their hierarchy straightforwardly.
> 
> 
> Diffs
> -
> 
>   src/webui/master/static/js/app.js 463c563cd121e881f7d137123534a047c5e420d2 
>   src/webui/master/static/roles.html 7578852e4e2ad29a0f07e16dcff85af6a2255b02 
> 
> 
> Diff: https://reviews.apache.org/r/65351/diff/1/
> 
> 
> Testing
> ---
> 
> ```
> $ bash mesos-master.sh --port='5061' --work_dir='/tmp/master1' 
> --webui_dir='/Users/Armand/Code/apache-mesos/src/webui' 
> --log_dir='/tmp/master1-log' --weights="example=42,example/a=2" 
> --registry=in_memory
> ```
> UI before patch when viewing the roles tab:
> ![Old UI](https://i.imgur.com/dflxj11.png)
> UI by default after patch:
> ![New UI](https://i.imgur.com/PRyAwNs.png)
> 
> 
> Thanks,
> 
> Armand Grillet
> 
>



Re: Review Request 65109: Fixed a bug relating to lingering executors.

2018-01-26 Thread Benjamin Mahler

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65109/#review196381
---




src/slave/slave.cpp
Line 4652 (original), 4659 (patched)


Need this also in the subscribe case?


- Benjamin Mahler


On Jan. 26, 2018, 11:13 p.m., Meng Zhu wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/65109/
> ---
> 
> (Updated Jan. 26, 2018, 11:13 p.m.)
> 
> 
> Review request for mesos, Benjamin Mahler and Vinod Kone.
> 
> 
> Bugs: MESOS-8411
> https://issues.apache.org/jira/browse/MESOS-8411
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> An executor should be shutdown if it has never got any tasks
> i.e. all of its initial tasks are killed before launching.
> See MESOS-8411.
> 
> This patch ensures this by checking an executor's various
> tasks queues during task kill and executor registration,
> and shutting down executors that had never received any tasks.
> 
> 
> Diffs
> -
> 
>   src/slave/constants.hpp e6cb7cc0ccdaaf981eb66defa21b38720f4e1de9 
>   src/slave/slave.hpp 09c01ebd1b5e8008ba9e7d412042f1db76a2c5a5 
>   src/slave/slave.cpp 43c7955237655accdf869db1b6494a35f77acdb3 
> 
> 
> Diff: https://reviews.apache.org/r/65109/diff/5/
> 
> 
> Testing
> ---
> 
> make check
> new tests in #65111
> 
> 
> Thanks,
> 
> Meng Zhu
> 
>



Re: Review Request 65109: Fixed a bug relating to lingering executors.

2018-01-26 Thread Benjamin Mahler

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65109/#review195957
---



It would for posterity to highlight the race in question, and to also mention 
the failover case. Here's a suggestion:

```
An executor should be shutdown if all of its tasks are
killed while the executor is launching.

This patch fixes and issue where the executor is left
running when the task(s) get killed between the executor
registration/subscription and `Slave::___run()`. See
MESOS-8411 for more details. There is an additional race
in the agent failover case that is addressed in this patch.

The fix here is to fix the race by checking an executor's various
tasks queues during task kill and executor (re-)registration,
and shutting down executors that had never received any tasks.
```

One issue is that we probably need to be checking for SOURCE_SLAVE for the 
dropped case in status update?


src/slave/constants.hpp
Lines 90-94 (patched)


"correctly infer" here seems a little misleading, maybe we should mention 
the false positives?

How about:

// NOTE: This should be greater than zero because the agent looks
// for completed tasks to determine (with false positives) whether
// an executor ever received tasks. See MESOS-8411.
//
// TODO(mzhu): Remove this note once we determine whether an
// executor ever received tasks without looking through the
// completed tasks.



src/slave/slave.hpp
Lines 865 (patched)


whoops, extra newline here



src/slave/slave.hpp
Lines 866-885 (patched)


How about the following?

```
// Returns true if the agent ever sent any tasks to this executor.
// More precisely, this function returns whether either:
//
//  (1) There are terminated/completed tasks with a
//  SOURCE_EXECUTOR status.
//
//  (2) `launchedTasks` is not empty.
//
// If this function returns false and there are no queued tasks,
// we probably (see TODO below) have killed or dropped all of its
// initial tasks, in which case the agent will shut down the executor.
//
// TODO(mzhu): Note however, that since we look through the cache
// of completed tasks, we can have false positives when a task
// that was delivered to the executor has been evicted from the
// completed task cache by tasks that have been killed by the
// agent before delivery. This should be fixed.
//
// NOTE: This function checks whether any tasks has ever been sent,
// this does not necessarily mean the executor has ever received
// any tasks. Specifically, tasks in `launchedTasks` may be dropped
// before received by the executor if the agent restarts.
```



src/slave/slave.hpp
Lines 874-875 (patched)


Lost is a little unclear here, perhaps we should highlight the kill / drop 
cases?

```
  // If this function returns false and there are no queued tasks,
  // we must have killed or dropped all of its initial tasks, in
  // which case the agent will shut down the executor.
```



src/slave/slave.hpp
Lines 879 (patched)


Let's say "dropped" now that we've clarified "lost". "Lost" can mean many 
things whereas dropped now identifies a particular case of "lost" where we've 
dropped a message.



src/slave/slave.hpp
Lines 880 (patched)


agent restart, since agent failure is just a particular case



src/slave/slave.cpp
Lines 2820-2822 (patched)


Thanks for clarifying this!

How about:

```
// At this point, we must have either sent some tasks to the running
// executor or there are queued tasks that need to be delivered.
// Otherwise, the executor state would have been synchronously
// transitioned to TERMINATING when the queued tasks were killed.
```



src/slave/slave.cpp
Lines 3417-3421 (patched)


I was thinking we should also be doing this in the REGISTERING case above? 
After talking to vinod, I guess it's a bit complicated since we can't talk to 
the executor yet so we'd have to do the forceful shutdown. Probably just need a 
TODO in the registering case to consider shutting it down early.



src/slave/slave.cpp
Lines 3423-3424 (patched)


How about: "because it has never been sent a task and all of its queued 
tasks have been killed before delivery"?



src/slave/slave.cpp
Lines 4225-4232 (original), 4242-4243 

Review Request 65369: Added test to ensure v1 executor is shutdown upon initial task all-kill.

2018-01-26 Thread Meng Zhu

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65369/
---

Review request for mesos, Benjamin Mahler and Vinod Kone.


Repository: mesos


Description
---

This test verifies that the v1 executor is shutdown if all of its
initial tasks could not be delivered, even after the executor has been
subscribed. See MESOS-8411.


Diffs
-

  src/tests/slave_tests.cpp 20b874481d3818574731fc30ba9df1fc2bcbe900 


Diff: https://reviews.apache.org/r/65369/diff/1/


Testing
---

make check


Thanks,

Meng Zhu



Review Request 65368: Updated maintenance schedule logic.

2018-01-26 Thread Joseph Wu

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65368/
---

Review request for mesos and Benjamin Mahler.


Bugs: MESOS-7882
https://issues.apache.org/jira/browse/MESOS-7882


Repository: mesos


Description
---

This moves most of the conditional logic of maintenance schedules
into the master's `updateUnavailability` method, while the HTTP
handler will only loop through all MachineIDs once.

The `updateUnavailability` method will now check if the given
Unavailability has changed before it updates the allocator.  This will
prevent a maintenance schedule from rescinding offers unrelated to the
maintenance schedule.


Diffs
-

  src/master/http.cpp c489b6f525f157811549b2cc84addd9d85e87990 
  src/master/master.cpp b97ebae6ebfd8ae0f73e617d0c55e140b9c3fce7 


Diff: https://reviews.apache.org/r/65368/diff/1/


Testing
---

make check


Thanks,

Joseph Wu



Review Request 65367: Added equality operator for Unavailability protobuf.

2018-01-26 Thread Joseph Wu

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65367/
---

Review request for mesos and Benjamin Mahler.


Bugs: MESOS-7882
https://issues.apache.org/jira/browse/MESOS-7882


Repository: mesos


Description
---

This will be used to compare parts of maintenance schedules for changes.


Diffs
-

  include/mesos/type_utils.hpp af2b187b9b59552e4ba515ad640fd4419eaf5075 
  include/mesos/v1/mesos.hpp d4c354ab596a6ea361f2fe45afa46089f8c1a543 


Diff: https://reviews.apache.org/r/65367/diff/1/


Testing
---

See next review.


Thanks,

Joseph Wu



Review Request 65366: Added test for maintenance-related offer rescinding behavior.

2018-01-26 Thread Joseph Wu

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65366/
---

Review request for mesos and Benjamin Mahler.


Bugs: MESOS-7882
https://issues.apache.org/jira/browse/MESOS-7882


Repository: mesos


Description
---

This test checks that offers related to a maintenance schedule will
be rescinded when a relevant maintanence schedule is posted or cleared.
The test also checks that un-related offers are not affected.


Diffs
-

  src/tests/master_maintenance_tests.cpp 
2a0625873b4f9b6350c14e8304a8d6fbd6a45e6b 


Diff: https://reviews.apache.org/r/65366/diff/1/


Testing
---

make check

This test should fail until the chain is fully applied.


Thanks,

Joseph Wu



Re: Review Request 65315: Moved 'ObjectApproverAll' into a header for use in tests.

2018-01-26 Thread Gaston Kleiman

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65315/#review196380
---




src/tests/mesos.hpp
Lines 3177 (patched)


Would `PermissiveObjectApprover` be a better name?


- Gaston Kleiman


On Jan. 24, 2018, 10:05 a.m., Greg Mann wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/65315/
> ---
> 
> (Updated Jan. 24, 2018, 10:05 a.m.)
> 
> 
> Review request for mesos, Benjamin Mahler, Gaston Kleiman, and Vinod Kone.
> 
> 
> Bugs: MESOS-8469
> https://issues.apache.org/jira/browse/MESOS-8469
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Moved 'ObjectApproverAll' into a header for use in tests.
> 
> 
> Diffs
> -
> 
>   src/tests/mesos.hpp a35c68e8645384f6244d17e37cad71373aba6893 
>   src/tests/mesos.cpp d751b2e9c635eb6a5039678de426467176cda908 
> 
> 
> Diff: https://reviews.apache.org/r/65315/diff/1/
> 
> 
> Testing
> ---
> 
> Testing details in the child review.
> 
> 
> Thanks,
> 
> Greg Mann
> 
>



Re: Review Request 65316: Added test for delayed authorization during operator events.

2018-01-26 Thread Gaston Kleiman

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65316/#review196379
---


Fix it, then Ship it!





src/tests/api_tests.cpp
Lines 2630 (patched)


Remove this blank line.


- Gaston Kleiman


On Jan. 24, 2018, 12:04 p.m., Greg Mann wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/65316/
> ---
> 
> (Updated Jan. 24, 2018, 12:04 p.m.)
> 
> 
> Review request for mesos, Benjamin Mahler, Gaston Kleiman, and Vinod Kone.
> 
> 
> Bugs: MESOS-8469
> https://issues.apache.org/jira/browse/MESOS-8469
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Until the fix for MESOS-8469, it was possible for the master
> operator event stream to drop events, if event-related state in
> the master changed in between asynchronous calls.
> 
> This patch adds `MasterAPITest.EventAuthorizationDelayed` to
> verify the fix for that issue.
> 
> 
> Diffs
> -
> 
>   src/tests/api_tests.cpp bdacc30be4dc8656a41a0c47d0e350d48e59ad94 
> 
> 
> Diff: https://reviews.apache.org/r/65316/diff/3/
> 
> 
> Testing
> ---
> 
> Confirmed that the test fails without the fix for MESOS-8469.
> 
> When the fix is included, the test passes. Successfully ran 
> `bin/mesos-tests.sh 
> --gtest_filter="ContentType/MasterAPITest.EventAuthorizationDelayed*" 
> --gtest_repeat=1000 --gtest_break_on_failure` to rule out any flakiness.
> 
> 
> Thanks,
> 
> Greg Mann
> 
>



Re: Review Request 64618: Implemented the master's `ACKNOWLEDGE_OPERATION_STATUS` handler.

2018-01-26 Thread Mesos Reviewbot Windows

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/64618/#review196377
---



FAIL: Some of the unit tests failed. Please check the relevant logs.

Reviews applied: `['65357', '65358', '65359', '65360', '65361', '65300', 
'65362', '65363', '64618']`

Failed command: `Start-MesosCITesting`

All the build artifacts available at: 
http://dcos-win.westus.cloudapp.azure.com/mesos-build/review/64618

Relevant logs:

- 
[mesos-tests-stdout.log](http://dcos-win.westus.cloudapp.azure.com/mesos-build/review/64618/logs/mesos-tests-stdout.log):

```

[--] 1 test from IsolationFlag/CpuIsolatorTest
[ RUN  ] IsolationFlag/CpuIsolatorTest.ROOT_UserCpuUsage/0
[   OK ] IsolationFlag/CpuIsolatorTest.ROOT_UserCpuUsage/0 (3439 ms)
[--] 1 test from IsolationFlag/CpuIsolatorTest (3465 ms total)

[--] 1 test from IsolationFlag/MemoryIsolatorTest
[ RUN  ] IsolationFlag/MemoryIsolatorTest.ROOT_MemUsage/0
[   OK ] IsolationFlag/MemoryIsolatorTest.ROOT_MemUsage/0 (3568 ms)
[--] 1 test from IsolationFlag/MemoryIsolatorTest (3594 ms total)

[--] Global test environment tear-down
D:\DCOS\mesos\mesos\src\tests\environment.cpp(832): error: Failed
Tests completed with child processes remaining:
-+- 828 008E7AB9E47C
 |-+- 10360 008E7AB9E47C
 | --- 5824 008E7AB9E47C
 |-+- 11000 008E7AB9E47C
 | --- 9868 008E7AB9E47C
 -+- 11328 008E7AB9E47C
   --- 13696 008E7AB9E47C
[==] 849 tests from 85 test cases ran. (361732 ms total)
[  PASSED  ] 849 tests.
[  FAILED  ] 0 tests, listed below:

 0 FAILED TESTS

  YOU HAVE 213 DISABLED TESTS

```

- 
[mesos-tests-stderr.log](http://dcos-win.westus.cloudapp.azure.com/mesos-build/review/64618/logs/mesos-tests-stderr.log):

```
I0126 23:48:48.659042  4904 master.cpp:10292] Updating the state of task 
58247121-ef5f-4488-b94a-e20ee1ab773b of framework 
08d9d427-72f2-4b58-8d63-ee25da06a3dc- (latest state: TASK_KILLED, status 
update state: TASK_KILLED)
I0126 23:48:48.659042 10084 slI0126 23:48:47.767930 10840 exec.cpp:162] 
Version: 1.6.0
I0126 23:48:47.793915  6644 exec.cpp:236] Executor registered on agent 
08d9d427-72f2-4b58-8d63-ee25da06a3dc-S0
I0126 23:48:47.796936  7844 executor.cpp:171] Received SUBSCRIBED event
I0126 23:48:47.801934  7844 executor.cpp:175] Subscribed executor on 
build-srv-03.zq4gs31qjdiunm1ryi1452nvnh.dx.internal.cloudapp.net
I0126 23:48:47.801934  7844 executor.cpp:171] Received LAUNCH event
I0126 23:48:47.806936  7844 executor.cpp:638] Starting task 
58247121-ef5f-4488-b94a-e20ee1ab773b
I0126 23:48:47.882922  7844 executor.cpp:478] Running 
'D:\DCOS\mesos\src\mesos-containerizer.exe launch '
I0126 23:48:48.633067  7844 executor.cpp:651] Forked command at 9200
I0126 23:48:48.661043 13576 exec.cpp:445] Executor asked to shutdown
I0126 23:48:48.661043  7844 executor.cpp:171] Received SHUTDOWN event
I0126 23:48:48.661043  7844 executor.cpp:748] Shutting down
I0126 23:48:48.662041  7844 executor.cpp:863] Sending SIGTERM to process tree 
at pid 9ave.cpp:3479] Shutting down framework 
08d9d427-72f2-4b58-8d63-ee25da06a3dc-
I0126 23:48:48.659042 10084 slave.cpp:6178] Shutting down executor 
'58247121-ef5f-4488-b94a-e20ee1ab773b' of framework 
08d9d427-72f2-4b58-8d63-ee25da06a3dc- at executor(1)@10.3.1.11:65166
I0126 23:48:48.660043 10084 slave.cpp:931] Agent terminating
W0126 23:48:48.660043 10084 slave.cpp:3475] Ignoring shutdown framework 
08d9d427-72f2-4b58-8d63-ee25da06a3dc- because it is terminating
I0126 23:48:48.662041  4904 master.cpp:10391] Removing task 
58247121-ef5f-4488-b94a-e20ee1ab773b with resources cpus(allocated: *):4; 
mem(allocated: *):2048; disk(allocated: *):1024; ports(allocated: 
*):[31000-32000] of framework 08d9d427-72f2-4b58-8d63-ee25da06a3dc- on 
agent 08d9d427-72f2-4b58-8d63-ee25da06a3dc-S0 at slave(329)@10.3.1.11:65144 
(build-srv-03.zq4gs31qjdiunm1ryi1452nvnh.dx.internal.cloudapp.net)
I0126 23:48:48.664042 10084 containerizer.cpp:2338] Destroying container 
ad7fdb26-819f-43f5-9b00-ee9c5daf6159 in RUNNING state
I0126 23:48:48.664042 10084 containerizer.cpp:2952] Transitioning the state of 
container ad7fdb26-819f-43f5-9b00-ee9c5daf6159 from RUNNING to DESTROYING
I0126 23:48:48.664042  4904 master.cpp:1307] Agent 
08d9d427-72f2-4b58-8d63-ee25da06a3dc-S0 at slave(329)@10.3.1.11:65144 
(build-srv-03.zq4gs31qjdiunm1ryi1452nvnh.dx.internal.cloudapp.net) disconnected
I0126 23:48:48.664042  4904 master.cpp:3277] Disconnecting agent 
08d9d427-72f2-4b58-8d63-ee25da06a3dc-S0 at slave(329)@10.3.1.11:65144 
(build-srv-03.zq4gs31qjdiunm1ryi1452nvnh.dx.internal.cloudapp.net)
I0126 23:48:48.664042  4904 master.cpp:3296] Deactivating agent 
08d9d427-72f2-4b58-8d63-ee25da06a3dc-S0 at slave(329)@10.3.1.11:65144 
(build-srv-03.zq4gs31qjdiunm1ryi1452nvnh.dx.internal.cloudapp.net)
I0126 23:48:48.665071 10280 

Re: Review Request 65111: Added tests to verify the executor is shutdown if no task is delivered.

2018-01-26 Thread Meng Zhu

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65111/
---

(Updated Jan. 26, 2018, 3:21 p.m.)


Review request for mesos, Benjamin Mahler and Vinod Kone.


Changes
---

Added failover test. Will add http exec executor in another patch.


Bugs: MESOS-8411
https://issues.apache.org/jira/browse/MESOS-8411


Repository: mesos


Description
---

This test verifies that the executor is shutdown if all of its
initial tasks could not be delivered, even after the executor has been
registered. See MESOS-8411.


Diffs (updated)
-

  src/tests/mesos.hpp a35c68e8645384f6244d17e37cad71373aba6893 
  src/tests/mesos.cpp d751b2e9c635eb6a5039678de426467176cda908 
  src/tests/slave_tests.cpp 20b874481d3818574731fc30ba9df1fc2bcbe900 


Diff: https://reviews.apache.org/r/65111/diff/3/

Changes: https://reviews.apache.org/r/65111/diff/2-3/


Testing
---

make check


Thanks,

Meng Zhu



Re: Review Request 65109: Fixed a bug relating to lingering executors.

2018-01-26 Thread Meng Zhu

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65109/
---

(Updated Jan. 26, 2018, 3:13 p.m.)


Review request for mesos, Benjamin Mahler and Vinod Kone.


Changes
---

Fixed a few issues exposed in the test. PTAL


Bugs: MESOS-8411
https://issues.apache.org/jira/browse/MESOS-8411


Repository: mesos


Description
---

An executor should be shutdown if it has never got any tasks
i.e. all of its initial tasks are killed before launching.
See MESOS-8411.

This patch ensures this by checking an executor's various
tasks queues during task kill and executor registration,
and shutting down executors that had never received any tasks.


Diffs (updated)
-

  src/slave/constants.hpp e6cb7cc0ccdaaf981eb66defa21b38720f4e1de9 
  src/slave/slave.hpp 09c01ebd1b5e8008ba9e7d412042f1db76a2c5a5 
  src/slave/slave.cpp 43c7955237655accdf869db1b6494a35f77acdb3 


Diff: https://reviews.apache.org/r/65109/diff/5/

Changes: https://reviews.apache.org/r/65109/diff/4-5/


Testing
---

make check
new tests in #65111


Thanks,

Meng Zhu



Re: Review Request 64618: Implemented the master's `ACKNOWLEDGE_OPERATION_STATUS` handler.

2018-01-26 Thread Mesos Reviewbot Windows

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/64618/#review196375
---



PASS: Mesos patch 64618 was successfully built and tested.

Reviews applied: `['65357', '65358', '65359', '65360', '65361', '65300', 
'65362', '65363', '64618']`

All the build artifacts available at: 
http://dcos-win.westus.cloudapp.azure.com/mesos-build/review/64618

- Mesos Reviewbot Windows


On Jan. 26, 2018, 10:09 p.m., Gaston Kleiman wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/64618/
> ---
> 
> (Updated Jan. 26, 2018, 10:09 p.m.)
> 
> 
> Review request for mesos and Greg Mann.
> 
> 
> Bugs: MESOS-8184
> https://issues.apache.org/jira/browse/MESOS-8184
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Implemented the master's `ACKNOWLEDGE_OPERATION_STATUS` handler.
> 
> 
> Diffs
> -
> 
>   src/master/http.cpp c489b6f525f157811549b2cc84addd9d85e87990 
>   src/master/master.cpp b97ebae6ebfd8ae0f73e617d0c55e140b9c3fce7 
> 
> 
> Diff: https://reviews.apache.org/r/64618/diff/3/
> 
> 
> Testing
> ---
> 
> `make check` on GNU/Linux
> 
> 
> Thanks,
> 
> Gaston Kleiman
> 
>



Re: Review Request 64618: Implemented the master's `ACKNOWLEDGE_OPERATION_STATUS` handler.

2018-01-26 Thread Gaston Kleiman

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/64618/
---

(Updated Jan. 26, 2018, 2:09 p.m.)


Review request for mesos and Greg Mann.


Bugs: MESOS-8184
https://issues.apache.org/jira/browse/MESOS-8184


Repository: mesos


Description
---

Implemented the master's `ACKNOWLEDGE_OPERATION_STATUS` handler.


Diffs (updated)
-

  src/master/http.cpp c489b6f525f157811549b2cc84addd9d85e87990 
  src/master/master.cpp b97ebae6ebfd8ae0f73e617d0c55e140b9c3fce7 


Diff: https://reviews.apache.org/r/64618/diff/3/

Changes: https://reviews.apache.org/r/64618/diff/2-3/


Testing
---

`make check` on GNU/Linux


Thanks,

Gaston Kleiman



Review Request 65363: Improved the validation of `ACKNOWLEDGE_OPERATION_STATUS` calls.

2018-01-26 Thread Gaston Kleiman

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65363/
---

Review request for mesos and Greg Mann.


Bugs: MESOS-8184
https://issues.apache.org/jira/browse/MESOS-8184


Repository: mesos


Description
---

Improved the validation of `ACKNOWLEDGE_OPERATION_STATUS` calls.


Diffs
-

  src/master/validation.cpp f0b86775b7919ba6aa4a73038edb78a0adca68f4 


Diff: https://reviews.apache.org/r/65363/diff/1/


Testing
---

`make check` on GNU/Linux


Thanks,

Gaston Kleiman



Review Request 65357: Made devolving `v1::scheduler::Call::SUBSCRIBE` more robust.

2018-01-26 Thread Gaston Kleiman

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65357/
---

Review request for mesos and Greg Mann.


Bugs: MESOS-8184
https://issues.apache.org/jira/browse/MESOS-8184


Repository: mesos


Description
---

`devolve(const v1::scheduler::Call& call)` is called by
`master::receive()` and by `Master::Http::scheduler()` before the input
is validated, so it should check if optional fields are set before
accessing them.


Diffs
-

  src/internal/devolve.cpp 1cfbdf2ab58cf64ca0947c2dbc8e7f244c912fbe 


Diff: https://reviews.apache.org/r/65357/diff/1/


Testing
---

`make check` on GNU/Linux


Thanks,

Gaston Kleiman



Review Request 65360: Made `master::receive()` drop `ACKNOWLEDGE_OPERATION_STATUS` calls.

2018-01-26 Thread Gaston Kleiman

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65360/
---

Review request for mesos and Greg Mann.


Bugs: MESOS-8184
https://issues.apache.org/jira/browse/MESOS-8184


Repository: mesos


Description
---

`Master::receive()` handles scheduler API calls sent by v0 API clients.
`ACKNOWLEDGE_OPERATION_STATUS` is a v1-only API call, so
`Master::receive()` should drop it.


Diffs
-

  src/master/master.cpp b97ebae6ebfd8ae0f73e617d0c55e140b9c3fce7 


Diff: https://reviews.apache.org/r/65360/diff/1/


Testing
---

`make check` on GNU/Linux


Thanks,

Gaston Kleiman



Review Request 65361: Added special handling for devolving ACKNOWLEDGE_OPERATION_STATUS calls.

2018-01-26 Thread Gaston Kleiman

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65361/
---

Review request for mesos and Greg Mann.


Bugs: MESOS-8184
https://issues.apache.org/jira/browse/MESOS-8184


Repository: mesos


Description
---

Added special handling for devolving ACKNOWLEDGE_OPERATION_STATUS calls.


Diffs
-

  src/internal/devolve.cpp 1cfbdf2ab58cf64ca0947c2dbc8e7f244c912fbe 


Diff: https://reviews.apache.org/r/65361/diff/1/


Testing
---

`make check` on GNU/Linux


Thanks,

Gaston Kleiman



Review Request 65362: Added a method to increment invalid scheduler API call counters.

2018-01-26 Thread Gaston Kleiman

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65362/
---

Review request for mesos and Greg Mann.


Bugs: MESOS-8184
https://issues.apache.org/jira/browse/MESOS-8184


Repository: mesos


Description
---

Added a method to increment invalid scheduler API call counters.


Diffs
-

  src/master/http.cpp c489b6f525f157811549b2cc84addd9d85e87990 
  src/master/master.cpp b97ebae6ebfd8ae0f73e617d0c55e140b9c3fce7 
  src/master/metrics.hpp f701efec0a82d6ba72b2414f739fcd1cd7ee2491 
  src/master/metrics.cpp 64fc829ac3b58d95fc0bd074571a46518a80bbba 


Diff: https://reviews.apache.org/r/65362/diff/1/


Testing
---

`make check` on GNU/Linux


Thanks,

Gaston Kleiman



Review Request 65358: Removed unnecesssary validations from `Master::receive()`.

2018-01-26 Thread Gaston Kleiman

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65358/
---

Review request for mesos and Greg Mann.


Bugs: MESOS-8184
https://issues.apache.org/jira/browse/MESOS-8184


Repository: mesos


Description
---

`validation::scheduler::call::validate(call)` is called at the top of
`master::receive()`, so it doesn't need to validate the calls again.


Diffs
-

  src/master/master.cpp b97ebae6ebfd8ae0f73e617d0c55e140b9c3fce7 


Diff: https://reviews.apache.org/r/65358/diff/1/


Testing
---

`make check` on GNU/Linux


Thanks,

Gaston Kleiman



Re: Review Request 65300: Added master metrics for operation status acknowledgments handling.

2018-01-26 Thread Gaston Kleiman

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65300/
---

(Updated Jan. 26, 2018, 2:08 p.m.)


Review request for mesos and Greg Mann.


Bugs: MESOS-8184
https://issues.apache.org/jira/browse/MESOS-8184


Repository: mesos


Description
---

Added master metrics for operation status acknowledgments handling.


Diffs (updated)
-

  docs/monitoring.md 7ad15b4f821108083b5f9f4381c33d5f465d361b 
  src/master/metrics.hpp f701efec0a82d6ba72b2414f739fcd1cd7ee2491 
  src/master/metrics.cpp 64fc829ac3b58d95fc0bd074571a46518a80bbba 


Diff: https://reviews.apache.org/r/65300/diff/3/

Changes: https://reviews.apache.org/r/65300/diff/2-3/


Testing
---

`make check` on GNU/Linux


Thanks,

Gaston Kleiman



Review Request 65359: Added an extra CHECK to `Master::acknowledge()`.

2018-01-26 Thread Gaston Kleiman

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65359/
---

Review request for mesos and Greg Mann.


Bugs: MESOS-8184
https://issues.apache.org/jira/browse/MESOS-8184


Repository: mesos


Description
---

The acknowledge call is validated before being passed to
`Master::acknowledge`, so it is safe to assume that `acknowledge.uuid()`
will be valid.

Added a CHECK to detect problems in the validation code or changes in
the code paths.


Diffs
-

  src/master/master.cpp b97ebae6ebfd8ae0f73e617d0c55e140b9c3fce7 


Diff: https://reviews.apache.org/r/65359/diff/1/


Testing
---

`make check` on GNU/Linux


Thanks,

Gaston Kleiman



Re: Review Request 65300: Added master metrics for operation status acknowledgments handling.

2018-01-26 Thread Gaston Kleiman

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65300/
---

(Updated Jan. 26, 2018, 1:45 p.m.)


Review request for mesos and Greg Mann.


Changes
---

Addressed feedback.


Bugs: MESOS-8184
https://issues.apache.org/jira/browse/MESOS-8184


Repository: mesos


Description
---

Added master metrics for operation status acknowledgments handling.


Diffs (updated)
-

  docs/monitoring.md 7ad15b4f821108083b5f9f4381c33d5f465d361b 
  src/master/metrics.hpp f701efec0a82d6ba72b2414f739fcd1cd7ee2491 
  src/master/metrics.cpp 64fc829ac3b58d95fc0bd074571a46518a80bbba 


Diff: https://reviews.apache.org/r/65300/diff/2/

Changes: https://reviews.apache.org/r/65300/diff/1-2/


Testing
---

`make check` on GNU/Linux


Thanks,

Gaston Kleiman



Re: Review Request 65294: Add revocable resources support to mesos-execute.

2018-01-26 Thread James Peach


> On Jan. 25, 2018, 9:10 p.m., Anish Gupta wrote:
> > src/cli/execute.cpp
> > Lines 1077-1082 (patched)
> > 
> >
> > Why not unconditionally set it since resources are not revocable unless 
> > have RevocableInfo
> 
> Zhitao Li wrote:
> I feel that is unnecessary, and receiving resources which this utility 
> may not need is unnecessary behavior change.
> 
> Inferring this capability from input resources seems clean enough to me.

I agree that it is safe to unconditionally enable the `REVOCABLE_RESOURCES` 
capability. I'd prefer to do that than requiring the user to specify it in both 
the flags and the resources.


- James


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65294/#review196286
---


On Jan. 24, 2018, 9:51 p.m., Zhitao Li wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/65294/
> ---
> 
> (Updated Jan. 24, 2018, 9:51 p.m.)
> 
> 
> Review request for mesos, Anish Gupta, Jason Lai, and James Peach.
> 
> 
> Bugs: MESOS-8471
> https://issues.apache.org/jira/browse/MESOS-8471
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> This allows user to use `mesos-execute` to quickly test over subscription.
> 
> 
> Diffs
> -
> 
>   src/cli/execute.cpp 6e626893a64dc07ac9a5f9a5c31e680cb80938fe 
> 
> 
> Diff: https://reviews.apache.org/r/65294/diff/1/
> 
> 
> Testing
> ---
> 
> Submitted a test task with following configuration:
> ```
>  {
>"name": "Name",
>"task_id": {"value" : "1"},
>"agent_id": {"value" : ""},
>"resources": [
>  {
>"name": "cpus",
>"type": "SCALAR",
>"revocable": {},
>"scalar": {
>  "value": 0.1
>}
>  }
>],
>"command": {
>  "value": "sleep 1000"
>}
>  }
>  ```
>  
> With command `mesos-execute --master=localhost:5050 --revocable_resources 
> --task=file:///home/user/test_rev_task.json`
> 
> If the master/agent has revocable cpu, this allows the task to execute.
> 
> 
> Thanks,
> 
> Zhitao Li
> 
>



Re: Review Request 65044: Added the v1 API 'GET_OPERATIONS' call for master and agent.

2018-01-26 Thread Vinod Kone

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65044/#review196370
---



API changes look good to me. I haven't reviewed the tests though.


src/slave/http.cpp
Lines 1685 (patched)


Not yours, but can we make the logging consistent here and in master for 
operator api handlers? Looks like master doesn't even log here?


- Vinod Kone


On Jan. 25, 2018, 10:36 a.m., Jan Schlicht wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/65044/
> ---
> 
> (Updated Jan. 25, 2018, 10:36 a.m.)
> 
> 
> Review request for mesos, Benjamin Bannier, Gaston Kleiman, Greg Mann, Jie 
> Yu, and Vinod Kone.
> 
> 
> Bugs: MESOS-8424
> https://issues.apache.org/jira/browse/MESOS-8424
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> The 'GET_OPERATIONS' call lists all operations known to master or agent.
> 
> 
> Diffs
> -
> 
>   include/mesos/agent/agent.proto 315820041990d28e5cdae71c8385d4551d56558c 
>   include/mesos/master/master.proto 3e34634f3b864222d07374936c0d9a18269c2fbd 
>   include/mesos/v1/agent/agent.proto 9e8b49defb83ffd4dd240ebb3a09b2dd3610ab2a 
>   include/mesos/v1/master/master.proto 
> 6759c3004e7e04c0044360b0c54f4438c2f6ec8a 
>   src/master/http.cpp c489b6f525f157811549b2cc84addd9d85e87990 
>   src/master/master.hpp b434d2398b8815811345b6586ca586d2025cb2a2 
>   src/master/validation.cpp f0b86775b7919ba6aa4a73038edb78a0adca68f4 
>   src/slave/http.hpp 1619bb77e3e7d5a6d387e4d4bb071d83ce914967 
>   src/slave/http.cpp 77e711ceeb0e2613d629b5e21fd686f85dfad11a 
>   src/slave/validation.cpp 0c2ccda177734cf3c47c0346ed34d20d58e7d932 
>   src/tests/api_tests.cpp bdacc30be4dc8656a41a0c47d0e350d48e59ad94 
> 
> 
> Diff: https://reviews.apache.org/r/65044/diff/8/
> 
> 
> Testing
> ---
> 
> make check
> 
> 
> Thanks,
> 
> Jan Schlicht
> 
>



Re: Review Request 65347: Added missing protobuf include.

2018-01-26 Thread Joseph Wu

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65347/#review196368
---


Ship it!




Note that the reason why this doesn't compile in protobuf 3.0.x is due to how 
the c++ files are generated.  In protobuf 3.0.x (and 3.1.x and 3.2.x) generated 
code only includes the protobuf map headers if there is a map present in the 
.proto file:
https://github.com/google/protobuf/blob/3.0.x/src/google/protobuf/compiler/cpp/cpp_file.cc#L817-L827

>From 3.3.x onwards, all generated files include 
>`google/protobuf/generated_message_table_driven.h`, which in turn includes the 
>map headers:
https://github.com/google/protobuf/blob/3.3.x/src/google/protobuf/compiler/cpp/cpp_file.cc#L1006

In our case, the `common/protobuf_utils.hpp` includes `mesos/mesos.hpp`, which 
includes a generated protobuf.

- Joseph Wu


On Jan. 26, 2018, 3:41 a.m., Benno Evers wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/65347/
> ---
> 
> (Updated Jan. 26, 2018, 3:41 a.m.)
> 
> 
> Review request for mesos.
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> When using our bundled version of protobuf, this file is transitively
> included from other headers.
> 
> However, when building against olders versions of protobuf this fails,
> for example when building against the system-default protobuf 3.0
> on Ubuntu 17.04.
> 
> 
> Diffs
> -
> 
>   src/common/protobuf_utils.hpp 0ff7239dd6b8f74119fdd97531d94af21fd0500c 
> 
> 
> Diff: https://reviews.apache.org/r/65347/diff/1/
> 
> 
> Testing
> ---
> 
> Built without bundled protobuf on an Ubuntu system.
> 
> Started internal CI run: 
> https://jenkins.mesosphere.com/service/jenkins/job/mesos/job/Mesos_CI/2675/
> 
> 
> Thanks,
> 
> Benno Evers
> 
>



Re: Review Request 65354: Fix flakyness in MasterTest.RegistryGcByCount.

2018-01-26 Thread Mesos Reviewbot Windows

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65354/#review196366
---



PASS: Mesos patch 65354 was successfully built and tested.

Reviews applied: `['65354']`

All the build artifacts available at: 
http://dcos-win.westus.cloudapp.azure.com/mesos-build/review/65354

- Mesos Reviewbot Windows


On Jan. 26, 2018, 6:12 p.m., Benno Evers wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/65354/
> ---
> 
> (Updated Jan. 26, 2018, 6:12 p.m.)
> 
> 
> Review request for mesos.
> 
> 
> Bugs: MESOS-8485
> https://issues.apache.org/jira/browse/MESOS-8485
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> The test could fail on very slow hosts due to a timeout
> in the agent waiting for the registration and a subsequent
> re-sent `SlaveRegisteredMessage. See MESOS-8485.
> 
> 
> Diffs
> -
> 
>   src/tests/master_tests.cpp 7112bb4efcc312dc4f68bdc44fb685c7624002b1 
> 
> 
> Diff: https://reviews.apache.org/r/65354/diff/1/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Benno Evers
> 
>



Re: Review Request 65354: Fix flakyness in MasterTest.RegistryGcByCount.

2018-01-26 Thread Mesos Reviewbot Windows

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65354/#review196364
---



PASS: Mesos patch 65354 was successfully built and tested.

Reviews applied: `['65354']`

All the build artifacts available at: 
http://dcos-win.westus.cloudapp.azure.com/mesos-build/review/65354

- Mesos Reviewbot Windows


On Jan. 26, 2018, 6:12 p.m., Benno Evers wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/65354/
> ---
> 
> (Updated Jan. 26, 2018, 6:12 p.m.)
> 
> 
> Review request for mesos.
> 
> 
> Bugs: MESOS-8485
> https://issues.apache.org/jira/browse/MESOS-8485
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> The test could fail on very slow hosts due to a timeout
> in the agent waiting for the registration and a subsequent
> re-sent `SlaveRegisteredMessage. See MESOS-8485.
> 
> 
> Diffs
> -
> 
>   src/tests/master_tests.cpp 7112bb4efcc312dc4f68bdc44fb685c7624002b1 
> 
> 
> Diff: https://reviews.apache.org/r/65354/diff/1/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Benno Evers
> 
>



Re: Review Request 65354: Fix flakyness in MasterTest.RegistryGcByCount.

2018-01-26 Thread Benno Evers

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65354/
---

(Updated Jan. 26, 2018, 6:12 p.m.)


Review request for mesos.


Bugs: MESOS-8485
https://issues.apache.org/jira/browse/MESOS-8485


Repository: mesos


Description
---

The test could fail on very slow hosts due to a timeout
in the agent waiting for the registration and a subsequent
re-sent `SlaveRegisteredMessage. See MESOS-8485.


Diffs
-

  src/tests/master_tests.cpp 7112bb4efcc312dc4f68bdc44fb685c7624002b1 


Diff: https://reviews.apache.org/r/65354/diff/1/


Testing
---


Thanks,

Benno Evers



Review Request 65354: Fix flakyness in MasterTest.RegistryGcByCount.

2018-01-26 Thread Benno Evers

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65354/
---

Review request for mesos.


Repository: mesos


Description
---

The test could fail on very slow hosts due to a timeout
in the agent waiting for the registration and a subsequent
re-sent `SlaveRegisteredMessage. See MESOS-8485.


Diffs
-

  src/tests/master_tests.cpp 7112bb4efcc312dc4f68bdc44fb685c7624002b1 


Diff: https://reviews.apache.org/r/65354/diff/1/


Testing
---


Thanks,

Benno Evers



Re: Review Request 65351: Reordered roles in Web UI to display them alphabetically by default.

2018-01-26 Thread Mesos Reviewbot Windows

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65351/#review196362
---



PASS: Mesos patch 65351 was successfully built and tested.

Reviews applied: `['65351']`

All the build artifacts available at: 
http://dcos-win.westus.cloudapp.azure.com/mesos-build/review/65351

- Mesos Reviewbot Windows


On Jan. 26, 2018, 3:16 p.m., Armand Grillet wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/65351/
> ---
> 
> (Updated Jan. 26, 2018, 3:16 p.m.)
> 
> 
> Review request for mesos and Benjamin Mahler.
> 
> 
> Bugs: MESOS-8495
> https://issues.apache.org/jira/browse/MESOS-8495
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Use the alphabetical order of the name of the roles by
> default to display their hierarchy straightforwardly.
> 
> 
> Diffs
> -
> 
>   src/webui/master/static/js/app.js 463c563cd121e881f7d137123534a047c5e420d2 
>   src/webui/master/static/roles.html 7578852e4e2ad29a0f07e16dcff85af6a2255b02 
> 
> 
> Diff: https://reviews.apache.org/r/65351/diff/1/
> 
> 
> Testing
> ---
> 
> ```
> $ bash mesos-master.sh --port='5061' --work_dir='/tmp/master1' 
> --webui_dir='/Users/Armand/Code/apache-mesos/src/webui' 
> --log_dir='/tmp/master1-log' --weights="example=42,example/a=2" 
> --registry=in_memory
> ```
> UI before patch when viewing the roles tab:
> ![Old UI](https://i.imgur.com/dflxj11.png)
> UI by default after patch:
> ![New UI](https://i.imgur.com/PRyAwNs.png)
> 
> 
> Thanks,
> 
> Armand Grillet
> 
>



Re: Review Request 63368: Added MemoryProfiler class to Libprocess.

2018-01-26 Thread Alexander Rukletsov

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/63368/#review196340
---




3rdparty/libprocess/Makefile.am
Lines 213 (patched)


Please tabs



3rdparty/libprocess/Makefile.am
Lines 303 (patched)


Please tabs



3rdparty/libprocess/include/process/memory_profiler.hpp
Lines 36 (patched)


No need to use the FQN here.



3rdparty/libprocess/include/process/memory_profiler.hpp
Lines 51-52 (patched)


"How long"... what? Can you please be more verbose here?
Also you mean request parameters, can you say this explicitly?



3rdparty/libprocess/include/process/memory_profiler.hpp
Lines 57-58 (patched)


Blank comment line in-between, please



3rdparty/libprocess/include/process/memory_profiler.hpp
Lines 64-66 (patched)


Please write a leading comment explaining what these functions return, 
i.e., last _finished_ state.



3rdparty/libprocess/include/process/memory_profiler.hpp
Lines 67 (patched)


No need for process prefix, here and below.



3rdparty/libprocess/include/process/memory_profiler.hpp
Lines 81 (patched)


What kind of statistics is this? Is this related to the last / active 
state? Or is it accumulated across all states since process launch? Can you 
please elaborate?



3rdparty/libprocess/include/process/memory_profiler.hpp
Lines 104 (patched)


Let's call this guy `stopAndGenerateRawProfile()` for clarity.



3rdparty/libprocess/include/process/memory_profiler.hpp
Lines 124 (patched)


~~One shot one kill~~one line one declaration, please.

Any reason to keep these fellas as `Try`s?

Please also expand comments on these fields. For example, 
```
// A timestamp of a last successful raw dump.
Option rawId;
```



3rdparty/libprocess/include/process/memory_profiler.hpp
Lines 130 (patched)


Let's make it `const`. You will likely need a static or free function that 
read the env var.



3rdparty/libprocess/src/memory_profiler.cpp
Lines 52 (patched)


Please backtick type, variable, class, and function names in comments.



3rdparty/libprocess/src/memory_profiler.cpp
Lines 198-240 (patched)


Can we combine these two with in a `Result updateJemallocSetting(const 
char* name, const T& value)`?



3rdparty/libprocess/src/memory_profiler.cpp
Lines 245 (patched)


Here you convert `string` to `Path` in order to convert it back at a call 
site. I suggest to make `getRawProfilePath()` return `string.

Also, there is no need in leading slashes.

And since you already have a section for literals, maybe put file names 
there?



3rdparty/libprocess/src/memory_profiler.cpp
Lines 261-271 (patched)


It's not obvious that updating a setting triggers a dump. Can you please 
leave a note here?



3rdparty/libprocess/src/memory_profiler.cpp
Lines 275 (patched)


One shot one...



3rdparty/libprocess/src/memory_profiler.cpp
Lines 313 (patched)


Why newline?



3rdparty/libprocess/src/memory_profiler.cpp
Lines 326-330 (patched)


Formatting.



3rdparty/libprocess/src/memory_profiler.cpp
Lines 334 (patched)


Let's make it a lambda in `start()`



3rdparty/libprocess/src/memory_profiler.cpp
Lines 347 (patched)


Let's make it a lambda in `stop()`.



3rdparty/libprocess/src/memory_profiler.cpp
Lines 355-357 (patched)


For clarity, let's instead make it 
```
Option extractIdFromRequest(const process::http::Request& request)
```
Then at the call site you can say:
`extractIdFromRequest(r).getOrElse(fallback.getOrElse(0))`



3rdparty/libprocess/src/memory_profiler.cpp
Lines 361 (patched)


`strtol`?



3rdparty/libprocess/src/memory_profiler.cpp
Lines 372 (patched)

Re: Review Request 65294: Add revocable resources support to mesos-execute.

2018-01-26 Thread Zhitao Li


> On Jan. 25, 2018, 9:10 p.m., Anish Gupta wrote:
> > src/cli/execute.cpp
> > Lines 1077-1082 (patched)
> > 
> >
> > Why not unconditionally set it since resources are not revocable unless 
> > have RevocableInfo

I feel that is unnecessary, and receiving resources which this utility may not 
need is unnecessary behavior change.

Inferring this capability from input resources seems clean enough to me.


- Zhitao


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65294/#review196286
---


On Jan. 24, 2018, 9:51 p.m., Zhitao Li wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/65294/
> ---
> 
> (Updated Jan. 24, 2018, 9:51 p.m.)
> 
> 
> Review request for mesos, Anish Gupta, Jason Lai, and James Peach.
> 
> 
> Bugs: MESOS-8471
> https://issues.apache.org/jira/browse/MESOS-8471
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> This allows user to use `mesos-execute` to quickly test over subscription.
> 
> 
> Diffs
> -
> 
>   src/cli/execute.cpp 6e626893a64dc07ac9a5f9a5c31e680cb80938fe 
> 
> 
> Diff: https://reviews.apache.org/r/65294/diff/1/
> 
> 
> Testing
> ---
> 
> Submitted a test task with following configuration:
> ```
>  {
>"name": "Name",
>"task_id": {"value" : "1"},
>"agent_id": {"value" : ""},
>"resources": [
>  {
>"name": "cpus",
>"type": "SCALAR",
>"revocable": {},
>"scalar": {
>  "value": 0.1
>}
>  }
>],
>"command": {
>  "value": "sleep 1000"
>}
>  }
>  ```
>  
> With command `mesos-execute --master=localhost:5050 --revocable_resources 
> --task=file:///home/user/test_rev_task.json`
> 
> If the master/agent has revocable cpu, this allows the task to execute.
> 
> 
> Thanks,
> 
> Zhitao Li
> 
>



Re: Review Request 65351: Reordered roles in Web UI to display them alphabetically by default.

2018-01-26 Thread Mesos Reviewbot Windows

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65351/#review196357
---



PASS: Mesos patch 65351 was successfully built and tested.

Reviews applied: `['65351']`

All the build artifacts available at: 
http://dcos-win.westus.cloudapp.azure.com/mesos-build/review/65351

- Mesos Reviewbot Windows


On Jan. 26, 2018, 3:16 p.m., Armand Grillet wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/65351/
> ---
> 
> (Updated Jan. 26, 2018, 3:16 p.m.)
> 
> 
> Review request for mesos and Benjamin Mahler.
> 
> 
> Bugs: MESOS-8495
> https://issues.apache.org/jira/browse/MESOS-8495
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Use the alphabetical order of the name of the roles by
> default to display their hierarchy straightforwardly.
> 
> 
> Diffs
> -
> 
>   src/webui/master/static/js/app.js 463c563cd121e881f7d137123534a047c5e420d2 
>   src/webui/master/static/roles.html 7578852e4e2ad29a0f07e16dcff85af6a2255b02 
> 
> 
> Diff: https://reviews.apache.org/r/65351/diff/1/
> 
> 
> Testing
> ---
> 
> ```
> $ bash mesos-master.sh --port='5061' --work_dir='/tmp/master1' 
> --webui_dir='/Users/Armand/Code/apache-mesos/src/webui' 
> --log_dir='/tmp/master1-log' --weights="example=42,example/a=2" 
> --registry=in_memory
> ```
> UI before patch when viewing the roles tab:
> ![Old UI](https://i.imgur.com/dflxj11.png)
> UI by default after patch:
> ![New UI](https://i.imgur.com/PRyAwNs.png)
> 
> 
> Thanks,
> 
> Armand Grillet
> 
>



Re: Review Request 65350: Modified `cgroups::prepare` to check nested cgroups support only once.

2018-01-26 Thread Mesos Reviewbot Windows

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65350/#review196354
---



PASS: Mesos patch 65350 was successfully built and tested.

Reviews applied: `['65350']`

All the build artifacts available at: 
http://dcos-win.westus.cloudapp.azure.com/mesos-build/review/65350

- Mesos Reviewbot Windows


On Jan. 26, 2018, 2:17 p.m., Andrei Budnik wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/65350/
> ---
> 
> (Updated Jan. 26, 2018, 2:17 p.m.)
> 
> 
> Review request for mesos, Alexander Rukletsov, Gilbert Song, and Jie Yu.
> 
> 
> Bugs: MESOS-8489
> https://issues.apache.org/jira/browse/MESOS-8489
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> This patch modifies `cgroup::prepare` function to save the results of
> a test for a nested cgroups support in a hashmap to reuse it in the
> future.
> 
> Besides optimization, this patch fixes flaky
> `LinuxCapabilitiesIsolatorFlagsTest.ROOT_IsolatorFlags` test. There
> was a race condition between two starting agents, where the first
> agent could detect `test` orphaned container while iterating over
> `/sys/fs/cgroup/freezer/mesos/` cgroup, because the second agent
> is calling `cgroup::prepare` during its initialization. This patch
> fixes the test, because it won't create a `test` cgroup in
> `cgroups::prepare` as this test has already been passed during the
> initialization of the first agent and the result is stored in hashmap.
> 
> 
> Diffs
> -
> 
>   src/linux/cgroups.cpp cdd0f40ec5748b909e529f21b15523122bf90d41 
> 
> 
> Diff: https://reviews.apache.org/r/65350/diff/1/
> 
> 
> Testing
> ---
> 
> sudo make check (Fedora 25)
> internal CI
> 
> 
> Thanks,
> 
> Andrei Budnik
> 
>



Review Request 65351: Reordered roles in Web UI to display them alphabetically by default.

2018-01-26 Thread Armand Grillet

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65351/
---

Review request for mesos and Benjamin Mahler.


Bugs: MESOS-8495
https://issues.apache.org/jira/browse/MESOS-8495


Repository: mesos


Description
---

Use the alphabetical order of the name of the roles by
default to display their hierarchy straightforwardly.


Diffs
-

  src/webui/master/static/js/app.js 463c563cd121e881f7d137123534a047c5e420d2 
  src/webui/master/static/roles.html 7578852e4e2ad29a0f07e16dcff85af6a2255b02 


Diff: https://reviews.apache.org/r/65351/diff/1/


Testing
---

```
$ bash mesos-master.sh --port='5061' --work_dir='/tmp/master1' 
--webui_dir='/Users/Armand/Code/apache-mesos/src/webui' 
--log_dir='/tmp/master1-log' --weights="example=42,example/a=2" 
--registry=in_memory
```
UI before patch when viewing the roles tab:
![Old UI](https://i.imgur.com/dflxj11.png)
UI by default after patch:
![New UI](https://i.imgur.com/PRyAwNs.png)


Thanks,

Armand Grillet



Re: Review Request 65350: Modified `cgroups::prepare` to check nested cgroups support only once.

2018-01-26 Thread Mesos Reviewbot Windows

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65350/#review196349
---



FAIL: Failed to get dependent review IDs for the current patch.

Failed command: `python.exe 
C:\Users\mesos\AppData\Local\Temp\tmp2C66.tmp\Mesos\utils\get-review-ids.py -r 
65350 -o C:\Users\mesos\AppData\Local\Temp\mesos_dependent_review_ids`

All the build artifacts available at: 
http://dcos-win.westus.cloudapp.azure.com/mesos-build/review/65350

Relevant logs:

- 
[get-review-ids-stderr.log](http://dcos-win.westus.cloudapp.azure.com/mesos-build/review/65350/logs/get-review-ids-stderr.log):

```
C:\Python27\python.exe: can't open file 
'C:\Users\mesos\AppData\Local\Temp\tmp2C66.tmp\Mesos\utils\get-review-ids.py': 
[Errno 2] No such file or directory
```

- Mesos Reviewbot Windows


On Jan. 26, 2018, 2:17 p.m., Andrei Budnik wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/65350/
> ---
> 
> (Updated Jan. 26, 2018, 2:17 p.m.)
> 
> 
> Review request for mesos, Alexander Rukletsov, Gilbert Song, and Jie Yu.
> 
> 
> Bugs: MESOS-8489
> https://issues.apache.org/jira/browse/MESOS-8489
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> This patch modifies `cgroup::prepare` function to save the results of
> a test for a nested cgroups support in a hashmap to reuse it in the
> future.
> 
> Besides optimization, this patch fixes flaky
> `LinuxCapabilitiesIsolatorFlagsTest.ROOT_IsolatorFlags` test. There
> was a race condition between two starting agents, where the first
> agent could detect `test` orphaned container while iterating over
> `/sys/fs/cgroup/freezer/mesos/` cgroup, because the second agent
> is calling `cgroup::prepare` during its initialization. This patch
> fixes the test, because it won't create a `test` cgroup in
> `cgroups::prepare` as this test has already been passed during the
> initialization of the first agent and the result is stored in hashmap.
> 
> 
> Diffs
> -
> 
>   src/linux/cgroups.cpp cdd0f40ec5748b909e529f21b15523122bf90d41 
> 
> 
> Diff: https://reviews.apache.org/r/65350/diff/1/
> 
> 
> Testing
> ---
> 
> sudo make check (Fedora 25)
> internal CI
> 
> 
> Thanks,
> 
> Andrei Budnik
> 
>



Re: Review Request 65348: Fixed flakyness of MasterTest.UpdateSlaveMessageWithPendingOffers.

2018-01-26 Thread Mesos Reviewbot Windows

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65348/#review196343
---



PASS: Mesos patch 65348 was successfully built and tested.

Reviews applied: `['65348']`

All the build artifacts available at: 
http://dcos-win.westus.cloudapp.azure.com/mesos-build/review/65348

- Mesos Reviewbot Windows


On Jan. 26, 2018, 1:31 p.m., Jan Schlicht wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/65348/
> ---
> 
> (Updated Jan. 26, 2018, 1:31 p.m.)
> 
> 
> Review request for mesos, Alexander Rukletsov and Benjamin Bannier.
> 
> 
> Bugs: MESOS-8490
> https://issues.apache.org/jira/browse/MESOS-8490
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Under rare circumstances, the clock would have to be advanced to trigger
> a new allocation cycle after 'UPDATE_SLAVE' has been handled in the
> master. We also wait for old offers to be rescinded before waiting for
> updated offers as sometimes the old offer could be received erroneously.
> 
> 
> Diffs
> -
> 
>   src/tests/master_tests.cpp 7112bb4efcc312dc4f68bdc44fb685c7624002b1 
> 
> 
> Diff: https://reviews.apache.org/r/65348/diff/2/
> 
> 
> Testing
> ---
> 
> make check
> 
> `src/mesos-tests 
> --gtest_filter=MasterTest.UpdateSlaveMessageWithPendingOffers 
> --gtest_repeat=1 --gtest_break_on_failure` while running `stress -c 4 -i 
> 8 -m 2 -d 6` on a 4 core machine. It usually takes a few thousand iterations 
> to trigger the flakyness in the old code. And only when the host is under 
> heavy load.
> 
> 
> Thanks,
> 
> Jan Schlicht
> 
>



Re: Review Request 63369: Added new '/state' and '/statistics' endpoints to the MemoryProfiler.

2018-01-26 Thread Benno Evers

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/63369/
---

(Updated Jan. 26, 2018, 2:18 p.m.)


Review request for mesos, Alexander Rukletsov and Benjamin Mahler.


Repository: mesos


Description
---

These endpoints provide useful information for both developers
and end users.


Diffs
-

  3rdparty/libprocess/include/process/memory_profiler.hpp PRE-CREATION 
  3rdparty/libprocess/src/memory_profiler.cpp PRE-CREATION 


Diff: https://reviews.apache.org/r/63369/diff/1/


Testing
---


Thanks,

Benno Evers



Review Request 65350: Modified `cgroups::prepare` to check nested cgroups support only once.

2018-01-26 Thread Andrei Budnik

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65350/
---

Review request for mesos, Alexander Rukletsov, Gilbert Song, and Jie Yu.


Bugs: MESOS-8489
https://issues.apache.org/jira/browse/MESOS-8489


Repository: mesos


Description
---

This patch modifies `cgroup::prepare` function to save the results of
a test for a nested cgroups support in a hashmap to reuse it in the
future.

Besides optimization, this patch fixes flaky
`LinuxCapabilitiesIsolatorFlagsTest.ROOT_IsolatorFlags` test. There
was a race condition between two starting agents, where the first
agent could detect `test` orphaned container while iterating over
`/sys/fs/cgroup/freezer/mesos/` cgroup, because the second agent
is calling `cgroup::prepare` during its initialization. This patch
fixes the test, because it won't create a `test` cgroup in
`cgroups::prepare` as this test has already been passed during the
initialization of the first agent and the result is stored in hashmap.


Diffs
-

  src/linux/cgroups.cpp cdd0f40ec5748b909e529f21b15523122bf90d41 


Diff: https://reviews.apache.org/r/65350/diff/1/


Testing
---

sudo make check (Fedora 25)
internal CI


Thanks,

Andrei Budnik



Re: Review Request 65348: Fixed flakyness of MasterTest.UpdateSlaveMessageWithPendingOffers.

2018-01-26 Thread Mesos Reviewbot Windows

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65348/#review196339
---



PASS: Mesos patch 65348 was successfully built and tested.

Reviews applied: `['65348']`

All the build artifacts available at: 
http://dcos-win.westus.cloudapp.azure.com/mesos-build/review/65348

- Mesos Reviewbot Windows


On Jan. 26, 2018, 1:31 p.m., Jan Schlicht wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/65348/
> ---
> 
> (Updated Jan. 26, 2018, 1:31 p.m.)
> 
> 
> Review request for mesos, Alexander Rukletsov and Benjamin Bannier.
> 
> 
> Bugs: MESOS-8490
> https://issues.apache.org/jira/browse/MESOS-8490
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Under rare circumstances, the clock would have to be advanced to trigger
> a new allocation cycle after 'UPDATE_SLAVE' has been handled in the
> master. We also wait for old offers to be rescinded before waiting for
> updated offers as sometimes the old offer could be received erroneously.
> 
> 
> Diffs
> -
> 
>   src/tests/master_tests.cpp 7112bb4efcc312dc4f68bdc44fb685c7624002b1 
> 
> 
> Diff: https://reviews.apache.org/r/65348/diff/2/
> 
> 
> Testing
> ---
> 
> make check
> 
> `src/mesos-tests 
> --gtest_filter=MasterTest.UpdateSlaveMessageWithPendingOffers 
> --gtest_repeat=1 --gtest_break_on_failure` while running `stress -c 4 -i 
> 8 -m 2 -d 6` on a 4 core machine. It usually takes a few thousand iterations 
> to trigger the flakyness in the old code. And only when the host is under 
> heavy load.
> 
> 
> Thanks,
> 
> Jan Schlicht
> 
>



Re: Review Request 65348: Fixed flakyness of MasterTest.UpdateSlaveMessageWithPendingOffers.

2018-01-26 Thread Jan Schlicht

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65348/
---

(Updated Jan. 26, 2018, 2:31 p.m.)


Review request for mesos, Alexander Rukletsov and Benjamin Bannier.


Changes
---

Removed TODO.


Bugs: MESOS-8490
https://issues.apache.org/jira/browse/MESOS-8490


Repository: mesos


Description (updated)
---

Under rare circumstances, the clock would have to be advances to trigger
a new allocation cycle after 'UPDATE_SLAVE' has been handled in the
master. We also wait for old offers to be rescinded before waiting for
updated offers as sometimes the old offer could be received erroneously.


Diffs (updated)
-

  src/tests/master_tests.cpp 7112bb4efcc312dc4f68bdc44fb685c7624002b1 


Diff: https://reviews.apache.org/r/65348/diff/2/

Changes: https://reviews.apache.org/r/65348/diff/1-2/


Testing
---

make check

`src/mesos-tests --gtest_filter=MasterTest.UpdateSlaveMessageWithPendingOffers 
--gtest_repeat=1 --gtest_break_on_failure` while running `stress -c 4 -i 8 
-m 2 -d 6` on a 4 core machine. It usually takes a few thousand iterations to 
trigger the flakyness in the old code. And only when the host is under heavy 
load.


Thanks,

Jan Schlicht



***UNCHECKED*** Re: Review Request 65348: Fixed flakyness of MasterTest.UpdateSlaveMessageWithPendingOffers.

2018-01-26 Thread Alexander Rukletsov

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65348/#review196332
---




src/tests/master_tests.cpp
Lines 8615-8616 (original), 8615-8616 (patched)


Remove the comment as well!


- Alexander Rukletsov


On Jan. 26, 2018, 12:55 p.m., Jan Schlicht wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/65348/
> ---
> 
> (Updated Jan. 26, 2018, 12:55 p.m.)
> 
> 
> Review request for mesos, Alexander Rukletsov and Benjamin Bannier.
> 
> 
> Bugs: MESOS-8490
> https://issues.apache.org/jira/browse/MESOS-8490
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Under rare circumstances, the clock would have to be advanced to trigger
> a new allocation cycle after 'UPDATE_SLAVE' has been handled in the
> master. We also wait for old offers to be rescinded before waiting for
> updated offers as sometimes the old offer could be received erroneously.
> 
> 
> Diffs
> -
> 
>   src/tests/master_tests.cpp 4a284c4e9c26b39d1e88b405f8b904f43b902b02 
> 
> 
> Diff: https://reviews.apache.org/r/65348/diff/1/
> 
> 
> Testing
> ---
> 
> make check
> 
> `src/mesos-tests 
> --gtest_filter=MasterTest.UpdateSlaveMessageWithPendingOffers 
> --gtest_repeat=1 --gtest_break_on_failure` while running `stress -c 4 -i 
> 8 -m 2 -d 6` on a 4 core machine. It usually takes a few thousand iterations 
> to trigger the flakyness in the old code. And only when the host is under 
> heavy load.
> 
> 
> Thanks,
> 
> Jan Schlicht
> 
>



Re: Review Request 65347: Added missing protobuf include.

2018-01-26 Thread Mesos Reviewbot Windows

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65347/#review196330
---



PASS: Mesos patch 65347 was successfully built and tested.

Reviews applied: `['65347']`

All the build artifacts available at: 
http://dcos-win.westus.cloudapp.azure.com/mesos-build/review/65347

- Mesos Reviewbot Windows


On Jan. 26, 2018, 11:41 a.m., Benno Evers wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/65347/
> ---
> 
> (Updated Jan. 26, 2018, 11:41 a.m.)
> 
> 
> Review request for mesos.
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> When using our bundled version of protobuf, this file is transitively
> included from other headers.
> 
> However, when building against olders versions of protobuf this fails,
> for example when building against the system-default protobuf 3.0
> on Ubuntu 17.04.
> 
> 
> Diffs
> -
> 
>   src/common/protobuf_utils.hpp 0ff7239dd6b8f74119fdd97531d94af21fd0500c 
> 
> 
> Diff: https://reviews.apache.org/r/65347/diff/1/
> 
> 
> Testing
> ---
> 
> Built without bundled protobuf on an Ubuntu system.
> 
> Started internal CI run: 
> https://jenkins.mesosphere.com/service/jenkins/job/mesos/job/Mesos_CI/2675/
> 
> 
> Thanks,
> 
> Benno Evers
> 
>



Re: Review Request 65346: Updated Web UI to show quota guarantee and limit for roles.

2018-01-26 Thread Mesos Reviewbot Windows

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65346/#review196329
---



PASS: Mesos patch 65346 was successfully built and tested.

Reviews applied: `['65334', '65346']`

All the build artifacts available at: 
http://dcos-win.westus.cloudapp.azure.com/mesos-build/review/65346

- Mesos Reviewbot Windows


On Jan. 26, 2018, 11:58 a.m., Armand Grillet wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/65346/
> ---
> 
> (Updated Jan. 26, 2018, 11:58 a.m.)
> 
> 
> Review request for mesos and Benjamin Mahler.
> 
> 
> Bugs: MESOS-8486
> https://issues.apache.org/jira/browse/MESOS-8486
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> See summary.
> 
> 
> Diffs
> -
> 
>   src/webui/master/static/roles.html 7578852e4e2ad29a0f07e16dcff85af6a2255b02 
> 
> 
> Diff: https://reviews.apache.org/r/65346/diff/1/
> 
> 
> Testing
> ---
> 
> New UI:
> ![New roles row](https://i.imgur.com/5mq31Ls.png)
> 
> 
> Thanks,
> 
> Armand Grillet
> 
>



Review Request 65348: Fixed flakyness of MasterTest.UpdateSlaveMessageWithPendingOffers.

2018-01-26 Thread Jan Schlicht

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65348/
---

Review request for mesos, Alexander Rukletsov and Benjamin Bannier.


Bugs: MESOS-8490
https://issues.apache.org/jira/browse/MESOS-8490


Repository: mesos


Description
---

Under rare circumstances, the clock would have to be advanced to trigger
a new allocation cycle after 'UPDATE_SLAVE' has been handled in the
master. We also wait for old offers to be rescinded before waiting for
updated offers as sometimes the old offer could be received erroneously.


Diffs
-

  src/tests/master_tests.cpp 4a284c4e9c26b39d1e88b405f8b904f43b902b02 


Diff: https://reviews.apache.org/r/65348/diff/1/


Testing
---

make check

`src/mesos-tests --gtest_filter=MasterTest.UpdateSlaveMessageWithPendingOffers 
--gtest_repeat=1 --gtest_break_on_failure` while running `stress -c 4 -i 8 
-m 2 -d 6` on a 4 core machine. It usually takes a few thousand iterations to 
trigger the flakyness in the old code. And only when the host is under heavy 
load.


Thanks,

Jan Schlicht



Re: Review Request 65347: Added missing protobuf include.

2018-01-26 Thread Mesos Reviewbot Windows

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65347/#review196326
---



PASS: Mesos patch 65347 was successfully built and tested.

Reviews applied: `['65347']`

All the build artifacts available at: 
http://dcos-win.westus.cloudapp.azure.com/mesos-build/review/65347

- Mesos Reviewbot Windows


On Jan. 26, 2018, 11:41 a.m., Benno Evers wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/65347/
> ---
> 
> (Updated Jan. 26, 2018, 11:41 a.m.)
> 
> 
> Review request for mesos.
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> When using our bundled version of protobuf, this file is transitively
> included from other headers.
> 
> However, when building against olders versions of protobuf this fails,
> for example when building against the system-default protobuf 3.0
> on Ubuntu 17.04.
> 
> 
> Diffs
> -
> 
>   src/common/protobuf_utils.hpp 0ff7239dd6b8f74119fdd97531d94af21fd0500c 
> 
> 
> Diff: https://reviews.apache.org/r/65347/diff/1/
> 
> 
> Testing
> ---
> 
> Built without bundled protobuf on an Ubuntu system.
> 
> Started internal CI run: 
> https://jenkins.mesosphere.com/service/jenkins/job/mesos/job/Mesos_CI/2675/
> 
> 
> Thanks,
> 
> Benno Evers
> 
>



Review Request 65346: Updated Web UI to show quota guarantee and limit for roles.

2018-01-26 Thread Armand Grillet

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65346/
---

Review request for mesos and Benjamin Mahler.


Bugs: MESOS-8486
https://issues.apache.org/jira/browse/MESOS-8486


Repository: mesos


Description
---

See summary.


Diffs
-

  src/webui/master/static/roles.html 7578852e4e2ad29a0f07e16dcff85af6a2255b02 


Diff: https://reviews.apache.org/r/65346/diff/1/


Testing
---

New UI:
![New roles row](https://i.imgur.com/5mq31Ls.png)


Thanks,

Armand Grillet



Review Request 65347: Added missing protobuf include.

2018-01-26 Thread Benno Evers

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65347/
---

Review request for mesos.


Repository: mesos


Description
---

When using our bundled version of protobuf, this file is transitively
included from other headers.

However, when building against olders versions of protobuf this fails,
for example when building against the system-default protobuf 3.0
on Ubuntu 17.04.


Diffs
-

  src/common/protobuf_utils.hpp 0ff7239dd6b8f74119fdd97531d94af21fd0500c 


Diff: https://reviews.apache.org/r/65347/diff/1/


Testing
---

Built without bundled protobuf on an Ubuntu system.

Started internal CI run: 
https://jenkins.mesosphere.com/service/jenkins/job/mesos/job/Mesos_CI/2675/


Thanks,

Benno Evers



Re: Review Request 65344: Updated the docs for agent ping timeout flags.

2018-01-26 Thread Mesos Reviewbot Windows

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65344/#review196313
---



FAIL: Some Mesos libprocess-tests failed.

Reviews applied: `['65344']`

Failed command: 
`D:\DCOS\mesos\3rdparty\libprocess\src\tests\Debug\libprocess-tests.exe`

All the build artifacts available at: 
http://dcos-win.westus.cloudapp.azure.com/mesos-build/review/65344

Relevant logs:

- 
[libprocess-tests-stdout.log](http://dcos-win.westus.cloudapp.azure.com/mesos-build/review/65344/logs/libprocess-tests-stdout.log):

```
[   OK ] Scheme/HTTPTest.WWWAuthenticateHeader/0 (407 ms)
[ RUN  ] Scheme/HTTPTest.WWWAuthenticateHeader/1
[   OK ] Scheme/HTTPTest.WWWAuthenticateHeader/1 (22 ms)
[ RUN  ] Scheme/HTTPTest.Accepts/0
[   OK ] Scheme/HTTPTest.Accepts/0 (181 ms)
[ RUN  ] Scheme/HTTPTest.Accepts/1
[   OK ] Scheme/HTTPTest.Accepts/1 (29 ms)
[--] 36 tests from Scheme/HTTPTest (6099 ms total)

[--] 4 tests from SSLVerifyIPAdd/SSLTest
[ RUN  ] SSLVerifyIPAdd/SSLTest.BasicSameProcess/0
[   OK ] SSLVerifyIPAdd/SSLTest.BasicSameProcess/0 (331 ms)
[ RUN  ] SSLVerifyIPAdd/SSLTest.BasicSameProcess/1
[   OK ] SSLVerifyIPAdd/SSLTest.BasicSameProcess/1 (361 ms)
[ RUN  ] SSLVerifyIPAdd/SSLTest.RequireCertificate/0
[   OK ] SSLVerifyIPAdd/SSLTest.RequireCertificate/0 (922 ms)
[ RUN  ] SSLVerifyIPAdd/SSLTest.RequireCertificate/1
[   OK ] SSLVerifyIPAdd/SSLTest.RequireCertificate/1 (1310 ms)
[--] 4 tests from SSLVerifyIPAdd/SSLTest (2928 ms total)

[--] Global test environment tear-down
[==] 225 tests from 35 test cases ran. (30383 ms total)
[  PASSED  ] 224 tests.
[  FAILED  ] 1 test, listed below:
[  FAILED  ] SSLTest.ProtocolMismatch

 1 FAILED TEST
  YOU HAVE 19 DISABLED TESTS

```

- 
[libprocess-tests-stderr.log](http://dcos-win.westus.cloudapp.azure.com/mesos-build/review/65344/logs/libprocess-tests-stderr.log):

```
I0126 10:31:18.117390  5936 process.cpp:928] Stopped the socket accept loop
I0126 10:31:18.147389  6156 process.cpp:928] Stopped the socket accept loop
I0126 10:31:18.477361  5936 openssl.cpp:514] Will not verify peer certificate!
NOTE: Set LIBPROCESS_SSL_VERIFY_CERT=1 to enable peer certificate verification
I0126 10:31:18.477361  5936 openssl.cpp:534] LIBPROCESS_SSL_REQUIRE_CERT 
implies peer certificate verification.
LIBPROCESS_SSL_VERIFY_CERT set to true
I0126 10:31:18.477361  5936 openssl.cpp:563] Using CA file: 
C:\Users\mesos\AppData\Local\Temp\NGThno\cert.pem
I0126 10:31:18.477361  5936 openssl.cpp:566] Using CA dir: 
C:\Users\mesos\AppData\Local\Temp\NGThno
I0126 10:31:18.839330  5936 openssl.cpp:514] Will not verify peer certificate!
NOTE: Set LIBPROCESS_SSL_VERIFY_CERT=1 to enable peer certificate verification
I0126 10:31:18.840332  5936 openssl.cpp:526] Will use IP address verification 
in subject alternative name certificate extension.
I0126 10:31:18.840332  5936 openssl.cpp:534] LIBPROCESS_SSL_REQUIRE_CERT 
implies peer certificate verification.
LIBPROCESS_SSL_VERIFY_CERT set to true
I0126 10:31:18.840332  5936 openssl.cpp:563] Using CA file: 
C:\Users\mesos\AppData\Local\Temp\gW2qgS\cert.pem
I0126 10:31:18.840332  5936 openssl.cpp:566] Using CA dir: 
C:\Users\mesos\AppData\Local\Temp\gW2qgS
I0126 10:31:19.094310  5936 openssl.cpp:509] CA directory path unspecified! 
NOTE: Set CA directory path with LIBPROCESS_SSL_CA_DIR=
I0126 10:31:19.094310  5936 openssl.cpp:514] Will not verify peer certificate!
NOTE: Set LIBPROCESS_SSL_VERIFY_CERT=1 to enable peer certificate verification
I0126 10:31:19.094310  5936 openssl.cpp:534] LIBPROCESS_SSL_REQUIRE_CERT 
implies peer certificate verification.
LIBPROCESS_SSL_VERIFY_CERT set to true
I0126 10:31:19.094310  5936 openssl.cpp:563] Using CA file: 
C:\Users\mesos\AppData\Local\Temp\MZodAZ\cert.pem
I0126 10:31:20.382262  5936 openssl.cpp:509] CA directory path unspecified! 
NOTE: Set CA directory path with LIBPROCESS_SSL_CA_DIR=
I0126 10:31:20.382262  5936 openssl.cpp:514] Will not verify peer certificate!
NOTE: Set LIBPROCESS_SSL_VERIFY_CERT=1 to enable peer certificate verification
I0126 10:31:20.382262  5936 openssl.cpp:526] Will use IP address verification 
in subject alternative name certificate extension.
I0126 10:31:20.382262  5936 openssl.cpp:534] LIBPROCESS_SSL_REQUIRE_CERT 
implies peer certificate verification.
LIBPROCESS_SSL_VERIFY_CERT set to true
I0126 10:31:20.382262  5936 openssl.cpp:563] Using CA file: 
C:\Users\mesos\AppData\Local\Temp\wqNibD\cert.pem
I0126 10:31:21.107635  5572 process.cpp:928] Stopped the socket accept loop
```

- Mesos Reviewbot Windows


On Jan. 25, 2018, 9:18 p.m., Greg Mann wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/65344/
>