Re: Review Request 61666: Added test to verify filtering of resource reservations & allocations.

2017-08-23 Thread Mesos Reviewbot

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/61666/#review183705
---



Patch looks great!

Reviews applied: [60915, 61666]

Passed command: export OS='ubuntu:14.04' BUILDTOOL='autotools' COMPILER='gcc' 
CONFIGURATION='--verbose' ENVIRONMENT='GLOG_v=1 MESOS_VERBOSE=1'; 
./support/docker-build.sh

- Mesos Reviewbot


On Aug. 16, 2017, 1:59 a.m., Andrei Budnik wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/61666/
> ---
> 
> (Updated Aug. 16, 2017, 1:59 a.m.)
> 
> 
> Review request for mesos, Alexander Rojas and Greg Mann.
> 
> 
> Bugs: MESOS-7892
> https://issues.apache.org/jira/browse/MESOS-7892
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> This patch extends `SlaveAuthorizerTest.FilterStateEndpoint` test to
> check that agent's `/state` endpoint shows resource reservations and
> allocations based on the 'VIEW_ROLE' permissions of the principal
> doing the request.
> 
> 
> Diffs
> -
> 
>   src/tests/slave_authorization_tests.cpp 
> 536a8efda0cc1ba08285c379b01af6ec64a82117 
> 
> 
> Diff: https://reviews.apache.org/r/61666/diff/2/
> 
> 
> Testing
> ---
> 
> make check (mac os x, fedora 25)
> 
> 
> Thanks,
> 
> Andrei Budnik
> 
>



Re: Review Request 61808: Add task env overwriting executor env glog print

2017-08-23 Thread Jie Yu

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/61808/#review183701
---


Ship it!




Ship It!

- Jie Yu


On Aug. 22, 2017, 1:56 a.m., Andy Pang wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/61808/
> ---
> 
> (Updated Aug. 22, 2017, 1:56 a.m.)
> 
> 
> Review request for mesos, Benjamin Mahler and Jie Yu.
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> When user assign environment by `CommandInfo` environment, it will overwrite 
> the same name environment get by before.
> 
> 
> Diffs
> -
> 
>   src/slave/containerizer/docker.cpp b84c392 
> 
> 
> Diff: https://reviews.apache.org/r/61808/diff/1/
> 
> 
> Testing
> ---
> 
> make check
> 
> 
> Thanks,
> 
> Andy Pang
> 
>



Re: Review Request 60766: Ignored containers that join CNI networks.

2017-08-23 Thread Qian Zhang

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/60766/#review183693
---




src/slave/containerizer/mesos/isolators/network/ports.cpp
Line 406 (original), 451 (patched)


Can we put all nested containers in a list in the above `foreach` loop so 
that we do not have to go through all containers here?



src/slave/containerizer/mesos/isolators/network/ports.cpp
Lines 462 (patched)


I think it should be `if (infos.contains(rootContainerId)) {`.



src/slave/containerizer/mesos/isolators/network/ports.cpp
Lines 490 (patched)


Kill this empty line.


- Qian Zhang


On Aug. 22, 2017, 6:01 a.m., James Peach wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/60766/
> ---
> 
> (Updated Aug. 22, 2017, 6:01 a.m.)
> 
> 
> Review request for mesos, Qian Zhang and Jiang Yan Xu.
> 
> 
> Bugs: MESOS-7675
> https://issues.apache.org/jira/browse/MESOS-7675
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Working on the assumption that containers with CNI networks will
> get their own IP addresses and don't need port isolation, ignore
> any containers that are joining CNI networks.
> 
> 
> Diffs
> -
> 
>   src/slave/containerizer/mesos/containerizer.cpp 
> 5805dfb4fb6e755e4c23851b0a6b504f2a8b3396 
>   src/slave/containerizer/mesos/isolators/network/ports.hpp PRE-CREATION 
>   src/slave/containerizer/mesos/isolators/network/ports.cpp PRE-CREATION 
> 
> 
> Diff: https://reviews.apache.org/r/60766/diff/13/
> 
> 
> Testing
> ---
> 
> make check (Fedora 26).
> 
> 
> Thanks,
> 
> James Peach
> 
>



Re: Review Request 60496: Added socket checking to the network ports isolator.

2017-08-23 Thread Qian Zhang

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/60496/#review183696
---




src/slave/containerizer/mesos/isolators/network/ports.cpp
Lines 200-202 (original), 373-375 (patched)


When framework launches a task group, this `update()` method will be called 
twice for the top-level container (executor):
1. When the top-level container is launched. At this time, the `resources` 
is the top-level container's own resources.
2. When the executor subscribes the agent 
(https://github.com/apache/mesos/blob/1.3.1/src/slave/slave.cpp#L3719). At this 
time, the `resources` is the top-level container's own resources + all nested 
containers resources, so in this `update()` method, the `info->ports` for the 
top-level container will be updated to include the ports of all nested 
containers. This seems not correct, since executor process will be allowed to 
listen on ports not assigned to it.


- Qian Zhang


On Aug. 24, 2017, 4:29 a.m., James Peach wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/60496/
> ---
> 
> (Updated Aug. 24, 2017, 4:29 a.m.)
> 
> 
> Review request for mesos, Qian Zhang and Jiang Yan Xu.
> 
> 
> Bugs: MESOS-7675
> https://issues.apache.org/jira/browse/MESOS-7675
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Implemented ports resource restrictions in the network ports isolator.
> Periodically, scan for listening sockets and match them up to all
> the open sockets in the containers we are tracking in the network.
> Check any sockets we find against the ports resource and trigger a
> resource limitation if the port has not been allocated.
> 
> 
> Diffs
> -
> 
>   src/slave/containerizer/mesos/isolators/network/ports.hpp PRE-CREATION 
>   src/slave/containerizer/mesos/isolators/network/ports.cpp PRE-CREATION 
> 
> 
> Diff: https://reviews.apache.org/r/60496/diff/15/
> 
> 
> Testing
> ---
> 
> make check (Fedora 26)
> 
> 
> Thanks,
> 
> James Peach
> 
>



Review Request 61873: Added test filter for docker user network tests.

2017-08-23 Thread Avinash sridharan

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/61873/
---

Review request for mesos, Jie Yu and Qian Zhang.


Bugs: MESOS-7807
https://issues.apache.org/jira/browse/MESOS-7807


Repository: mesos


Description
---

Added test filter for docker user network tests.


Diffs
-

  src/tests/environment.cpp a7262cd97361b55ff31238341657e764df6c9ea5 


Diff: https://reviews.apache.org/r/61873/diff/1/


Testing
---

make, make check


Thanks,

Avinash sridharan



Review Request 61874: Added a test for IPv6 containers on docker user networks.

2017-08-23 Thread Avinash sridharan

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/61874/
---

Review request for mesos, Jie Yu and Qian Zhang.


Bugs: MESOS-7807
https://issues.apache.org/jira/browse/MESOS-7807


Repository: mesos


Description
---

The test creates a IPv6 docker user network. It than launches an alpine
image on the docker user network. Finally it checks if the IP address
allocated to the container are reflected correctly in state.


Diffs
-

  src/tests/containerizer/docker_containerizer_tests.cpp 
6856ca22570183b022d401d1fa8f59d819783eed 


Diff: https://reviews.apache.org/r/61874/diff/1/


Testing
---

sudo bin/mesos-tests.sh 
--gtest_filter="*.ROOT_DOCKER_USERNETWORK_LaunchIPv6Container"
[==] Running 1 test from 1 test case.
[--] Global test environment set-up.
[--] 1 test from DockerContainerizerIPv6UserNetworkTest
[ RUN  ] 
DockerContainerizerIPv6UserNetworkTest.ROOT_DOCKER_USERNETWORK_LaunchIPv6Container
[   OK ] 
DockerContainerizerIPv6UserNetworkTest.ROOT_DOCKER_USERNETWORK_LaunchIPv6Container
 (15660 ms)
[--] 1 test from DockerContainerizerIPv6UserNetworkTest (15664 ms total)

[--] Global test environment tear-down
[==] 1 test from 1 test case ran. (15713 ms total)
[  PASSED  ] 1 test.


Thanks,

Avinash sridharan



Re: Review Request 61237: Updated docker executor to return IPv6 address of a container.

2017-08-23 Thread Avinash sridharan

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/61237/
---

(Updated Aug. 24, 2017, 12:48 a.m.)


Review request for mesos, Jie Yu and Qian Zhang.


Changes
---

Rebased.


Bugs: MESOS-7807
https://issues.apache.org/jira/browse/MESOS-7807


Repository: mesos


Description
---

A docker container can have a v4 and a v6 address. Currently the docker
executor was returning only the IPv4 address. This changes allows the
executor to return the IPv4 and IPv6 address of the container.


Diffs (updated)
-

  src/docker/docker.hpp 95e60a7dbbd6ccc659f70ca3dca8d13433f3ea07 
  src/docker/docker.cpp 192e170acce895f80df6c83e437da020489de468 
  src/docker/executor.cpp 99a62244b7805d25b926d0293cb62574981d2878 


Diff: https://reviews.apache.org/r/61237/diff/3/

Changes: https://reviews.apache.org/r/61237/diff/2-3/


Testing
---

Start agent:
sudo ./bin/mesos-agent.sh --ip=10.0.2.15 --master=10.0.2.15:5050 
--work_dir=/tmp/mesos --containerizers=mesos,docker

Start master:
sudo ./bin/mesos-master.sh --ip=10.0.2.15 --work_dir=/tmp/mesos-master

Use mesos-execute to launch a docker container on an IPv6 network:
sudo src/mesos-execute --master=10.0.2.15:5050 
--task=file:///home/vagrant/dev/mesos_tasks/docker.json 2> log.txt

cat /home/vagrant/dev/mesos_tasks/docker.json
{
  "name": "test",
  "task_id": {"value" : "test"},
  "agent_id": {"value" : ""},
  "resources": [
{
  "name": "cpus",
  "type": "SCALAR",
  "scalar": {
"value": 0.1
  }
},
{
  "name": "mem",
  "type": "SCALAR",
  "scalar": {
"value": 32
  }
}
  ],
  "command": {
"shell": false
  },
  "container": {
"type": "DOCKER",
"docker": {
  "image": "nginx",
  "network": "USER"
},
"network_infos": [
{
  "name": "ipv6"
}
]
  }
}


Curl the Mesos state to make sure that the right IPv6 addresses are reflected 
for the container:
 curl http://10.0.2.15:5051/state | jq

"role": "*",
  "statuses": [
{
  "state": "TASK_RUNNING",
  "timestamp": 1501288232.79384,
  "container_status": {
"container_id": {
  "value": "59319a5b-a28b-46bf-a20f-8472e6598abd"
},
"network_infos": [
  {
"ip_addresses": [
  {
"protocol": "IPv4",
"ip_address": "172.19.0.2"
  },
  {
"protocol": "IPv6",
"ip_address": "fd00:0:0:1::2"
  }
],
"name": "ipv6"
  }
]
  }
}
  ],
  "container": {
"type": "DOCKER",
"docker": {
  "image": "nginx",
  "network": "USER",
  "privileged": false
},
"network_infos": [
  {
"name": "ipv6"
  }
]
  }
}
  ],


Thanks,

Avinash sridharan



Re: Review Request 60591: Optionally isolate only the agent network ports.

2017-08-23 Thread James Peach

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/60591/
---

(Updated Aug. 24, 2017, 12:32 a.m.)


Review request for mesos, Qian Zhang and Jiang Yan Xu.


Bugs: MESOS-7675
https://issues.apache.org/jira/browse/MESOS-7675


Repository: mesos


Description
---

Normally, the `network/ports` isolator will kill any task that
listens on a port that it does not have resources for. However,
executors that are based on the libprocess API will always
listen on a port in the ephemeral range, and we want to make
it possible to use libprocess-based executors.

Added the `--check_agent_port_range_only` option to only kill
tasks when they listen on un-allocated ports within the port
range published by the agent resources. This still prevents
port collisions between tasks, but doesn't kill them just
because the executor is listening on a port.


Diffs (updated)
-

  src/slave/containerizer/mesos/isolators/network/ports.hpp PRE-CREATION 
  src/slave/containerizer/mesos/isolators/network/ports.cpp PRE-CREATION 
  src/slave/flags.hpp 2970fea0cfac6af275a758d4bfedfe9a943c2b60 
  src/slave/flags.cpp 3b02f3e909a554f15104739832ae3f252926b45f 


Diff: https://reviews.apache.org/r/60591/diff/16/

Changes: https://reviews.apache.org/r/60591/diff/15-16/


Testing
---

make check (Fedora 26)


Thanks,

James Peach



Re: Review Request 61869: Adjusted the ContianerLaunchInfo.command merge logic.

2017-08-23 Thread James Peach

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/61869/#review183692
---


Fix it, then Ship it!





src/slave/containerizer/mesos/containerizer.cpp
Lines 1422 (patched)


It's a bit verbose but I'd rather split these into separate checks so we 
know which one failed.


- James Peach


On Aug. 23, 2017, 11:25 p.m., Jie Yu wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/61869/
> ---
> 
> (Updated Aug. 23, 2017, 11:25 p.m.)
> 
> 
> Review request for mesos, Gilbert Song and James Peach.
> 
> 
> Bugs: MESOS-7909
> https://issues.apache.org/jira/browse/MESOS-7909
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> This patches addressed MESOS-7909 by eliminating the ordering
> dependency between the `linux/capabilities` isolator and the
> `docker/runtime` or `appc/runtime` isolator.
> 
> Now, for command tasks, we always merge with the command executor
> command in MesosContainerizer. Previously, this is done in the
> `docker/runtime` or `appc/runtime` isolator, which introduced this
> unintentional dependency on the isolator ordering.
> 
> 
> Diffs
> -
> 
>   src/slave/containerizer/mesos/containerizer.cpp 
> 5805dfb4fb6e755e4c23851b0a6b504f2a8b3396 
>   src/slave/containerizer/mesos/isolators/appc/runtime.cpp 
> f76b22b26e371ec5ed5d81e86a78007224cc260e 
>   src/slave/containerizer/mesos/isolators/docker/runtime.cpp 
> 2a6e0b179394e0485d2495ceb4bbbcb184af08fe 
> 
> 
> Diff: https://reviews.apache.org/r/61869/diff/1/
> 
> 
> Testing
> ---
> 
> sudo make check
> 
> Also, verified with the previously failed ping tests.
> 
> 
> Thanks,
> 
> Jie Yu
> 
>



Re: Review Request 61645: Fixed a bug where TASK_KILLED updates can be dropped by the agent.

2017-08-23 Thread Benjamin Mahler

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/61645/#review183690
---




src/slave/slave.cpp
Lines 2985-2999 (original), 2948-2996 (patched)


Forgot to add a return statement at the end of this block, will follow up.


- Benjamin Mahler


On Aug. 23, 2017, 9:29 p.m., Benjamin Mahler wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/61645/
> ---
> 
> (Updated Aug. 23, 2017, 9:29 p.m.)
> 
> 
> Review request for mesos, Anand Mazumdar and Vinod Kone.
> 
> 
> Bugs: MESOS-7783 and MESOS-7863
> https://issues.apache.org/jira/browse/MESOS-7783
> https://issues.apache.org/jira/browse/MESOS-7863
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Per the description of MESOS-7863, there is currently an assumption
> that when a pending task is killed, the framework will be stored in
> the agent when the launch proceeds for the killed task. When this
> assumption does not hold, the TASK_KILLED update will be dropped
> due to the frameowrk being unknown when the launch proceeds. This
> assumption doesn't hold in two cases:
> 
>   (1) Another pending task was killed and we removed the framework
>   in 'Slave::run' thinking it was idle, because pending tasks
>   was empty (we remove from pending tasks when processing the
>   kill). (MESOS-7783 is an example instance of this).
> 
>   (2) The last executor terminated without tasks to send terminal
>   updates for, or the last terminated executor received its
>   last acknowledgement. At this point, we remove the framework
>   thinking there were no pending tasks if the task was killed
>   (removed from pending).
> 
> The fix applied here is to send the status updates from the kill
> path rather than the launch path, to be consistent with how we kill
> tasks queued within the Executor struct. We ensure that the task
> is removed synchronously within the kill path to prevent its launch.
> 
> 
> Diffs
> -
> 
>   src/slave/slave.hpp 0e07a1af733003bb897cbebb7c1f64437063353d 
>   src/slave/slave.cpp 50d2a10cd68f6611efd4e691e5325e6e0c06f33a 
>   src/tests/slave_tests.cpp a8b46a450c4f3986139390aeed5544e7f9091781 
> 
> 
> Diff: https://reviews.apache.org/r/61645/diff/3/
> 
> 
> Testing
> ---
> 
> Added a test in a subsequent patch.
> 
> 
> Thanks,
> 
> Benjamin Mahler
> 
>



Review Request 61869: Adjusted the ContianerLaunchInfo.command merge logic.

2017-08-23 Thread Jie Yu

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/61869/
---

Review request for mesos, Gilbert Song and James Peach.


Bugs: MESOS-7909
https://issues.apache.org/jira/browse/MESOS-7909


Repository: mesos


Description
---

This patches addressed MESOS-7909 by eliminating the ordering
dependency between the `linux/capabilities` isolator and the
`docker/runtime` or `appc/runtime` isolator.

Now, for command tasks, we always merge with the command executor
command in MesosContainerizer. Previously, this is done in the
`docker/runtime` or `appc/runtime` isolator, which introduced this
unintentional dependency on the isolator ordering.


Diffs
-

  src/slave/containerizer/mesos/containerizer.cpp 
5805dfb4fb6e755e4c23851b0a6b504f2a8b3396 
  src/slave/containerizer/mesos/isolators/appc/runtime.cpp 
f76b22b26e371ec5ed5d81e86a78007224cc260e 
  src/slave/containerizer/mesos/isolators/docker/runtime.cpp 
2a6e0b179394e0485d2495ceb4bbbcb184af08fe 


Diff: https://reviews.apache.org/r/61869/diff/1/


Testing
---

sudo make check

Also, verified with the previously failed ping tests.


Thanks,

Jie Yu



Review Request 61866: Adds port mapping information in NetworkInfo as part of `state` output.

2017-08-23 Thread Deepak Goel

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/61866/
---

Review request for mesos.


Repository: mesos


Description
---

Adds port mapping information in NetworkInfo as part of `state` output.


Diffs
-

  src/common/http.cpp 43d674e9286170aae92014a952c6a8ba66437ca5 


Diff: https://reviews.apache.org/r/61866/diff/1/


Testing
---


Thanks,

Deepak Goel



Re: Review Request 61650: Controlled queued task handling with helpers in the agent.

2017-08-23 Thread Vinod Kone

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/61650/#review183679
---


Fix it, then Ship it!





src/slave/slave.hpp
Line 714 (original), 715 (patched)


s/will/will also/


- Vinod Kone


On Aug. 23, 2017, 9:29 p.m., Benjamin Mahler wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/61650/
> ---
> 
> (Updated Aug. 23, 2017, 9:29 p.m.)
> 
> 
> Review request for mesos, Anand Mazumdar and Vinod Kone.
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> With this patch, the enqueueing and dequeueing of tasks in the
> agent are controlled solely by Executor class member functions,
> which simplifies the agent code.
> 
> 
> Diffs
> -
> 
>   src/slave/slave.hpp 0e07a1af733003bb897cbebb7c1f64437063353d 
>   src/slave/slave.cpp 50d2a10cd68f6611efd4e691e5325e6e0c06f33a 
> 
> 
> Diff: https://reviews.apache.org/r/61650/diff/2/
> 
> 
> Testing
> ---
> 
> make check
> 
> 
> Thanks,
> 
> Benjamin Mahler
> 
>



Re: Review Request 61645: Fixed a bug where TASK_KILLED updates can be dropped by the agent.

2017-08-23 Thread Vinod Kone

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/61645/#review183678
---


Fix it, then Ship it!





src/slave/slave.hpp
Lines 882 (patched)


s/will be/will also be/ ?



src/slave/slave.cpp
Lines 1820 (patched)


Can you add a TODO here to track pending tasks on the executor struct 
instead of framework? We might even be able to leverage queued tasks.


- Vinod Kone


On Aug. 23, 2017, 9:29 p.m., Benjamin Mahler wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/61645/
> ---
> 
> (Updated Aug. 23, 2017, 9:29 p.m.)
> 
> 
> Review request for mesos, Anand Mazumdar and Vinod Kone.
> 
> 
> Bugs: MESOS-7783 and MESOS-7863
> https://issues.apache.org/jira/browse/MESOS-7783
> https://issues.apache.org/jira/browse/MESOS-7863
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Per the description of MESOS-7863, there is currently an assumption
> that when a pending task is killed, the framework will be stored in
> the agent when the launch proceeds for the killed task. When this
> assumption does not hold, the TASK_KILLED update will be dropped
> due to the frameowrk being unknown when the launch proceeds. This
> assumption doesn't hold in two cases:
> 
>   (1) Another pending task was killed and we removed the framework
>   in 'Slave::run' thinking it was idle, because pending tasks
>   was empty (we remove from pending tasks when processing the
>   kill). (MESOS-7783 is an example instance of this).
> 
>   (2) The last executor terminated without tasks to send terminal
>   updates for, or the last terminated executor received its
>   last acknowledgement. At this point, we remove the framework
>   thinking there were no pending tasks if the task was killed
>   (removed from pending).
> 
> The fix applied here is to send the status updates from the kill
> path rather than the launch path, to be consistent with how we kill
> tasks queued within the Executor struct. We ensure that the task
> is removed synchronously within the kill path to prevent its launch.
> 
> 
> Diffs
> -
> 
>   src/slave/slave.hpp 0e07a1af733003bb897cbebb7c1f64437063353d 
>   src/slave/slave.cpp 50d2a10cd68f6611efd4e691e5325e6e0c06f33a 
>   src/tests/slave_tests.cpp a8b46a450c4f3986139390aeed5544e7f9091781 
> 
> 
> Diff: https://reviews.apache.org/r/61645/diff/3/
> 
> 
> Testing
> ---
> 
> Added a test in a subsequent patch.
> 
> 
> Thanks,
> 
> Benjamin Mahler
> 
>



Re: Review Request 60591: Optionally isolate only the agent network ports.

2017-08-23 Thread James Peach


> On Aug. 21, 2017, 7:15 a.m., Qian Zhang wrote:
> > src/slave/flags.cpp
> > Lines 1012-1018 (patched)
> > 
> >
> > So by default this flag is not enabled, that means any libprocess-based 
> > exectuors (e.g., command executor and default executor) will be killed 
> > since they will listen on ephemeral port?
> 
> James Peach wrote:
> Yes, that's correct.
> 
> Qian Zhang wrote:
> That is a breaking change. I mean once this code is merged, any 
> frameworks which uses command executor or default executor (e.g., Marathon) 
> will be broken since the tasks that they launch will be killed.
> 
> James Peach wrote:
> No, you'd have to enable the isolator to break your containers.
> 
> Qian Zhang wrote:
> Agree.
> 
> And do you think `--check_agent_port_range_only` would be a better name? 
> If set to `true`, the `network/ports` isolator will do the check only within 
> agent's port range to restrict containers to only listen on the ports that 
> they have been assigned resources for, so containers are allowed to listen on 
> additional ports outside agent's port range. If set to `false`, the 
> `network/ports` isolator will restrict containers to only listen on ports 
> that they have been assigned resources for regardless of agent's port range.
> 
> And the default value of `--check_agent_port_range_only` should be 
> `true`, which means `network/ports` isolator will run in soft mode, so when 
> operator enable `network/ports` isolator but does not set this flag, the 
> frameworks using libprocess-based executor will not be broken.
> 
> James Peach wrote:
> I can live with `--check_agent_port_range_only`. However, this should be 
> secure by default so we have to keep the default as `false`.

I just realized that the reason I chose the name was that is it symmetric with 
`--container_ports_watch_interval` :(


- James


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/60591/#review183303
---


On Aug. 23, 2017, 9:57 p.m., James Peach wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/60591/
> ---
> 
> (Updated Aug. 23, 2017, 9:57 p.m.)
> 
> 
> Review request for mesos, Qian Zhang and Jiang Yan Xu.
> 
> 
> Bugs: MESOS-7675
> https://issues.apache.org/jira/browse/MESOS-7675
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Normally, the `network/ports` isolator will kill any task that
> listens on a port that it does not have resources for. However,
> executors that are based on the libprocess API will always
> listen on a port in the ephemeral range, and we want to make
> it possible to use libprocess-based executors.
> 
> Added the `--check_agent_port_range_only` option to only kill
> tasks when they listen on un-allocated ports within the port
> range published by the agent resources. This still prevents
> port collisions between tasks, but doesn't kill them just
> because the executor is listening on a port.
> 
> 
> Diffs
> -
> 
>   src/slave/containerizer/mesos/isolators/network/ports.hpp PRE-CREATION 
>   src/slave/containerizer/mesos/isolators/network/ports.cpp PRE-CREATION 
>   src/slave/flags.hpp 2970fea0cfac6af275a758d4bfedfe9a943c2b60 
>   src/slave/flags.cpp 3b02f3e909a554f15104739832ae3f252926b45f 
> 
> 
> Diff: https://reviews.apache.org/r/60591/diff/15/
> 
> 
> Testing
> ---
> 
> make check (Fedora 26)
> 
> 
> Thanks,
> 
> James Peach
> 
>



Re: Review Request 60591: Optionally isolate only the agent network ports.

2017-08-23 Thread James Peach


> On Aug. 22, 2017, 9:09 a.m., Qian Zhang wrote:
> > src/slave/containerizer/mesos/isolators/network/ports.cpp
> > Lines 183 (patched)
> > 
> >
> > Usually we put the declaration of static function at the beginning of 
> > the file, and put its implemention in the end. You can take `slave.cpp` and 
> > `master.cpp` as references.

This isn't in the style guide and it's rare that we use forward declarations 
unless they are necessary, eg. `src/linux/cgroups.cpp`, 
`src/launcher/fetcher.cpp`, `src/common/http.cpp`.


- James


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/60591/#review183420
---


On Aug. 23, 2017, 9:57 p.m., James Peach wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/60591/
> ---
> 
> (Updated Aug. 23, 2017, 9:57 p.m.)
> 
> 
> Review request for mesos, Qian Zhang and Jiang Yan Xu.
> 
> 
> Bugs: MESOS-7675
> https://issues.apache.org/jira/browse/MESOS-7675
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Normally, the `network/ports` isolator will kill any task that
> listens on a port that it does not have resources for. However,
> executors that are based on the libprocess API will always
> listen on a port in the ephemeral range, and we want to make
> it possible to use libprocess-based executors.
> 
> Added the `--check_agent_port_range_only` option to only kill
> tasks when they listen on un-allocated ports within the port
> range published by the agent resources. This still prevents
> port collisions between tasks, but doesn't kill them just
> because the executor is listening on a port.
> 
> 
> Diffs
> -
> 
>   src/slave/containerizer/mesos/isolators/network/ports.hpp PRE-CREATION 
>   src/slave/containerizer/mesos/isolators/network/ports.cpp PRE-CREATION 
>   src/slave/flags.hpp 2970fea0cfac6af275a758d4bfedfe9a943c2b60 
>   src/slave/flags.cpp 3b02f3e909a554f15104739832ae3f252926b45f 
> 
> 
> Diff: https://reviews.apache.org/r/60591/diff/15/
> 
> 
> Testing
> ---
> 
> make check (Fedora 26)
> 
> 
> Thanks,
> 
> James Peach
> 
>



Re: Review Request 60591: Optionally isolate only the agent network ports.

2017-08-23 Thread James Peach

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/60591/
---

(Updated Aug. 23, 2017, 9:57 p.m.)


Review request for mesos, Qian Zhang and Jiang Yan Xu.


Bugs: MESOS-7675
https://issues.apache.org/jira/browse/MESOS-7675


Repository: mesos


Description (updated)
---

Normally, the `network/ports` isolator will kill any task that
listens on a port that it does not have resources for. However,
executors that are based on the libprocess API will always
listen on a port in the ephemeral range, and we want to make
it possible to use libprocess-based executors.

Added the `--check_agent_port_range_only` option to only kill
tasks when they listen on un-allocated ports within the port
range published by the agent resources. This still prevents
port collisions between tasks, but doesn't kill them just
because the executor is listening on a port.


Diffs (updated)
-

  src/slave/containerizer/mesos/isolators/network/ports.hpp PRE-CREATION 
  src/slave/containerizer/mesos/isolators/network/ports.cpp PRE-CREATION 
  src/slave/flags.hpp 2970fea0cfac6af275a758d4bfedfe9a943c2b60 
  src/slave/flags.cpp 3b02f3e909a554f15104739832ae3f252926b45f 


Diff: https://reviews.apache.org/r/60591/diff/15/

Changes: https://reviews.apache.org/r/60591/diff/14-15/


Testing
---

make check (Fedora 26)


Thanks,

James Peach



Re: Review Request 61640: Improved the reason and message for tasks killed before delivery.

2017-08-23 Thread Vinod Kone

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/61640/#review183677
---


Fix it, then Ship it!





include/mesos/mesos.proto
Lines 2179 (patched)


s/REASON_TASK_KILLED_BEFORE_DELIVERY/REASON_TASK_KILLED_DURING_LAUNCH/ ?

since "launch" is a known concept/term in mesos.


- Vinod Kone


On Aug. 22, 2017, 7:45 p.m., Benjamin Mahler wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/61640/
> ---
> 
> (Updated Aug. 22, 2017, 7:45 p.m.)
> 
> 
> Review request for mesos, Anand Mazumdar and Vinod Kone.
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Previously, if a task was killed before delivery to a registering
> executor, we sent REASON_EXECUTOR_UNREGISTERED with a message of
> "Unregistered executor" in the agent, and the master sent no reason.
> This introduces a REASON_TASK_KILLED_BEFORE_DELIVERY to clarify
> this case to frameworks.
> 
> 
> Diffs
> -
> 
>   include/mesos/mesos.proto d91c96814dfcd03a4390d80f90acd61b5e4aebdd 
>   include/mesos/v1/mesos.proto 6a1d011a45dcead644300d6becedcbd4bc1f5a96 
>   src/master/master.cpp 7f38a5e21884546d4b4c866ca5918db779af8f99 
>   src/slave/slave.cpp 50d2a10cd68f6611efd4e691e5325e6e0c06f33a 
>   src/tests/master_authorization_tests.cpp 
> b41b4a111a51f1b7f774334d5867528f007adaff 
>   src/tests/slave_recovery_tests.cpp b3825d13b12ee0da110b30baba5f5e68051b8e97 
>   src/tests/slave_tests.cpp a8b46a450c4f3986139390aeed5544e7f9091781 
> 
> 
> Diff: https://reviews.apache.org/r/61640/diff/2/
> 
> 
> Testing
> ---
> 
> Updated the tests.
> 
> 
> Thanks,
> 
> Benjamin Mahler
> 
>



Re: Review Request 61638: Implemented LinkedHashMap copy construction and assignment.

2017-08-23 Thread Vinod Kone

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/61638/#review183673
---


Fix it, then Ship it!





3rdparty/stout/tests/linkedhashmap_tests.cpp
Lines 214 (patched)


copy paste the TODO here as well?


- Vinod Kone


On Aug. 23, 2017, 7:16 p.m., Benjamin Mahler wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/61638/
> ---
> 
> (Updated Aug. 23, 2017, 7:16 p.m.)
> 
> 
> Review request for mesos, Anand Mazumdar and Vinod Kone.
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> The defaults being generated were incorrect. No code is relying
> on them today, this was discovered when writing new code.
> 
> 
> Diffs
> -
> 
>   3rdparty/stout/include/stout/linkedhashmap.hpp 
> 35684e46480761c9b8325525377ca3ca9707990c 
>   3rdparty/stout/tests/linkedhashmap_tests.cpp 
> 267022e3ea601300083a41cc559dfa0adbc9074b 
> 
> 
> Diff: https://reviews.apache.org/r/61638/diff/2/
> 
> 
> Testing
> ---
> 
> Added a test.
> 
> 
> Thanks,
> 
> Benjamin Mahler
> 
>



Re: Review Request 61650: Controlled queued task handling with helpers in the agent.

2017-08-23 Thread Benjamin Mahler

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/61650/
---

(Updated Aug. 23, 2017, 9:29 p.m.)


Review request for mesos, Anand Mazumdar and Vinod Kone.


Changes
---

Addressed review comments.


Repository: mesos


Description
---

With this patch, the enqueueing and dequeueing of tasks in the
agent are controlled solely by Executor class member functions,
which simplifies the agent code.


Diffs (updated)
-

  src/slave/slave.hpp 0e07a1af733003bb897cbebb7c1f64437063353d 
  src/slave/slave.cpp 50d2a10cd68f6611efd4e691e5325e6e0c06f33a 


Diff: https://reviews.apache.org/r/61650/diff/2/

Changes: https://reviews.apache.org/r/61650/diff/1-2/


Testing
---

make check


Thanks,

Benjamin Mahler



Re: Review Request 61645: Fixed a bug where TASK_KILLED updates can be dropped by the agent.

2017-08-23 Thread Benjamin Mahler

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/61645/
---

(Updated Aug. 23, 2017, 9:29 p.m.)


Review request for mesos, Anand Mazumdar and Vinod Kone.


Changes
---

Addressed review comments.


Bugs: MESOS-7783 and MESOS-7863
https://issues.apache.org/jira/browse/MESOS-7783
https://issues.apache.org/jira/browse/MESOS-7863


Repository: mesos


Description
---

Per the description of MESOS-7863, there is currently an assumption
that when a pending task is killed, the framework will be stored in
the agent when the launch proceeds for the killed task. When this
assumption does not hold, the TASK_KILLED update will be dropped
due to the frameowrk being unknown when the launch proceeds. This
assumption doesn't hold in two cases:

  (1) Another pending task was killed and we removed the framework
  in 'Slave::run' thinking it was idle, because pending tasks
  was empty (we remove from pending tasks when processing the
  kill). (MESOS-7783 is an example instance of this).

  (2) The last executor terminated without tasks to send terminal
  updates for, or the last terminated executor received its
  last acknowledgement. At this point, we remove the framework
  thinking there were no pending tasks if the task was killed
  (removed from pending).

The fix applied here is to send the status updates from the kill
path rather than the launch path, to be consistent with how we kill
tasks queued within the Executor struct. We ensure that the task
is removed synchronously within the kill path to prevent its launch.


Diffs (updated)
-

  src/slave/slave.hpp 0e07a1af733003bb897cbebb7c1f64437063353d 
  src/slave/slave.cpp 50d2a10cd68f6611efd4e691e5325e6e0c06f33a 
  src/tests/slave_tests.cpp a8b46a450c4f3986139390aeed5544e7f9091781 


Diff: https://reviews.apache.org/r/61645/diff/3/

Changes: https://reviews.apache.org/r/61645/diff/2-3/


Testing
---

Added a test in a subsequent patch.


Thanks,

Benjamin Mahler



Re: Review Request 61645: Fixed a bug where TASK_KILLED updates can be dropped by the agent.

2017-08-23 Thread Benjamin Mahler


> On Aug. 22, 2017, 11:42 p.m., Vinod Kone wrote:
> > src/slave/slave.cpp
> > Lines 4451 (patched)
> > 
> >
> > Can you add a note/TODO here that if the terminal update for the 
> > pending task is retried by the status update manager, it might potentially 
> > get dropped if the framework gets removed? Ideally, we should transition 
> > the pending task to terminated tasks instead of removing it from the map 
> > and forgetting about it.

Added a TODO that we should track it like we do within the executor struct, but 
didn't mention that issue since `Slave::forward` seems to be ok with a removed 
framework.


- Benjamin


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/61645/#review183502
---


On Aug. 23, 2017, 9:29 p.m., Benjamin Mahler wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/61645/
> ---
> 
> (Updated Aug. 23, 2017, 9:29 p.m.)
> 
> 
> Review request for mesos, Anand Mazumdar and Vinod Kone.
> 
> 
> Bugs: MESOS-7783 and MESOS-7863
> https://issues.apache.org/jira/browse/MESOS-7783
> https://issues.apache.org/jira/browse/MESOS-7863
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Per the description of MESOS-7863, there is currently an assumption
> that when a pending task is killed, the framework will be stored in
> the agent when the launch proceeds for the killed task. When this
> assumption does not hold, the TASK_KILLED update will be dropped
> due to the frameowrk being unknown when the launch proceeds. This
> assumption doesn't hold in two cases:
> 
>   (1) Another pending task was killed and we removed the framework
>   in 'Slave::run' thinking it was idle, because pending tasks
>   was empty (we remove from pending tasks when processing the
>   kill). (MESOS-7783 is an example instance of this).
> 
>   (2) The last executor terminated without tasks to send terminal
>   updates for, or the last terminated executor received its
>   last acknowledgement. At this point, we remove the framework
>   thinking there were no pending tasks if the task was killed
>   (removed from pending).
> 
> The fix applied here is to send the status updates from the kill
> path rather than the launch path, to be consistent with how we kill
> tasks queued within the Executor struct. We ensure that the task
> is removed synchronously within the kill path to prevent its launch.
> 
> 
> Diffs
> -
> 
>   src/slave/slave.hpp 0e07a1af733003bb897cbebb7c1f64437063353d 
>   src/slave/slave.cpp 50d2a10cd68f6611efd4e691e5325e6e0c06f33a 
>   src/tests/slave_tests.cpp a8b46a450c4f3986139390aeed5544e7f9091781 
> 
> 
> Diff: https://reviews.apache.org/r/61645/diff/3/
> 
> 
> Testing
> ---
> 
> Added a test in a subsequent patch.
> 
> 
> Thanks,
> 
> Benjamin Mahler
> 
>



Re: Review Request 60591: Optionally isolate only the agent network ports.

2017-08-23 Thread James Peach


> On Aug. 22, 2017, 9:09 a.m., Qian Zhang wrote:
> > src/slave/containerizer/mesos/isolators/network/ports.cpp
> > Lines 242-246 (patched)
> > 
> >
> > Sorry for the back and forth, is it possile for `ports.isSome()` to be 
> > false? It seems not, right?
> 
> James Peach wrote:
> Yes, it can be false. In [r/60765](https://reviews.apache.org/r/60765/) I 
> added a test to hit this case. However if I remove the code that tries to 
> protect the agent port, this can be simplified.
> 
> Qian Zhang wrote:
> Not sure why it can be false. If we hit here, that means 
> `havePortsResource(flags)` is true, so we must have port resources, right?

This would happen if we had an empty ports resource.


- James


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/60591/#review183420
---


On Aug. 21, 2017, 8:36 p.m., James Peach wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/60591/
> ---
> 
> (Updated Aug. 21, 2017, 8:36 p.m.)
> 
> 
> Review request for mesos, Qian Zhang and Jiang Yan Xu.
> 
> 
> Bugs: MESOS-7675
> https://issues.apache.org/jira/browse/MESOS-7675
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Normally, the `network/ports` isolator will kill any task that
> listens on a port that it does not have resources for. However,
> executors that are based on the libprocess API will always listen
> on a port in the ephemeral range, and we want to make it possible
> to use libprocess-based executors.
> 
> Added the `--container_ports_watch_resources_only` option to only
> kill tasks when they listen on un-allocated ports within the port
> range published by the agent resources. This still prevents port
> collisions between tasks, but doesn't kill them just because the
> executor is listening on a port.
> 
> 
> Diffs
> -
> 
>   src/slave/containerizer/mesos/isolators/network/ports.hpp PRE-CREATION 
>   src/slave/containerizer/mesos/isolators/network/ports.cpp PRE-CREATION 
>   src/slave/flags.hpp 2970fea0cfac6af275a758d4bfedfe9a943c2b60 
>   src/slave/flags.cpp 3b02f3e909a554f15104739832ae3f252926b45f 
> 
> 
> Diff: https://reviews.apache.org/r/60591/diff/14/
> 
> 
> Testing
> ---
> 
> make check (Fedora 26)
> 
> 
> Thanks,
> 
> James Peach
> 
>



Re: Review Request 60496: Added socket checking to the network ports isolator.

2017-08-23 Thread James Peach

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/60496/
---

(Updated Aug. 23, 2017, 8:29 p.m.)


Review request for mesos, Qian Zhang and Jiang Yan Xu.


Changes
---

Addressed review feedback.


Bugs: MESOS-7675
https://issues.apache.org/jira/browse/MESOS-7675


Repository: mesos


Description
---

Implemented ports resource restrictions in the network ports isolator.
Periodically, scan for listening sockets and match them up to all
the open sockets in the containers we are tracking in the network.
Check any sockets we find against the ports resource and trigger a
resource limitation if the port has not been allocated.


Diffs (updated)
-

  src/slave/containerizer/mesos/isolators/network/ports.hpp PRE-CREATION 
  src/slave/containerizer/mesos/isolators/network/ports.cpp PRE-CREATION 


Diff: https://reviews.apache.org/r/60496/diff/15/

Changes: https://reviews.apache.org/r/60496/diff/14-15/


Testing
---

make check (Fedora 26)


Thanks,

James Peach



Re: Review Request 61638: Implemented LinkedHashMap copy construction and assignment.

2017-08-23 Thread Benjamin Mahler


> On Aug. 22, 2017, 11:54 p.m., Vinod Kone wrote:
> > 3rdparty/stout/tests/linkedhashmap_tests.cpp
> > Lines 207 (patched)
> > 
> >
> > Would this test fail without your fix above? Maybe you want to update 
> > the original after the copy is done to ensure the copy is not affected?

Updated the test to ensure the old code looks at the invalid iterators and 
crashes.


- Benjamin


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/61638/#review183555
---


On Aug. 23, 2017, 7:16 p.m., Benjamin Mahler wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/61638/
> ---
> 
> (Updated Aug. 23, 2017, 7:16 p.m.)
> 
> 
> Review request for mesos, Anand Mazumdar and Vinod Kone.
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> The defaults being generated were incorrect. No code is relying
> on them today, this was discovered when writing new code.
> 
> 
> Diffs
> -
> 
>   3rdparty/stout/include/stout/linkedhashmap.hpp 
> 35684e46480761c9b8325525377ca3ca9707990c 
>   3rdparty/stout/tests/linkedhashmap_tests.cpp 
> 267022e3ea601300083a41cc559dfa0adbc9074b 
> 
> 
> Diff: https://reviews.apache.org/r/61638/diff/2/
> 
> 
> Testing
> ---
> 
> Added a test.
> 
> 
> Thanks,
> 
> Benjamin Mahler
> 
>



Re: Review Request 61638: Implemented LinkedHashMap copy construction and assignment.

2017-08-23 Thread Benjamin Mahler

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/61638/
---

(Updated Aug. 23, 2017, 7:16 p.m.)


Review request for mesos, Anand Mazumdar and Vinod Kone.


Changes
---

Updated the test to ensure the old code crashed.


Repository: mesos


Description
---

The defaults being generated were incorrect. No code is relying
on them today, this was discovered when writing new code.


Diffs (updated)
-

  3rdparty/stout/include/stout/linkedhashmap.hpp 
35684e46480761c9b8325525377ca3ca9707990c 
  3rdparty/stout/tests/linkedhashmap_tests.cpp 
267022e3ea601300083a41cc559dfa0adbc9074b 


Diff: https://reviews.apache.org/r/61638/diff/2/

Changes: https://reviews.apache.org/r/61638/diff/1-2/


Testing
---

Added a test.


Thanks,

Benjamin Mahler



Re: Review Request 61640: Improved the reason and message for tasks killed before delivery.

2017-08-23 Thread Benjamin Mahler


> On Aug. 18, 2017, 11:37 p.m., Vinod Kone wrote:
> > src/slave/slave.cpp
> > Lines 3033 (patched)
> > 
> >
> > Can you add a TODO to the protobuf that contains this enum saying that 
> > it is no longer used.
> > 
> > Also, should we use a better named enum for this case? I would like to 
> > move towards all mesos generated updates to have reasons.

Introduced a `REASON_TASK_KILLED_BEFORE_DELIVERY` for these cases and updated 
the master side as well.


- Benjamin


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/61640/#review183281
---


On Aug. 22, 2017, 7:45 p.m., Benjamin Mahler wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/61640/
> ---
> 
> (Updated Aug. 22, 2017, 7:45 p.m.)
> 
> 
> Review request for mesos, Anand Mazumdar and Vinod Kone.
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Previously, if a task was killed before delivery to a registering
> executor, we sent REASON_EXECUTOR_UNREGISTERED with a message of
> "Unregistered executor" in the agent, and the master sent no reason.
> This introduces a REASON_TASK_KILLED_BEFORE_DELIVERY to clarify
> this case to frameworks.
> 
> 
> Diffs
> -
> 
>   include/mesos/mesos.proto d91c96814dfcd03a4390d80f90acd61b5e4aebdd 
>   include/mesos/v1/mesos.proto 6a1d011a45dcead644300d6becedcbd4bc1f5a96 
>   src/master/master.cpp 7f38a5e21884546d4b4c866ca5918db779af8f99 
>   src/slave/slave.cpp 50d2a10cd68f6611efd4e691e5325e6e0c06f33a 
>   src/tests/master_authorization_tests.cpp 
> b41b4a111a51f1b7f774334d5867528f007adaff 
>   src/tests/slave_recovery_tests.cpp b3825d13b12ee0da110b30baba5f5e68051b8e97 
>   src/tests/slave_tests.cpp a8b46a450c4f3986139390aeed5544e7f9091781 
> 
> 
> Diff: https://reviews.apache.org/r/61640/diff/2/
> 
> 
> Testing
> ---
> 
> Updated the tests.
> 
> 
> Thanks,
> 
> Benjamin Mahler
> 
>



Review Request 61855: Removed a stray trailing parenthesis from a validation error.

2017-08-23 Thread Gastón Kleiman

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/61855/
---

Review request for mesos and Greg Mann.


Repository: mesos


Description
---

Removed a stray trailing parenthesis from a validation error.


Diffs
-

  src/master/validation.cpp 7c3247d407c9e6aa8cce457d6c6be0c39f4b532f 


Diff: https://reviews.apache.org/r/61855/diff/1/


Testing
---

None, but it passed all the commit hooks. YOLO =).


Thanks,

Gastón Kleiman



Re: Review Request 61588: Added a `[-s|--skip-hooks]` option when applying reviews.

2017-08-23 Thread Chun-Hung Hsiao

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/61588/
---

(Updated Aug. 23, 2017, 6:35 p.m.)


Review request for mesos, Benjamin Bannier, Jie Yu, and Kevin Klues.


Changes
---

Addressed Andrew's comment.


Bugs: MESOS-7880
https://issues.apache.org/jira/browse/MESOS-7880


Repository: mesos


Description
---

Added a `[-s|--skip-hooks]` option when applying reviews.


Diffs (updated)
-

  support/apply-reviews.py aa2eb8d23bed52a4f33510c53ace2c63bd66ec49 


Diff: https://reviews.apache.org/r/61588/diff/4/

Changes: https://reviews.apache.org/r/61588/diff/3-4/


Testing
---

Applied a review chain.


Thanks,

Chun-Hung Hsiao



Re: Review Request 61818: Adjusted a comment and a log message around container termination.

2017-08-23 Thread Mesos Reviewbot Windows

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/61818/#review183635
---



Patch looks great!

Reviews applied: [61811, 59746, 61818]

Logs available here: http://104.210.40.105/logs/master/61818

- Mesos Reviewbot Windows


On Aug. 22, 2017, 5:18 p.m., Alexander Rukletsov wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/61818/
> ---
> 
> (Updated Aug. 22, 2017, 5:18 p.m.)
> 
> 
> Review request for mesos, Gilbert Song, Ian Downes, and Jie Yu.
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> When the container launch is aborted and the container is
> cleaned up in the containerizer before `executorLaunched()` is
> invoked, `containerizer->wait()` called in `executorLaunched()`
> does not hold a termination object. This case should not be
> logged as an error.
> 
> 
> Diffs
> -
> 
>   src/slave/slave.cpp 50d2a10cd68f6611efd4e691e5325e6e0c06f33a 
> 
> 
> Diff: https://reviews.apache.org/r/61818/diff/1/
> 
> 
> Testing
> ---
> 
> None: not a functional change.
> 
> 
> Thanks,
> 
> Alexander Rukletsov
> 
>



Re: Review Request 60621: Add new stout capability: os::copyfile.

2017-08-23 Thread Andrew Schwartzmeyer

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/60621/#review183631
---




3rdparty/stout/include/stout/os/posix/copyfile.hpp
Lines 26 (patched)


Not sure how I missed that. Good catch.



3rdparty/stout/include/stout/os/posix/copyfile.hpp
Lines 66 (patched)


This is odd. I though most of this code was copied from the original 
implementation (deleted in https://reviews.apache.org/r/60628) but apparently 
it was changed in the copy. I should have taken a closer look. That does 
explain why the logging is still here.


- Andrew Schwartzmeyer


On July 3, 2017, 12:29 p.m., Jeff Coffler wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/60621/
> ---
> 
> (Updated July 3, 2017, 12:29 p.m.)
> 
> 
> Review request for mesos, Andrew Schwartzmeyer, John Kordich, Joseph Wu, and 
> Li Li.
> 
> 
> Bugs: MESOS-6705
> https://issues.apache.org/jira/browse/MESOS-6705
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Add new stout capability: os::copyfile.
> 
> 
> Diffs
> -
> 
>   3rdparty/stout/Makefile.am 7956d1ae392cd3fa560474bc3ac38877cddce857 
>   3rdparty/stout/include/Makefile.am dec7293949e16b6580509c2fd41a851e349fbef4 
>   3rdparty/stout/include/stout/os/copyfile.hpp PRE-CREATION 
>   3rdparty/stout/include/stout/os/posix/copyfile.hpp PRE-CREATION 
>   3rdparty/stout/include/stout/os/windows/copyfile.hpp PRE-CREATION 
>   3rdparty/stout/tests/CMakeLists.txt 
> 8d881ab7ac571dea7aace269332a856feb7a6c43 
>   3rdparty/stout/tests/os/copyfile_tests.cpp PRE-CREATION 
> 
> 
> Diff: https://reviews.apache.org/r/60621/diff/1/
> 
> 
> Testing
> ---
> 
> See upstream
> 
> 
> Thanks,
> 
> Jeff Coffler
> 
>



Re: Review Request 61798: Added _EXIT as alternative to ABORT.

2017-08-23 Thread Andrei Budnik


> On Aug. 22, 2017, 10:46 a.m., Alexander Rukletsov wrote:
> > 3rdparty/stout/include/stout/exit.hpp
> > Lines 42 (patched)
> > 
> >
> > Shouldn't we use `CRLF` on Windows?

I can't find a new line macro. BTW, `ABORT` writes "\n" to stderr.


> On Aug. 22, 2017, 10:46 a.m., Alexander Rukletsov wrote:
> > 3rdparty/stout/include/stout/exit.hpp
> > Lines 76 (patched)
> > 
> >
> > Why don't you choose `stdout` vs. `stderr` based on `status` value 
> > similar to `__Exit`?

Fixed.


- Andrei


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/61798/#review183462
---


On Aug. 21, 2017, 8:54 p.m., Andrei Budnik wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/61798/
> ---
> 
> (Updated Aug. 21, 2017, 8:54 p.m.)
> 
> 
> Review request for mesos, Alexander Rukletsov, Alexander Rojas, Benjamin 
> Bannier, Benjamin Mahler, and James Peach.
> 
> 
> Bugs: MESOS-7791
> https://issues.apache.org/jira/browse/MESOS-7791
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> EXIT macro can't be used when async-signal safety is required.
> 
> Currently, ABORT is used for terminating a program after both
> unexpected and expected errors, when async-signal safety is required.
> In case of expected errors, `abort` causes coredumps, which might be
> undesirable.
> 
> In addition, _EXIT takes variable number of arguments, thereby _EXIT
> interface doesn't force users to concatenate strings, thus making its
> usage more async-signal safe.
> 
> 
> Diffs
> -
> 
>   3rdparty/stout/include/stout/exit.hpp 
> e5c2d3440a85ab2d2cae551ee4094cd965e38dfc 
> 
> 
> Diff: https://reviews.apache.org/r/61798/diff/2/
> 
> 
> Testing
> ---
> 
> sudo make check (mac os x, fedora 25)
> internal CI
> 
> 
> Thanks,
> 
> Andrei Budnik
> 
>



Re: Review Request 61801: Used _EXIT macro in `CgroupsAnyHierarchyTest.ROOT_CGROUPS_Write` test.

2017-08-23 Thread Andrei Budnik


> On Aug. 22, 2017, 10:46 a.m., Alexander Rukletsov wrote:
> > src/tests/containerizer/cgroups_tests.cpp
> > Line 489 (original), 490 (patched)
> > 
> >
> > Why did you drop this comment?

Because the following message asserts the same?


- Andrei


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/61801/#review183464
---


On Aug. 21, 2017, 8:54 p.m., Andrei Budnik wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/61801/
> ---
> 
> (Updated Aug. 21, 2017, 8:54 p.m.)
> 
> 
> Review request for mesos, Alexander Rukletsov, Alexander Rojas, Benjamin 
> Bannier, Benjamin Mahler, and James Peach.
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Used _EXIT macro in `CgroupsAnyHierarchyTest.ROOT_CGROUPS_Write` test.
> 
> 
> Diffs
> -
> 
>   src/tests/containerizer/cgroups_tests.cpp 
> c5d83e667da0b2c8c03937c66e2a973d547cd3d2 
> 
> 
> Diff: https://reviews.apache.org/r/61801/diff/1/
> 
> 
> Testing
> ---
> 
> sudo make check (mac os x, fedora 25)
> internal CI
> 
> 
> Thanks,
> 
> Andrei Budnik
> 
>



Re: Review Request 61799: Replaced ABORT with _EXIT in `childMain` to handle `os::execvpe` error.

2017-08-23 Thread Andrei Budnik


> On Aug. 22, 2017, 10:46 a.m., Alexander Rukletsov wrote:
> > 3rdparty/libprocess/src/subprocess_posix.hpp
> > Line 194 (original), 195-197 (patched)
> > 
> >
> > Is this what `clang-format` suggests?

`clang-format` suggests
```cpp
  _EXIT(
errno, "Failed to os::execvpe on path '", path, "': ", os::strerror(errno));
```
which is wrong, because code style obliges us to put 4 spaces after `(`. Also, 
I can't add 2 more spaces, because of 80-charactes limit.


- Andrei


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/61799/#review183463
---


On Aug. 21, 2017, 8:54 p.m., Andrei Budnik wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/61799/
> ---
> 
> (Updated Aug. 21, 2017, 8:54 p.m.)
> 
> 
> Review request for mesos, Alexander Rukletsov, Alexander Rojas, Benjamin 
> Bannier, Benjamin Mahler, and James Peach.
> 
> 
> Bugs: MESOS-7791
> https://issues.apache.org/jira/browse/MESOS-7791
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Using ABORT causes coredump when path to an executable is invalid.
> In addition, using of _EXIT allows us to get rid of string
> concatenation, thereby making `childMain` more async-signal safe.
> 
> 
> Diffs
> -
> 
>   3rdparty/libprocess/src/subprocess_posix.hpp 
> b478beccff22e2ffa08b9b91cfd5b9c6ada9b697 
> 
> 
> Diff: https://reviews.apache.org/r/61799/diff/1/
> 
> 
> Testing
> ---
> 
> sudo make check (mac os x, fedora 25)
> internal CI
> 
> 
> Thanks,
> 
> Andrei Budnik
> 
>



Re: Review Request 60591: Optionally isolate only the agent network ports.

2017-08-23 Thread James Peach


> On Aug. 21, 2017, 7:15 a.m., Qian Zhang wrote:
> > src/slave/flags.cpp
> > Lines 1012-1018 (patched)
> > 
> >
> > So by default this flag is not enabled, that means any libprocess-based 
> > exectuors (e.g., command executor and default executor) will be killed 
> > since they will listen on ephemeral port?
> 
> James Peach wrote:
> Yes, that's correct.
> 
> Qian Zhang wrote:
> That is a breaking change. I mean once this code is merged, any 
> frameworks which uses command executor or default executor (e.g., Marathon) 
> will be broken since the tasks that they launch will be killed.
> 
> James Peach wrote:
> No, you'd have to enable the isolator to break your containers.
> 
> Qian Zhang wrote:
> Agree.
> 
> And do you think `--check_agent_port_range_only` would be a better name? 
> If set to `true`, the `network/ports` isolator will do the check only within 
> agent's port range to restrict containers to only listen on the ports that 
> they have been assigned resources for, so containers are allowed to listen on 
> additional ports outside agent's port range. If set to `false`, the 
> `network/ports` isolator will restrict containers to only listen on ports 
> that they have been assigned resources for regardless of agent's port range.
> 
> And the default value of `--check_agent_port_range_only` should be 
> `true`, which means `network/ports` isolator will run in soft mode, so when 
> operator enable `network/ports` isolator but does not set this flag, the 
> frameworks using libprocess-based executor will not be broken.

I can live with `--check_agent_port_range_only`. However, this should be secure 
by default so we have to keep the default as `false`.


- James


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/60591/#review183303
---


On Aug. 21, 2017, 8:36 p.m., James Peach wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/60591/
> ---
> 
> (Updated Aug. 21, 2017, 8:36 p.m.)
> 
> 
> Review request for mesos, Qian Zhang and Jiang Yan Xu.
> 
> 
> Bugs: MESOS-7675
> https://issues.apache.org/jira/browse/MESOS-7675
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Normally, the `network/ports` isolator will kill any task that
> listens on a port that it does not have resources for. However,
> executors that are based on the libprocess API will always listen
> on a port in the ephemeral range, and we want to make it possible
> to use libprocess-based executors.
> 
> Added the `--container_ports_watch_resources_only` option to only
> kill tasks when they listen on un-allocated ports within the port
> range published by the agent resources. This still prevents port
> collisions between tasks, but doesn't kill them just because the
> executor is listening on a port.
> 
> 
> Diffs
> -
> 
>   src/slave/containerizer/mesos/isolators/network/ports.hpp PRE-CREATION 
>   src/slave/containerizer/mesos/isolators/network/ports.cpp PRE-CREATION 
>   src/slave/flags.hpp 2970fea0cfac6af275a758d4bfedfe9a943c2b60 
>   src/slave/flags.cpp 3b02f3e909a554f15104739832ae3f252926b45f 
> 
> 
> Diff: https://reviews.apache.org/r/60591/diff/13/
> 
> 
> Testing
> ---
> 
> make check (Fedora 26)
> 
> 
> Thanks,
> 
> James Peach
> 
>



Re: Review Request 61797: Added async-signal safe implementation of `write` function in stout.

2017-08-23 Thread Andrei Budnik


> On Aug. 22, 2017, 10:46 a.m., Alexander Rukletsov wrote:
> > 3rdparty/stout/include/stout/os/write.hpp
> > Lines 67 (patched)
> > 
> >
> > `std::strlen`?

I'm not sure if `std::strlen` is async-signal safe.


> On Aug. 22, 2017, 10:46 a.m., Alexander Rukletsov wrote:
> > 3rdparty/stout/include/stout/os/write.hpp
> > Line 60 (original), 95-99 (patched)
> > 
> >
> > Maybe `return (result == 0) ? Nothing() : ErrnoError(result);`?

It looks more complex to me. I don't like that.


- Andrei


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/61797/#review183460
---


On Aug. 21, 2017, 8:54 p.m., Andrei Budnik wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/61797/
> ---
> 
> (Updated Aug. 21, 2017, 8:54 p.m.)
> 
> 
> Review request for mesos, Alexander Rukletsov, Alexander Rojas, Benjamin 
> Bannier, Benjamin Mahler, and James Peach.
> 
> 
> Bugs: MESOS-7791
> https://issues.apache.org/jira/browse/MESOS-7791
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Async-signal safe `write` function, which can take variable number of
> arguments, should be used to improve signal safety.
> 
> 
> Diffs
> -
> 
>   3rdparty/stout/include/stout/os/write.hpp 
> beb5bd83b52565a75e34d32b5bb17951bc799578 
> 
> 
> Diff: https://reviews.apache.org/r/61797/diff/1/
> 
> 
> Testing
> ---
> 
> sudo make check (mac os x, fedora 25)
> internal CI
> 
> 
> Thanks,
> 
> Andrei Budnik
> 
>



Re: Review Request 60594: Added a`network/ports` isolator nested container test.

2017-08-23 Thread James Peach


> On Aug. 23, 2017, 2:53 p.m., Qian Zhang wrote:
> > This patch seems out of date?

Yeh, this got folded into an earlier patch during one of the patch series 
refactor. Dropping.


- James


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/60594/#review183605
---


On July 3, 2017, 10:32 a.m., James Peach wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/60594/
> ---
> 
> (Updated July 3, 2017, 10:32 a.m.)
> 
> 
> Review request for mesos, Qian Zhang and Jiang Yan Xu.
> 
> 
> Bugs: MESOS-7675
> https://issues.apache.org/jira/browse/MESOS-7675
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Added a test to verify that the `network/ports` isolator works
> correctly with nested containers.
> 
> 
> Diffs
> -
> 
>   src/tests/containerizer/network_ports_isolator_tests.cpp PRE-CREATION 
> 
> 
> Diff: https://reviews.apache.org/r/60594/diff/6/
> 
> 
> Testing
> ---
> 
> make check (Fedora 26)
> 
> 
> Thanks,
> 
> James Peach
> 
>



Re: Review Request 60765: Added basic `network/ports` isolator tests.

2017-08-23 Thread James Peach


> On Aug. 23, 2017, 7:58 a.m., Qian Zhang wrote:
> > src/tests/containerizer/ports_isolator_tests.cpp
> > Lines 80 (patched)
> > 
> >
> > Why do we need a template function?

We need this to be ablt to use the same code for the `v1` API.


> On Aug. 23, 2017, 7:58 a.m., Qian Zhang wrote:
> > src/tests/containerizer/ports_isolator_tests.cpp
> > Lines 85 (patched)
> > 
> >
> > Can we just do `set_type(HealthCheck::TCP)`?

We have to deal with the old and new APIs, so this can be 
`mesos::HealthCheck::TCP` or `mesos::v1::HealthCheck::TCP`.


- James


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/60765/#review183314
---


On July 29, 2017, 12:02 a.m., James Peach wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/60765/
> ---
> 
> (Updated July 29, 2017, 12:02 a.m.)
> 
> 
> Review request for mesos, Qian Zhang and Jiang Yan Xu.
> 
> 
> Bugs: MESOS-7675
> https://issues.apache.org/jira/browse/MESOS-7675
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Added tests to verify that the `network/ports` is able to correctly
> terminate only tasks that use rogue TCP ports.
> 
> 
> Diffs
> -
> 
>   src/tests/containerizer/ports_isolator_tests.cpp PRE-CREATION 
> 
> 
> Diff: https://reviews.apache.org/r/60765/diff/10/
> 
> 
> Testing
> ---
> 
> make check (Fedora 26)
> 
> 
> Thanks,
> 
> James Peach
> 
>



Re: Review Request 60766: Ignored containers that join CNI networks.

2017-08-23 Thread James Peach


> On Aug. 23, 2017, 9:52 a.m., Qian Zhang wrote:
> > src/slave/containerizer/mesos/isolators/network/ports.cpp
> > Line 394 (original), 436 (patched)
> > 
> >
> > So here an empty `Info` object is put into `infos`? I think we need to 
> > populate its `ports` field before that.

The ports are populated in `update()`:

```
0823 08:57:15.686208 21125 linux_launcher.cpp:300] Recovered container 
d9e180c6-c527-431c-b147-f820c788aa7d
I0823 08:57:15.686942 21133 ports.cpp:437] recovering container 
d9e180c6-c527-431c-b147-f820c788aa7d
I0823 08:57:15.688449 21123 provisioner.cpp:416] Provisioner recovery complete
I0823 08:57:15.689740 21132 slave.cpp:6110] Sending reconnect request to 
executor '084372d8-e569-48d9-9bb3-6ce89750207e' of framework 
ac96cb5d-fbab-402c-8fe8-1042ea1b1986- at executor(1)@17.228.224.108:39141
I0823 08:57:15.691100 21459 exec.cpp:283] Received reconnect request from agent 
ac96cb5d-fbab-402c-8fe8-1042ea1b1986-S0
I0823 08:57:15.693174 21131 slave.cpp:4071] Received re-registration message 
from executor '084372d8-e569-48d9-9bb3-6ce89750207e' of framework 
ac96cb5d-fbab-402c-8fe8-1042ea1b1986-
I0823 08:57:15.693802 21132 ports.cpp:372] updating container 
d9e180c6-c527-431c-b147-f820c788aa7d
I0823 08:57:15.694080 21457 exec.cpp:260] Executor re-registered on agent 
ac96cb5d-fbab-402c-8fe8-1042ea1b1986-S0
```


- James


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/60766/#review183586
---


On Aug. 21, 2017, 10:01 p.m., James Peach wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/60766/
> ---
> 
> (Updated Aug. 21, 2017, 10:01 p.m.)
> 
> 
> Review request for mesos, Qian Zhang and Jiang Yan Xu.
> 
> 
> Bugs: MESOS-7675
> https://issues.apache.org/jira/browse/MESOS-7675
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Working on the assumption that containers with CNI networks will
> get their own IP addresses and don't need port isolation, ignore
> any containers that are joining CNI networks.
> 
> 
> Diffs
> -
> 
>   src/slave/containerizer/mesos/containerizer.cpp 
> 5772421c3078d36225b946a5286b8c1bf2f007e8 
>   src/slave/containerizer/mesos/isolators/network/ports.hpp PRE-CREATION 
>   src/slave/containerizer/mesos/isolators/network/ports.cpp PRE-CREATION 
> 
> 
> Diff: https://reviews.apache.org/r/60766/diff/11/
> 
> 
> Testing
> ---
> 
> make check (Fedora 26).
> 
> 
> Thanks,
> 
> James Peach
> 
>



Re: Review Request 60591: Optionally isolate only the agent network ports.

2017-08-23 Thread James Peach


> On Aug. 23, 2017, 2:10 a.m., Qian Zhang wrote:
> > src/slave/containerizer/mesos/isolators/network/ports.cpp
> > Lines 243-251 (patched)
> > 
> >
> > Not yours, but it seems that we do not have this check for agent's 
> > `--resources` flag, so that means operator can specify an incorrect port 
> > range (e.g., `[88-99]` in that flag?

Filed [MESOS-7910](https://issues.apache.org/jira/browse/MESOS-7910).


- James


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/60591/#review183566
---


On Aug. 21, 2017, 8:36 p.m., James Peach wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/60591/
> ---
> 
> (Updated Aug. 21, 2017, 8:36 p.m.)
> 
> 
> Review request for mesos, Qian Zhang and Jiang Yan Xu.
> 
> 
> Bugs: MESOS-7675
> https://issues.apache.org/jira/browse/MESOS-7675
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Normally, the `network/ports` isolator will kill any task that
> listens on a port that it does not have resources for. However,
> executors that are based on the libprocess API will always listen
> on a port in the ephemeral range, and we want to make it possible
> to use libprocess-based executors.
> 
> Added the `--container_ports_watch_resources_only` option to only
> kill tasks when they listen on un-allocated ports within the port
> range published by the agent resources. This still prevents port
> collisions between tasks, but doesn't kill them just because the
> executor is listening on a port.
> 
> 
> Diffs
> -
> 
>   src/slave/containerizer/mesos/isolators/network/ports.hpp PRE-CREATION 
>   src/slave/containerizer/mesos/isolators/network/ports.cpp PRE-CREATION 
>   src/slave/flags.hpp 2970fea0cfac6af275a758d4bfedfe9a943c2b60 
>   src/slave/flags.cpp 3b02f3e909a554f15104739832ae3f252926b45f 
> 
> 
> Diff: https://reviews.apache.org/r/60591/diff/13/
> 
> 
> Testing
> ---
> 
> make check (Fedora 26)
> 
> 
> Thanks,
> 
> James Peach
> 
>



Re: Review Request 61666: Added test to verify filtering of resource reservations & allocations.

2017-08-23 Thread Andrei Budnik


> On Aug. 21, 2017, 10:12 a.m., Alexander Rojas wrote:
> > src/tests/slave_authorization_tests.cpp
> > Lines 106 (patched)
> > 
> >
> > _Only one default user…_

This comment was rewritten.


> On Aug. 21, 2017, 10:12 a.m., Alexander Rojas wrote:
> > src/tests/slave_authorization_tests.cpp
> > Lines 164 (patched)
> > 
> >
> > I would rather have `acl->mutable_roles()->add_value(role_name)`

Fixed.


> On Aug. 21, 2017, 10:12 a.m., Alexander Rojas wrote:
> > src/tests/slave_authorization_tests.cpp
> > Lines 195 (patched)
> > 
> >
> > can you give it a better name to this role, even _test-role_ would do, 
> > but if you look at api tests we use other things like _superhero_ or 
> > _muggle_.

I've introduced two roles *superhero* and *muggle*.
The test has been changed significantly.


- Andrei


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/61666/#review183321
---


On Aug. 15, 2017, 5:59 p.m., Andrei Budnik wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/61666/
> ---
> 
> (Updated Aug. 15, 2017, 5:59 p.m.)
> 
> 
> Review request for mesos, Alexander Rojas and Greg Mann.
> 
> 
> Bugs: MESOS-7892
> https://issues.apache.org/jira/browse/MESOS-7892
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> This patch extends `SlaveAuthorizerTest.FilterStateEndpoint` test to
> check that agent's `/state` endpoint shows resource reservations and
> allocations based on the 'VIEW_ROLE' permissions of the principal
> doing the request.
> 
> 
> Diffs
> -
> 
>   src/tests/slave_authorization_tests.cpp 
> 536a8efda0cc1ba08285c379b01af6ec64a82117 
> 
> 
> Diff: https://reviews.apache.org/r/61666/diff/2/
> 
> 
> Testing
> ---
> 
> make check (mac os x, fedora 25)
> 
> 
> Thanks,
> 
> Andrei Budnik
> 
>



Re: Review Request 60593: Test the `network/ports` isolator recovery.

2017-08-23 Thread Qian Zhang

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/60593/#review183580
---




src/tests/containerizer/ports_isolator_tests.cpp
Lines 701-703 (patched)


Why do we need these?



src/tests/containerizer/ports_isolator_tests.cpp
Lines 853 (patched)


It seems there is no enough time left for this code because we stop the 
`driver` right after the check is triggered.



src/tests/containerizer/ports_isolator_tests.cpp
Lines 871-873 (patched)


I think this test should belong to r/60594 since it is not related to slave 
recovery.


- Qian Zhang


On July 13, 2017, 12:58 p.m., James Peach wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/60593/
> ---
> 
> (Updated July 13, 2017, 12:58 p.m.)
> 
> 
> Review request for mesos, Qian Zhang and Jiang Yan Xu.
> 
> 
> Bugs: MESOS-7675
> https://issues.apache.org/jira/browse/MESOS-7675
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Added a test for the `network/ports` isolator recovery by starting
> a task that is listening on a rogue port. We only configure the
> isolator when we restart the agent to simulate the case where a
> task only starts to misbehave after an agent recovery.
> 
> 
> Diffs
> -
> 
>   src/tests/containerizer/ports_isolator_tests.cpp PRE-CREATION 
> 
> 
> Diff: https://reviews.apache.org/r/60593/diff/11/
> 
> 
> Testing
> ---
> 
> make check (Fedora 26)
> 
> 
> Thanks,
> 
> James Peach
> 
>



Re: Review Request 60594: Added a`network/ports` isolator nested container test.

2017-08-23 Thread Qian Zhang

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/60594/#review183605
---



This patch seems out of date?

- Qian Zhang


On July 3, 2017, 6:32 p.m., James Peach wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/60594/
> ---
> 
> (Updated July 3, 2017, 6:32 p.m.)
> 
> 
> Review request for mesos, Qian Zhang and Jiang Yan Xu.
> 
> 
> Bugs: MESOS-7675
> https://issues.apache.org/jira/browse/MESOS-7675
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Added a test to verify that the `network/ports` isolator works
> correctly with nested containers.
> 
> 
> Diffs
> -
> 
>   src/tests/containerizer/network_ports_isolator_tests.cpp PRE-CREATION 
> 
> 
> Diff: https://reviews.apache.org/r/60594/diff/6/
> 
> 
> Testing
> ---
> 
> make check (Fedora 26)
> 
> 
> Thanks,
> 
> James Peach
> 
>



Re: Review Request 61818: Adjusted a comment and a log message around container termination.

2017-08-23 Thread Mesos Reviewbot

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/61818/#review183599
---



Patch looks great!

Reviews applied: [61811, 59746, 61818]

Passed command: export OS='ubuntu:14.04' BUILDTOOL='autotools' COMPILER='gcc' 
CONFIGURATION='--verbose' ENVIRONMENT='GLOG_v=1 MESOS_VERBOSE=1'; 
./support/docker-build.sh

- Mesos Reviewbot


On Aug. 22, 2017, 5:18 p.m., Alexander Rukletsov wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/61818/
> ---
> 
> (Updated Aug. 22, 2017, 5:18 p.m.)
> 
> 
> Review request for mesos, Gilbert Song, Ian Downes, and Jie Yu.
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> When the container launch is aborted and the container is
> cleaned up in the containerizer before `executorLaunched()` is
> invoked, `containerizer->wait()` called in `executorLaunched()`
> does not hold a termination object. This case should not be
> logged as an error.
> 
> 
> Diffs
> -
> 
>   src/slave/slave.cpp 50d2a10cd68f6611efd4e691e5325e6e0c06f33a 
> 
> 
> Diff: https://reviews.apache.org/r/61818/diff/1/
> 
> 
> Testing
> ---
> 
> None: not a functional change.
> 
> 
> Thanks,
> 
> Alexander Rukletsov
> 
>



Review Request 61849: Improved consistency of cout/cerr and glog usage in main functions.

2017-08-23 Thread Armand Grillet

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/61849/
---

Review request for mesos, Andrei Budnik and Alexander Rukletsov.


Bugs: MESOS-7586
https://issues.apache.org/jira/browse/MESOS-7586


Repository: mesos


Description
---

The main functions now have a common pattern, in 6 steps:
  1. Load flags.
  2. Check if the --help flag has been passed.
  3. Check if the flags are correct.
  4. Catch signals.
  5. Parse and setup the environment variables.
  6. `process::initialize();`.

This change reduces the number of messages "WARNING: Logging
before InitGoogleLogging() is written to STDERR".


Diffs
-

  src/docker/executor.cpp 99a62244b7805d25b926d0293cb62574981d2878 
  src/launcher/default_executor.cpp 30bae5c931c81b81ff9590eea8051b2232388813 
  src/launcher/executor.cpp 65b9a2c52ba5a5a514fff992018e356f388468fc 
  src/local/main.cpp f0d7e8ca1a64b5409632ea0c0894a1dc84f9796e 
  src/master/main.cpp 1a78a55fd52c5eebad689c2a7f16c95979260d79 
  src/slave/main.cpp a4a8ced6825ece1b25003824d987ff83fb799ad7 


Diff: https://reviews.apache.org/r/61849/diff/1/


Testing
---

```
$ make check
```


Thanks,

Armand Grillet



Re: Review Request 60766: Ignored containers that join CNI networks.

2017-08-23 Thread Qian Zhang

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/60766/#review183586
---




src/slave/containerizer/mesos/isolators/network/ports.cpp
Line 394 (original), 436 (patched)


So here an empty `Info` object is put into `infos`? I think we need to 
populate its `ports` field before that.


- Qian Zhang


On Aug. 22, 2017, 6:01 a.m., James Peach wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/60766/
> ---
> 
> (Updated Aug. 22, 2017, 6:01 a.m.)
> 
> 
> Review request for mesos, Qian Zhang and Jiang Yan Xu.
> 
> 
> Bugs: MESOS-7675
> https://issues.apache.org/jira/browse/MESOS-7675
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Working on the assumption that containers with CNI networks will
> get their own IP addresses and don't need port isolation, ignore
> any containers that are joining CNI networks.
> 
> 
> Diffs
> -
> 
>   src/slave/containerizer/mesos/containerizer.cpp 
> 5772421c3078d36225b946a5286b8c1bf2f007e8 
>   src/slave/containerizer/mesos/isolators/network/ports.hpp PRE-CREATION 
>   src/slave/containerizer/mesos/isolators/network/ports.cpp PRE-CREATION 
> 
> 
> Diff: https://reviews.apache.org/r/60766/diff/11/
> 
> 
> Testing
> ---
> 
> make check (Fedora 26).
> 
> 
> Thanks,
> 
> James Peach
> 
>



Re: Review Request 60765: Added basic `network/ports` isolator tests.

2017-08-23 Thread Qian Zhang

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/60765/#review183314
---




src/tests/containerizer/ports_isolator_tests.cpp
Lines 39-40 (patched)


These two lines should be put under the two lines below.



src/tests/containerizer/ports_isolator_tests.cpp
Lines 58-59 (patched)


Mind to explain why advancing the clock will cause updates to be resent?



src/tests/containerizer/ports_isolator_tests.cpp
Lines 60 (patched)


Suggest to rename to `awaitStatusUpdateAcked`.



src/tests/containerizer/ports_isolator_tests.cpp
Lines 80 (patched)


Why do we need a template function?



src/tests/containerizer/ports_isolator_tests.cpp
Lines 85 (patched)


Can we just do `set_type(HealthCheck::TCP)`?



src/tests/containerizer/ports_isolator_tests.cpp
Lines 145 (patched)


s/Verify/This test verifies/

Ditto for all other places.



src/tests/containerizer/ports_isolator_tests.cpp
Lines 260 (patched)


In `CreateSlaveFlags()`, I see `flags.resources` will be explicitly set to 
`defaultAgentResourcesString` which is 
`cpus:2;gpus:0;mem:1024;disk:1024;ports:[31000-32000]`, so this test actually 
covers the case that operator explicitly sets ports resources in agent's 
`--resources` flag.

I think we may need another test that we set `flags.resources` to `None()` 
after `CreateSlaveFlags()` is called so that the code in 
`NetworkPortsIsolatorProcess::create()` to default the ports if it is missing 
can be covered.



src/tests/containerizer/ports_isolator_tests.cpp
Lines 299 (patched)


s/1/1u/

Ditto for the other place.



src/tests/containerizer/ports_isolator_tests.cpp
Lines 328-329 (patched)


Since this is a health check status update, I think we also need:
```
  EXPECT_EQ(
  TaskStatus::REASON_TASK_HEALTH_CHECK_STATUS_UPDATED,
  statusHealthy->reason());
  EXPECT_TRUE(statusHealthy->has_healthy());
  EXPECT_TRUE(statusHealthy->healthy());
```

Ditto for all other places.



src/tests/containerizer/ports_isolator_tests.cpp
Lines 567 (patched)


What if `taskPort + 1` making it out of agent port range?


- Qian Zhang


On July 29, 2017, 8:02 a.m., James Peach wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/60765/
> ---
> 
> (Updated July 29, 2017, 8:02 a.m.)
> 
> 
> Review request for mesos, Qian Zhang and Jiang Yan Xu.
> 
> 
> Bugs: MESOS-7675
> https://issues.apache.org/jira/browse/MESOS-7675
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Added tests to verify that the `network/ports` is able to correctly
> terminate only tasks that use rogue TCP ports.
> 
> 
> Diffs
> -
> 
>   src/tests/containerizer/ports_isolator_tests.cpp PRE-CREATION 
> 
> 
> Diff: https://reviews.apache.org/r/60765/diff/10/
> 
> 
> Testing
> ---
> 
> make check (Fedora 26)
> 
> 
> Thanks,
> 
> James Peach
> 
>