Re: Review Request 68018: Added `SeccompFilter` class.

2019-01-15 Thread Qian Zhang


> On Jan. 14, 2019, 4:31 p.m., Qian Zhang wrote:
> > src/linux/seccomp/seccomp.cpp
> > Lines 137-139 (patched)
> > 
> >
> > Will this affect the task run by Mesos? E.g., a task may want to run a 
> > program which has `set-user-ID` bit.
> 
> Andrei Budnik wrote:
> Yes, `no_new_privs` flag affects the task that wants to run a program 
> which has `set-user-ID` bit.
> E.g., launching a `ping -c 3 8.8.8.8` fails with seccomp. You'll see a 
> message in executor logs:
> ```
> I0114 07:19:21.887670 13264 executor.cpp:706] Forked command at 13276
> ping: socket: Operation not permitted
> I0114 07:19:22.055352 13263 executor.cpp:1007] Command exited with status 
> 2 (pid: 13276)
> ```
> 
> Also, see my previous comment 
> https://reviews.apache.org/r/68018/#comment297000
> 
> Qian Zhang wrote:
> In your previous comment, you mentioned that Docker daemon launches its 
> containers with `SCMP_FLTATR_CTL_NNP` flag set by default, does that mean any 
> containers launched by Docker daemon cannot run program which has set-user-ID 
> bit?
> 
> This seems unfortunate since it might break some use cases or 
> applications that we already supported. And can you please elaborate a bit 
> about `"Disabling SCMP_FLTATR_CTL_NNP flag for a root means that Seccomp 
> filter can be reverted anytime"`? How will the Seccomp filter be reverted? Do 
> you mean the task launched by Mesos can call libseccomp API to revert the 
> filter itself?
> 
> If we have to live with this limitation (i.e., cannot run program which 
> has set-user-ID bit), then we need to highlight it in the document.
> 
> Gilbert Song wrote:
> Seems like we asked the same question.
> 
> Andrei, let align on this thread? :/thanks:)
> 
> Andrei Budnik wrote:
> >does that mean any containers launched by Docker daemon cannot run 
> program which has set-user-ID bit?
> 
> Docker daemon can not be used to run arbitrary programs (in opposity to 
> Mesos c'zer). So, when one launches a Docker container, Docker daemon 
> launches a container process with `NNP` bit set, which means that a container 
> process (and it descendants) can't gain more previleges **outside** its 
> container. Mesos containerizer has exactly the same behaviour:
> 
> 1) Run system-provided `/bin/ping` (*outside* its container) as a 
> non-privileged user:
> ```
> $ ./src/mesos-execute --master="`hostname`:5050" --name="a" 
> --containerizer=mesos --command="ping -c 3 8.8.8.8"
> ...
> Received status update TASK_FAILED for task 'a'
>   message: 'Command exited with status 2'
>   source: SOURCE_EXECUTOR
> ```
> 
> 2) Run system-provided `/bin/ping` (*outside* its container) as a 
> privileged user:
> ```
> sudo ./src/mesos-execute --master="`hostname`:5050" --name="a" 
> --containerizer=mesos --command="ping -c 3 8.8.8.8"
> ...
> Received status update TASK_FINISHED for task 'a'
>   message: 'Command exited with status 0'
>   source: SOURCE_EXECUTOR
> ```
> 
> 3) Run container image provided `ping` (*inside* its image/container) as 
> a non-privileged user:
> ```
> $ ./src/mesos-execute --master="`hostname`:5050" --name="a" 
> --containerizer=mesos --docker_image="fedora:latest" --command="yum -y 
> install iputils;ping -c 3 8.8.8.8"
> ...
> Received status update TASK_FINISHED for task 'a'
>   message: 'Command exited with status 0'
>   source: SOURCE_EXECUTOR
> 
> $ cat /path/to/container/stdout
> ...
> PING 8.8.8.8 (8.8.8.8) 56(84) bytes of data.
> 64 bytes from 8.8.8.8: icmp_seq=1 ttl=122 time=13.9 ms
> ```
> 
> > This seems unfortunate since it might break some use cases or 
> applications that we already supported.
> 
> It's very unlikely that the agent launches tasks, whose binary has 
> `setuid`/`setgid` bit specified. Because... what the point?
> I doubt if any of the following programs a launched as a Mesos container:
> ```
> $ sudo find /bin/ -perm -u=s -type f 2>/dev/null
> /bin/newgrp
> /bin/pkexec
> /bin/mount
> /bin/umount
> /bin/newuidmap
> /bin/newgidmap
> /bin/sudo
> /bin/crontab
> /bin/su
> /bin/gpasswd
> /bin/chage
> /bin/passwd
> /bin/staprun
> /bin/fusermount
> /bin/fusermount-glusterfs
> /bin/chfn
> /bin/chsh
> /bin/at
> ```
> 
> > And can you please elaborate a bit about "Disabling SCMP_FLTATR_CTL_NNP 
> flag for a root means that Seccomp filter can be reverted anytime"? How will 
> the Seccomp filter be reverted? Do you mean the task launched by Mesos can 
> call libseccomp API to revert the filter itself?
> 
> Yes, without `NNP` (`no_new_privs`) bit set, a privileged task might call 
> `seccomp` Linux syscall to install an empty Seccomp filter.

> Run system-provided 

Re: Review Request 69726: Fixed flakiness in MasterAPITest.OperationUpdatesUponOfferGone.

2019-01-15 Thread Mesos Reviewbot

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69726/#review212044
---



Patch looks great!

Reviews applied: [69726]

Passed command: export OS='ubuntu:14.04' BUILDTOOL='autotools' COMPILER='gcc' 
CONFIGURATION='--verbose --disable-libtool-wrappers' ENVIRONMENT='GLOG_v=1 
MESOS_VERBOSE=1'; ./support/docker-build.sh

- Mesos Reviewbot


On Jan. 14, 2019, 4:35 p.m., Benno Evers wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/69726/
> ---
> 
> (Updated Jan. 14, 2019, 4:35 p.m.)
> 
> 
> Review request for mesos, Gastón Kleiman and Greg Mann.
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> It could happen that the first offer was sent before the allocator
> had been updated with the resources from the resource provider.
> 
> To further increase reliability, stopped the clock during this test.
> 
> 
> Diffs
> -
> 
>   src/tests/api_tests.cpp c597243e2e210e83a4ab7441fbcfa3198b43d849 
> 
> 
> Diff: https://reviews.apache.org/r/69726/diff/2/
> 
> 
> Testing
> ---
> 
> `./src/mesos-tests --gtest_filter="*OperationUpdates*" --gtest_repeat=5 
> --gtest_throw_on_failure`
> 
> 
> Thanks,
> 
> Benno Evers
> 
>



Re: Review Request 69158: Added an integration test for resource provider removal.

2019-01-15 Thread Mesos Reviewbot Windows

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69158/#review212042
---



PASS: Mesos patch 69158 was successfully built and tested.

Reviews applied: `['68147', '69158']`

All the build artifacts available at: 
http://dcos-win.westus2.cloudapp.azure.com/artifacts/mesos-reviewbot-testing/2786/mesos-review-69158

- Mesos Reviewbot Windows


On Jan. 15, 2019, 7:03 a.m., Benjamin Bannier wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/69158/
> ---
> 
> (Updated Jan. 15, 2019, 7:03 a.m.)
> 
> 
> Review request for mesos, Chun-Hung Hsiao and Jan Schlicht.
> 
> 
> Bugs: MESOS-8403
> https://issues.apache.org/jira/browse/MESOS-8403
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Added an integration test for resource provider removal.
> 
> 
> Diffs
> -
> 
>   src/tests/slave_tests.cpp dc711de96053ebecb6b71b98f14f22cb9b050c52 
> 
> 
> Diff: https://reviews.apache.org/r/69158/diff/5/
> 
> 
> Testing
> ---
> 
> `make check`
> 
> 
> Thanks,
> 
> Benjamin Bannier
> 
>



Re: Review Request 68021: Added `linux/seccomp` isolator.

2019-01-15 Thread Qian Zhang


> On Jan. 15, 2019, 11:02 a.m., Qian Zhang wrote:
> > src/slave/containerizer/mesos/isolators/linux/seccomp.cpp
> > Lines 17-18 (patched)
> > 
> >
> > A newline between.
> 
> Andrei Budnik wrote:
> 
> https://github.com/apache/mesos/blob/2aaf96ecbab316708afb401e43cad2f2f692f687/src/slave/containerizer/mesos/isolators/xfs/utils.cpp#L35-L38

Hmm, I think that is not correct. Our guideline should be headers are 
partitioned by their subfolders 
(https://github.com/apache/mesos/blob/master/docs/c++-style-guide.md#order-of-includes
 ), here are a few examples:

https://github.com/apache/mesos/blob/1.7.0/src/linux/routing/link/link.cpp#L21:L29
https://github.com/apache/mesos/blob/1.7.0/src/linux/capabilities.cpp#L19:L21


- Qian


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/68021/#review211985
---


On Nov. 8, 2018, 11:24 p.m., Andrei Budnik wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/68021/
> ---
> 
> (Updated Nov. 8, 2018, 11:24 p.m.)
> 
> 
> Review request for mesos, Gilbert Song, Jie Yu, James Peach, and Qian Zhang.
> 
> 
> Bugs: MESOS-9035
> https://issues.apache.org/jira/browse/MESOS-9035
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> This patch introduces `linux/seccomp` isolator which is used for
> preparing `ContainerSeccompProfile` for the Mesos containerizer
> launcher. If the `ContainerConfig` message has an info about Seccomp
> profile name, then this info will be used to locate a Seccomp profile.
> The given Seccomp profile is parsed and the resulting
> `ContainerSeccompProfile` is stored in the `ContainerLaunchInfo`
> message.
> 
> 
> Diffs
> -
> 
>   src/CMakeLists.txt a574d449dc26b820cbef7ff0b5e94b42b6fe86cf 
>   src/Makefile.am cd785255fcdf1302a8f9fa358039e5d1f200e132 
>   src/slave/containerizer/mesos/containerizer.cpp 
> 5016f2e9f0651abcb0a5f364e8eace458f2edeae 
>   src/slave/containerizer/mesos/isolators/linux/seccomp.hpp PRE-CREATION 
>   src/slave/containerizer/mesos/isolators/linux/seccomp.cpp PRE-CREATION 
> 
> 
> Diff: https://reviews.apache.org/r/68021/diff/13/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Andrei Budnik
> 
>



Re: Review Request 69588: Removed outdated authorization logic for offer operations.

2019-01-15 Thread Mesos Reviewbot Windows

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69588/#review212040
---



FAIL: Some of the unit tests failed. Please check the relevant logs.

Reviews applied: `['69588']`

Failed command: `Start-MesosCITesting`

All the build artifacts available at: 
http://dcos-win.westus2.cloudapp.azure.com/artifacts/mesos-reviewbot-testing/2783/mesos-review-69588

Relevant logs:

- 
[mesos-tests.log](http://dcos-win.westus2.cloudapp.azure.com/artifacts/mesos-reviewbot-testing/2783/mesos-review-69588/logs/mesos-tests.log):

```
I0116 00:55:43.745271 22032 process.cpp:927] Stopped the socket accept loop
W0116 00:55:43.746259 22032 process.cpp:1890] Failed to send 
'mesos.internal.StatusUpdateMessage' to '192.10.1.6:50492', connect: IO failed 
with error code: The remote computer refused the network connection.

:Type::SOCKET=412 to peer '192.10.1.6:51107': IO failed with error code: The 
specified network name is no longer available.

I0116 00:47:29.050809 24828 exec.cpp:518] Agent exited, but framework has 
checkpointing enabled. Waiting 15mins to reconnect with agent 
bd8fb96d-c411-43ff-a5f0-d321fbfc573c-S0
I0116 00:56:17.430984 25988 executor.cpp:1007] Command exited with status 0 
(pid: 21536)
I0116 00:56:18.435178 26500 process.cpp:927] Stopped the socket accept loop
W0116 00:56:18.437160 26500 process.cpp:1890] Failed to send 
'mesos.internal.StatusUpdateMessage' to '192.10.1.6:50492', connect: IO failed 
with error code: The remote computer refused the network connection.

ointing enabled. Waiting 15mins to reconnect with agent 
bd8fb96d-c411-43ff-a5f0-d321fbfc573c-S0
W0116 00:47:29.049806 23704 process.cpp:838] Failed to recv on socket 
WindowsFD::Type::SOCKET=392 to peer '192.10.1.6:51110': IO failed with error 
code: The specified network name is no longer available.

I0116 00:56:17.827467 16024 executor.cpp:1007] Command exited with status 0 
(pid: 23380)
I0116 00:56:18.831517 23704 process.cpp:927] Stopped the socket accept loop
W0116 00:56:18.833523 23704 process.cpp:1890] Failed to send 
'mesos.internal.StatusUpdateMessage' to '192.10.1.6:50492', connect: IO failed 
with error code: The remote computer refused the network connection.

:Type::SOCKET=392 to peer '192.10.1.6:51113': IO failed with error code: The 
specified network name is no longer available.

I0116 00:47:29.052810 19028 exec.cpp:518] Agent exited, but framework has 
checkpointing enabled. Waiting 15mins to reconnect with agent 
7407f0ea-0cf3-4e97-a555-c9640c17efea-S0
I0116 00:56:48.621898 24048 executor.cpp:1007] Command exited with status 0 
(pid: 23352)
I0116 00:56:49.624585 23904 process.cpp:927] Stopped the socket accept loop
W0116 00:56:49.626592 23904 process.cpp:1890] Failed to send 
'mesos.internal.StatusUpdateMessage' to '192.10.1.6:50492', connect: IO failed 
with error code: The remote computer refused the network connection.

I0116 01:02:29.057098 23860 executor.cpp:568] Recovery timeout of 15mins 
exceeded; Shutting down
I0116 01:02:29.058001 26044 default_executor.cpp:204] Received SHUTDOWN event
I0116 01:02:29.058001 26044 default_executor.cpp:1025] Shutting down
I0116 01:02:29.058001 26044 default_executor.cpp:1081] Terminating after 1secs
I0116 01:02:30.05 22872 process.cpp:927] Stopped the socket accept loop
```

- Mesos Reviewbot Windows


On Dec. 19, 2018, 3:20 p.m., Chun-Hung Hsiao wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/69588/
> ---
> 
> (Updated Dec. 19, 2018, 3:20 p.m.)
> 
> 
> Review request for mesos, Benjamin Bannier, Benjamin Mahler, Michael Park, 
> and Till Toenshoff.
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Since the master now upgrades resources to the post-reservation-refinement
> format before authorization (see MESOS-7735), the authorization logic in the
> master becomes outdated. This patch cleans it up.
> 
> 
> Diffs
> -
> 
>   src/master/master.cpp b4faf2b077a0288ba36195b7a21402932489d316 
> 
> 
> Diff: https://reviews.apache.org/r/69588/diff/2/
> 
> 
> Testing
> ---
> 
> sudo make check
> 
> 
> Thanks,
> 
> Chun-Hung Hsiao
> 
>



Re: Review Request 69755: Moved flags and constants in MesosContainerLaunch into header.

2019-01-15 Thread Mesos Reviewbot Windows

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69755/#review212039
---



FAIL: Some of the unit tests failed. Please check the relevant logs.

Reviews applied: `['69750', '69751', '69752', '69753', '69754', '69755']`

Failed command: `Start-MesosCITesting`

All the build artifacts available at: 
http://dcos-win.westus2.cloudapp.azure.com/artifacts/mesos-reviewbot-testing/2785/mesos-review-69755

Relevant logs:

- 
[mesos-tests-cmake.log](http://dcos-win.westus2.cloudapp.azure.com/artifacts/mesos-reviewbot-testing/2785/mesos-review-69755/logs/mesos-tests-cmake.log):

```
 
d:\dcos\mesos\3rdparty\zookeeper-3.4.8\src\zookeeper-3.4.8\src\c\src\winport.c(205):
 warning C4716: 'pthread_cond_wait': must return a value 
[D:\DCOS\mesos\3rdparty\zookeeper-3.4.8\src\zookeeper-3.4.8-build\zookeeper.vcxproj]
 [D:\DCOS\mesos\3rdparty\zookeeper-3.4.8.vcxproj]
 
d:\dcos\mesos\3rdparty\zookeeper-3.4.8\src\zookeeper-3.4.8\src\c\src\cli.c(124):
 warning C4996: 'fopen': This function or variable may be unsafe. Consider 
using fopen_s instead. To disable deprecation, use _CRT_SECURE_NO_WARNINGS. See 
online help for details. 
[D:\DCOS\mesos\3rdparty\zookeeper-3.4.8\src\zookeeper-3.4.8-build\cli.vcxproj] 
[D:\DCOS\mesos\3rdparty\zookeeper-3.4.8.vcxproj]
 
d:\dcos\mesos\3rdparty\zookeeper-3.4.8\src\zookeeper-3.4.8\src\c\src\cli.c(128):
 warning C4267: 'initializing': conversion from 'size_t' to 'int', possible 
loss of data 
[D:\DCOS\mesos\3rdparty\zookeeper-3.4.8\src\zookeeper-3.4.8-build\cli.vcxproj] 
[D:\DCOS\mesos\3rdparty\zookeeper-3.4.8.vcxproj]
 
d:\dcos\mesos\3rdparty\zookeeper-3.4.8\src\zookeeper-3.4.8\src\c\src\cli.c(279):
 warning C4267: 'function': conversion from 'size_t' to 'int', possible loss of 
data 
[D:\DCOS\mesos\3rdparty\zookeeper-3.4.8\src\zookeeper-3.4.8-build\cli.vcxproj] 
[D:\DCOS\mesos\3rdparty\zookeeper-3.4.8.vcxproj]
 
d:\dcos\mesos\3rdparty\zookeeper-3.4.8\src\zookeeper-3.4.8\src\c\src\cli.c(301):
 warning C4267: 'initializing': conversion from 'size_t' to 'int', possible 
loss of data 
[D:\DCOS\mesos\3rdparty\zookeeper-3.4.8\src\zookeeper-3.4.8-build\cli.vcxproj] 
[D:\DCOS\mesos\3rdparty\zookeeper-3.4.8.vcxproj]
 
d:\dcos\mesos\3rdparty\zookeeper-3.4.8\src\zookeeper-3.4.8\src\c\src\cli.c(368):
 warning C4267: 'function': conversion from 'size_t' to 'int', possible loss of 
data 
[D:\DCOS\mesos\3rdparty\zookeeper-3.4.8\src\zookeeper-3.4.8-build\cli.vcxproj] 
[D:\DCOS\mesos\3rdparty\zookeeper-3.4.8.vcxproj]
 
d:\dcos\mesos\3rdparty\zookeeper-3.4.8\src\zookeeper-3.4.8\src\c\src\cli.c(372):
 warning C4267: 'function': conversion from 'size_t' to 'int', possible loss of 
data 
[D:\DCOS\mesos\3rdparty\zookeeper-3.4.8\src\zookeeper-3.4.8-build\cli.vcxproj] 
[D:\DCOS\mesos\3rdparty\zookeeper-3.4.8.vcxproj]
 
d:\dcos\mesos\3rdparty\zookeeper-3.4.8\src\zookeeper-3.4.8\src\c\src\cli.c(512):
 warning C4267: 'function': conversion from 'size_t' to 'int', possible loss of 
data 
[D:\DCOS\mesos\3rdparty\zookeeper-3.4.8\src\zookeeper-3.4.8-build\cli.vcxproj] 
[D:\DCOS\mesos\3rdparty\zookeeper-3.4.8.vcxproj]
 
d:\dcos\mesos\3rdparty\zookeeper-3.4.8\src\zookeeper-3.4.8\src\c\src\cli.c(543):
 warning C4996: 'strcpy': This function or variable may be unsafe. Consider 
using strcpy_s instead. To disable deprecation, use _CRT_SECURE_NO_WARNINGS. 
See online help for details. 
[D:\DCOS\mesos\3rdparty\zookeeper-3.4.8\src\zookeeper-3.4.8-build\cli.vcxproj] 
[D:\DCOS\mesos\3rdparty\zookeeper-3.4.8.vcxproj]
 
d:\dcos\mesos\3rdparty\zookeeper-3.4.8\src\zookeeper-3.4.8\src\c\src\cli.c(548):
 warning C4996: 'fopen': This function or variable may be unsafe. Consider 
using fopen_s instead. To disable deprecation, use _CRT_SECURE_NO_WARNINGS. See 
online help for details. 
[D:\DCOS\mesos\3rdparty\zookeeper-3.4.8\src\zookeeper-3.4.8-build\cli.vcxproj] 
[D:\DCOS\mesos\3rdparty\zookeeper-3.4.8.vcxproj]
 
d:\dcos\mesos\3rdparty\zookeeper-3.4.8\src\zookeeper-3.4.8\src\c\src\cli.c(569):
 warning C4996: 'strcpy': This function or variable may be unsafe. Consider 
using strcpy_s instead. To disable deprecation, use _CRT_SECURE_NO_WARNINGS. 
See online help for details. 
[D:\DCOS\mesos\3rdparty\zookeeper-3.4.8\src\zookeeper-3.4.8-build\cli.vcxproj] 
[D:\DCOS\mesos\3rdparty\zookeeper-3.4.8.vcxproj]


   "D:\DCOS\mesos\src\tests\mesos-tests.vcxproj" (default target) (1) ->
   "D:\DCOS\mesos\src\mesos-hdfs.vcxproj" (default target) (19) ->
   (ClCompile target) -> 
 d:\dcos\mesos\mesos\src\uri\utils.cpp : fatal error C1083: Cannot open 
precompiled header file: 'D:\DCOS\mesos\src\Debug\cotire\mesos_CXX_prefix.pch': 
No such file or directory [D:\DCOS\mesos\src\mesos-hdfs.vcxproj]


   "D:\DCOS\mesos\src\tests\mesos-tests.vcxproj" (default target) (1) ->
   

Re: Review Request 69708: Fixed gRPC CMake build issue on Ubuntu 14.04.

2019-01-15 Thread Joseph Wu

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69708/#review212038
---


Ship it!




Ship It!

- Joseph Wu


On Jan. 15, 2019, 3:13 p.m., Chun-Hung Hsiao wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/69708/
> ---
> 
> (Updated Jan. 15, 2019, 3:13 p.m.)
> 
> 
> Review request for mesos, Benjamin Bannier, BenjaminVW BenjaminVW, Gastón 
> Kleiman, Joseph Wu, Till Toenshoff, and Vinod Kone.
> 
> 
> Bugs: MESOS-9519
> https://issues.apache.org/jira/browse/MESOS-9519
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> This patch cherry-picks an unofficial commit to disable ALPN compilation
> if only OpenSSL 1.0.1 is on the system.
> 
> 
> Diffs
> -
> 
>   3rdparty/CMakeLists.txt 703808d063e4bba58f647b5d48b78724003bcc4e 
>   3rdparty/grpc-1.10.0.patch 655f00387a6b308b653b23053419ec05c8b22144 
>   3rdparty/grpc.md e06843c8b6038eb9fb809241686fd611d1daedc8 
> 
> 
> Diff: https://reviews.apache.org/r/69708/diff/6/
> 
> 
> Testing
> ---
> 
> OS=ubuntu:14.04 BUILDTOOL=cmake COMPILER=gcc CONFIGURATION='--verbose 
> --enable-libevent --enable-ssl' ENVIRONMENT='GLOG_v=1 MESOS_VERBOSE=1' 
> support/docker-build.sh
> OS=ubuntu:16.04 BUILDTOOL=cmake COMPILER=gcc CONFIGURATION='--verbose 
> --enable-libevent --enable-ssl' ENVIRONMENT='GLOG_v=1 MESOS_VERBOSE=1' 
> support/docker-build.sh (to make sure that it works with OpenSSL 1.0.2. Also 
> manually inspected that ALPN symbols are linked).
> Manually tested on Debian 9 to ensure that the CMake rule can find its 
> OpenSSL 1.1.0.
> 
> NOTE: The patch itself does not depend on r/69725, but the ubuntu:16.04 test 
> above does.
> 
> 
> Thanks,
> 
> Chun-Hung Hsiao
> 
>



Re: Review Request 69708: Fixed gRPC CMake build issue on Ubuntu 14.04.

2019-01-15 Thread Chun-Hung Hsiao

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69708/
---

(Updated Jan. 15, 2019, 11:13 p.m.)


Review request for mesos, Benjamin Bannier, BenjaminVW BenjaminVW, Gastón 
Kleiman, Joseph Wu, Till Toenshoff, and Vinod Kone.


Changes
---

Addressed Benjamin's comment.


Bugs: MESOS-9519
https://issues.apache.org/jira/browse/MESOS-9519


Repository: mesos


Description
---

This patch cherry-picks an unofficial commit to disable ALPN compilation
if only OpenSSL 1.0.1 is on the system.


Diffs (updated)
-

  3rdparty/CMakeLists.txt 703808d063e4bba58f647b5d48b78724003bcc4e 
  3rdparty/grpc-1.10.0.patch 655f00387a6b308b653b23053419ec05c8b22144 
  3rdparty/grpc.md e06843c8b6038eb9fb809241686fd611d1daedc8 


Diff: https://reviews.apache.org/r/69708/diff/6/

Changes: https://reviews.apache.org/r/69708/diff/5-6/


Testing
---

OS=ubuntu:14.04 BUILDTOOL=cmake COMPILER=gcc CONFIGURATION='--verbose 
--enable-libevent --enable-ssl' ENVIRONMENT='GLOG_v=1 MESOS_VERBOSE=1' 
support/docker-build.sh
OS=ubuntu:16.04 BUILDTOOL=cmake COMPILER=gcc CONFIGURATION='--verbose 
--enable-libevent --enable-ssl' ENVIRONMENT='GLOG_v=1 MESOS_VERBOSE=1' 
support/docker-build.sh (to make sure that it works with OpenSSL 1.0.2. Also 
manually inspected that ALPN symbols are linked).
Manually tested on Debian 9 to ensure that the CMake rule can find its OpenSSL 
1.1.0.

NOTE: The patch itself does not depend on r/69725, but the ubuntu:16.04 test 
above does.


Thanks,

Chun-Hung Hsiao



Review Request 69752: Split scheduler::Call validation into a separate file.

2019-01-15 Thread Joseph Wu

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69752/
---

Review request for mesos, Benjamin Bannier and Till Toenshoff.


Repository: mesos


Description
---

This validation function is used by both the Scheduler and Master
while sending or receiving Calls, respectively.  By moving the function
into a separate file, we can build the scheduler library without
including all the other master validation logic.


Diffs
-

  src/CMakeLists.txt a574d449dc26b820cbef7ff0b5e94b42b6fe86cf 
  src/Makefile.am cd785255fcdf1302a8f9fa358039e5d1f200e132 
  src/java/jni/org_apache_mesos_v1_scheduler_V0Mesos.cpp 
4e4dd54007d9a4da88acc9e38bd48d1e599d74d3 
  src/master/http.cpp 012ee4f205ab3f2678b058fb0f3ebdac4336eca4 
  src/master/master.cpp 2339207149a85578ea47cf66f28392182f9075f2 
  src/master/validation.hpp a5bdff6a9301631dac9970568a1346460939c861 
  src/master/validation.cpp a71edeb7827b534e27ad3e3398abe555bf36e741 
  src/scheduler/scheduler.cpp cb24ba9c8e1d04b8c62bdf07b12758a61b3bf036 
  src/scheduler/validation.hpp PRE-CREATION 
  src/scheduler/validation.cpp PRE-CREATION 


Diff: https://reviews.apache.org/r/69752/diff/1/


Testing
---

cmake --build .


Thanks,

Joseph Wu



Review Request 69754: CMake: Changed mesos-execute to link against a smaller library.

2019-01-15 Thread Joseph Wu

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69754/
---

Review request for mesos, Benjamin Bannier and Till Toenshoff.


Repository: mesos


Description
---

This also separates the HDFS library into a separate target, because
this will be used by the mesos-execute scheduler and a couple other
components in future.


Diffs
-

  src/CMakeLists.txt a574d449dc26b820cbef7ff0b5e94b42b6fe86cf 
  src/cli/CMakeLists.txt e99b055ebb0fc0a97955bcb71c74bd9a8c6923ed 


Diff: https://reviews.apache.org/r/69754/diff/1/


Testing
---

cmake --build . --target mesos-hdfs
cmake --build . --target mesos-execute


Thanks,

Joseph Wu



Review Request 69755: Moved flags and constants in MesosContainerLaunch into header.

2019-01-15 Thread Joseph Wu

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69755/
---

Review request for mesos, Benjamin Bannier and Till Toenshoff.


Repository: mesos


Description
---

This removes the need to link 'src/containerizer/mesos/launch.cpp'
with the components (MesosContainerizer actor and Command Executor
binary) that use the 'mesos-containerizer' helper binary.


Diffs
-

  src/launcher/executor.cpp f962e800f23d5582b1bc04a263253893492a5054 
  src/slave/containerizer/mesos/containerizer.cpp 
5016f2e9f0651abcb0a5f364e8eace458f2edeae 
  src/slave/containerizer/mesos/launch.hpp 
0a6394d56321948ad760ac69c05456319a254842 
  src/slave/containerizer/mesos/launch.cpp 
2f1c9e7a8748c9d7eab25bc8567ca68308e680f9 
  src/tests/containerizer/port_mapping_tests.cpp 
b511016fa7cd80d2ffee1747e5e463cc1b2d56bb 


Diff: https://reviews.apache.org/r/69755/diff/1/


Testing
---

cmake --build .


Thanks,

Joseph Wu



Review Request 69753: CMake: Split part of libmesos into several smaller libraries.

2019-01-15 Thread Joseph Wu

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69753/
---

Review request for mesos, Benjamin Bannier and Till Toenshoff.


Repository: mesos


Description
---

This incrementally splits the gigantic libmesos into some logical
chunks that can be linked to independently, depending on the scope
of the component.

The goal is that all components eventually link to "libmesos-common";
which only contains the 3rdparty libraries, libprocess, the messages
passed between components, and some utilities/helpers; instead of
linking to everything.

This commit also introduces the following two libraries:
  * libmesos-detector: Code used to detect the Mesos master.  Used by
the Master, Agent, and Scheduler.
  * libmesos-scheduler: Code used by native Schedulers.


Diffs
-

  src/CMakeLists.txt a574d449dc26b820cbef7ff0b5e94b42b6fe86cf 


Diff: https://reviews.apache.org/r/69753/diff/1/


Testing
---

cmake --built . --target mesos-common
cmake --built . --target mesos-detector
cmake --built . --target mesos-scheduler


Thanks,

Joseph Wu



Review Request 69751: CMake: Disabled "local" clusters in the scheduler libraries.

2019-01-15 Thread Joseph Wu

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69751/
---

Review request for mesos, Benjamin Bannier and Till Toenshoff.


Repository: mesos


Description
---

This removes the ability to specify a "local" master in the scheduler
libraries, which subsequently spawns a master and agent.
The consequence of the local cluster is that the scheduler library
is currently required to include the entire master and agent source
tree to instantiate a master/agent in the process.  This dramatically
increases the build cost of this library, but only provides an
oft-unused bit of functionality.

The example framework tests are disabled temporarily because these
tests require the "local" cluster feature.


Diffs
-

  src/sched/sched.cpp e77a02951831a7e0c5d9a9068f8d014cb1478382 
  src/scheduler/scheduler.cpp cb24ba9c8e1d04b8c62bdf07b12758a61b3bf036 
  src/tests/CMakeLists.txt 7a1265e38e5e3b3a102862d231fc78adae14c4fe 


Diff: https://reviews.apache.org/r/69751/diff/1/


Testing
---

cmake --build .

The longer term fix for the examples tests is to rewrite them.  The example 
tests currently use the `TEST_SCRIPT` macro, which launches a script, which 
launches the framework.  Instead, we can write a normal unit test (MesosTest) 
which launches the framework directly.  The test itself can launch any required 
masters/agents.  A rewrite like this would have the benefit of greater control 
over the example tests, which currently only check if the script succeeds.

Once done, we can consider deleting the "local" cluster code entirely.


Thanks,

Joseph Wu



Re: Review Request 69721: Broke up `SSLTest.ProtocolMismatch` into smaller tests.

2019-01-15 Thread Till Toenshoff via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69721/#review212036
---


Ship it!




Ship It!

- Till Toenshoff


On Jan. 14, 2019, 7:54 p.m., Benjamin Bannier wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/69721/
> ---
> 
> (Updated Jan. 14, 2019, 7:54 p.m.)
> 
> 
> Review request for mesos and Till Toenshoff.
> 
> 
> Bugs: MESOS-5189
> https://issues.apache.org/jira/browse/MESOS-5189
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> This patch breaks up the different iterations in
> `SSLTest.ProtocolMismatch` into smaller tests
> `SSLProtocol/SSLProtocolTest.Mismatch/N`. While this slightly
> increases the time to run all combinations sequentially, it leads to
> much shorter aggregated run time in parallel test execution.
> 
> 
> Diffs
> -
> 
>   3rdparty/libprocess/src/tests/ssl_tests.cpp 
> bfb5eab255230b7fbfdefaa62be27a8f8c4e9517 
> 
> 
> Diff: https://reviews.apache.org/r/69721/diff/2/
> 
> 
> Testing
> ---
> 
> `make check`
> 
> 
> Thanks,
> 
> Benjamin Bannier
> 
>



Re: Review Request 69720: Made `SSLTest` an unparameterized test suite.

2019-01-15 Thread Till Toenshoff via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69720/#review212037
---


Ship it!




Ship It!

- Till Toenshoff


On Jan. 14, 2019, 7:54 p.m., Benjamin Bannier wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/69720/
> ---
> 
> (Updated Jan. 14, 2019, 7:54 p.m.)
> 
> 
> Review request for mesos and Till Toenshoff.
> 
> 
> Bugs: MESOS-5189
> https://issues.apache.org/jira/browse/MESOS-5189
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> This general and reusable testing class is made unparameterized since
> deriving from a parameterized test is cumbersome (e.g., requires manual
> disambiguation of possibly private types and methods inherited from the
> respective `::testing::WithParamInterface` base class).
> 
> This patch makes `SSLTest` an unparameterized test suite which eases
> reuse. We are able to remove a number of redundant test
> parameterizations along the way.
> 
> 
> Diffs
> -
> 
>   3rdparty/libprocess/include/process/ssl/gtest.hpp 
> 23d7aee963b6fb489403a94500d39e3413c7fcdd 
>   3rdparty/libprocess/src/tests/ssl_tests.cpp 
> bfb5eab255230b7fbfdefaa62be27a8f8c4e9517 
> 
> 
> Diff: https://reviews.apache.org/r/69720/diff/2/
> 
> 
> Testing
> ---
> 
> `make check`
> 
> 
> Thanks,
> 
> Benjamin Bannier
> 
>



Re: Review Request 69750: CMake: Changed linkage of mesos-tcp-connect.

2019-01-15 Thread Joseph Wu


> On Jan. 15, 2019, 12:53 p.m., Benjamin Bannier wrote:
> > Is this something we should create a JIRA for? I am asking since it could 
> > have a noticable impact on check performance due to faster loading and a 
> > candidate for (future) backporting.

Assuming we use the CMake build when deploying Mesos agents, then yes.  At the 
moment, most deployments use the autotools build, so this change wouldn't have 
any impact on performance.


> On Jan. 15, 2019, 12:53 p.m., Benjamin Bannier wrote:
> > src/checks/CMakeLists.txt
> > Line 20 (original), 20 (patched)
> > 
> >
> > While `tcp_connect.cpp` depends on `stout`, we seem to usually rely on 
> > the deps we declare to pull in other deps for us.

Yes.  I'll be making a few more changes which push various targets up the build 
chain (i.e. fewer dependencies).  In general, I will be relying on CMake to 
transitively pull in dependencies of dependencies where it makes sense to.


- Joseph


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69750/#review212033
---


On Jan. 15, 2019, 12:01 p.m., Joseph Wu wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/69750/
> ---
> 
> (Updated Jan. 15, 2019, 12:01 p.m.)
> 
> 
> Review request for mesos, Benjamin Bannier and Till Toenshoff.
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> This binary does not need to be linked to libmesos, as it
> only uses some basic stout/libprocess headers.
> 
> 
> Diffs
> -
> 
>   src/checks/CMakeLists.txt a3a6aed18429979cd43b9d8a7be9753f4bd3feba 
> 
> 
> Diff: https://reviews.apache.org/r/69750/diff/1/
> 
> 
> Testing
> ---
> 
> cmake --build . --target mesos-tcp-connect
> 
> 
> Thanks,
> 
> Joseph Wu
> 
>



Re: Review Request 69708: Fixed gRPC CMake build issue on Ubuntu 14.04.

2019-01-15 Thread Chun-Hung Hsiao


> On Jan. 15, 2019, 8:29 p.m., Benjamin Bannier wrote:
> > 3rdparty/grpc-1.10.0.patch
> > Lines 33 (patched)
> > 
> >
> > Not related to this patch, but I wonder how this was needed given that 
> > we didn't patch outside of `WIN32` before.

The patch originally had only `WIN32` changes, so only applied in 
`CMakeList.txt`. Then this commit was introduced for Autotools compilations, so 
it was applied in `Makefile.am` as well. But I forgot to do the corresponding 
adjustment in `CMakeList.txt` because it was not necessary until now.


> On Jan. 15, 2019, 8:29 p.m., Benjamin Bannier wrote:
> > 3rdparty/grpc-1.10.0.patch
> > Lines 64 (patched)
> > 
> >
> > Does this actually describe what this patch intends to do? You could 
> > also expand in the commit message body to make it easier to forward port 
> > this should we ever bump to a more recent bundled grpc version.

Yes. "fallbacking" meaning fallbacking to NPN. The most recent gRPC still 
doesn't have such logic in CMake, and the corresponding logic in its Makefile 
was replaced with a user-specified make variable.


- Chun-Hung


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69708/#review212032
---


On Jan. 14, 2019, 9:47 p.m., Chun-Hung Hsiao wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/69708/
> ---
> 
> (Updated Jan. 14, 2019, 9:47 p.m.)
> 
> 
> Review request for mesos, Benjamin Bannier, BenjaminVW BenjaminVW, Gastón 
> Kleiman, Joseph Wu, Till Toenshoff, and Vinod Kone.
> 
> 
> Bugs: MESOS-9519
> https://issues.apache.org/jira/browse/MESOS-9519
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> This patch cherry-picks an unofficial commit to disable ALPN compilation
> if only OpenSSL 1.0.1 is on the system.
> 
> 
> Diffs
> -
> 
>   3rdparty/CMakeLists.txt 703808d063e4bba58f647b5d48b78724003bcc4e 
>   3rdparty/grpc-1.10.0.patch 655f00387a6b308b653b23053419ec05c8b22144 
>   3rdparty/grpc.md e06843c8b6038eb9fb809241686fd611d1daedc8 
> 
> 
> Diff: https://reviews.apache.org/r/69708/diff/5/
> 
> 
> Testing
> ---
> 
> OS=ubuntu:14.04 BUILDTOOL=cmake COMPILER=gcc CONFIGURATION='--verbose 
> --enable-libevent --enable-ssl' ENVIRONMENT='GLOG_v=1 MESOS_VERBOSE=1' 
> support/docker-build.sh
> OS=ubuntu:16.04 BUILDTOOL=cmake COMPILER=gcc CONFIGURATION='--verbose 
> --enable-libevent --enable-ssl' ENVIRONMENT='GLOG_v=1 MESOS_VERBOSE=1' 
> support/docker-build.sh (to make sure that it works with OpenSSL 1.0.2. Also 
> manually inspected that ALPN symbols are linked).
> Manually tested on Debian 9 to ensure that the CMake rule can find its 
> OpenSSL 1.1.0.
> 
> NOTE: The patch itself does not depend on r/69725, but the ubuntu:16.04 test 
> above does.
> 
> 
> Thanks,
> 
> Chun-Hung Hsiao
> 
>



Re: Review Request 69750: CMake: Changed linkage of mesos-tcp-connect.

2019-01-15 Thread Benjamin Bannier

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69750/#review212033
---


Ship it!




Is this something we should create a JIRA for? I am asking since it could have 
a noticable impact on check performance due to faster loading and a candidate 
for (future) backporting.


src/checks/CMakeLists.txt
Line 20 (original), 20 (patched)


While `tcp_connect.cpp` depends on `stout`, we seem to usually rely on the 
deps we declare to pull in other deps for us.


- Benjamin Bannier


On Jan. 15, 2019, 9:01 p.m., Joseph Wu wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/69750/
> ---
> 
> (Updated Jan. 15, 2019, 9:01 p.m.)
> 
> 
> Review request for mesos, Benjamin Bannier and Till Toenshoff.
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> This binary does not need to be linked to libmesos, as it
> only uses some basic stout/libprocess headers.
> 
> 
> Diffs
> -
> 
>   src/checks/CMakeLists.txt a3a6aed18429979cd43b9d8a7be9753f4bd3feba 
> 
> 
> Diff: https://reviews.apache.org/r/69750/diff/1/
> 
> 
> Testing
> ---
> 
> cmake --build . --target mesos-tcp-connect
> 
> 
> Thanks,
> 
> Joseph Wu
> 
>



Re: Review Request 69708: Fixed gRPC CMake build issue on Ubuntu 14.04.

2019-01-15 Thread Benjamin Bannier

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69708/#review212032
---


Fix it, then Ship it!




LGTM.


3rdparty/grpc-1.10.0.patch
Lines 33 (patched)


Not related to this patch, but I wonder how this was needed given that we 
didn't patch outside of `WIN32` before.



3rdparty/grpc-1.10.0.patch
Lines 64 (patched)


Does this actually describe what this patch intends to do? You could also 
expand in the commit message body to make it easier to forward port this should 
we ever bump to a more recent bundled grpc version.


- Benjamin Bannier


On Jan. 14, 2019, 10:47 p.m., Chun-Hung Hsiao wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/69708/
> ---
> 
> (Updated Jan. 14, 2019, 10:47 p.m.)
> 
> 
> Review request for mesos, Benjamin Bannier, BenjaminVW BenjaminVW, Gastón 
> Kleiman, Joseph Wu, Till Toenshoff, and Vinod Kone.
> 
> 
> Bugs: MESOS-9519
> https://issues.apache.org/jira/browse/MESOS-9519
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> This patch cherry-picks an unofficial commit to disable ALPN compilation
> if only OpenSSL 1.0.1 is on the system.
> 
> 
> Diffs
> -
> 
>   3rdparty/CMakeLists.txt 703808d063e4bba58f647b5d48b78724003bcc4e 
>   3rdparty/grpc-1.10.0.patch 655f00387a6b308b653b23053419ec05c8b22144 
>   3rdparty/grpc.md e06843c8b6038eb9fb809241686fd611d1daedc8 
> 
> 
> Diff: https://reviews.apache.org/r/69708/diff/5/
> 
> 
> Testing
> ---
> 
> OS=ubuntu:14.04 BUILDTOOL=cmake COMPILER=gcc CONFIGURATION='--verbose 
> --enable-libevent --enable-ssl' ENVIRONMENT='GLOG_v=1 MESOS_VERBOSE=1' 
> support/docker-build.sh
> OS=ubuntu:16.04 BUILDTOOL=cmake COMPILER=gcc CONFIGURATION='--verbose 
> --enable-libevent --enable-ssl' ENVIRONMENT='GLOG_v=1 MESOS_VERBOSE=1' 
> support/docker-build.sh (to make sure that it works with OpenSSL 1.0.2. Also 
> manually inspected that ALPN symbols are linked).
> Manually tested on Debian 9 to ensure that the CMake rule can find its 
> OpenSSL 1.1.0.
> 
> NOTE: The patch itself does not depend on r/69725, but the ubuntu:16.04 test 
> above does.
> 
> 
> Thanks,
> 
> Chun-Hung Hsiao
> 
>



Re: Review Request 69719: Exposed subscriptions and disconnections RP manager metrics.

2019-01-15 Thread Chun-Hung Hsiao


> On Jan. 14, 2019, 11:55 p.m., Chun-Hung Hsiao wrote:
> > src/resource_provider/manager.cpp
> > Lines 823-832 (original), 827-836 (patched)
> > 
> >
> > How about moving this snippt to the end of this function, so the agent 
> > would receive `SUBSCRIBE` and then `DISCONNECT` no matter if the RP has a 
> > connection problem after it subscribes? This would also make the metric 
> > reveal the connection problem.
> 
> Benjamin Bannier wrote:
> This is supposed to be a counter for successful subscriptions, see below.

The problem I describe still remains. Moving the `return` statement around 
could help us to capture the disconnection in the metrics. WDYT?


- Chun-Hung


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69719/#review211971
---


On Jan. 15, 2019, 9:27 a.m., Benjamin Bannier wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/69719/
> ---
> 
> (Updated Jan. 15, 2019, 9:27 a.m.)
> 
> 
> Review request for mesos, Chun-Hung Hsiao and James DeFelice.
> 
> 
> Bugs: MESOS-9223
> https://issues.apache.org/jira/browse/MESOS-9223
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> This patch adds monotonically increasing counters for subscriptions and
> disconnections of resource providers with the resource provider manager
> (`resource_provider_manager/subscriptions` and
> `resource_provider_manager/disconnections`, respectively). While the
> existing gauge `resource_provider_manager/subscribed` exposed the
> current state, these added counters can be used to monitor resource
> provider state for unexpected events.
> 
> 
> Diffs
> -
> 
>   src/resource_provider/manager.cpp 65852c629393f32fd582bfcff86d7ce14e5386ac 
>   src/tests/resource_provider_manager_tests.cpp 
> 455ce7d2c71f2815430b69a5475b2ccc343cd9af 
> 
> 
> Diff: https://reviews.apache.org/r/69719/diff/2/
> 
> 
> Testing
> ---
> 
> `make check`
> 
> 
> Thanks,
> 
> Benjamin Bannier
> 
>



Re: Review Request 69719: Exposed subscriptions and disconnections RP manager metrics.

2019-01-15 Thread Mesos Reviewbot Windows

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69719/#review212031
---



PASS: Mesos patch 69719 was successfully built and tested.

Reviews applied: `['69606', '69719']`

All the build artifacts available at: 
http://dcos-win.westus2.cloudapp.azure.com/artifacts/mesos-reviewbot-testing/2781/mesos-review-69719

- Mesos Reviewbot Windows


On Jan. 15, 2019, 1:27 a.m., Benjamin Bannier wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/69719/
> ---
> 
> (Updated Jan. 15, 2019, 1:27 a.m.)
> 
> 
> Review request for mesos, Chun-Hung Hsiao and James DeFelice.
> 
> 
> Bugs: MESOS-9223
> https://issues.apache.org/jira/browse/MESOS-9223
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> This patch adds monotonically increasing counters for subscriptions and
> disconnections of resource providers with the resource provider manager
> (`resource_provider_manager/subscriptions` and
> `resource_provider_manager/disconnections`, respectively). While the
> existing gauge `resource_provider_manager/subscribed` exposed the
> current state, these added counters can be used to monitor resource
> provider state for unexpected events.
> 
> 
> Diffs
> -
> 
>   src/resource_provider/manager.cpp 65852c629393f32fd582bfcff86d7ce14e5386ac 
>   src/tests/resource_provider_manager_tests.cpp 
> 455ce7d2c71f2815430b69a5475b2ccc343cd9af 
> 
> 
> Diff: https://reviews.apache.org/r/69719/diff/2/
> 
> 
> Testing
> ---
> 
> `make check`
> 
> 
> Thanks,
> 
> Benjamin Bannier
> 
>



Re: Review Request 69719: Exposed subscriptions and disconnections RP manager metrics.

2019-01-15 Thread Chun-Hung Hsiao


> On Jan. 14, 2019, 11:55 p.m., Chun-Hung Hsiao wrote:
> > src/resource_provider/manager.cpp
> > Lines 1004-1005 (patched)
> > 
> >
> > How about the following, to make it consistent with, e.g., 
> > `master/messages_register_framework`, and also make it less confusing w/ 
> > `resource_provider_manager/subscribed`?
> > ```
> > messages_subscribe("resource_provider_manager/messages_subscribe"),
> > messages_disconnect("resource_provider_manager/messages_disconnect")
> > ```
> 
> Benjamin Bannier wrote:
> These are not really counters for messages as disconnections have no 
> relation to messages the RP manager receives, and the subscriptions counter 
> is supposed to count successfull connections (not just all subscription 
> attempts (this allows correlating subscription and disconnection counters). 
> What do you think about naming them e.g., 
> `resource_provider_manager/events/[subscribe|disconnect]`? We could introduce 
> a separate counter for subscription messages later.

Hmm I was actually thinking about the number of `ResourceProviderMessage` 
enqueued by the RP manager (and received by the agent actor). But yeah the 
names I suggested here are confusing, as you pointed out that one might think 
they're messages received by the RP manager. (I also confused myself with 
`master/messages_register_framework`.)

Not sure if `events` would make it better. I'm fine with keeping it as-is or go 
with `events` now. Dropping.


- Chun-Hung


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69719/#review211971
---


On Jan. 15, 2019, 9:27 a.m., Benjamin Bannier wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/69719/
> ---
> 
> (Updated Jan. 15, 2019, 9:27 a.m.)
> 
> 
> Review request for mesos, Chun-Hung Hsiao and James DeFelice.
> 
> 
> Bugs: MESOS-9223
> https://issues.apache.org/jira/browse/MESOS-9223
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> This patch adds monotonically increasing counters for subscriptions and
> disconnections of resource providers with the resource provider manager
> (`resource_provider_manager/subscriptions` and
> `resource_provider_manager/disconnections`, respectively). While the
> existing gauge `resource_provider_manager/subscribed` exposed the
> current state, these added counters can be used to monitor resource
> provider state for unexpected events.
> 
> 
> Diffs
> -
> 
>   src/resource_provider/manager.cpp 65852c629393f32fd582bfcff86d7ce14e5386ac 
>   src/tests/resource_provider_manager_tests.cpp 
> 455ce7d2c71f2815430b69a5475b2ccc343cd9af 
> 
> 
> Diff: https://reviews.apache.org/r/69719/diff/2/
> 
> 
> Testing
> ---
> 
> `make check`
> 
> 
> Thanks,
> 
> Benjamin Bannier
> 
>



Review Request 69749: CMake: Cleaned up a block of platform-specific sources.

2019-01-15 Thread Joseph Wu

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69749/
---

Review request for mesos, Benjamin Bannier and Till Toenshoff.


Repository: mesos


Description
---

This change slightly modifies Posix and Windows builds.

The following isolators were included in all platforms, but the sources
were excluded from Windows builds:
  * environment_secret
  * filesystem/posix
  * posix/disk
  * posix/rlimits
  * volume/sandbox_path
These sources are now included in all platforms.

The `docker/volume` and `network/cni` sources were built on Posix,
but were only used on Linux.  These sources and targets are now excluded
on Posix.


Diffs
-

  src/CMakeLists.txt a574d449dc26b820cbef7ff0b5e94b42b6fe86cf 
  src/slave/CMakeLists.txt f42a4a6bda7ebb0af05df051bd5f07fbef33bf6d 
  src/slave/containerizer/mesos/CMakeLists.txt 
92f4a49fdab43bb6767021e0332e2eaa205b0fec 


Diff: https://reviews.apache.org/r/69749/diff/1/


Testing
---

cmake --build . --target tests

On OSX and Windows.


Thanks,

Joseph Wu



Review Request 69750: CMake: Changed linkage of mesos-tcp-connect.

2019-01-15 Thread Joseph Wu

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69750/
---

Review request for mesos, Benjamin Bannier and Till Toenshoff.


Repository: mesos


Description
---

This binary does not need to be linked to libmesos, as it
only uses some basic stout/libprocess headers.


Diffs
-

  src/checks/CMakeLists.txt a3a6aed18429979cd43b9d8a7be9753f4bd3feba 


Diff: https://reviews.apache.org/r/69750/diff/1/


Testing
---

cmake --build . --target mesos-tcp-connect


Thanks,

Joseph Wu



Re: Review Request 68016: Added libseccomp to the build.

2019-01-15 Thread Gilbert Song

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/68016/#review212028
---


Ship it!




Ship It!

- Gilbert Song


On Nov. 8, 2018, 7:23 a.m., Andrei Budnik wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/68016/
> ---
> 
> (Updated Nov. 8, 2018, 7:23 a.m.)
> 
> 
> Review request for mesos, Andrew Schwartzmeyer, Gilbert Song, Jie Yu, James 
> Peach, and Qian Zhang.
> 
> 
> Bugs: MESOS-9032
> https://issues.apache.org/jira/browse/MESOS-9032
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> This library is needed to implement Seccomp syscall filtering in the
> Mesos containerizer. This patch introduces `seccomp-isolator` build
> flag, which is used to include or exclude sources related to Seccomp
> from the build. Since Seccomp is a Linux-specific feature, the flag
> is disabled by default. Enabling `seccomp-isolator` means either:
> 
> 1. Compiling and linking against the bundled version of libseccomp from
>sources (default).
> 
> 2. Linking against the libseccomp installed in the OS,
>if `--with-libseccomp` build flag is provided.
> 
> 
> Diffs
> -
> 
>   3rdparty/CMakeLists.txt 703808d063e4bba58f647b5d48b78724003bcc4e 
>   3rdparty/Makefile.am 99270f048573900cf41d0c62cfe3488b83d71820 
>   3rdparty/cmake/FindLIBSECCOMP.cmake PRE-CREATION 
>   3rdparty/cmake/Versions.cmake 69fc594ec5ba2887b20b88ec0767a5d801411411 
>   3rdparty/versions.am 99ef92087f6958d83ba415e84db5cbbb0c597573 
>   cmake/CompilationConfigure.cmake 2485a8a580dcc2ad9b026e389b6525ef3a19f98e 
>   configure.ac 6778f119570def1838e26cddf7b0192bfe6e37d4 
>   src/CMakeLists.txt a574d449dc26b820cbef7ff0b5e94b42b6fe86cf 
>   src/Makefile.am 188a47017221a931d8b965a4af5e033b77e6ce4e 
>   src/python/native_common/ext_modules.py.in 
> 1f2e6c131d18e3e2fbc2e865c4698c83e73b87ba 
> 
> 
> Diff: https://reviews.apache.org/r/68016/diff/15/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Andrei Budnik
> 
>



Re: Review Request 69588: Removed outdated authorization logic for offer operations.

2019-01-15 Thread Benjamin Mahler

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69588/#review212027
---




src/master/master.cpp
Lines 3604-3605 (original), 3604-3605 (patched)


Isn't the empty case valid? Also, use `Resources::reservationRole`?



src/master/master.cpp
Line 3662 (original), 3664 (patched)


The `reservations.empty()` case is actually an invalid call right? There's 
nothing to unreserve.

Should we be CHECKing that here or is that validated after this point today?



src/master/master.cpp
Lines 3723-3724 (original), 3723-3724 (patched)


Is the empty case valid?

Also, use `Resources::reservationRole`?



src/master/master.cpp
Line 3818 (original), 3826-3829 (patched)


Can you use `Resources::reservationRole` for this? I guess `*` is invalid 
for persistent volumes but we don't care here?



src/master/master.cpp
Lines 3877-3880 (original), 3892-3895 (patched)


ditto `Resources::reservationRole` and `*` being invalid, but I guess we 
don't care here?



src/master/master.cpp
Lines 3940-3943 (original), 3959-3962 (patched)


Ditto `Resources::reservationRole` and `*` being invalid (but I guess we 
don't care here?)


- Benjamin Mahler


On Dec. 19, 2018, 11:20 p.m., Chun-Hung Hsiao wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/69588/
> ---
> 
> (Updated Dec. 19, 2018, 11:20 p.m.)
> 
> 
> Review request for mesos, Benjamin Bannier, Benjamin Mahler, Michael Park, 
> and Till Toenshoff.
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Since the master now upgrades resources to the post-reservation-refinement
> format before authorization (see MESOS-7735), the authorization logic in the
> master becomes outdated. This patch cleans it up.
> 
> 
> Diffs
> -
> 
>   src/master/master.cpp b4faf2b077a0288ba36195b7a21402932489d316 
> 
> 
> Diff: https://reviews.apache.org/r/69588/diff/2/
> 
> 
> Testing
> ---
> 
> sudo make check
> 
> 
> Thanks,
> 
> Chun-Hung Hsiao
> 
>



Re: Review Request 69082: Correctly propagated `close` failures in some instances.

2019-01-15 Thread Benjamin Mahler

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69082/#review212026
---



Should we be surfacing a close EINTR as an error or let that be silent?
I think these errors need some message pre-fixing? E.g.

```
Failed to close '3': Bad file number
```

As it stands the error messages will only show "Bad file number" and it won't 
be clear if the write or close produced this? It's also an issue here between 
open/write/flush but would be great to resolve this now.


3rdparty/stout/include/stout/os/write.hpp
Line 129 (original), 129 (patched)


Looks to me like s/result/write would read better?

```
  Try write = os::write(fd.get(), message);
  
  ...
  
  Try close = os::close(fd.get());
  
  // Propagate `close` errors if `write` succeeded.
  if (close.isError() && !write.isError()) {
return close;
  }
  
  return write;
```

Ditto for the others



3rdparty/stout/include/stout/os/write.hpp
Line 140 (original), 140 (patched)


newline above this to be consistent with the rest of the formatting in this 
function? ditto for the others


- Benjamin Mahler


On Jan. 15, 2019, 12:24 a.m., Benjamin Bannier wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/69082/
> ---
> 
> (Updated Jan. 15, 2019, 12:24 a.m.)
> 
> 
> Review request for mesos, Benjamin Mahler, Chun-Hung Hsiao, and Alex Clemmer.
> 
> 
> Bugs: MESOS-9331
> https://issues.apache.org/jira/browse/MESOS-9331
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> When e.g., writing to disk, errors from write might only manifest at
> ::close time. We update some instances to correctly propagate these
> errors instead of dropping them silently. We only propagate the
> ::close error if the write operation succeeded, otherwise we just
> propagate the error from the write operation.
> 
> 
> Diffs
> -
> 
>   3rdparty/stout/include/stout/os/posix/mktemp.hpp 
> 63b3d1a7720d07f877fa1d4eb7f32a548916637a 
>   3rdparty/stout/include/stout/os/write.hpp 
> f7538f94f5a953a7a90a05bc1d2f138b6c17f814 
>   3rdparty/stout/include/stout/protobuf.hpp 
> eb4adef56f1701e3c101284e05e4e6c66eef9180 
> 
> 
> Diff: https://reviews.apache.org/r/69082/diff/2/
> 
> 
> Testing
> ---
> 
> `make check`
> 
> 
> Thanks,
> 
> Benjamin Bannier
> 
>



Re: Review Request 68017: Added Seccomp-related protobuf messages.

2019-01-15 Thread Andrei Budnik


> On Jan. 14, 2019, 11:38 p.m., Gilbert Song wrote:
> > include/mesos/seccomp/seccomp.proto
> > Lines 21-28 (patched)
> > 
> >
> > This is my fault. Sorry, Andrei!
> > 
> > Would you mind moving this .proto to /src/seccomp again? The previous 
> > suggestion was not appropriate after the second thought because:
> > 
> > proto files under /include:
> > They are public protobuf files that may be used by external/3th party 
> > modules, while some of the proto files like v1/v2.proto and fetcher.proto 
> > should be moved out from /include dir. this is a tech debt.
> > 
> > proto files under /src:
> > used for internal conversion.

I can't move it to `src/seccomp`, because `message ContainerLaunchInfo` (from 
`include/mesos/slave/containerizer.proto`) depends on it!


- Andrei


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/68017/#review211972
---


On Nov. 8, 2018, 3:24 p.m., Andrei Budnik wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/68017/
> ---
> 
> (Updated Nov. 8, 2018, 3:24 p.m.)
> 
> 
> Review request for mesos, Gilbert Song, Jie Yu, James Peach, and Qian Zhang.
> 
> 
> Bugs: MESOS-9033
> https://issues.apache.org/jira/browse/MESOS-9033
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> See summary.
> 
> 
> Diffs
> -
> 
>   include/mesos/mesos.proto 2ef6ba3aef67cf34227569948fd3ddc8dfd5879d 
>   include/mesos/seccomp/seccomp.hpp PRE-CREATION 
>   include/mesos/seccomp/seccomp.proto PRE-CREATION 
>   include/mesos/slave/containerizer.proto 
> 5b4dcdda0f55ea3355c78d1447c7be9ca54d9dc9 
>   include/mesos/v1/mesos.proto 1a701da65f653fe4191f92ff1fb1436809b50acb 
>   src/CMakeLists.txt a574d449dc26b820cbef7ff0b5e94b42b6fe86cf 
>   src/Makefile.am cd785255fcdf1302a8f9fa358039e5d1f200e132 
> 
> 
> Diff: https://reviews.apache.org/r/68017/diff/15/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Andrei Budnik
> 
>



Re: Review Request 68147: Added agent support to remove local resource providers.

2019-01-15 Thread Chun-Hung Hsiao


> On Aug. 17, 2018, 10:29 p.m., Chun-Hung Hsiao wrote:
> > src/slave/slave.cpp
> > Lines 8195 (patched)
> > 
> >
> > Let's validate that there is no task using the resources provided by 
> > this RP before doing the removal, or fail the call.
> 
> Benjamin Bannier wrote:
> I believe this should be treated as orthogonal issue. It might e.g., 
> happen that the user triggering the RP removal cannot kill tasks or prevent 
> them from being rescheduled. I'd suggest to leaves as is.
> 
> WDYT?
> 
> Chun-Hung Hsiao wrote:
> The scenario I am worried about is that when the RP is marked as gone, 
> the RP may fail to tear itself down if there is any task using its resources.
> 
> Benjamin Bannier wrote:
> We don't seem to have a good story on incompatible resource changes ATM. 
> If we e.g., update a RP config in an incompatible way, we make not attempt to 
> detect or rectify this in any way, either. At least in the agent we were in 
> general we were operating under the assumption that tasks would die on 
> incompatible changes to their resources.
> 
> I'd suggest to drop this issue for now and potentially file a JIRA to 
> track draining and situations where it might be needed.
> 
> Benjamin Bannier wrote:
> Dropping this, see above comment.
> 
> Chun-Hung Hsiao wrote:
> Reopening this. Here's my suggestion:
> 
> 1. Create a ticket for a proper design to drain the resource provider.
> 2. For now, let's fail the call here if 
> `slave->resourceProviders.at(resourceProviderId)->totalResources` is not 
> empty. In other words, the operator must first reconfigure the resource 
> provider to "drain" itself before making the `MARK_RESOURCE_PROVIDER_GONE` 
> call.
> 
> Benjamin Bannier wrote:
> Okay.
> 
> 1. I created https://issues.apache.org/jira/browse/MESOS-9522.
> 2. We had discussed this some time ago, and I think agreed that 
> preventing operators from marking resource providers as gone whenever a task 
> was running on them was not good (especially if operators cannot drain the 
> RP). We have prior art in `MARK_AGENT_GONE` which asks operators to drain 
> agent before marking as gone. I would suggest for now with MESOS-9522 not 
> implemented we allow operators to mark resource providers as gone even if 
> some of their resources are in use; if needed they can drain them by 
> filtering all tasks using their resources. This also wouldn't expose a race.
> 
> Does this address your concerns?

I agreed that checking if there is any task running on resources of an RP when 
marking it as gone would lead to some complexity, so I'm suggesting to use a 
stronger condition: there must be no resource.

I'm not convinced that asking the operator to drain the RP is a bad idea. My 
main concern is that, since `MARK_RESOURCE_PROVIDER_GONE` cannot be undone, we 
should only perform it when it is safe to do, so the operator can have a chance 
to fix the problem. Also, this restriction would ensure that no framework will 
have an inconsistent view of cluster resources. WDYT?


- Chun-Hung


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/68147/#review207559
---


On Jan. 15, 2019, 3:03 p.m., Benjamin Bannier wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/68147/
> ---
> 
> (Updated Jan. 15, 2019, 3:03 p.m.)
> 
> 
> Review request for mesos, Chun-Hung Hsiao, Gastón Kleiman, Greg Mann, and Jan 
> Schlicht.
> 
> 
> Bugs: MESOS-8403
> https://issues.apache.org/jira/browse/MESOS-8403
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> This patch adds support for triggering permanent removal of local
> resource providers. We also add authorization and tests as part of this
> patch.
> 
> 
> Diffs
> -
> 
>   include/mesos/agent/agent.proto 74488e873cbf99ca487403b70691912cf3788288 
>   include/mesos/v1/agent/agent.proto 5d1ab6fd7f6f2159367c66ed35ee90aea07f 
>   src/common/protobuf_utils.cpp a0159fed8d325808c5e8519da06173441deb 
>   src/master/master.cpp 2339207149a85578ea47cf66f28392182f9075f2 
>   src/slave/http.hpp 5b113fa2f7e3421d4219e0ece113010937b204c1 
>   src/slave/http.cpp e0dc09103e9186aa3f7328d2052fc6747f5be9bb 
>   src/slave/slave.hpp 2eadf5fce9a314f1ec0ac5d51820c6381f5f1468 
>   src/slave/slave.cpp 10cbc190acc7eea5734efa0066541168545a66b6 
>   src/slave/validation.cpp df5e1373dbe497bc859455dcaf4e064e923bd72e 
>   src/tests/api_tests.cpp 79a2d2966ff6533629aeeefaaaf19e849b76e0e6 
> 
> 
> Diff: https://reviews.apache.org/r/68147/diff/13/
> 
> 
> Testing
> ---
> 
> `make check`
> 
> 
> Thanks,
> 
> Benjamin Bannier
> 
>


Re: Review Request 68147: Added agent support to remove local resource providers.

2019-01-15 Thread Chun-Hung Hsiao


> On Jan. 15, 2019, 5:53 a.m., Chun-Hung Hsiao wrote:
> > src/tests/api_tests.cpp
> > Lines 7857 (patched)
> > 
> >
> > ```
> > Owned detector = master.get()->createDetector();
> > ```
> > And
> > ```
> > Try> slave = StartSlave(detector.get(), 
> > slavFlags);
> > ```
> > below.
> 
> Benjamin Bannier wrote:
> Could you explain the pattern we follow? I mostly went for a 
> StandaloneMasterDetector instead of an Owned because of the 
> conceptually simpler type (value vs pointer), but we seem to use both here.

I'm in favor of abstract types (e.g., `MasterDetector`) over concrete subtypes 
(e.g., `StandaloneMasterDetector`). Please feel free to drop this if you prefer 
avoiding the pointer type :)


- Chun-Hung


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/68147/#review211997
---


On Jan. 15, 2019, 3:03 p.m., Benjamin Bannier wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/68147/
> ---
> 
> (Updated Jan. 15, 2019, 3:03 p.m.)
> 
> 
> Review request for mesos, Chun-Hung Hsiao, Gastón Kleiman, Greg Mann, and Jan 
> Schlicht.
> 
> 
> Bugs: MESOS-8403
> https://issues.apache.org/jira/browse/MESOS-8403
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> This patch adds support for triggering permanent removal of local
> resource providers. We also add authorization and tests as part of this
> patch.
> 
> 
> Diffs
> -
> 
>   include/mesos/agent/agent.proto 74488e873cbf99ca487403b70691912cf3788288 
>   include/mesos/v1/agent/agent.proto 5d1ab6fd7f6f2159367c66ed35ee90aea07f 
>   src/common/protobuf_utils.cpp a0159fed8d325808c5e8519da06173441deb 
>   src/master/master.cpp 2339207149a85578ea47cf66f28392182f9075f2 
>   src/slave/http.hpp 5b113fa2f7e3421d4219e0ece113010937b204c1 
>   src/slave/http.cpp e0dc09103e9186aa3f7328d2052fc6747f5be9bb 
>   src/slave/slave.hpp 2eadf5fce9a314f1ec0ac5d51820c6381f5f1468 
>   src/slave/slave.cpp 10cbc190acc7eea5734efa0066541168545a66b6 
>   src/slave/validation.cpp df5e1373dbe497bc859455dcaf4e064e923bd72e 
>   src/tests/api_tests.cpp 79a2d2966ff6533629aeeefaaaf19e849b76e0e6 
> 
> 
> Diff: https://reviews.apache.org/r/68147/diff/13/
> 
> 
> Testing
> ---
> 
> `make check`
> 
> 
> Thanks,
> 
> Benjamin Bannier
> 
>



Re: Review Request 68017: Added Seccomp-related protobuf messages.

2019-01-15 Thread Andrei Budnik


> On Jan. 15, 2019, 2:39 a.m., Gilbert Song wrote:
> > include/mesos/mesos.proto
> > Lines 3158-3159 (patched)
> > 
> >
> > Seems like this was added recently.
> > 
> > Is this field only used when there is a default agent wide seccomp 
> > profile provided from the agent flag? and we reply on it to give an 
> > opportunity for user/framework to get rid of seccomp?
> > 
> > (probably more comment needed)
> > 
> > If it is the case, do we have other options for naming? e.g., 
> > `no_seccomp` (maybe more explicit)
> 
> Gilbert Song wrote:
> after second thought, I would suggest to remove this field for now. since 
> there is not use case yet (I understand your motivation: you want users could 
> get rid of seccomp if there is a default one). We could add this field later 
> if necessary. For now:
> 
> two options:
> 1. remove it
> 2. do it implicitly by using the optional profile_name: if seccompinfo 
> isSome but profile_name is None. do not set seccomp filtering for container
> 
> Gilbert Song wrote:
> I would prefer option #2.
> 
> The reason we want to avoid introducing `unconfined` now is that 
> framework could set both field at the same time and ideally we may need an 
> `enum type`. given the use case is not clear yet (people may not necessary to 
> use it yet). lets make it implicit for now.
> 
> Qian Zhang wrote:
> > do it implicitly by using the optional profile_name: if seccompinfo 
> isSome but profile_name is None. do not set seccomp filtering for container
> 
> So the semantic will be:
> 1. seccompinfo is None: The default seccomp profile (if any) will be 
> applied for the container.
> 2. seccompinfo is Some but profile_name is None: No seccomp filtering for 
> the container.
> 
> This is kind of unintuitive to me because none-seccompinfo and 
> none-profile_name will have totally different result.
> 
> Qian Zhang wrote:
> Second thought, do we really need to support framework to disable seccomp 
> filtering? I think that may not be what the operator wants. If the operator 
> specifies a default seccomp profile, that means the operator cares about the 
> security of the agent host, and he/she would expect the default seccomp 
> profile to be applied for each container unless framework overrides it with 
> another seccomp profile. Override does not mean disable because any seccomp 
> profiles that framework can use are also set up by operator under 
> `--seccomp_config_dir` (i.e., still under the operator's control), but if 
> framework can disable seccomp filtering for its containers, that means the 
> security of the agent host is out of the operator's control which seems kind 
> of security hole.
> 
> However, disabling seccomp filtering can still be supported (if some 
> operators want it) by not specifying the default seccomp profile, and in this 
> case if framework does not specify seccompinfo or profile_name, the seccomp 
> filtering will be disabled. But if operator specifies a default seccomp 
> profile, I think we should not support disabling seccomp filtering. And 
> another way to support disabling seccomp filtering is, operator specifies the 
> default seccomp profile and also put some other profiles under 
> `--seccomp_config_dir` and names them like `strict`, `loose` and `none`, and 
> there is no syscall filtering in the `none` profile which is equivalent to 
> disabling seccomp filtering.

I think we should adhere semantics for Seccomp configuration similar to Docker 
and Kubernetes:
https://docs.docker.com/engine/security/seccomp/ (especially, `You can pass 
unconfined to run a container without the default seccomp profile.`)
https://kubernetes.io/docs/concepts/policy/pod-security-policy/#seccomp


- Andrei


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/68017/#review211986
---


On Nov. 8, 2018, 3:24 p.m., Andrei Budnik wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/68017/
> ---
> 
> (Updated Nov. 8, 2018, 3:24 p.m.)
> 
> 
> Review request for mesos, Gilbert Song, Jie Yu, James Peach, and Qian Zhang.
> 
> 
> Bugs: MESOS-9033
> https://issues.apache.org/jira/browse/MESOS-9033
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> See summary.
> 
> 
> Diffs
> -
> 
>   include/mesos/mesos.proto 2ef6ba3aef67cf34227569948fd3ddc8dfd5879d 
>   include/mesos/seccomp/seccomp.hpp PRE-CREATION 
>   include/mesos/seccomp/seccomp.proto PRE-CREATION 
>   include/mesos/slave/containerizer.proto 
> 5b4dcdda0f55ea3355c78d1447c7be9ca54d9dc9 
>   

Re: Review Request 68017: Added Seccomp-related protobuf messages.

2019-01-15 Thread Andrei Budnik


> On Jan. 15, 2019, 8:41 a.m., Qian Zhang wrote:
> > include/mesos/seccomp/seccomp.proto
> > Lines 79 (patched)
> > 
> >
> > So the value must be numeric? Do we support other types (like string 
> > value)?

> So the value must be numeric?

Yes.

> Do we support other types (like string value)?

No.

https://github.com/moby/moby/blob/8e610b2b55bfd1bfa9436ab110d311f5e8a74dcb/api/types/seccomp.go#L73


- Andrei


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/68017/#review211942
---


On Nov. 8, 2018, 3:24 p.m., Andrei Budnik wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/68017/
> ---
> 
> (Updated Nov. 8, 2018, 3:24 p.m.)
> 
> 
> Review request for mesos, Gilbert Song, Jie Yu, James Peach, and Qian Zhang.
> 
> 
> Bugs: MESOS-9033
> https://issues.apache.org/jira/browse/MESOS-9033
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> See summary.
> 
> 
> Diffs
> -
> 
>   include/mesos/mesos.proto 2ef6ba3aef67cf34227569948fd3ddc8dfd5879d 
>   include/mesos/seccomp/seccomp.hpp PRE-CREATION 
>   include/mesos/seccomp/seccomp.proto PRE-CREATION 
>   include/mesos/slave/containerizer.proto 
> 5b4dcdda0f55ea3355c78d1447c7be9ca54d9dc9 
>   include/mesos/v1/mesos.proto 1a701da65f653fe4191f92ff1fb1436809b50acb 
>   src/CMakeLists.txt a574d449dc26b820cbef7ff0b5e94b42b6fe86cf 
>   src/Makefile.am cd785255fcdf1302a8f9fa358039e5d1f200e132 
> 
> 
> Diff: https://reviews.apache.org/r/68017/diff/15/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Andrei Budnik
> 
>



Re: Review Request 68020: Added Seccomp-related flags to the agent.

2019-01-15 Thread Andrei Budnik


> On Jan. 15, 2019, 2:31 a.m., Qian Zhang wrote:
> > src/slave/flags.cpp
> > Lines 1399 (patched)
> > 
> >
> > Path or name?

Well.. technically it is a path. For example, if 
`--seccomp_config_dir=/home/nobody/seccomp`, then I can specify 
`--seccomp_profile_name=myseccomp/default.json`, resulting in a file path 
`/home/nobody/seccomp/myseccomp/default.json`.
Renaming `seccomp_profile_name` to `seccomp_profile_path` adds more confusion 
as we already have `seccomp_profile_dir`.


- Andrei


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/68020/#review211983
---


On Nov. 8, 2018, 3:24 p.m., Andrei Budnik wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/68020/
> ---
> 
> (Updated Nov. 8, 2018, 3:24 p.m.)
> 
> 
> Review request for mesos, Gilbert Song, Jie Yu, James Peach, and Qian Zhang.
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> `--seccomp_config_dir` and `--seccomp_profile_name` flags have been
> added to the agent. These flags are used by the `linux/seccomp`
> isolator to specify the path of the directory containing Seccomp
> profiles and the name of the default Seccomp profile.
> 
> 
> Diffs
> -
> 
>   docs/configuration/agent.md 330283f4e3957075dd4310de4a841feac23de36c 
>   src/slave/flags.hpp 494ae02ab5eb365e2cda5017be573691107c3f28 
>   src/slave/flags.cpp 6bac8e1409f04d639204c45eda8a90c098e3dbd0 
> 
> 
> Diff: https://reviews.apache.org/r/68020/diff/10/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Andrei Budnik
> 
>



Re: Review Request 68021: Added `linux/seccomp` isolator.

2019-01-15 Thread Andrei Budnik


> On Jan. 15, 2019, 3:02 a.m., Qian Zhang wrote:
> > src/slave/containerizer/mesos/isolators/linux/seccomp.cpp
> > Lines 17-18 (patched)
> > 
> >
> > A newline between.

https://github.com/apache/mesos/blob/2aaf96ecbab316708afb401e43cad2f2f692f687/src/slave/containerizer/mesos/isolators/xfs/utils.cpp#L35-L38


> On Jan. 15, 2019, 3:02 a.m., Qian Zhang wrote:
> > src/slave/containerizer/mesos/isolators/linux/seccomp.cpp
> > Lines 61 (patched)
> > 
> >
> > Can we have a more explicit error message here? Like `The default 
> > seccomp profile is invalid: ...`.

Currenly, the error message is very long and pretty descriptive:
```
E0115 10:23:51.804989  2491 main.cpp:484] EXIT with status 1: Failed to create 
a containerizer: Could not create MesosContainerizer: Failed to create isolator 
'linux/seccomp': Failed to read Seccomp profile file 
'/home/abudnik/default.json': No such file or directory
```


> On Jan. 15, 2019, 3:02 a.m., Qian Zhang wrote:
> > src/slave/containerizer/mesos/isolators/linux/seccomp.cpp
> > Lines 97 (patched)
> > 
> >
> > Missing Seccomp profile name for container xxx.

I don't think we need to mention `containerId`. Other isolators don't specify 
`containerId` in failure messages.

We print `containerId` in `Http::_launchContainer` (and in 
`Slave::executorLaunched`) on failures.


- Andrei


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/68021/#review211985
---


On Nov. 8, 2018, 3:24 p.m., Andrei Budnik wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/68021/
> ---
> 
> (Updated Nov. 8, 2018, 3:24 p.m.)
> 
> 
> Review request for mesos, Gilbert Song, Jie Yu, James Peach, and Qian Zhang.
> 
> 
> Bugs: MESOS-9035
> https://issues.apache.org/jira/browse/MESOS-9035
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> This patch introduces `linux/seccomp` isolator which is used for
> preparing `ContainerSeccompProfile` for the Mesos containerizer
> launcher. If the `ContainerConfig` message has an info about Seccomp
> profile name, then this info will be used to locate a Seccomp profile.
> The given Seccomp profile is parsed and the resulting
> `ContainerSeccompProfile` is stored in the `ContainerLaunchInfo`
> message.
> 
> 
> Diffs
> -
> 
>   src/CMakeLists.txt a574d449dc26b820cbef7ff0b5e94b42b6fe86cf 
>   src/Makefile.am cd785255fcdf1302a8f9fa358039e5d1f200e132 
>   src/slave/containerizer/mesos/containerizer.cpp 
> 5016f2e9f0651abcb0a5f364e8eace458f2edeae 
>   src/slave/containerizer/mesos/isolators/linux/seccomp.hpp PRE-CREATION 
>   src/slave/containerizer/mesos/isolators/linux/seccomp.cpp PRE-CREATION 
> 
> 
> Diff: https://reviews.apache.org/r/68021/diff/13/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Andrei Budnik
> 
>



Re: Review Request 68147: Added agent support to remove local resource providers.

2019-01-15 Thread Benjamin Bannier

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/68147/
---

(Updated Jan. 15, 2019, 4:03 p.m.)


Review request for mesos, Chun-Hung Hsiao, Gastón Kleiman, Greg Mann, and Jan 
Schlicht.


Bugs: MESOS-8403
https://issues.apache.org/jira/browse/MESOS-8403


Repository: mesos


Description
---

This patch adds support for triggering permanent removal of local
resource providers. We also add authorization and tests as part of this
patch.


Diffs (updated)
-

  include/mesos/agent/agent.proto 74488e873cbf99ca487403b70691912cf3788288 
  include/mesos/v1/agent/agent.proto 5d1ab6fd7f6f2159367c66ed35ee90aea07f 
  src/common/protobuf_utils.cpp a0159fed8d325808c5e8519da06173441deb 
  src/master/master.cpp 2339207149a85578ea47cf66f28392182f9075f2 
  src/slave/http.hpp 5b113fa2f7e3421d4219e0ece113010937b204c1 
  src/slave/http.cpp e0dc09103e9186aa3f7328d2052fc6747f5be9bb 
  src/slave/slave.hpp 2eadf5fce9a314f1ec0ac5d51820c6381f5f1468 
  src/slave/slave.cpp 10cbc190acc7eea5734efa0066541168545a66b6 
  src/slave/validation.cpp df5e1373dbe497bc859455dcaf4e064e923bd72e 
  src/tests/api_tests.cpp 79a2d2966ff6533629aeeefaaaf19e849b76e0e6 


Diff: https://reviews.apache.org/r/68147/diff/13/

Changes: https://reviews.apache.org/r/68147/diff/12-13/


Testing
---

`make check`


Thanks,

Benjamin Bannier



Re: Review Request 69687: Fixed flakiness of resource provider ContainerTerminationMetric test.

2019-01-15 Thread Mesos Reviewbot Windows

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69687/#review212018
---



PASS: Mesos patch 69687 was successfully built and tested.

Reviews applied: `['69687']`

All the build artifacts available at: 
http://dcos-win.westus2.cloudapp.azure.com/artifacts/mesos-reviewbot-testing/2779/mesos-review-69687

- Mesos Reviewbot Windows


On Jan. 15, 2019, 8:31 a.m., Benjamin Bannier wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/69687/
> ---
> 
> (Updated Jan. 15, 2019, 8:31 a.m.)
> 
> 
> Review request for mesos and Chun-Hung Hsiao.
> 
> 
> Bugs: MESOS-9130
> https://issues.apache.org/jira/browse/MESOS-9130
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Plugin restarts can currently fail if the resource provider is killed
> before it has performed minimal reconciliation steps. This patch adds
> additional synchronization to ensure the plugin container can safely be
> killed to safely perform the desired test.
> 
> 
> Diffs
> -
> 
>   src/tests/storage_local_resource_provider_tests.cpp 
> e8ed20f818ed7f1a3ce15758ea3c366520443377 
> 
> 
> Diff: https://reviews.apache.org/r/69687/diff/2/
> 
> 
> Testing
> ---
> 
> * `make check`
> * ran test under stress for ~1000 iterations w/o issues
> 
> 
> Thanks,
> 
> Benjamin Bannier
> 
>



Re: Review Request 69158: Added an integration test for resource provider removal.

2019-01-15 Thread Benjamin Bannier


> On Jan. 15, 2019, 7:08 a.m., Chun-Hung Hsiao wrote:
> > src/tests/slave_tests.cpp
> > Lines 10609 (patched)
> > 
> >
> > Maybe move this to 
> > `AgentAPITest.OperationStatusUpdateUponResourceProviderGone`? We have 
> > `MasterAPITest.TaskUpdatesUponAgentGone` there.

Unfortunately this tests multiple aspects (disconnection of gone RPs, 
transition of operations -- updated the test docstring to account for that). 
Given the former, I believe it makes more sense to keep it here. WDYT?


- Benjamin


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69158/#review211998
---


On Jan. 15, 2019, 4:03 p.m., Benjamin Bannier wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/69158/
> ---
> 
> (Updated Jan. 15, 2019, 4:03 p.m.)
> 
> 
> Review request for mesos, Chun-Hung Hsiao and Jan Schlicht.
> 
> 
> Bugs: MESOS-8403
> https://issues.apache.org/jira/browse/MESOS-8403
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Added an integration test for resource provider removal.
> 
> 
> Diffs
> -
> 
>   src/tests/slave_tests.cpp dc711de96053ebecb6b71b98f14f22cb9b050c52 
> 
> 
> Diff: https://reviews.apache.org/r/69158/diff/5/
> 
> 
> Testing
> ---
> 
> `make check`
> 
> 
> Thanks,
> 
> Benjamin Bannier
> 
>



Re: Review Request 69158: Added an integration test for resource provider removal.

2019-01-15 Thread Benjamin Bannier


> On Jan. 15, 2019, 6:58 a.m., Chun-Hung Hsiao wrote:
> > src/tests/slave_tests.cpp
> > Lines 10625-10626 (patched)
> > 
> >
> > ```
> > Owned detector = master.get()->createDetector();
> > Try> slave = StartSlave(detector.get(), 
> > slavFlags);
> > ```

Could you explain the pattern we follow? I mostly went for a 
`StandaloneMasterDetector` instead of an `Owned` because of the 
conceptually simpler type (value vs pointer), but we seem to use both here.


- Benjamin


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69158/#review211996
---


On Jan. 15, 2019, 4:03 p.m., Benjamin Bannier wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/69158/
> ---
> 
> (Updated Jan. 15, 2019, 4:03 p.m.)
> 
> 
> Review request for mesos, Chun-Hung Hsiao and Jan Schlicht.
> 
> 
> Bugs: MESOS-8403
> https://issues.apache.org/jira/browse/MESOS-8403
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Added an integration test for resource provider removal.
> 
> 
> Diffs
> -
> 
>   src/tests/slave_tests.cpp dc711de96053ebecb6b71b98f14f22cb9b050c52 
> 
> 
> Diff: https://reviews.apache.org/r/69158/diff/5/
> 
> 
> Testing
> ---
> 
> `make check`
> 
> 
> Thanks,
> 
> Benjamin Bannier
> 
>



Re: Review Request 69158: Added an integration test for resource provider removal.

2019-01-15 Thread Benjamin Bannier

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69158/
---

(Updated Jan. 15, 2019, 4:03 p.m.)


Review request for mesos, Chun-Hung Hsiao and Jan Schlicht.


Changes
---

Addressed comments from Chun.


Bugs: MESOS-8403
https://issues.apache.org/jira/browse/MESOS-8403


Repository: mesos


Description
---

Added an integration test for resource provider removal.


Diffs (updated)
-

  src/tests/slave_tests.cpp dc711de96053ebecb6b71b98f14f22cb9b050c52 


Diff: https://reviews.apache.org/r/69158/diff/5/

Changes: https://reviews.apache.org/r/69158/diff/4-5/


Testing
---

`make check`


Thanks,

Benjamin Bannier



Re: Review Request 68018: Added `SeccompFilter` class.

2019-01-15 Thread Andrei Budnik


> On Jan. 3, 2019, 1:58 a.m., Gilbert Song wrote:
> > src/linux/seccomp/seccomp.cpp
> > Lines 141-144 (patched)
> > 
> >
> > Instead of always set `SCMP_FLTATR_CTL_NNP`. Should we consider to 
> > check root privileges first (e.g., `geteuid() != 0`)?
> 
> Andrei Budnik wrote:
> By default, libseccomp sets `true` to the `SCMP_FLTATR_CTL_NNP` flag
> 
> https://github.com/seccomp/libseccomp/blob/1e64feb5f1a9ea02687228e3073e8b784a04ce46/src/db.c#L960
> 
> Hence, all Seccomp test pass even after removing `seccomp_attr_set(ctx, 
> SCMP_FLTATR_CTL_NNP, 1)`. Also, this means that Docker daemon launches its 
> containers with this flag set by default (as they also use libseccomp).
> 
> Disabling `SCMP_FLTATR_CTL_NNP` flag for a `root` means that Seccomp 
> filter can be reverted anytime. So, disabling this flag is meaningless.
> 
> Gilbert Song wrote:
> Gotcha. Does it imply that task launched by docker daemon with seccomp 
> profile enabled cannot setuid (assuming docker relies on execve/execvpe)?

No, it doesn't. See my comment 
https://reviews.apache.org/r/68018/#comment297515 starting with `Docker daemon 
can not be used to run arbitrary programs...`.


- Andrei


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/68018/#review211623
---


On Nov. 8, 2018, 3:24 p.m., Andrei Budnik wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/68018/
> ---
> 
> (Updated Nov. 8, 2018, 3:24 p.m.)
> 
> 
> Review request for mesos, Gilbert Song, Jie Yu, James Peach, and Qian Zhang.
> 
> 
> Bugs: MESOS-9034
> https://issues.apache.org/jira/browse/MESOS-9034
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> `SeccompFilter` class is a wrapper for `libseccomp` API. Its main
> purpose is to provide a translation of the `ContainerSeccompProfile`
> message into calls of `libseccomp` API.
> 
> 
> Diffs
> -
> 
>   src/CMakeLists.txt a574d449dc26b820cbef7ff0b5e94b42b6fe86cf 
>   src/Makefile.am cd785255fcdf1302a8f9fa358039e5d1f200e132 
>   src/linux/seccomp/seccomp.hpp PRE-CREATION 
>   src/linux/seccomp/seccomp.cpp PRE-CREATION 
> 
> 
> Diff: https://reviews.apache.org/r/68018/diff/15/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Andrei Budnik
> 
>



[GitHub] dlazarus commented on a change in pull request #324: MESOS-9499 extended URI syntax to support any Zookeeper authentication schemes

2019-01-15 Thread GitBox
dlazarus commented on a change in pull request #324: MESOS-9499 extended URI 
syntax to support any Zookeeper authentication schemes
URL: https://github.com/apache/mesos/pull/324#discussion_r247926266
 
 

 ##
 File path: include/mesos/zookeeper/url.hpp
 ##
 @@ -101,16 +102,35 @@ inline Try URL::parse(const std::string& url)
   // Look for the trailing '@' (if any), that's where servers starts.
   size_t index = s.find_last_of('@');
 
-  if (index != std::string::npos) {
-return URL(s.substr(0, index), s.substr(index + 1), path);
-  } else {
+  if (index == std::string::npos)
 return URL(s, path);
+
+  std::string servers = s.substr(index + 1);
+  std::string auth = s.substr(0, index);
+
+  size_t schemeDelimiter = auth.find_first_of('!');
+
+  // If there is not '!' in URL scheme is "digest" and everything before '@' 
is credentials
+  std::string scheme = "digest";
+  std::string credentials = auth;
 
 Review comment:
   It is not in the hostname. Look:
   example: zk://simple!login:password@host:port/path
   hostname starts after '@'.
   Everything after zk:// and before '@' has nothing to do with DNS and 
hostnames.
   
   Check https://tools.ietf.org/html/rfc3986
   Page 49 describes allowed delimiters in this part, explicitly including '!'.
   Section 3.2.1 describes this part as "User Information".
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


Re: Review Request 68018: Added `SeccompFilter` class.

2019-01-15 Thread Andrei Budnik


> On Jan. 14, 2019, 8:31 a.m., Qian Zhang wrote:
> > src/linux/seccomp/seccomp.cpp
> > Lines 137-139 (patched)
> > 
> >
> > Will this affect the task run by Mesos? E.g., a task may want to run a 
> > program which has `set-user-ID` bit.
> 
> Andrei Budnik wrote:
> Yes, `no_new_privs` flag affects the task that wants to run a program 
> which has `set-user-ID` bit.
> E.g., launching a `ping -c 3 8.8.8.8` fails with seccomp. You'll see a 
> message in executor logs:
> ```
> I0114 07:19:21.887670 13264 executor.cpp:706] Forked command at 13276
> ping: socket: Operation not permitted
> I0114 07:19:22.055352 13263 executor.cpp:1007] Command exited with status 
> 2 (pid: 13276)
> ```
> 
> Also, see my previous comment 
> https://reviews.apache.org/r/68018/#comment297000
> 
> Qian Zhang wrote:
> In your previous comment, you mentioned that Docker daemon launches its 
> containers with `SCMP_FLTATR_CTL_NNP` flag set by default, does that mean any 
> containers launched by Docker daemon cannot run program which has set-user-ID 
> bit?
> 
> This seems unfortunate since it might break some use cases or 
> applications that we already supported. And can you please elaborate a bit 
> about `"Disabling SCMP_FLTATR_CTL_NNP flag for a root means that Seccomp 
> filter can be reverted anytime"`? How will the Seccomp filter be reverted? Do 
> you mean the task launched by Mesos can call libseccomp API to revert the 
> filter itself?
> 
> If we have to live with this limitation (i.e., cannot run program which 
> has set-user-ID bit), then we need to highlight it in the document.
> 
> Gilbert Song wrote:
> Seems like we asked the same question.
> 
> Andrei, let align on this thread? :/thanks:)

>does that mean any containers launched by Docker daemon cannot run program 
>which has set-user-ID bit?

Docker daemon can not be used to run arbitrary programs (in opposity to Mesos 
c'zer). So, when one launches a Docker container, Docker daemon launches a 
container process with `NNP` bit set, which means that a container process (and 
it descendants) can't gain more previleges **outside** its container. Mesos 
containerizer has exactly the same behaviour:

1) Run system-provided `/bin/ping` (*outside* its container) as a 
non-privileged user:
```
$ ./src/mesos-execute --master="`hostname`:5050" --name="a" 
--containerizer=mesos --command="ping -c 3 8.8.8.8"
...
Received status update TASK_FAILED for task 'a'
  message: 'Command exited with status 2'
  source: SOURCE_EXECUTOR
```

2) Run system-provided `/bin/ping` (*outside* its container) as a privileged 
user:
```
sudo ./src/mesos-execute --master="`hostname`:5050" --name="a" 
--containerizer=mesos --command="ping -c 3 8.8.8.8"
...
Received status update TASK_FINISHED for task 'a'
  message: 'Command exited with status 0'
  source: SOURCE_EXECUTOR
```

3) Run container image provided `ping` (*inside* its image/container) as a 
non-privileged user:
```
$ ./src/mesos-execute --master="`hostname`:5050" --name="a" 
--containerizer=mesos --docker_image="fedora:latest" --command="yum -y install 
iputils;ping -c 3 8.8.8.8"
...
Received status update TASK_FINISHED for task 'a'
  message: 'Command exited with status 0'
  source: SOURCE_EXECUTOR

$ cat /path/to/container/stdout
...
PING 8.8.8.8 (8.8.8.8) 56(84) bytes of data.
64 bytes from 8.8.8.8: icmp_seq=1 ttl=122 time=13.9 ms
```

> This seems unfortunate since it might break some use cases or applications 
> that we already supported.

It's very unlikely that the agent launches tasks, whose binary has 
`setuid`/`setgid` bit specified. Because... what the point?
I doubt if any of the following programs a launched as a Mesos container:
```
$ sudo find /bin/ -perm -u=s -type f 2>/dev/null
/bin/newgrp
/bin/pkexec
/bin/mount
/bin/umount
/bin/newuidmap
/bin/newgidmap
/bin/sudo
/bin/crontab
/bin/su
/bin/gpasswd
/bin/chage
/bin/passwd
/bin/staprun
/bin/fusermount
/bin/fusermount-glusterfs
/bin/chfn
/bin/chsh
/bin/at
```

> And can you please elaborate a bit about "Disabling SCMP_FLTATR_CTL_NNP flag 
> for a root means that Seccomp filter can be reverted anytime"? How will the 
> Seccomp filter be reverted? Do you mean the task launched by Mesos can call 
> libseccomp API to revert the filter itself?

Yes, without `NNP` (`no_new_privs`) bit set, a privileged task might call 
`seccomp` Linux syscall to install an empty Seccomp filter.


- Andrei


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/68018/#review211946
---


On Nov. 8, 2018, 3:24 p.m., Andrei Budnik wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/68018/
> 

Re: Review Request 68147: Added agent support to remove local resource providers.

2019-01-15 Thread Benjamin Bannier


> On Jan. 15, 2019, 6:42 a.m., Chun-Hung Hsiao wrote:
> > src/master/master.cpp
> > Lines 11248 (patched)
> > 
> >
> > Can you add a comment explaining why it is not necessary to call 
> > `allocator->recoverResources(...)` here, and that the 
> > `slave->recoverResources(...)` and `framework->recoverResources(...)` below 
> > are unrelated and thus the bookkeeping would still be consistent even if 
> > `allocator->recoverResources(...)` is not called?

I moved into a block together with the other failed, terminal status since this 
makes immediate sense. I also updated the test to assert that the master sees 
the status update before the `UpdateSlaveMessage` as the agent only forwards 
status updates for operations it knows and the resource accounting in the 
master is only safe for that case.


- Benjamin


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/68147/#review211993
---


On Jan. 15, 2019, 3:53 p.m., Benjamin Bannier wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/68147/
> ---
> 
> (Updated Jan. 15, 2019, 3:53 p.m.)
> 
> 
> Review request for mesos, Chun-Hung Hsiao, Gastón Kleiman, Greg Mann, and Jan 
> Schlicht.
> 
> 
> Bugs: MESOS-8403
> https://issues.apache.org/jira/browse/MESOS-8403
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> This patch adds support for triggering permanent removal of local
> resource providers. We also add authorization and tests as part of this
> patch.
> 
> 
> Diffs
> -
> 
>   include/mesos/agent/agent.proto 74488e873cbf99ca487403b70691912cf3788288 
>   include/mesos/v1/agent/agent.proto 5d1ab6fd7f6f2159367c66ed35ee90aea07f 
>   src/common/protobuf_utils.cpp a45607eed4c4bae5010bcc3f3ffeabd6d911062a 
>   src/master/master.cpp 3b3824a67f46866cd64e32d7f9f92484b5891aa2 
>   src/slave/http.hpp 5b113fa2f7e3421d4219e0ece113010937b204c1 
>   src/slave/http.cpp ba389263ef7ab311682d542424eb56360841d24c 
>   src/slave/slave.hpp edf7269d4057ec8c95bb54c855210ad00d002a50 
>   src/slave/slave.cpp 858b78620e1ef33f3587d0bd95a684917aaf5bbb 
>   src/slave/validation.cpp df5e1373dbe497bc859455dcaf4e064e923bd72e 
>   src/tests/api_tests.cpp edde48a4b4b46c9f47bc44de389c97b77322b8e8 
> 
> 
> Diff: https://reviews.apache.org/r/68147/diff/12/
> 
> 
> Testing
> ---
> 
> `make check`
> 
> 
> Thanks,
> 
> Benjamin Bannier
> 
>



Re: Review Request 68147: Added agent support to remove local resource providers.

2019-01-15 Thread Benjamin Bannier


> On Jan. 15, 2019, 6:53 a.m., Chun-Hung Hsiao wrote:
> > src/tests/api_tests.cpp
> > Lines 7857 (patched)
> > 
> >
> > ```
> > Owned detector = master.get()->createDetector();
> > ```
> > And
> > ```
> > Try> slave = StartSlave(detector.get(), 
> > slavFlags);
> > ```
> > below.

Could you explain the pattern we follow? I mostly went for a 
StandaloneMasterDetector instead of an Owned because of the 
conceptually simpler type (value vs pointer), but we seem to use both here.


- Benjamin


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/68147/#review211997
---


On Jan. 15, 2019, 3:53 p.m., Benjamin Bannier wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/68147/
> ---
> 
> (Updated Jan. 15, 2019, 3:53 p.m.)
> 
> 
> Review request for mesos, Chun-Hung Hsiao, Gastón Kleiman, Greg Mann, and Jan 
> Schlicht.
> 
> 
> Bugs: MESOS-8403
> https://issues.apache.org/jira/browse/MESOS-8403
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> This patch adds support for triggering permanent removal of local
> resource providers. We also add authorization and tests as part of this
> patch.
> 
> 
> Diffs
> -
> 
>   include/mesos/agent/agent.proto 74488e873cbf99ca487403b70691912cf3788288 
>   include/mesos/v1/agent/agent.proto 5d1ab6fd7f6f2159367c66ed35ee90aea07f 
>   src/common/protobuf_utils.cpp a45607eed4c4bae5010bcc3f3ffeabd6d911062a 
>   src/master/master.cpp 3b3824a67f46866cd64e32d7f9f92484b5891aa2 
>   src/slave/http.hpp 5b113fa2f7e3421d4219e0ece113010937b204c1 
>   src/slave/http.cpp ba389263ef7ab311682d542424eb56360841d24c 
>   src/slave/slave.hpp edf7269d4057ec8c95bb54c855210ad00d002a50 
>   src/slave/slave.cpp 858b78620e1ef33f3587d0bd95a684917aaf5bbb 
>   src/slave/validation.cpp df5e1373dbe497bc859455dcaf4e064e923bd72e 
>   src/tests/api_tests.cpp edde48a4b4b46c9f47bc44de389c97b77322b8e8 
> 
> 
> Diff: https://reviews.apache.org/r/68147/diff/12/
> 
> 
> Testing
> ---
> 
> `make check`
> 
> 
> Thanks,
> 
> Benjamin Bannier
> 
>



Re: Review Request 68147: Added agent support to remove local resource providers.

2019-01-15 Thread Benjamin Bannier


> On Aug. 18, 2018, 12:29 a.m., Chun-Hung Hsiao wrote:
> > src/slave/slave.cpp
> > Lines 8195 (patched)
> > 
> >
> > Let's validate that there is no task using the resources provided by 
> > this RP before doing the removal, or fail the call.
> 
> Benjamin Bannier wrote:
> I believe this should be treated as orthogonal issue. It might e.g., 
> happen that the user triggering the RP removal cannot kill tasks or prevent 
> them from being rescheduled. I'd suggest to leaves as is.
> 
> WDYT?
> 
> Chun-Hung Hsiao wrote:
> The scenario I am worried about is that when the RP is marked as gone, 
> the RP may fail to tear itself down if there is any task using its resources.
> 
> Benjamin Bannier wrote:
> We don't seem to have a good story on incompatible resource changes ATM. 
> If we e.g., update a RP config in an incompatible way, we make not attempt to 
> detect or rectify this in any way, either. At least in the agent we were in 
> general we were operating under the assumption that tasks would die on 
> incompatible changes to their resources.
> 
> I'd suggest to drop this issue for now and potentially file a JIRA to 
> track draining and situations where it might be needed.
> 
> Benjamin Bannier wrote:
> Dropping this, see above comment.
> 
> Chun-Hung Hsiao wrote:
> Reopening this. Here's my suggestion:
> 
> 1. Create a ticket for a proper design to drain the resource provider.
> 2. For now, let's fail the call here if 
> `slave->resourceProviders.at(resourceProviderId)->totalResources` is not 
> empty. In other words, the operator must first reconfigure the resource 
> provider to "drain" itself before making the `MARK_RESOURCE_PROVIDER_GONE` 
> call.

Okay.

1. I created https://issues.apache.org/jira/browse/MESOS-9522.
2. We had discussed this some time ago, and I think agreed that preventing 
operators from marking resource providers as gone whenever a task was running 
on them was not good (especially if operators cannot drain the RP). We have 
prior art in `MARK_AGENT_GONE` which asks operators to drain agent before 
marking as gone. I would suggest for now with MESOS-9522 not implemented we 
allow operators to mark resource providers as gone even if some of their 
resources are in use; if needed they can drain them by filtering all tasks 
using their resources. This also wouldn't expose a race.

Does this address your concerns?


- Benjamin


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/68147/#review207559
---


On Jan. 15, 2019, 3:53 p.m., Benjamin Bannier wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/68147/
> ---
> 
> (Updated Jan. 15, 2019, 3:53 p.m.)
> 
> 
> Review request for mesos, Chun-Hung Hsiao, Gastón Kleiman, Greg Mann, and Jan 
> Schlicht.
> 
> 
> Bugs: MESOS-8403
> https://issues.apache.org/jira/browse/MESOS-8403
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> This patch adds support for triggering permanent removal of local
> resource providers. We also add authorization and tests as part of this
> patch.
> 
> 
> Diffs
> -
> 
>   include/mesos/agent/agent.proto 74488e873cbf99ca487403b70691912cf3788288 
>   include/mesos/v1/agent/agent.proto 5d1ab6fd7f6f2159367c66ed35ee90aea07f 
>   src/common/protobuf_utils.cpp a45607eed4c4bae5010bcc3f3ffeabd6d911062a 
>   src/master/master.cpp 3b3824a67f46866cd64e32d7f9f92484b5891aa2 
>   src/slave/http.hpp 5b113fa2f7e3421d4219e0ece113010937b204c1 
>   src/slave/http.cpp ba389263ef7ab311682d542424eb56360841d24c 
>   src/slave/slave.hpp edf7269d4057ec8c95bb54c855210ad00d002a50 
>   src/slave/slave.cpp 858b78620e1ef33f3587d0bd95a684917aaf5bbb 
>   src/slave/validation.cpp df5e1373dbe497bc859455dcaf4e064e923bd72e 
>   src/tests/api_tests.cpp edde48a4b4b46c9f47bc44de389c97b77322b8e8 
> 
> 
> Diff: https://reviews.apache.org/r/68147/diff/12/
> 
> 
> Testing
> ---
> 
> `make check`
> 
> 
> Thanks,
> 
> Benjamin Bannier
> 
>



Re: Review Request 68147: Added agent support to remove local resource providers.

2019-01-15 Thread Benjamin Bannier

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/68147/
---

(Updated Jan. 15, 2019, 3:53 p.m.)


Review request for mesos, Chun-Hung Hsiao, Gastón Kleiman, Greg Mann, and Jan 
Schlicht.


Changes
---

Addressed comments from Chun.


Bugs: MESOS-8403
https://issues.apache.org/jira/browse/MESOS-8403


Repository: mesos


Description
---

This patch adds support for triggering permanent removal of local
resource providers. We also add authorization and tests as part of this
patch.


Diffs
-

  include/mesos/agent/agent.proto 74488e873cbf99ca487403b70691912cf3788288 
  include/mesos/v1/agent/agent.proto 5d1ab6fd7f6f2159367c66ed35ee90aea07f 
  src/common/protobuf_utils.cpp a45607eed4c4bae5010bcc3f3ffeabd6d911062a 
  src/master/master.cpp 3b3824a67f46866cd64e32d7f9f92484b5891aa2 
  src/slave/http.hpp 5b113fa2f7e3421d4219e0ece113010937b204c1 
  src/slave/http.cpp ba389263ef7ab311682d542424eb56360841d24c 
  src/slave/slave.hpp edf7269d4057ec8c95bb54c855210ad00d002a50 
  src/slave/slave.cpp 858b78620e1ef33f3587d0bd95a684917aaf5bbb 
  src/slave/validation.cpp df5e1373dbe497bc859455dcaf4e064e923bd72e 
  src/tests/api_tests.cpp edde48a4b4b46c9f47bc44de389c97b77322b8e8 


Diff: https://reviews.apache.org/r/68147/diff/12/


Testing
---

`make check`


Thanks,

Benjamin Bannier



Re: Review Request 68017: Added Seccomp-related protobuf messages.

2019-01-15 Thread Qian Zhang


> On Jan. 15, 2019, 10:39 a.m., Gilbert Song wrote:
> > include/mesos/mesos.proto
> > Lines 3158-3159 (patched)
> > 
> >
> > Seems like this was added recently.
> > 
> > Is this field only used when there is a default agent wide seccomp 
> > profile provided from the agent flag? and we reply on it to give an 
> > opportunity for user/framework to get rid of seccomp?
> > 
> > (probably more comment needed)
> > 
> > If it is the case, do we have other options for naming? e.g., 
> > `no_seccomp` (maybe more explicit)
> 
> Gilbert Song wrote:
> after second thought, I would suggest to remove this field for now. since 
> there is not use case yet (I understand your motivation: you want users could 
> get rid of seccomp if there is a default one). We could add this field later 
> if necessary. For now:
> 
> two options:
> 1. remove it
> 2. do it implicitly by using the optional profile_name: if seccompinfo 
> isSome but profile_name is None. do not set seccomp filtering for container
> 
> Gilbert Song wrote:
> I would prefer option #2.
> 
> The reason we want to avoid introducing `unconfined` now is that 
> framework could set both field at the same time and ideally we may need an 
> `enum type`. given the use case is not clear yet (people may not necessary to 
> use it yet). lets make it implicit for now.
> 
> Qian Zhang wrote:
> > do it implicitly by using the optional profile_name: if seccompinfo 
> isSome but profile_name is None. do not set seccomp filtering for container
> 
> So the semantic will be:
> 1. seccompinfo is None: The default seccomp profile (if any) will be 
> applied for the container.
> 2. seccompinfo is Some but profile_name is None: No seccomp filtering for 
> the container.
> 
> This is kind of unintuitive to me because none-seccompinfo and 
> none-profile_name will have totally different result.

Second thought, do we really need to support framework to disable seccomp 
filtering? I think that may not be what the operator wants. If the operator 
specifies a default seccomp profile, that means the operator cares about the 
security of the agent host, and he/she would expect the default seccomp profile 
to be applied for each container unless framework overrides it with another 
seccomp profile. Override does not mean disable because any seccomp profiles 
that framework can use are also set up by operator under `--seccomp_config_dir` 
(i.e., still under the operator's control), but if framework can disable 
seccomp filtering for its containers, that means the security of the agent host 
is out of the operator's control which seems kind of security hole.

However, disabling seccomp filtering can still be supported (if some operators 
want it) by not specifying the default seccomp profile, and in this case if 
framework does not specify seccompinfo or profile_name, the seccomp filtering 
will be disabled. But if operator specifies a default seccomp profile, I think 
we should not support disabling seccomp filtering. And another way to support 
disabling seccomp filtering is, operator specifies the default seccomp profile 
and also put some other profiles under `--seccomp_config_dir` and names them 
like `strict`, `loose` and `none`, and there is no syscall filtering in the 
`none` profile which is equivalent to disabling seccomp filtering.


- Qian


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/68017/#review211986
---


On Nov. 8, 2018, 11:24 p.m., Andrei Budnik wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/68017/
> ---
> 
> (Updated Nov. 8, 2018, 11:24 p.m.)
> 
> 
> Review request for mesos, Gilbert Song, Jie Yu, James Peach, and Qian Zhang.
> 
> 
> Bugs: MESOS-9033
> https://issues.apache.org/jira/browse/MESOS-9033
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> See summary.
> 
> 
> Diffs
> -
> 
>   include/mesos/mesos.proto 2ef6ba3aef67cf34227569948fd3ddc8dfd5879d 
>   include/mesos/seccomp/seccomp.hpp PRE-CREATION 
>   include/mesos/seccomp/seccomp.proto PRE-CREATION 
>   include/mesos/slave/containerizer.proto 
> 5b4dcdda0f55ea3355c78d1447c7be9ca54d9dc9 
>   include/mesos/v1/mesos.proto 1a701da65f653fe4191f92ff1fb1436809b50acb 
>   src/CMakeLists.txt a574d449dc26b820cbef7ff0b5e94b42b6fe86cf 
>   src/Makefile.am 188a47017221a931d8b965a4af5e033b77e6ce4e 
> 
> 
> Diff: https://reviews.apache.org/r/68017/diff/14/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Andrei Budnik
> 
>



Re: Review Request 69694: Tester.

2019-01-15 Thread Mesos Reviewbot

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69694/#review212004
---



Bad review!

Reviews applied: []

Error:
No reviewers specified. Please find a reviewer by asking on JIRA or the mailing 
list.

- Mesos Reviewbot


On Jan. 10, 2019, 3:52 a.m., Till Toenshoff wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/69694/
> ---
> 
> (Updated Jan. 10, 2019, 3:52 a.m.)
> 
> 
> Review request for mesos.
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Tester.
> 
> 
> Diffs
> -
> 
>   CMakeLists.txt f2885faa25f1d718d0f451aa2199e3f7692317a3 
>   configure.ac 6778f119570def1838e26cddf7b0192bfe6e37d4 
> 
> 
> Diff: https://reviews.apache.org/r/69694/diff/4/
> 
> 
> Testing
> ---
> 
> Dont review - just a test!
> 
> 
> Thanks,
> 
> Till Toenshoff
> 
>



Re: Review Request 69719: Exposed subscriptions and disconnections RP manager metrics.

2019-01-15 Thread Benjamin Bannier


> On Jan. 15, 2019, 12:55 a.m., Chun-Hung Hsiao wrote:
> > src/resource_provider/manager.cpp
> > Lines 823-832 (original), 827-836 (patched)
> > 
> >
> > How about moving this snippt to the end of this function, so the agent 
> > would receive `SUBSCRIBE` and then `DISCONNECT` no matter if the RP has a 
> > connection problem after it subscribes? This would also make the metric 
> > reveal the connection problem.

This is supposed to be a counter for successful subscriptions, see below.


> On Jan. 15, 2019, 12:55 a.m., Chun-Hung Hsiao wrote:
> > src/resource_provider/manager.cpp
> > Lines 1004-1005 (patched)
> > 
> >
> > How about the following, to make it consistent with, e.g., 
> > `master/messages_register_framework`, and also make it less confusing w/ 
> > `resource_provider_manager/subscribed`?
> > ```
> > messages_subscribe("resource_provider_manager/messages_subscribe"),
> > messages_disconnect("resource_provider_manager/messages_disconnect")
> > ```

These are not really counters for messages as disconnections have no relation 
to messages the RP manager receives, and the subscriptions counter is supposed 
to count successfull connections (not just all subscription attempts (this 
allows correlating subscription and disconnection counters). What do you think 
about naming them e.g., 
`resource_provider_manager/events/[subscribe|disconnect]`? We could introduce a 
separate counter for subscription messages later.


> On Jan. 15, 2019, 12:55 a.m., Chun-Hung Hsiao wrote:
> > src/tests/resource_provider_manager_tests.cpp
> > Lines 1463-1468 (patched)
> > 
> >
> > We usually do the following instead:
> > ```
> > ASSERT_NE(0u, snapshot.values.count(...));
> > ```
> > Or
> > ```
> > ASSERT_EQ(1u, snapshot.values.count(...));
> > ```
> > 
> > But I have to admit that `ASSERT_TRUE` is more readable. So it's up to 
> > you to decide if you want to keep the consistency or not ;)

Changed to `ASSERT_EQ`. While I also like above form with `ASSERT_TRUE` Mesos 
style in general seems to be to avoid implicit conversions to `bool`.


- Benjamin


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69719/#review211971
---


On Jan. 15, 2019, 10:27 a.m., Benjamin Bannier wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/69719/
> ---
> 
> (Updated Jan. 15, 2019, 10:27 a.m.)
> 
> 
> Review request for mesos, Chun-Hung Hsiao and James DeFelice.
> 
> 
> Bugs: MESOS-9223
> https://issues.apache.org/jira/browse/MESOS-9223
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> This patch adds monotonically increasing counters for subscriptions and
> disconnections of resource providers with the resource provider manager
> (`resource_provider_manager/subscriptions` and
> `resource_provider_manager/disconnections`, respectively). While the
> existing gauge `resource_provider_manager/subscribed` exposed the
> current state, these added counters can be used to monitor resource
> provider state for unexpected events.
> 
> 
> Diffs
> -
> 
>   src/resource_provider/manager.cpp 65852c629393f32fd582bfcff86d7ce14e5386ac 
>   src/tests/resource_provider_manager_tests.cpp 
> 455ce7d2c71f2815430b69a5475b2ccc343cd9af 
> 
> 
> Diff: https://reviews.apache.org/r/69719/diff/2/
> 
> 
> Testing
> ---
> 
> `make check`
> 
> 
> Thanks,
> 
> Benjamin Bannier
> 
>



Re: Review Request 69719: Exposed subscriptions and disconnections RP manager metrics.

2019-01-15 Thread Benjamin Bannier

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69719/
---

(Updated Jan. 15, 2019, 10:27 a.m.)


Review request for mesos, Chun-Hung Hsiao and James DeFelice.


Changes
---

Addressed some comments from Chun.


Bugs: MESOS-9223
https://issues.apache.org/jira/browse/MESOS-9223


Repository: mesos


Description
---

This patch adds monotonically increasing counters for subscriptions and
disconnections of resource providers with the resource provider manager
(`resource_provider_manager/subscriptions` and
`resource_provider_manager/disconnections`, respectively). While the
existing gauge `resource_provider_manager/subscribed` exposed the
current state, these added counters can be used to monitor resource
provider state for unexpected events.


Diffs (updated)
-

  src/resource_provider/manager.cpp 65852c629393f32fd582bfcff86d7ce14e5386ac 
  src/tests/resource_provider_manager_tests.cpp 
455ce7d2c71f2815430b69a5475b2ccc343cd9af 


Diff: https://reviews.apache.org/r/69719/diff/2/

Changes: https://reviews.apache.org/r/69719/diff/1-2/


Testing
---

`make check`


Thanks,

Benjamin Bannier



Re: Review Request 69606: Made starting and stopping on CSI plugin containers more verbose.

2019-01-15 Thread Benjamin Bannier

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69606/
---

(Updated Jan. 15, 2019, 9:43 a.m.)


Review request for mesos and Chun-Hung Hsiao.


Bugs: MESOS-9223
https://issues.apache.org/jira/browse/MESOS-9223


Repository: mesos


Description
---

This patch adds some additional logging so it becomes easier to follow
the lifecycle of CSI plugins. The container daemon already logged some
related information, but didn't directly call out that it was working
with CSI plugin containers.


Diffs (updated)
-

  src/resource_provider/storage/provider.cpp 
a0641665710e08f6f3b4b3ece7bcc5c6d4ef 


Diff: https://reviews.apache.org/r/69606/diff/3/

Changes: https://reviews.apache.org/r/69606/diff/2-3/


Testing
---

`sudo make check`


Thanks,

Benjamin Bannier



Re: Review Request 68017: Added Seccomp-related protobuf messages.

2019-01-15 Thread Qian Zhang

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/68017/#review211942
---


Fix it, then Ship it!





include/mesos/seccomp/seccomp.proto
Lines 22-23 (patched)


Here we just need one blank line rather than two.



include/mesos/seccomp/seccomp.proto
Lines 79 (patched)


So the value must be numeric? Do we support other types (like string value)?


- Qian Zhang


On Nov. 8, 2018, 11:24 p.m., Andrei Budnik wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/68017/
> ---
> 
> (Updated Nov. 8, 2018, 11:24 p.m.)
> 
> 
> Review request for mesos, Gilbert Song, Jie Yu, James Peach, and Qian Zhang.
> 
> 
> Bugs: MESOS-9033
> https://issues.apache.org/jira/browse/MESOS-9033
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> See summary.
> 
> 
> Diffs
> -
> 
>   include/mesos/mesos.proto 2ef6ba3aef67cf34227569948fd3ddc8dfd5879d 
>   include/mesos/seccomp/seccomp.hpp PRE-CREATION 
>   include/mesos/seccomp/seccomp.proto PRE-CREATION 
>   include/mesos/slave/containerizer.proto 
> 5b4dcdda0f55ea3355c78d1447c7be9ca54d9dc9 
>   include/mesos/v1/mesos.proto 1a701da65f653fe4191f92ff1fb1436809b50acb 
>   src/CMakeLists.txt a574d449dc26b820cbef7ff0b5e94b42b6fe86cf 
>   src/Makefile.am 188a47017221a931d8b965a4af5e033b77e6ce4e 
> 
> 
> Diff: https://reviews.apache.org/r/68017/diff/14/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Andrei Budnik
> 
>



Re: Review Request 68017: Added Seccomp-related protobuf messages.

2019-01-15 Thread Qian Zhang


> On Jan. 15, 2019, 10:39 a.m., Gilbert Song wrote:
> > include/mesos/mesos.proto
> > Lines 3158-3159 (patched)
> > 
> >
> > Seems like this was added recently.
> > 
> > Is this field only used when there is a default agent wide seccomp 
> > profile provided from the agent flag? and we reply on it to give an 
> > opportunity for user/framework to get rid of seccomp?
> > 
> > (probably more comment needed)
> > 
> > If it is the case, do we have other options for naming? e.g., 
> > `no_seccomp` (maybe more explicit)
> 
> Gilbert Song wrote:
> after second thought, I would suggest to remove this field for now. since 
> there is not use case yet (I understand your motivation: you want users could 
> get rid of seccomp if there is a default one). We could add this field later 
> if necessary. For now:
> 
> two options:
> 1. remove it
> 2. do it implicitly by using the optional profile_name: if seccompinfo 
> isSome but profile_name is None. do not set seccomp filtering for container
> 
> Gilbert Song wrote:
> I would prefer option #2.
> 
> The reason we want to avoid introducing `unconfined` now is that 
> framework could set both field at the same time and ideally we may need an 
> `enum type`. given the use case is not clear yet (people may not necessary to 
> use it yet). lets make it implicit for now.

> do it implicitly by using the optional profile_name: if seccompinfo isSome 
> but profile_name is None. do not set seccomp filtering for container

So the semantic will be:
1. seccompinfo is None: The default seccomp profile (if any) will be applied 
for the container.
2. seccompinfo is Some but profile_name is None: No seccomp filtering for the 
container.

This is kind of unintuitive to me because none-seccompinfo and 
none-profile_name will have totally different result.


- Qian


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/68017/#review211986
---


On Nov. 8, 2018, 11:24 p.m., Andrei Budnik wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/68017/
> ---
> 
> (Updated Nov. 8, 2018, 11:24 p.m.)
> 
> 
> Review request for mesos, Gilbert Song, Jie Yu, James Peach, and Qian Zhang.
> 
> 
> Bugs: MESOS-9033
> https://issues.apache.org/jira/browse/MESOS-9033
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> See summary.
> 
> 
> Diffs
> -
> 
>   include/mesos/mesos.proto 2ef6ba3aef67cf34227569948fd3ddc8dfd5879d 
>   include/mesos/seccomp/seccomp.hpp PRE-CREATION 
>   include/mesos/seccomp/seccomp.proto PRE-CREATION 
>   include/mesos/slave/containerizer.proto 
> 5b4dcdda0f55ea3355c78d1447c7be9ca54d9dc9 
>   include/mesos/v1/mesos.proto 1a701da65f653fe4191f92ff1fb1436809b50acb 
>   src/CMakeLists.txt a574d449dc26b820cbef7ff0b5e94b42b6fe86cf 
>   src/Makefile.am 188a47017221a931d8b965a4af5e033b77e6ce4e 
> 
> 
> Diff: https://reviews.apache.org/r/68017/diff/14/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Andrei Budnik
> 
>



Re: Review Request 69687: Fixed flakiness of resource provider ContainerTerminationMetric test.

2019-01-15 Thread Benjamin Bannier

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69687/
---

(Updated Jan. 15, 2019, 9:31 a.m.)


Review request for mesos and Chun-Hung Hsiao.


Changes
---

Extended comment as suggested by chhsia0.


Bugs: MESOS-9130
https://issues.apache.org/jira/browse/MESOS-9130


Repository: mesos


Description
---

Plugin restarts can currently fail if the resource provider is killed
before it has performed minimal reconciliation steps. This patch adds
additional synchronization to ensure the plugin container can safely be
killed to safely perform the desired test.


Diffs (updated)
-

  src/tests/storage_local_resource_provider_tests.cpp 
e8ed20f818ed7f1a3ce15758ea3c366520443377 


Diff: https://reviews.apache.org/r/69687/diff/2/

Changes: https://reviews.apache.org/r/69687/diff/1-2/


Testing
---

* `make check`
* ran test under stress for ~1000 iterations w/o issues


Thanks,

Benjamin Bannier