[GitHub] [mesos] brat002 opened a new pull request #348: Add wargaming.net url to Powered By Mesos Page

2020-01-09 Thread GitBox
brat002 opened a new pull request #348: Add wargaming.net url to Powered By 
Mesos Page
URL: https://github.com/apache/mesos/pull/348
 
 
   Wargaming uses Mesos clusters at least 4 years. I think, we should add it to 
the Page.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


Re: Review Request 71867: WIP: Support for docker containerizer.

2020-01-09 Thread Mesos Reviewbot

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71867/#review219203
---



Bad patch!

Reviews applied: [71947, 71814, 71815, 71816, 71833, 71836, 71853, 71854, 71867]

Failed command: ['bash', '-c', "set -o pipefail; export OS='ubuntu:14.04' 
BUILDTOOL='autotools' COMPILER='gcc' CONFIGURATION='--verbose 
--disable-libtool-wrappers --disable-parallel-test-execution' 
ENVIRONMENT='GLOG_v=1 MESOS_VERBOSE=1'; ./support/docker-build.sh 2>&1 | tee 
build_71867"]

Error:
..
ed, _Alloc>&)
 operator!=(const unordered_multimap<_Key, _Tp, _Hash, _Pred, _Alloc>& __x,
 ^
/usr/include/c++/4.8/bits/unordered_map.h:1406:5: note:   template argument 
deduction/substitution failed:
In file included from 
../3rdparty/googletest-release-1.8.0/googlemock/include/gmock/internal/gmock-internal-utils.h:47:0,
 from 
../3rdparty/googletest-release-1.8.0/googlemock/include/gmock/gmock-actions.h:46,
 from 
../3rdparty/googletest-release-1.8.0/googlemock/include/gmock/gmock.h:58,
 from 
../../src/tests/containerizer/docker_containerizer_tests.cpp:17:
../3rdparty/googletest-release-1.8.0/googletest/include/gtest/gtest.h:1522:28: 
note:   'const std::basic_string' is not derived from 'const 
std::unordered_multimap<_Key, _Tp, _Hash, _Pred, _Alloc>'
 GTEST_IMPL_CMP_HELPER_(NE, !=);
^
../3rdparty/googletest-release-1.8.0/googletest/include/gtest/gtest.h:1510:12: 
note: in definition of macro 'GTEST_IMPL_CMP_HELPER_'
   if (val1 op val2) {\
^
In file included from /usr/include/c++/4.8/unordered_map:48:0,
 from 
../3rdparty/protobuf-3.5.0/src/google/protobuf/stubs/hash.h:146,
 from ../3rdparty/protobuf-3.5.0/src/google/protobuf/map.h:49,
 from 
../3rdparty/protobuf-3.5.0/src/google/protobuf/generated_message_table_driven.h:34,
 from ../include/mesos/mesos.pb.h:25,
 from ../../include/mesos/mesos.hpp:20,
 from ../../include/mesos/slave/container_logger.hpp:24,
 from 
../../src/tests/containerizer/docker_containerizer_tests.cpp:21:
/usr/include/c++/4.8/bits/unordered_map.h:1394:5: note: template bool std::operator!=(const 
std::unordered_map<_Key, _Tp, _Hash, _Pred, _Alloc>&, const 
std::unordered_map<_Key, _Tp, _Hash, _Pred, _Alloc>&)
 operator!=(const unordered_map<_Key, _Tp, _Hash, _Pred, _Alloc>& __x,
 ^
/usr/include/c++/4.8/bits/unordered_map.h:1394:5: note:   template argument 
deduction/substitution failed:
In file included from 
../3rdparty/googletest-release-1.8.0/googlemock/include/gmock/internal/gmock-internal-utils.h:47:0,
 from 
../3rdparty/googletest-release-1.8.0/googlemock/include/gmock/gmock-actions.h:46,
 from 
../3rdparty/googletest-release-1.8.0/googlemock/include/gmock/gmock.h:58,
 from 
../../src/tests/containerizer/docker_containerizer_tests.cpp:17:
../3rdparty/googletest-release-1.8.0/googletest/include/gtest/gtest.h:1522:28: 
note:   'const std::basic_string' is not derived from 'const 
std::unordered_map<_Key, _Tp, _Hash, _Pred, _Alloc>'
 GTEST_IMPL_CMP_HELPER_(NE, !=);
^
../3rdparty/googletest-release-1.8.0/googletest/include/gtest/gtest.h:1510:12: 
note: in definition of macro 'GTEST_IMPL_CMP_HELPER_'
   if (val1 op val2) {\
^
In file included from /usr/include/c++/4.8/iterator:66:0,
 from 
../3rdparty/googletest-release-1.8.0/googletest/include/gtest/internal/gtest-param-util.h:39,
 from 
../3rdparty/googletest-release-1.8.0/googletest/include/gtest/gtest-param-test.h:192,
 from 
../3rdparty/googletest-release-1.8.0/googletest/include/gtest/gtest.h:62,
 from 
../3rdparty/googletest-release-1.8.0/googlemock/include/gmock/internal/gmock-internal-utils.h:47,
 from 
../3rdparty/googletest-release-1.8.0/googlemock/include/gmock/gmock-actions.h:46,
 from 
../3rdparty/googletest-release-1.8.0/googlemock/include/gmock/gmock.h:58,
 from 
../../src/tests/containerizer/docker_containerizer_tests.cpp:17:
/usr/include/c++/4.8/bits/stream_iterator.h:137:5: note: template bool std::operator!=(const 
std::istream_iterator<_Tp, _CharT, _Traits, _Dist>&, const 
std::istream_iterator<_Tp, _CharT, _Traits, _Dist>&)
 operator!=(const istream_iterator<_Tp, _CharT, _Traits, _Dist>& __x,
 ^
/usr/include/c++/4.8/bits/stream_iterator.h:137:5: note:   template argument 
deduction/substitution failed:
In file included from 
../3rdparty/googletest-release-1.8.0/googlemock/include/gmock/internal/gmock-internal-utils.h:47:0,
 from 
../3rdparty/googletest-release-1.8.0/googlemock/include/gmock/gmock-actions.h:46,
 from 
../3rdparty/google

Re: Review Request 71944: Set container process's OOM score adjust.

2020-01-09 Thread Qian Zhang


> On Jan. 8, 2020, 7:07 a.m., Greg Mann wrote:
> > src/slave/containerizer/mesos/isolators/cgroups/subsystems/memory.cpp
> > Lines 199 (patched)
> > 
> >
> > Do we really want to do this? My concern is that this will make any 
> > non-Mesos-task processes on the node (networking and security components, 
> > for example) more likely to be OOM-killed than Mesos tasks. Perhaps we 
> > should only set the OOM score adjustment for burstable tasks. What do you 
> > think?
> 
> Qian Zhang wrote:
> I think it depends on which one is in higher priority and more important, 
> guaranteed task or non-Mesos-task processes? In Kubernetes implementation 
> (https://github.com/kubernetes/kubernetes/blob/v1.16.2/pkg/kubelet/qos/policy.go#L51:L53),
>  the OOM score adjust of guaranteed container is set to -998, and kubelet's 
> OOM score adjust is set to -998 too, I think we should do the same to protect 
> guaranteed containers and Mesos agent, what do you think?
> 
> Greg Mann wrote:
> One significant difference in the Kubernetes case is that they have 
> user-space code which kills pod processes to reclaim memory when necessary. 
> Consequently, there will be less impact if the OOM killer shows a strong 
> preference against killing guaranteed tasks.
> 
> My intuition is that we should not set the OOM score adjustment for 
> non-bursting processes. Even if we leave it at zero, guaranteed tasks will 
> still be treated preferentially with respect to bursting tasks, since all 
> bursting tasks will have an adjustment greater than zero.
> 
> Qian Zhang wrote:
> I agree that guaranteed tasks will be treated preferentially with respect 
> to bursting tasks, but I am thinking about the guaranteed tasks v.s. the 
> non-Mesos-tasks, let's say two guaranteed tasks running on a node, and each 
> of them's memory request/limit is half of the node's memory, and both of them 
> has almost used all of its memory request/limit, so their OOM score will be 
> very high (like 490+). Now if a non-mesos-task (e.g., a system component or 
> even Mesos agent itself) tries to use a lot of memory suddenly, the node will 
> be short of memory, and then OOM killer will definitely kill one of the two 
> guaranteed tasks since their OOM score are the top 2 in the node. But I do 
> not think K8s will have this issue since the guaranteed containers OOM score 
> adjust is -998.
> 
> Qian Zhang wrote:
> And even we think about the case guaranteed tasks v.s. burstable tasks, I 
> think it is also a bit risky if we leave guaranteed task's OOM score adjust 
> to 0. For example, one guaranteed task (T1) and one burstable task (T2) 
> running on a node, each of them's memory request is half of the node's 
> memory. T1 has almost used all of its memory request/limit, so its OOM score 
> will be something like 490+. T2 uses very little memory, so its OOM score 
> will be a bit beyond 500 (like 510). I think in this case the OOM scores of 
> T1 and T2 are too close, the actual OOM score is calculated in a more complex 
> way, so I am afraid there will be a moment that the OOM score of T1 is even 
> higher than T2, that's why I think it is a bit risky.

Just want to add one point, it seems a small amount (30) will be subtracted 
from the OOM score of root-owned processes, so in the above example, if T2 is 
owned by root but T1 is owned by a normal user, it might be possible T2 get a 
smaller OOM score that T1.


- Qian


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71944/#review219158
---


On Jan. 8, 2020, 11:28 p.m., Qian Zhang wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/71944/
> ---
> 
> (Updated Jan. 8, 2020, 11:28 p.m.)
> 
> 
> Review request for mesos, Andrei Budnik and Greg Mann.
> 
> 
> Bugs: MESOS-10048
> https://issues.apache.org/jira/browse/MESOS-10048
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Set container process's OOM score adjust.
> 
> 
> Diffs
> -
> 
>   src/slave/containerizer/mesos/isolators/cgroups/subsystems/memory.hpp 
> 27d88e91fb784179effd54781f84000fe85c13eb 
>   src/slave/containerizer/mesos/isolators/cgroups/subsystems/memory.cpp 
> 0896d37761a11f55ba4b866d235c3bd2b79dcfba 
> 
> 
> Diff: https://reviews.apache.org/r/71944/diff/3/
> 
> 
> Testing
> ---
> 
> sudo make check
> 
> 
> Thanks,
> 
> Qian Zhang
> 
>



Re: Review Request 71944: Set container process's OOM score adjust.

2020-01-09 Thread Qian Zhang


> On Jan. 8, 2020, 7:07 a.m., Greg Mann wrote:
> > src/slave/containerizer/mesos/isolators/cgroups/subsystems/memory.cpp
> > Lines 199 (patched)
> > 
> >
> > Do we really want to do this? My concern is that this will make any 
> > non-Mesos-task processes on the node (networking and security components, 
> > for example) more likely to be OOM-killed than Mesos tasks. Perhaps we 
> > should only set the OOM score adjustment for burstable tasks. What do you 
> > think?
> 
> Qian Zhang wrote:
> I think it depends on which one is in higher priority and more important, 
> guaranteed task or non-Mesos-task processes? In Kubernetes implementation 
> (https://github.com/kubernetes/kubernetes/blob/v1.16.2/pkg/kubelet/qos/policy.go#L51:L53),
>  the OOM score adjust of guaranteed container is set to -998, and kubelet's 
> OOM score adjust is set to -998 too, I think we should do the same to protect 
> guaranteed containers and Mesos agent, what do you think?
> 
> Greg Mann wrote:
> One significant difference in the Kubernetes case is that they have 
> user-space code which kills pod processes to reclaim memory when necessary. 
> Consequently, there will be less impact if the OOM killer shows a strong 
> preference against killing guaranteed tasks.
> 
> My intuition is that we should not set the OOM score adjustment for 
> non-bursting processes. Even if we leave it at zero, guaranteed tasks will 
> still be treated preferentially with respect to bursting tasks, since all 
> bursting tasks will have an adjustment greater than zero.
> 
> Qian Zhang wrote:
> I agree that guaranteed tasks will be treated preferentially with respect 
> to bursting tasks, but I am thinking about the guaranteed tasks v.s. the 
> non-Mesos-tasks, let's say two guaranteed tasks running on a node, and each 
> of them's memory request/limit is half of the node's memory, and both of them 
> has almost used all of its memory request/limit, so their OOM score will be 
> very high (like 490+). Now if a non-mesos-task (e.g., a system component or 
> even Mesos agent itself) tries to use a lot of memory suddenly, the node will 
> be short of memory, and then OOM killer will definitely kill one of the two 
> guaranteed tasks since their OOM score are the top 2 in the node. But I do 
> not think K8s will have this issue since the guaranteed containers OOM score 
> adjust is -998.

And even we think about the case guaranteed tasks v.s. burstable tasks, I think 
it is also a bit risky if we leave guaranteed task's OOM score adjust to 0. For 
example, one guaranteed task (T1) and one burstable task (T2) running on a 
node, each of them's memory request is half of the node's memory. T1 has almost 
used all of its memory request/limit, so its OOM score will be something like 
490+. T2 uses very little memory, so its OOM score will be a bit beyond 500 
(like 510). I think in this case the OOM scores of T1 and T2 are too close, the 
actual OOM score is calculated in a more complex way, so I am afraid there will 
be a moment that the OOM score of T1 is even higher than T2, that's why I think 
it is a bit risky.


- Qian


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71944/#review219158
---


On Jan. 8, 2020, 11:28 p.m., Qian Zhang wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/71944/
> ---
> 
> (Updated Jan. 8, 2020, 11:28 p.m.)
> 
> 
> Review request for mesos, Andrei Budnik and Greg Mann.
> 
> 
> Bugs: MESOS-10048
> https://issues.apache.org/jira/browse/MESOS-10048
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Set container process's OOM score adjust.
> 
> 
> Diffs
> -
> 
>   src/slave/containerizer/mesos/isolators/cgroups/subsystems/memory.hpp 
> 27d88e91fb784179effd54781f84000fe85c13eb 
>   src/slave/containerizer/mesos/isolators/cgroups/subsystems/memory.cpp 
> 0896d37761a11f55ba4b866d235c3bd2b79dcfba 
> 
> 
> Diff: https://reviews.apache.org/r/71944/diff/3/
> 
> 
> Testing
> ---
> 
> sudo make check
> 
> 
> Thanks,
> 
> Qian Zhang
> 
>



Re: Review Request 71935: Added a new task status reason.

2020-01-09 Thread Mesos Reviewbot

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71935/#review219199
---



Patch looks great!

Reviews applied: [71935]

Passed command: export OS='ubuntu:14.04' BUILDTOOL='autotools' COMPILER='gcc' 
CONFIGURATION='--verbose --disable-libtool-wrappers 
--disable-parallel-test-execution' ENVIRONMENT='GLOG_v=1 MESOS_VERBOSE=1'; 
./support/docker-build.sh

- Mesos Reviewbot


On Jan. 10, 2020, 12:23 a.m., Greg Mann wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/71935/
> ---
> 
> (Updated Jan. 10, 2020, 12:23 a.m.)
> 
> 
> Review request for mesos, Andrei Budnik and Qian Zhang.
> 
> 
> Bugs: MESOS-10049
> https://issues.apache.org/jira/browse/MESOS-10049
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> The new reason will be sent to frameworks when one of their tasks is
> OOM-killed on an agent while the task is exceeding its memory request.
> 
> 
> Diffs
> -
> 
>   include/mesos/mesos.proto b0f590545df720df644d049e9d8f1e81bdbe814c 
>   include/mesos/v1/mesos.proto 53a7b9bc6190d826868a1633c11c9a0ecf9acf0a 
> 
> 
> Diff: https://reviews.apache.org/r/71935/diff/2/
> 
> 
> Testing
> ---
> 
> `make`
> 
> 
> Thanks,
> 
> Greg Mann
> 
>



Re: Review Request 71944: Set container process's OOM score adjust.

2020-01-09 Thread Qian Zhang


> On Jan. 8, 2020, 7:07 a.m., Greg Mann wrote:
> > src/slave/containerizer/mesos/isolators/cgroups/subsystems/memory.cpp
> > Lines 199 (patched)
> > 
> >
> > Do we really want to do this? My concern is that this will make any 
> > non-Mesos-task processes on the node (networking and security components, 
> > for example) more likely to be OOM-killed than Mesos tasks. Perhaps we 
> > should only set the OOM score adjustment for burstable tasks. What do you 
> > think?
> 
> Qian Zhang wrote:
> I think it depends on which one is in higher priority and more important, 
> guaranteed task or non-Mesos-task processes? In Kubernetes implementation 
> (https://github.com/kubernetes/kubernetes/blob/v1.16.2/pkg/kubelet/qos/policy.go#L51:L53),
>  the OOM score adjust of guaranteed container is set to -998, and kubelet's 
> OOM score adjust is set to -998 too, I think we should do the same to protect 
> guaranteed containers and Mesos agent, what do you think?
> 
> Greg Mann wrote:
> One significant difference in the Kubernetes case is that they have 
> user-space code which kills pod processes to reclaim memory when necessary. 
> Consequently, there will be less impact if the OOM killer shows a strong 
> preference against killing guaranteed tasks.
> 
> My intuition is that we should not set the OOM score adjustment for 
> non-bursting processes. Even if we leave it at zero, guaranteed tasks will 
> still be treated preferentially with respect to bursting tasks, since all 
> bursting tasks will have an adjustment greater than zero.

I agree that guaranteed tasks will be treated preferentially with respect to 
bursting tasks, but I am thinking about the guaranteed tasks v.s. the 
non-Mesos-tasks, let's say two guaranteed tasks running on a node, and each of 
them's memory request/limit is half of the node's memory, and both of them has 
almost used all of its memory request/limit, so their OOM score will be very 
high (like 490+). Now if a non-mesos-task (e.g., a system component or even 
Mesos agent itself) tries to use a lot of memory suddenly, the node will be 
short of memory, and then OOM killer will definitely kill one of the two 
guaranteed tasks since their OOM score are the top 2 in the node. But I do not 
think K8s will have this issue since the guaranteed containers OOM score adjust 
is -998.


- Qian


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71944/#review219158
---


On Jan. 8, 2020, 11:28 p.m., Qian Zhang wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/71944/
> ---
> 
> (Updated Jan. 8, 2020, 11:28 p.m.)
> 
> 
> Review request for mesos, Andrei Budnik and Greg Mann.
> 
> 
> Bugs: MESOS-10048
> https://issues.apache.org/jira/browse/MESOS-10048
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Set container process's OOM score adjust.
> 
> 
> Diffs
> -
> 
>   src/slave/containerizer/mesos/isolators/cgroups/subsystems/memory.hpp 
> 27d88e91fb784179effd54781f84000fe85c13eb 
>   src/slave/containerizer/mesos/isolators/cgroups/subsystems/memory.cpp 
> 0896d37761a11f55ba4b866d235c3bd2b79dcfba 
> 
> 
> Diff: https://reviews.apache.org/r/71944/diff/3/
> 
> 
> Testing
> ---
> 
> sudo make check
> 
> 
> Thanks,
> 
> Qian Zhang
> 
>



Review Request 71977: Added systemd support to domain socket agent flag.

2020-01-09 Thread Benno Evers

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71977/
---

Review request for mesos and Benjamin Bannier.


Repository: mesos


Description
---

Added the ability to specify a unix domain socket
as `systemd:` for the `--domain_socket_location`
agent flag.

This will instruct the agent to expect the domain socket
being passed by systemd with the specified name.


Diffs
-

  src/slave/flags.cpp 50b09cf3b0cf89568a45afd3fe89fa16c6a79222 
  src/slave/main.cpp fd58637cd680291e6794bcdb0655603bb97744c7 


Diff: https://reviews.apache.org/r/71977/diff/1/


Testing
---


Thanks,

Benno Evers



Review Request 71976: Added support for systemd socket activation API.

2020-01-09 Thread Benno Evers

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71976/
---

Review request for mesos, Benjamin Bannier and Benjamin Mahler.


Repository: mesos


Description
---

Added support for the systemd socket activation api,
that allows systemd to pass listening file descriptors
to a given service.


Diffs
-

  src/linux/systemd.hpp f760e2c02683f0169ba5d3b0a3da53f66a7f3d91 
  src/linux/systemd.cpp 8126b31a542b62ce51cc6e0f6372986bffcc948b 


Diff: https://reviews.apache.org/r/71976/diff/1/


Testing
---


Thanks,

Benno Evers



Re: Review Request 71833: Created unix domain socket on agent startup.

2020-01-09 Thread Benno Evers

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71833/
---

(Updated Jan. 10, 2020, 1:46 a.m.)


Review request for mesos and Benjamin Bannier.


Changes
---

Addressed review comments


Bugs: MESOS-10036
https://issues.apache.org/jira/browse/MESOS-10036


Repository: mesos


Description
---

Added agent code to check whether domain socket support is
enabled, and if so to create or open the socket at
MESOS_DOMAIN_SOCKET_LOCATION.


Diffs (updated)
-

  src/Makefile.am 47ad1bd27e155f0193aafa956df0bd43baafd348 
  src/common/domain_sockets.hpp PRE-CREATION 
  src/local/local.cpp 8ff43618f06463b4ba86b64b25e5de692e406448 
  src/slave/main.cpp fd58637cd680291e6794bcdb0655603bb97744c7 
  src/slave/slave.hpp 77b5bc0082c6bb73fbd48a2ebe812629921645cb 
  src/slave/slave.cpp 3839a120446339fea8aa857f431a2dba28ed4002 
  src/tests/cluster.cpp f7bc882a644ec65710ada3d15507e1d4c5ba06f7 
  src/tests/mock_slave.cpp 71be957884ea88258ef37e60649e3947e89b12d0 


Diff: https://reviews.apache.org/r/71833/diff/5/

Changes: https://reviews.apache.org/r/71833/diff/4-5/


Testing
---


Thanks,

Benno Evers



Re: Review Request 71816: Added domain socket-related flags to Mesos agent.

2020-01-09 Thread Benno Evers

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71816/
---

(Updated Jan. 10, 2020, 1:43 a.m.)


Review request for mesos and Benjamin Bannier.


Changes
---

Various bugfixes.


Summary (updated)
-

Added domain socket-related flags to Mesos agent.


Bugs: MESOS-10035
https://issues.apache.org/jira/browse/MESOS-10035


Repository: mesos


Description
---

Added two new flags 'http_executor_domain_sockets' and
'domain_socket_location' to the agent, along with minimal
logic to pass the relevant values on to executors.

They are used to control whether the agent should
provide a domain socket to HTTP executors and the
location of the socket on the host filesystem, respectively.


Diffs (updated)
-

  src/slave/constants.hpp 721afe13ff590122878a8ead62c66f89e8549e91 
  src/slave/flags.hpp 3c5ffca62454c340656cfc89c53a198757815794 
  src/slave/flags.cpp 50b09cf3b0cf89568a45afd3fe89fa16c6a79222 
  src/slave/main.cpp fd58637cd680291e6794bcdb0655603bb97744c7 
  src/slave/slave.cpp 3839a120446339fea8aa857f431a2dba28ed4002 


Diff: https://reviews.apache.org/r/71816/diff/3/

Changes: https://reviews.apache.org/r/71816/diff/2-3/


Testing
---


Thanks,

Benno Evers



Re: Review Request 71815: Made the default executors connect via domain sockets if available.

2020-01-09 Thread Benno Evers

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71815/
---

(Updated Jan. 10, 2020, 1:42 a.m.)


Review request for mesos and Benjamin Bannier.


Changes
---

Used new `relative_path()` function from stout.


Bugs: MESOS-10039
https://issues.apache.org/jira/browse/MESOS-10039


Repository: mesos


Description
---

Added support for the `MESOS_DOMAIN_SOCKET` environment variable
to the default executor. If this variable is present, the executor
will use the specified domain socket to connect to the agent,
as opposed to the IP address encoded in the agent PID.


Diffs (updated)
-

  src/executor/executor.cpp dfa58206b65a6827ad4b958879c19a093a672e80 
  src/launcher/default_executor.cpp 521494a2770192e39b5159624f057e01f2690c8a 
  src/tests/command_executor_tests.cpp c7f7711a7b085bdacd0f16b16a6c85cc47e2045c 
  src/tests/default_executor_tests.cpp 49c4e3b37bde848dc7a4fd02a1458c650a84a16f 


Diff: https://reviews.apache.org/r/71815/diff/4/

Changes: https://reviews.apache.org/r/71815/diff/3-4/


Testing
---

Ran the new test.


Thanks,

Benno Evers



Re: Review Request 71947: Handled embedded null bytes in abstract domain socket names.

2020-01-09 Thread Benno Evers


> On Jan. 6, 2020, 12:59 p.m., Benjamin Bannier wrote:
> > 3rdparty/libprocess/include/process/address.hpp
> > Lines 238 (patched)
> > 
> >
> > As discussed offline, let's add a `CHECK` here that `sun_path` is not 
> > empty.

As it turns out, I was mistaken in our offline discussion: a single null byte 
is a perfectly valid name for an abstract domain socket.


> On Jan. 6, 2020, 12:59 p.m., Benjamin Bannier wrote:
> > 3rdparty/libprocess/include/process/address.hpp
> > Line 232 (original), 247 (patched)
> > 
> >
> > For unnamed sockets we would now return an empty string here which 
> > seems leaky. What do you think about changing the return type of 
> > `Address::path` to e.g., `Option` instead and returning a `None` in 
> > that case?

I've thought about it and an empty string actually uniquely identifies unnamed 
sockets. Pathname sockets always must have at least one character, and the 
difference between an empty string and an abstract domain socket whose name is 
a single null byte would be that the former has size 0, and the latter has size 
1.

So I think as long as we document this correspondence, this should be fine.


- Benno


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71947/#review219127
---


On Jan. 10, 2020, 1:37 a.m., Benno Evers wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/71947/
> ---
> 
> (Updated Jan. 10, 2020, 1:37 a.m.)
> 
> 
> Review request for mesos, Benjamin Bannier and Benjamin Mahler.
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Address handling code for unix domain sockets assumed that
> strlen() could be used to compute the name of a unix domain
> socket, but that fails for unnamed sockets or in the case
> where an abstract domain socket contains embedded null bytes.
> 
> This patch adds a new `length` parameter to correctly handle
> these special cases.
> 
> 
> Diffs
> -
> 
>   3rdparty/libprocess/include/process/address.hpp 
> 749498056b52b916dfaf6c85f83ecc05e0d5406c 
>   3rdparty/libprocess/include/process/network.hpp 
> 8f48a4a78557309a9b1b00d7defb45eed454b077 
> 
> 
> Diff: https://reviews.apache.org/r/71947/diff/2/
> 
> 
> Testing
> ---
> 
> Ran existing unit tests and verified that the newly added `CHECK()` doesn't 
> trigger.
> 
> 
> Thanks,
> 
> Benno Evers
> 
>



Re: Review Request 71947: Handled embedded null bytes in abstract domain socket names.

2020-01-09 Thread Benno Evers

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71947/
---

(Updated Jan. 10, 2020, 1:37 a.m.)


Review request for mesos, Benjamin Bannier and Benjamin Mahler.


Changes
---

Cleaned up implementation to always return the correct length for unix 
addresses.


Repository: mesos


Description
---

Address handling code for unix domain sockets assumed that
strlen() could be used to compute the name of a unix domain
socket, but that fails for unnamed sockets or in the case
where an abstract domain socket contains embedded null bytes.

This patch adds a new `length` parameter to correctly handle
these special cases.


Diffs (updated)
-

  3rdparty/libprocess/include/process/address.hpp 
749498056b52b916dfaf6c85f83ecc05e0d5406c 
  3rdparty/libprocess/include/process/network.hpp 
8f48a4a78557309a9b1b00d7defb45eed454b077 


Diff: https://reviews.apache.org/r/71947/diff/2/

Changes: https://reviews.apache.org/r/71947/diff/1-2/


Testing
---

Ran existing unit tests and verified that the newly added `CHECK()` doesn't 
trigger.


Thanks,

Benno Evers



Re: Review Request 71935: Added a new task status reason.

2020-01-09 Thread Greg Mann

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71935/
---

(Updated Jan. 10, 2020, 12:23 a.m.)


Review request for mesos, Andrei Budnik and Qian Zhang.


Bugs: MESOS-10049
https://issues.apache.org/jira/browse/MESOS-10049


Repository: mesos


Description
---

The new reason will be sent to frameworks when one of their tasks is
OOM-killed on an agent while the task is exceeding its memory request.


Diffs (updated)
-

  include/mesos/mesos.proto b0f590545df720df644d049e9d8f1e81bdbe814c 
  include/mesos/v1/mesos.proto 53a7b9bc6190d826868a1633c11c9a0ecf9acf0a 


Diff: https://reviews.apache.org/r/71935/diff/2/

Changes: https://reviews.apache.org/r/71935/diff/1-2/


Testing
---

`make`


Thanks,

Greg Mann



Re: Review Request 71966: Cgroups isolator: added support for nested cgroups during recovery.

2020-01-09 Thread Mesos Reviewbot

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71966/#review219196
---



Patch looks great!

Reviews applied: [71855, 71931, 71856, 71858, 71884, 71885, 71886, 71943, 
71944, 71950, 71951, 71952, 71953, 71955, 71956, 71965, 71966]

Passed command: export OS='ubuntu:14.04' BUILDTOOL='autotools' COMPILER='gcc' 
CONFIGURATION='--verbose --disable-libtool-wrappers 
--disable-parallel-test-execution' ENVIRONMENT='GLOG_v=1 MESOS_VERBOSE=1'; 
./support/docker-build.sh

- Mesos Reviewbot


On Jan. 7, 2020, 5:06 p.m., Andrei Budnik wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/71966/
> ---
> 
> (Updated Jan. 7, 2020, 5:06 p.m.)
> 
> 
> Review request for mesos, Greg Mann and Qian Zhang.
> 
> 
> Bugs: MESOS-10079
> https://issues.apache.org/jira/browse/MESOS-10079
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> This patch enables recovery for nested cgroups and implements
> the detection of orphaned nested cgroups.
> 
> 
> Diffs
> -
> 
>   src/slave/containerizer/mesos/isolators/cgroups/cgroups.hpp 
> 4bd3d6dad37dee031660c15e957cc36f63e21fcb 
>   src/slave/containerizer/mesos/isolators/cgroups/cgroups.cpp 
> b12b73d8e0161d448075378765e77867521de04e 
> 
> 
> Diff: https://reviews.apache.org/r/71966/diff/1/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Andrei Budnik
> 
>



Re: Review Request 71965: Cgroups isolator: added support for nested cgroups during launch.

2020-01-09 Thread Andrei Budnik

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71965/
---

(Updated Jan. 9, 2020, 9:03 p.m.)


Review request for mesos, Greg Mann and Qian Zhang.


Bugs: MESOS-10076
https://issues.apache.org/jira/browse/MESOS-10076


Repository: mesos


Description
---

This patch adds support for nested cgroups for nested containers.
Nested cgroups are created only for a nested container with explicitly
enabled `share_cgroups` flag. The cgroups isolator stores info about
nested cgroups in the `infos` class variable, which is used to
determine whether a nested container has its nested cgroup.


Diffs
-

  src/slave/containerizer/mesos/isolators/cgroups/cgroups.hpp 
4bd3d6dad37dee031660c15e957cc36f63e21fcb 
  src/slave/containerizer/mesos/isolators/cgroups/cgroups.cpp 
b12b73d8e0161d448075378765e77867521de04e 


Diff: https://reviews.apache.org/r/71965/diff/1/


Testing
---


Thanks,

Andrei Budnik



Re: Review Request 71978: Updated all GET_* v1 master calls to be served in parallel.

2020-01-09 Thread Mesos Reviewbot

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71978/#review219194
---



Patch looks great!

Reviews applied: [71906, 71907, 71908, 71978]

Passed command: export OS='ubuntu:14.04' BUILDTOOL='autotools' COMPILER='gcc' 
CONFIGURATION='--verbose --disable-libtool-wrappers 
--disable-parallel-test-execution' ENVIRONMENT='GLOG_v=1 MESOS_VERBOSE=1'; 
./support/docker-build.sh

- Mesos Reviewbot


On Jan. 9, 2020, 5:48 p.m., Benjamin Mahler wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/71978/
> ---
> 
> (Updated Jan. 9, 2020, 5:48 p.m.)
> 
> 
> Review request for mesos, Andrei Sekretenko and Greg Mann.
> 
> 
> Bugs: MESOS-9497
> https://issues.apache.org/jira/browse/MESOS-9497
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Using the same approach taken for the v0 read-only endpoints,
> this enables parallel reads for the GET_* v1 master calls.
> 
> 
> Diffs
> -
> 
>   src/master/http.cpp 16dbed3aad7aeac7f45eb44ffc98cb9df6546aff 
>   src/master/master.hpp c9d417c23981a3908b419ce70cdf9a0096e37c17 
>   src/master/readonly_handler.cpp 500ce3f6a0f3afc3193586020fa5c253e4c6801c 
>   src/tests/master_load_tests.cpp ac16fe744ec2a8d4b4af005b91a5560f0b2f3ef3 
> 
> 
> Diff: https://reviews.apache.org/r/71978/diff/1/
> 
> 
> Testing
> ---
> 
> make check
> 
> 
> Thanks,
> 
> Benjamin Mahler
> 
>



Re: Review Request 71944: Set container process's OOM score adjust.

2020-01-09 Thread Greg Mann


> On Jan. 7, 2020, 11:07 p.m., Greg Mann wrote:
> > src/slave/containerizer/mesos/isolators/cgroups/subsystems/memory.cpp
> > Lines 199 (patched)
> > 
> >
> > Do we really want to do this? My concern is that this will make any 
> > non-Mesos-task processes on the node (networking and security components, 
> > for example) more likely to be OOM-killed than Mesos tasks. Perhaps we 
> > should only set the OOM score adjustment for burstable tasks. What do you 
> > think?
> 
> Qian Zhang wrote:
> I think it depends on which one is in higher priority and more important, 
> guaranteed task or non-Mesos-task processes? In Kubernetes implementation 
> (https://github.com/kubernetes/kubernetes/blob/v1.16.2/pkg/kubelet/qos/policy.go#L51:L53),
>  the OOM score adjust of guaranteed container is set to -998, and kubelet's 
> OOM score adjust is set to -998 too, I think we should do the same to protect 
> guaranteed containers and Mesos agent, what do you think?

One significant difference in the Kubernetes case is that they have user-space 
code which kills pod processes to reclaim memory when necessary. Consequently, 
there will be less impact if the OOM killer shows a strong preference against 
killing guaranteed tasks.

My intuition is that we should not set the OOM score adjustment for 
non-bursting processes. Even if we leave it at zero, guaranteed tasks will 
still be treated preferentially with respect to bursting tasks, since all 
bursting tasks will have an adjustment greater than zero.


- Greg


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71944/#review219158
---


On Jan. 8, 2020, 3:28 p.m., Qian Zhang wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/71944/
> ---
> 
> (Updated Jan. 8, 2020, 3:28 p.m.)
> 
> 
> Review request for mesos, Andrei Budnik and Greg Mann.
> 
> 
> Bugs: MESOS-10048
> https://issues.apache.org/jira/browse/MESOS-10048
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Set container process's OOM score adjust.
> 
> 
> Diffs
> -
> 
>   src/slave/containerizer/mesos/isolators/cgroups/subsystems/memory.hpp 
> 27d88e91fb784179effd54781f84000fe85c13eb 
>   src/slave/containerizer/mesos/isolators/cgroups/subsystems/memory.cpp 
> 0896d37761a11f55ba4b866d235c3bd2b79dcfba 
> 
> 
> Diff: https://reviews.apache.org/r/71944/diff/3/
> 
> 
> Testing
> ---
> 
> sudo make check
> 
> 
> Thanks,
> 
> Qian Zhang
> 
>



Review Request 71978: Updated all GET_* v1 master calls to be served in parallel.

2020-01-09 Thread Benjamin Mahler

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71978/
---

Review request for mesos, Andrei Sekretenko and Greg Mann.


Bugs: MESOS-9497
https://issues.apache.org/jira/browse/MESOS-9497


Repository: mesos


Description
---

Using the same approach taken for the v0 read-only endpoints,
this enables parallel reads for the GET_* v1 master calls.


Diffs
-

  src/master/http.cpp 16dbed3aad7aeac7f45eb44ffc98cb9df6546aff 
  src/master/master.hpp c9d417c23981a3908b419ce70cdf9a0096e37c17 
  src/master/readonly_handler.cpp 500ce3f6a0f3afc3193586020fa5c253e4c6801c 
  src/tests/master_load_tests.cpp ac16fe744ec2a8d4b4af005b91a5560f0b2f3ef3 


Diff: https://reviews.apache.org/r/71978/diff/1/


Testing
---

make check


Thanks,

Benjamin Mahler