[jira] [Commented] (MESOS-7130) port_mapping isolator: executor hangs when running on EC2

2017-10-02 Thread Jie Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-7130?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16189252#comment-16189252
 ] 

Jie Yu commented on MESOS-7130:
---

[~bgreen], [~pierrecdn], can you guys test this patch:
https://reviews.apache.org/r/62743/

Let me know if that fix the issue or not.

> port_mapping isolator: executor hangs when running on EC2
> -
>
> Key: MESOS-7130
> URL: https://issues.apache.org/jira/browse/MESOS-7130
> Project: Mesos
>  Issue Type: Bug
>  Components: executor
>Reporter: Pierre Cheynier
>Assignee: Jie Yu
>
> Hi,
> I'm experiencing a weird issue: I'm using a CI to do testing on 
> infrastructure automation.
> I recently activated the {{network/port_mapping}} isolator.
> I'm able to make the changes work and pass the test for bare-metal servers 
> and virtualbox VMs using this configuration.
> But when I try on EC2 (on which my CI pipeline rely) it systematically fails 
> to run any container.
> It appears that the sandbox is created and the port_mapping isolator seems to 
> be OK according to the logs in stdout and stderr and the {tc} output :
> {noformat}
> + mount --make-rslave /run/netns
> + test -f /proc/sys/net/ipv6/conf/all/disable_ipv6
> + echo 1
> + ip link set lo address 02:44:20:bb:42:cf mtu 9001 up
> + ethtool -K eth0 rx off
> (...)
> + tc filter show dev eth0 parent :0
> + tc filter show dev lo parent :0
> I0215 16:01:13.941375 1 exec.cpp:161] Version: 1.0.2
> {noformat}
> Then the executor never come back in REGISTERED state and hang indefinitely.
> {GLOG_v=3} doesn't help here.
> My skills in this area are limited, but trying to load the symbols and attach 
> a gdb to the mesos-executor process, I'm able to print this stack:
> {noformat}
> #0  0x7feffc1386d5 in pthread_cond_wait@@GLIBC_2.3.2 () from 
> /usr/lib64/libpthread.so.0
> #1  0x7feffbed69ec in 
> std::condition_variable::wait(std::unique_lock&) () from 
> /usr/lib64/libstdc++.so.6
> #2  0x7ff0003dd8ec in void synchronized_wait std::mutex>(std::condition_variable*, std::mutex*) () from 
> /usr/lib64/libmesos-1.0.2.so
> #3  0x7ff0017d595d in Gate::arrive(long) () from 
> /usr/lib64/libmesos-1.0.2.so
> #4  0x7ff0017c00ed in process::ProcessManager::wait(process::UPID const&) 
> () from /usr/lib64/libmesos-1.0.2.so
> #5  0x7ff0017c5c05 in process::wait(process::UPID const&, Duration 
> const&) () from /usr/lib64/libmesos-1.0.2.so
> #6  0x004ab26f in process::wait(process::ProcessBase const*, Duration 
> const&) ()
> #7  0x004a3903 in main ()
> {noformat}
> I concluded that the underlying shell script launched by the isolator or the 
> task itself is just .. blocked. But I don't understand why.
> Here is a process tree to show that I've no task running but the executor is:
> {noformat}
> root 28420  0.8  3.0 1061420 124940 ?  Ssl  17:56   0:25 
> /usr/sbin/mesos-slave --advertise_ip=127.0.0.1 
> --attributes=platform:centos;platform_major_version:7;type:base 
> --cgroups_enable_cfs --cgroups_hierarchy=/sys/fs/cgroup 
> --cgroups_net_cls_primary_handle=0xC370 
> --container_logger=org_apache_mesos_LogrotateContainerLogger 
> --containerizers=mesos,docker 
> --credential=file:///etc/mesos-chef/slave-credential 
> --default_container_info={"type":"MESOS","volumes":[{"host_path":"tmp","container_path":"/tmp","mode":"RW"}]}
>  --default_role=default --docker_registry=/usr/share/mesos/users 
> --docker_store_dir=/var/opt/mesos/store/docker 
> --egress_unique_flow_per_container --enforce_container_disk_quota 
> --ephemeral_ports_per_container=128 
> --executor_environment_variables={"PATH":"/bin:/usr/bin:/usr/sbin","CRITEO_DC":"par","CRITEO_ENV":"prod"}
>  --image_providers=docker --image_provisioner_backend=copy 
> --isolation=cgroups/cpu,cgroups/mem,cgroups/net_cls,namespaces/pid,disk/du,filesystem/shared,filesystem/linux,docker/runtime,network/cni,network/port_mapping
>  --logging_level=INFO 
> --master=zk://mesos:test@localhost.localdomain:2181/mesos 
> --modules=file:///etc/mesos-chef/slave-modules.json --port=5051 
> --recover=reconnect 
> --resources=ports:[31000-32000];ephemeral_ports:[32768-57344] --strict 
> --work_dir=/var/opt/mesos
> root 28484  0.0  2.3 433676 95016 ?Ssl  17:56   0:00  \_ 
> mesos-logrotate-logger --help=false 
> --log_filename=/var/opt/mesos/slaves/cdf94219-87b2-4af2-9f61-5697f0442915-S0/frameworks/366e8ed2-730e-4423-9324-086704d182b0-/executors/group_simplehttp.16f7c2ee-f3a8-11e6-be1c-0242b44d071f/runs/1d3e6b1c-cda8-47e5-92c4-a161429a7ac6/stdout
>  --logrotate_options=rotate 5 --logrotate_path=logrotate --max_size=10MB
> root 28485  0.0  2.3 499212 94724 ?Ssl  17:56   0:00  \_ 
> mesos-logrotate-logger --help=false 
> 

[jira] [Commented] (MESOS-7828) Current approach to parse protobuf enum from JSON does not support upgrades

2017-10-02 Thread Qian Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-7828?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16189162#comment-16189162
 ] 

Qian Zhang commented on MESOS-7828:
---

commit 2f3ceb45106e79586f2c32bfd26db0318d608075
Author: Qian Zhang zhq527...@gmail.com
Date:   Thu Jul 27 16:15:44 2017 +0800

Added a test `ProtobufTest.ParseJSONUnrecognizedEnum`.

Review: https://reviews.apache.org/r/61174

3rdparty/stout/tests/protobuf_tests.cpp   | 33 +
 3rdparty/stout/tests/protobuf_tests.proto |  9 +
 2 files changed, 42 insertions(+)

commit b10a4ea59231d134662d49417add2ccd7779cde7
Author: Qian Zhang 
Date:   Tue Jul 25 23:03:43 2017 +0800

Fixed JSON protobuf deserialization to ignore unrecognized enum values.

Protobuf deserialization will discard any unrecognized enum values.
This patch fixes our custom JSON -> protobuf conversion code to be
consistent with this behavior.

See MESOS-4997 for why this matters when dealing with upgrades.

Fixes MESOS-7828.

Review: https://reviews.apache.org/r/61109

 3rdparty/stout/include/stout/protobuf.hpp | 14 +-
 1 file changed, 13 insertions(+), 1 deletion(-)

> Current approach to parse protobuf enum from JSON does not support upgrades
> ---
>
> Key: MESOS-7828
> URL: https://issues.apache.org/jira/browse/MESOS-7828
> Project: Mesos
>  Issue Type: Bug
>Reporter: Qian Zhang
>Assignee: Qian Zhang
>
> To use protobuf enum in a backwards compatible way, [the suggestion on the 
> protobuf mailing 
> list|https://groups.google.com/forum/#!msg/protobuf/NhUjBfDyGmY/pf294zMi2bIJ] 
> is to use optional enum fields and include an UNKNOWN value as the first 
> entry in the enum list (and/or explicitly specifying it as the default). This 
> can handle the case of parsing protobuf message from a serialized string, but 
> it can not handle the case of parsing protobuf message from JSON.
> E.g., when I access master endpoint with an inexistent enum {{xxx}}, I will 
> get an error:
> {code}
> $ curl -X POST -H "Content-Type: application/json" -d '{"type": "xxx"}' 
> 127.0.0.1:5050/api/v1
> Failed to convert JSON into Call protobuf: Failed to find enum for 'xxx'% 
> {code}
> In the {{Call}} protobuf message, the enum {{Type}} already has a default 
> value {{UNKNOWN}} (see 
> [here|https://github.com/apache/mesos/blob/1.3.0/include/mesos/v1/master/master.proto#L45]
>  for details) and the field {{Call.type}} is optional, but the above curl 
> command will still fail. The root cause is, in the code 
> [here|https://github.com/apache/mesos/blob/1.3.0/3rdparty/stout/include/stout/protobuf.hpp#L449:L454]
>  when we try to get the enum value for the string "xxx", it will fail since 
> there is no any enum value corresponding to "xxx".



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (MESOS-6705) Port `fetcher_tests.cpp`

2017-10-02 Thread Joseph Wu (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-6705?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16189075#comment-16189075
 ] 

Joseph Wu commented on MESOS-6705:
--

Some additional tests that will likely be impacted by this port:
{code}
commit 772c8f554fe19f8c121c57bee97679cde2646fb8
Author: Gaston Kleiman 
Date:   Mon Oct 2 16:27:32 2017 -0700

Windows: Disable some new tests that fetch local URIs.

Mesos currently conflates paths and URIs.  On platforms where the
path separator '\' is not equal to the URI separator '/', tests
that rely on `path` helpers for URIs will naturally fail.

This disables some new tests that fail on Windows for this reason.

The tests here should eventually be fixed along with MESOS-6705.

Review: https://reviews.apache.org/r/62735/
{code}

> Port `fetcher_tests.cpp`
> 
>
> Key: MESOS-6705
> URL: https://issues.apache.org/jira/browse/MESOS-6705
> Project: Mesos
>  Issue Type: Bug
>  Components: agent
>Reporter: Alex Clemmer
>Assignee: Jeff Coffler
>  Labels: microsoft, windows-mvp
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (MESOS-8050) Mesos HTTP/HTTPS health checks for IPv6 docker containers.

2017-10-02 Thread Avinash Sridharan (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-8050?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Avinash Sridharan updated MESOS-8050:
-
Target Version/s: 1.5.0

> Mesos HTTP/HTTPS health checks for IPv6 docker containers.
> --
>
> Key: MESOS-8050
> URL: https://issues.apache.org/jira/browse/MESOS-8050
> Project: Mesos
>  Issue Type: Task
>Reporter: Avinash Sridharan
>Assignee: Vinod Kone
>
> Currently the MESOS HTTP/HTTPS health checks hardcode the IP address to 
> 127.0.0.1 while performing the pings on the containers. With IPv6 containers, 
> even dual stack kernels the container will have both the IPv4 and IPv6 
> loopback interfaces (127.0.0.1 and ::1). Further, its upto the application's 
> discretion to either open a INET or an INET6 socket which would imply that to 
> support IPv6 containers the MESOS HTTP/HTTPS health checks need to be 
> configurable to perform health checks on 127.0.0.1 or ::1. 
> A proposal here would be to introduce the concept of a transport on which 
> MESOS HTTP/HTTPS health checks work. That is the framework specifies whether 
> MESOS HTTP healthchecks work over TCP or TCP6. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (MESOS-8050) Mesos HTTP/HTTPS health checks for IPv6 docker containers.

2017-10-02 Thread Avinash Sridharan (JIRA)
Avinash Sridharan created MESOS-8050:


 Summary: Mesos HTTP/HTTPS health checks for IPv6 docker containers.
 Key: MESOS-8050
 URL: https://issues.apache.org/jira/browse/MESOS-8050
 Project: Mesos
  Issue Type: Task
Reporter: Avinash Sridharan
Assignee: Vinod Kone


Currently the MESOS HTTP/HTTPS health checks hardcode the IP address to 
127.0.0.1 while performing the pings on the containers. With IPv6 containers, 
even dual stack kernels the container will have both the IPv4 and IPv6 loopback 
interfaces (127.0.0.1 and ::1). Further, its upto the application's discretion 
to either open a INET or an INET6 socket which would imply that to support IPv6 
containers the MESOS HTTP/HTTPS health checks need to be configurable to 
perform health checks on 127.0.0.1 or ::1. 

A proposal here would be to introduce the concept of a transport on which MESOS 
HTTP/HTTPS health checks work. That is the framework specifies whether MESOS 
HTTP healthchecks work over TCP or TCP6. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (MESOS-7916) Improve the test coverage of the DefaultExecutor.

2017-10-02 Thread JIRA

[ 
https://issues.apache.org/jira/browse/MESOS-7916?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16188915#comment-16188915
 ] 

Gastón Kleiman commented on MESOS-7916:
---

{code}
commit 6eefc685ccf304d0fb8ed4ff9bc314197d77f078
Author: Gaston Kleiman 
Date:   Fri Sep 29 12:14:44 2017 -0700

Added a test using Docker, a file URI, and the DefaultExecutor.

This test verifies that URIs set on Docker tasks are fetched and made
available to them when started by the DefaultExecutor.

Review: https://reviews.apache.org/r/62632/
{code}

{code}
commit f429f4070935ad006f57e460e1db53ae65801809
Author: Gaston Kleiman 
Date:   Fri Sep 29 12:14:39 2017 -0700

Added a test using a file URI and the DefaultExecutor.

This test verifies that URIs set on tasks are fetched and made available
to them when started by the DefaultExecutor.

Review: https://reviews.apache.org/r/62168/
{code}

> Improve the test coverage of the DefaultExecutor.
> -
>
> Key: MESOS-7916
> URL: https://issues.apache.org/jira/browse/MESOS-7916
> Project: Mesos
>  Issue Type: Improvement
>  Components: executor
>Reporter: Gastón Kleiman
>Assignee: Gastón Kleiman
>  Labels: mesosphere
>
> We should write tests for the {{DefaultExecutor}} to cover the following 
> common scenarios:
> # -Start a task that uses a GPU, and make sure that it is made available to 
> the task.-
> # -Launch a Docker task with a health check.-
> # -Launch two tasks and verify that they can access a volume owned by the 
> Executor via {{sandbox_path}} volumes.-
> # -Launch two tasks, each one in its own task group, and verify that they can 
> access a volume owned by the Executor via {{sandbox_path}} volumes.-
> # -Launch a task that uses an env secret, make sure that it is accessible.-
> # -Launch a task using a URI and make sure that the artifact is accessible.-
> # -Launch a task using a Docker image + URIs, make sure that the fetched 
> artifact is accessible.-
> # Launch one task and ensure that (health) checks can read from a persistent 
> volume.
> # -Ensure that the executor's env is NOT inherited by the nested tasks.-



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Comment Edited] (MESOS-7916) Improve the test coverage of the DefaultExecutor.

2017-10-02 Thread JIRA

[ 
https://issues.apache.org/jira/browse/MESOS-7916?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16157402#comment-16157402
 ] 

Gastón Kleiman edited comment on MESOS-7916 at 10/2/17 9:53 PM:


I have a patch with the URI tests, but they're flaky because of MESOS-7927.


was (Author: gkleiman):
I have a patch with the URI tests, but they're flaky because of MESPS-7927.

> Improve the test coverage of the DefaultExecutor.
> -
>
> Key: MESOS-7916
> URL: https://issues.apache.org/jira/browse/MESOS-7916
> Project: Mesos
>  Issue Type: Improvement
>  Components: executor
>Reporter: Gastón Kleiman
>Assignee: Gastón Kleiman
>  Labels: mesosphere
>
> We should write tests for the {{DefaultExecutor}} to cover the following 
> common scenarios:
> # -Start a task that uses a GPU, and make sure that it is made available to 
> the task.-
> # -Launch a Docker task with a health check.-
> # -Launch two tasks and verify that they can access a volume owned by the 
> Executor via {{sandbox_path}} volumes.-
> # -Launch two tasks, each one in its own task group, and verify that they can 
> access a volume owned by the Executor via {{sandbox_path}} volumes.-
> # -Launch a task that uses an env secret, make sure that it is accessible.-
> # -Launch a task using a URI and make sure that the artifact is accessible.-
> # -Launch a task using a Docker image + URIs, make sure that the fetched 
> artifact is accessible.-
> # Launch one task and ensure that (health) checks can read from a persistent 
> volume.
> # -Ensure that the executor's env is NOT inherited by the nested tasks.-



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (MESOS-7916) Improve the test coverage of the DefaultExecutor.

2017-10-02 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/MESOS-7916?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gastón Kleiman updated MESOS-7916:
--
Description: 
We should write tests for the {{DefaultExecutor}} to cover the following common 
scenarios:

# -Start a task that uses a GPU, and make sure that it is made available to the 
task.-
# -Launch a Docker task with a health check.-
# -Launch two tasks and verify that they can access a volume owned by the 
Executor via {{sandbox_path}} volumes.-
# -Launch two tasks, each one in its own task group, and verify that they can 
access a volume owned by the Executor via {{sandbox_path}} volumes.-
# -Launch a task that uses an env secret, make sure that it is accessible.-
# -Launch a task using a URI and make sure that the artifact is accessible.-
# -Launch a task using a Docker image + URIs, make sure that the fetched 
artifact is accessible.-
# Launch one task and ensure that (health) checks can read from a persistent 
volume.
# -Ensure that the executor's env is NOT inherited by the nested tasks.-

  was:
We should write tests for the {{DefaultExecutor}} to cover the following common 
scenarios:

# -Start a task that uses a GPU, and make sure that it is made available to the 
task.-
# -Launch a Docker task with a health check.-
# -Launch two tasks and verify that they can access a volume owned by the 
Executor via {{sandbox_path}} volumes.-
# -Launch two tasks, each one in its own task group, and verify that they can 
access a volume owned by the Executor via {{sandbox_path}} volumes.-
# -Launch a task that uses an env secret, make sure that it is accessible.-
# Launch a task using a URI and make sure that the artifact is accessible.
# Launch a task using a Docker image + URIs, make sure that the fetched 
artifact is accessible.
# Launch one task and ensure that (health) checks can read from a persistent 
volume.
# Ensure that the executor's env is NOT inherited by the nested tasks.


> Improve the test coverage of the DefaultExecutor.
> -
>
> Key: MESOS-7916
> URL: https://issues.apache.org/jira/browse/MESOS-7916
> Project: Mesos
>  Issue Type: Improvement
>  Components: executor
>Reporter: Gastón Kleiman
>Assignee: Gastón Kleiman
>  Labels: mesosphere
>
> We should write tests for the {{DefaultExecutor}} to cover the following 
> common scenarios:
> # -Start a task that uses a GPU, and make sure that it is made available to 
> the task.-
> # -Launch a Docker task with a health check.-
> # -Launch two tasks and verify that they can access a volume owned by the 
> Executor via {{sandbox_path}} volumes.-
> # -Launch two tasks, each one in its own task group, and verify that they can 
> access a volume owned by the Executor via {{sandbox_path}} volumes.-
> # -Launch a task that uses an env secret, make sure that it is accessible.-
> # -Launch a task using a URI and make sure that the artifact is accessible.-
> # -Launch a task using a Docker image + URIs, make sure that the fetched 
> artifact is accessible.-
> # Launch one task and ensure that (health) checks can read from a persistent 
> volume.
> # -Ensure that the executor's env is NOT inherited by the nested tasks.-



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (MESOS-3107) Define CMake style guide

2017-10-02 Thread Joseph Wu (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-3107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16188891#comment-16188891
 ] 

Joseph Wu commented on MESOS-3107:
--

{code}
commit 3f15dedb221a6882fd01b172f584527c496d4e1e
Author: Andrew Schwartzmeyer 
Date:   Mon Oct 2 14:12:30 2017 -0700

CMake: Used `TRUE|FALSE` instead of `ON|OFF` consistently.

While these are equivalent values to CMake, we should be consistent.
Also made uses of `option` consistent, and moved `ENABLE_JAVA` to the
3rdparty section.

Review: https://reviews.apache.org/r/62730/
{code}

> Define CMake style guide
> 
>
> Key: MESOS-3107
> URL: https://issues.apache.org/jira/browse/MESOS-3107
> Project: Mesos
>  Issue Type: Task
>  Components: cmake
>Reporter: Alex Clemmer
>Assignee: Andrew Schwartzmeyer
>  Labels: build, cmake
>
> The short story is that it is important to be principled about how the CMake 
> build system is maintained, because there CMake language makes it difficult 
> to statically verify that a configuration is correct. It is not unique in 
> this regard, but (make is arguably even worse) but it is something that's 
> important to make sure we get right.
> The longer story is, CMake's language is dynamically scoped and often has 
> somewhat odd defaults for variable values (_e.g._, IIRC, target names passed 
> to ExternalProject_Add default to "PREFIX" instead of erroring out). This 
> means that it is rare to get a configuration-time error (_i.e._, CMake 
> usually doesn't say something like "hey this variable isn't defined"), and in 
> large projects, this can make it very difficult to know where definitions 
> come from, or whether it's important that one config routine runs before 
> another. Dynamic scoping also makes it particularly easy to write spaghetti 
> code, which is clearly undesirable for something as important as a build 
> system.
> Thus, it is particularly important that we lay down our expectations for how 
> the CMake system is to be structured. This might include:
> * Function naming (_e.g._, making it easy to tell whether a function was 
> defined by us, and where it was defined; so we might say that we want our 
> functions to have an underscore to start, and start with the package the come 
> from, like libprocess, so that we know where to look for the definition.)
> * What assertions we want to check variable values against, so that we can 
> replace subtle errors (_e.g._, a library is accidentally named something 
> silly like "PREFIX.0.0.1") with an obvious ones (_e.g._, "You have failed to 
> define your target name, so CMake has defaulted to 'PREFIX'; please check 
> your configuration routines")
> * Decisions of what goes where. (_e.g._, the most complex parts of the CMake 
> MVPs is in the configuration routines, like `MesosConfigure.cmake`; to curb 
> this, we should have strict rules about what goes in that file vs other 
> files, and how we know what is to be run before what. Part of this should 
> probably be prominent comments explaining the structure of the project, so 
> that people aren't confused!)
> * And so on.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (MESOS-8033) Use more idiomatic CMake for compiler features

2017-10-02 Thread Joseph Wu (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-8033?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16188890#comment-16188890
 ] 

Joseph Wu commented on MESOS-8033:
--

{code}
commit bd5e874c22f0d16fc5494213d319065cf9107d0f
Author: Andrew Schwartzmeyer 
Date:   Mon Oct 2 14:33:52 2017 -0700

CMake: Removed `MESOS_CPPFLAGS` variable.

This was a magic variable that was used to add compiler definitions
globally. Instead, global definitions are now added explicitly with
`add_definitions`, and others with `target_compile_definitions`.

Review: https://reviews.apache.org/r/62731/
{code}

> Use more idiomatic CMake for compiler features
> --
>
> Key: MESOS-8033
> URL: https://issues.apache.org/jira/browse/MESOS-8033
> Project: Mesos
>  Issue Type: Improvement
>  Components: cmake
>Reporter: Andrew Schwartzmeyer
>Priority: Minor
>  Labels: cmake
>
> Specifically, we should replace
> {noformat}
>   string(APPEND CMAKE_CXX_FLAGS " -std=c++11")
> {noformat}
> With {{CMAKE_CXX_STANDARD}}, and use [compile feature 
> requirements|https://cmake.org/cmake/help/latest/manual/cmake-compile-features.7.html#compile-feature-requirements].
> And replace
> {noformat}
>   string(APPEND CMAKE_CXX_FLAGS " -Wformat-security")
> {noformat}
> With compile options instead of appending to {{CMAKE_CXX_FLAGS}}.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (MESOS-7961) Display task health in the webui.

2017-10-02 Thread Benjamin Mahler (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-7961?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benjamin Mahler updated MESOS-7961:
---
Shepherd: Benjamin Mahler

> Display task health in the webui.
> -
>
> Key: MESOS-7961
> URL: https://issues.apache.org/jira/browse/MESOS-7961
> Project: Mesos
>  Issue Type: Improvement
>  Components: webui
>Reporter: Benjamin Mahler
>Assignee: Tomasz Janiszewski
>
> Currently the webui does not display task health based on the latest status 
> update. Since this information is in the protobuf, it is within the webui's 
> scope to display health information.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (MESOS-7961) Display task health in the webui.

2017-10-02 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-7961?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16188784#comment-16188784
 ] 

ASF GitHub Bot commented on MESOS-7961:
---

Github user asfgit closed the pull request at:

https://github.com/apache/mesos/pull/233


> Display task health in the webui.
> -
>
> Key: MESOS-7961
> URL: https://issues.apache.org/jira/browse/MESOS-7961
> Project: Mesos
>  Issue Type: Improvement
>  Components: webui
>Reporter: Benjamin Mahler
>Assignee: Tomasz Janiszewski
>
> Currently the webui does not display task health based on the latest status 
> update. Since this information is in the protobuf, it is within the webui's 
> scope to display health information.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (MESOS-3110) Harden the CMake system-dependency-locating routines

2017-10-02 Thread Joseph Wu (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-3110?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joseph Wu updated MESOS-3110:
-
Fix Version/s: 1.5.0

> Harden the CMake system-dependency-locating routines
> 
>
> Key: MESOS-3110
> URL: https://issues.apache.org/jira/browse/MESOS-3110
> Project: Mesos
>  Issue Type: Task
>  Components: cmake
>Reporter: Alex Clemmer
>Assignee: John Kordich
>  Labels: build, cmake
> Fix For: 1.5.0
>
>
> Currently the Mesos project has two flavors of dependency: (1) the 
> dependencies we expect are already on the system (_e.g._, apr, libsvn), and 
> (2) the dependencies that are historically bundled with Mesos (_e.g._, glog).
> Dependency type (1) requires solid modules that will locate them on any 
> system: Linux, BSD, or Windows. This would come for free if we were using 
> CMake 3.0, but we're using CMake 2.8 so that Ubuntu users can install it out 
> of the box, instead of upgrading CMake first.
> This is additionally useful for dependency type (2), where we will expect to 
> have to use these routines when we support both the rebundled dependencies in 
> the `3rdparty/` folder, and system installations of those dependencies.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (MESOS-3384) Include libsasl in Windows CMake build

2017-10-02 Thread Joseph Wu (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-3384?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joseph Wu updated MESOS-3384:
-
Shepherd: Joseph Wu  (was: Joris Van Remoortere)

> Include libsasl in Windows CMake build
> --
>
> Key: MESOS-3384
> URL: https://issues.apache.org/jira/browse/MESOS-3384
> Project: Mesos
>  Issue Type: Task
>  Components: cmake
>Reporter: Alex Clemmer
>Assignee: John Kordich
>  Labels: build, cmake, mesosphere
> Fix For: 1.5.0
>
>
> Windows will probably require libsasl to work. This means we need to insert 
> the code to retrieve, build, and link against it for the Windows path, since 
> it isn't rebundled and distributed as part of Mesos.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (MESOS-1280) Add replace task primitive

2017-10-02 Thread Yan Xu (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-1280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16188488#comment-16188488
 ] 

Yan Xu commented on MESOS-1280:
---

Probably not all fields in the TaskInfo make equal sense to be updatable or 
justify the complexity. If possible we probably still prefer treating tasks as 
cattle and want to only give them pet treatment for certain important benefits. 

[~zhitao] could you elaborate on the uses cases you were thinking about? I see 
that in 

 you mentioned in-place upgrades and launching zero-resource onto running 
executors, among others. 

I am asking because I recently started looking into something related to this. 
I may poll the user/dev thread later but am starting here first.

> Add replace task primitive
> --
>
> Key: MESOS-1280
> URL: https://issues.apache.org/jira/browse/MESOS-1280
> Project: Mesos
>  Issue Type: Bug
>  Components: agent, c++ api, master
>Reporter: Niklas Quarfot Nielsen
>  Labels: mesosphere
>
> Also along the lines of MESOS-938, replaceTask would one of a couple of 
> primitives needed to support various task replacement and scaling scenarios. 
> This replaceTask() version is significantly simpler than the first proposed 
> one; it's only responsibility is to run a new task info on a running tasks 
> resources.
> The running task will be killed as usual, but the newly freed resources will 
> never be announced and the new task will run on them instead.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (MESOS-3160) CgroupsAnyHierarchyMemoryPressureTest.ROOT_IncreaseRSS Flaky

2017-10-02 Thread Till Toenshoff (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-3160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16188114#comment-16188114
 ] 

Till Toenshoff commented on MESOS-3160:
---

Just saw it crashing on our internal CI (ubuntu 14.04):

{noformat}
00:39:21 [ RUN  ] CgroupsAnyHierarchyMemoryPressureTest.ROOT_IncreaseRSS
00:39:21 *** Aborted at 1506731961 (unix time) try "date -d @1506731961" if you 
are using GNU date ***
00:39:21 PC: @ 0x7fa16bc17b91 process::ProcessManager::resume()
00:39:21 *** SIGSEGV (@0x8) received by PID 31454 (TID 0x7fa15ea32700) from PID 
8; stack trace: ***
00:39:21 @ 0x7fa1367483fd (unknown)
00:39:21 @ 0x7fa13674d419 (unknown)
00:39:21 @ 0x7fa136741918 (unknown)
00:39:21 @ 0x7fa169011330 (unknown)
00:39:21 @ 0x7fa16bc17b91 process::ProcessManager::resume()
00:39:21 @ 0x7fa16bc1d6e6 
_ZNSt6thread5_ImplISt12_Bind_simpleIFZN7process14ProcessManager12init_threadsEvEUlvE_vEEE6_M_runEv
00:39:21 @ 0x7fa1697eca60 (unknown)
00:39:21 @ 0x7fa169009184 start_thread
00:39:21 @ 0x7fa168d35ffd (unknown)
{noformat}


> CgroupsAnyHierarchyMemoryPressureTest.ROOT_IncreaseRSS Flaky
> 
>
> Key: MESOS-3160
> URL: https://issues.apache.org/jira/browse/MESOS-3160
> Project: Mesos
>  Issue Type: Bug
>Affects Versions: 0.24.0, 0.26.0
>Reporter: Paul Brett
>  Labels: cgroups, mesosphere
>
> Test will occasionally with:
> [ RUN  ] CgroupsAnyHierarchyMemoryPressureTest.ROOT_IncreaseUnlockedRSS
> ../../src/tests/containerizer/cgroups_tests.cpp:1103: Failure
> helper.increaseRSS(getpagesize()): Failed to sync with the subprocess
> ../../src/tests/containerizer/cgroups_tests.cpp:1103: Failure
> helper.increaseRSS(getpagesize()): The subprocess has not been spawned yet
> [  FAILED  ] CgroupsAnyHierarchyMemoryPressureTest.ROOT_IncreaseUnlockedRSS 
> (223 ms)



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (MESOS-8049) MasterTest.RecoveredFramework is flaky and crashes.

2017-10-02 Thread Till Toenshoff (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-8049?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Till Toenshoff updated MESOS-8049:
--
Description: 
Observed on internal CI:

{noformat}
00:35:26 [ RUN  ] MasterTest.RecoveredFramework
00:35:26 I0930 00:35:26.319862 27033 cluster.cpp:162] Creating default 'local' 
authorizer
00:35:26 I0930 00:35:26.321624 27053 master.cpp:445] Master 
94ab36ee-4c02-457d-ae35-2f130ae826f5 (ip-172-16-10-150) started on 
172.16.10.150:37345
00:35:26 I0930 00:35:26.321647 27053 master.cpp:447] Flags at startup: 
--acls="" --agent_ping_timeout="15secs" --agent_reregister_timeout="10mins" 
--allocation_interval="1secs" --allocator="HierarchicalDRF" 
--authenticate_agents="true" --authenticate_frameworks="true" 
--authenticate_http_frameworks="true" --authenticate_http_readonly="true" 
--authenticate_http_readwrite="true" --authenticators="crammd5" 
--authorizers="local" --credentials="/tmp/Z8B1GQ/credentials" 
--filter_gpu_resources="true" --framework_sorter="drf" --help="false" 
--hostname_lookup="true" --http_authenticators="basic" 
--http_framework_authenticators="basic" --initialize_driver_logging="true" 
--log_auto_initialize="true" --logbufsecs="0" --logging_level="INFO" 
--max_agent_ping_timeouts="5" --max_completed_frameworks="50" 
--max_completed_tasks_per_framework="1000" 
--max_unreachable_tasks_per_framework="1000" --port="5050" --quiet="false" 
--recovery_agent_removal_limit="100%" --registry="in_memory" 
--registry_fetch_timeout="1mins" --registry_gc_interval="15mins" 
--registry_max_agent_age="2weeks" --registry_max_agent_count="102400" 
--registry_store_timeout="100secs" --registry_strict="false" 
--root_submissions="true" --user_sorter="drf" --version="false" 
--webui_dir="/usr/local/share/mesos/webui" --work_dir="/tmp/Z8B1GQ/master" 
--zk_session_timeout="10secs"
00:35:26 I0930 00:35:26.321758 27053 master.cpp:497] Master only allowing 
authenticated frameworks to register
00:35:26 I0930 00:35:26.321768 27053 master.cpp:511] Master only allowing 
authenticated agents to register
00:35:26 I0930 00:35:26.321772 27053 master.cpp:524] Master only allowing 
authenticated HTTP frameworks to register
00:35:26 I0930 00:35:26.321777 27053 credentials.hpp:37] Loading credentials 
for authentication from '/tmp/Z8B1GQ/credentials'
00:35:26 I0930 00:35:26.321853 27053 master.cpp:569] Using default 'crammd5' 
authenticator
00:35:26 I0930 00:35:26.321892 27053 http.cpp:1045] Creating default 'basic' 
HTTP authenticator for realm 'mesos-master-readonly'
00:35:26 I0930 00:35:26.321923 27053 http.cpp:1045] Creating default 'basic' 
HTTP authenticator for realm 'mesos-master-readwrite'
00:35:26 I0930 00:35:26.321946 27053 http.cpp:1045] Creating default 'basic' 
HTTP authenticator for realm 'mesos-master-scheduler'
00:35:26 I0930 00:35:26.321969 27053 master.cpp:649] Authorization enabled
00:35:26 I0930 00:35:26.322120 27048 hierarchical.cpp:171] Initialized 
hierarchical allocator process
00:35:26 I0930 00:35:26.322145 27048 whitelist_watcher.cpp:77] No whitelist 
given
00:35:26 I0930 00:35:26.322657 27053 master.cpp:2216] Elected as the leading 
master!
00:35:26 I0930 00:35:26.322679 27053 master.cpp:1705] Recovering from registrar
00:35:26 I0930 00:35:26.322721 27053 registrar.cpp:347] Recovering registrar
00:35:26 I0930 00:35:26.322829 27048 registrar.cpp:391] Successfully fetched 
the registry (0B) in 90368ns
00:35:26 I0930 00:35:26.322856 27048 registrar.cpp:495] Applied 1 operations in 
4113ns; attempting to update the registry
00:35:26 I0930 00:35:26.322960 27053 registrar.cpp:552] Successfully updated 
the registry in 89088ns
00:35:26 I0930 00:35:26.323011 27053 registrar.cpp:424] Successfully recovered 
registrar
00:35:26 I0930 00:35:26.323148 27054 master.cpp:1809] Recovered 0 agents from 
the registry (146B); allowing 10mins for agents to re-register
00:35:26 I0930 00:35:26.323161 27047 hierarchical.cpp:209] Skipping recovery of 
hierarchical allocator: nothing to recover
00:35:26 W0930 00:35:26.325556 27033 process.cpp:3194] Attempted to spawn 
already running process files@172.16.10.150:37345
00:35:26 I0930 00:35:26.325654 27033 cluster.cpp:448] Creating default 'local' 
authorizer
00:35:26 I0930 00:35:26.326050 27048 slave.cpp:254] Mesos agent started on 
(250)@172.16.10.150:37345
00:35:26 I0930 00:35:26.326066 27048 slave.cpp:255] Flags at startup: --acls="" 
--appc_simple_discovery_uri_prefix="http://; 
--appc_store_dir="/tmp/MasterTest_RecoveredFramework_6nFcY6/store/appc" 
--authenticate_http_readonly="true" --authenticate_http_readwrite="true" 
--authenticatee="crammd5" --authentication_backoff_factor="1secs" 
--authorizer="local" --cgroups_cpu_enable_pids_and_tids_count="false" 
--cgroups_enable_cfs="false" --cgroups_hierarchy="/sys/fs/cgroup" 
--cgroups_limit_swap="false" --cgroups_root="mesos" 
--container_disk_watch_interval="15secs" --containerizers="mesos" 

[jira] [Created] (MESOS-8049) MasterTest.RecoveredFramework is flaky and crashes.

2017-10-02 Thread Till Toenshoff (JIRA)
Till Toenshoff created MESOS-8049:
-

 Summary: MasterTest.RecoveredFramework is flaky and crashes.
 Key: MESOS-8049
 URL: https://issues.apache.org/jira/browse/MESOS-8049
 Project: Mesos
  Issue Type: Bug
Affects Versions: 1.5.0
 Environment: ubuntu-17.04
Reporter: Till Toenshoff


{noformat}
00:35:26 [ RUN  ] MasterTest.RecoveredFramework
00:35:26 I0930 00:35:26.319862 27033 cluster.cpp:162] Creating default 'local' 
authorizer
00:35:26 I0930 00:35:26.321624 27053 master.cpp:445] Master 
94ab36ee-4c02-457d-ae35-2f130ae826f5 (ip-172-16-10-150) started on 
172.16.10.150:37345
00:35:26 I0930 00:35:26.321647 27053 master.cpp:447] Flags at startup: 
--acls="" --agent_ping_timeout="15secs" --agent_reregister_timeout="10mins" 
--allocation_interval="1secs" --allocator="HierarchicalDRF" 
--authenticate_agents="true" --authenticate_frameworks="true" 
--authenticate_http_frameworks="true" --authenticate_http_readonly="true" 
--authenticate_http_readwrite="true" --authenticators="crammd5" 
--authorizers="local" --credentials="/tmp/Z8B1GQ/credentials" 
--filter_gpu_resources="true" --framework_sorter="drf" --help="false" 
--hostname_lookup="true" --http_authenticators="basic" 
--http_framework_authenticators="basic" --initialize_driver_logging="true" 
--log_auto_initialize="true" --logbufsecs="0" --logging_level="INFO" 
--max_agent_ping_timeouts="5" --max_completed_frameworks="50" 
--max_completed_tasks_per_framework="1000" 
--max_unreachable_tasks_per_framework="1000" --port="5050" --quiet="false" 
--recovery_agent_removal_limit="100%" --registry="in_memory" 
--registry_fetch_timeout="1mins" --registry_gc_interval="15mins" 
--registry_max_agent_age="2weeks" --registry_max_agent_count="102400" 
--registry_store_timeout="100secs" --registry_strict="false" 
--root_submissions="true" --user_sorter="drf" --version="false" 
--webui_dir="/usr/local/share/mesos/webui" --work_dir="/tmp/Z8B1GQ/master" 
--zk_session_timeout="10secs"
00:35:26 I0930 00:35:26.321758 27053 master.cpp:497] Master only allowing 
authenticated frameworks to register
00:35:26 I0930 00:35:26.321768 27053 master.cpp:511] Master only allowing 
authenticated agents to register
00:35:26 I0930 00:35:26.321772 27053 master.cpp:524] Master only allowing 
authenticated HTTP frameworks to register
00:35:26 I0930 00:35:26.321777 27053 credentials.hpp:37] Loading credentials 
for authentication from '/tmp/Z8B1GQ/credentials'
00:35:26 I0930 00:35:26.321853 27053 master.cpp:569] Using default 'crammd5' 
authenticator
00:35:26 I0930 00:35:26.321892 27053 http.cpp:1045] Creating default 'basic' 
HTTP authenticator for realm 'mesos-master-readonly'
00:35:26 I0930 00:35:26.321923 27053 http.cpp:1045] Creating default 'basic' 
HTTP authenticator for realm 'mesos-master-readwrite'
00:35:26 I0930 00:35:26.321946 27053 http.cpp:1045] Creating default 'basic' 
HTTP authenticator for realm 'mesos-master-scheduler'
00:35:26 I0930 00:35:26.321969 27053 master.cpp:649] Authorization enabled
00:35:26 I0930 00:35:26.322120 27048 hierarchical.cpp:171] Initialized 
hierarchical allocator process
00:35:26 I0930 00:35:26.322145 27048 whitelist_watcher.cpp:77] No whitelist 
given
00:35:26 I0930 00:35:26.322657 27053 master.cpp:2216] Elected as the leading 
master!
00:35:26 I0930 00:35:26.322679 27053 master.cpp:1705] Recovering from registrar
00:35:26 I0930 00:35:26.322721 27053 registrar.cpp:347] Recovering registrar
00:35:26 I0930 00:35:26.322829 27048 registrar.cpp:391] Successfully fetched 
the registry (0B) in 90368ns
00:35:26 I0930 00:35:26.322856 27048 registrar.cpp:495] Applied 1 operations in 
4113ns; attempting to update the registry
00:35:26 I0930 00:35:26.322960 27053 registrar.cpp:552] Successfully updated 
the registry in 89088ns
00:35:26 I0930 00:35:26.323011 27053 registrar.cpp:424] Successfully recovered 
registrar
00:35:26 I0930 00:35:26.323148 27054 master.cpp:1809] Recovered 0 agents from 
the registry (146B); allowing 10mins for agents to re-register
00:35:26 I0930 00:35:26.323161 27047 hierarchical.cpp:209] Skipping recovery of 
hierarchical allocator: nothing to recover
00:35:26 W0930 00:35:26.325556 27033 process.cpp:3194] Attempted to spawn 
already running process files@172.16.10.150:37345
00:35:26 I0930 00:35:26.325654 27033 cluster.cpp:448] Creating default 'local' 
authorizer
00:35:26 I0930 00:35:26.326050 27048 slave.cpp:254] Mesos agent started on 
(250)@172.16.10.150:37345
00:35:26 I0930 00:35:26.326066 27048 slave.cpp:255] Flags at startup: --acls="" 
--appc_simple_discovery_uri_prefix="http://; 
--appc_store_dir="/tmp/MasterTest_RecoveredFramework_6nFcY6/store/appc" 
--authenticate_http_readonly="true" --authenticate_http_readwrite="true" 
--authenticatee="crammd5" --authentication_backoff_factor="1secs" 
--authorizer="local" --cgroups_cpu_enable_pids_and_tids_count="false" 
--cgroups_enable_cfs="false" 

[jira] [Commented] (MESOS-8000) DefaultExecutorCniTest.ROOT_VerifyContainerIP is flaky.

2017-10-02 Thread Till Toenshoff (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-8000?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16188086#comment-16188086
 ] 

Till Toenshoff commented on MESOS-8000:
---

Observed on internal CI (Centos7):
{noformat}
00:53:39  ../../src/tests/containerizer/cni_isolator_tests.cpp:1419: Failure
00:53:39  Failed to wait 15secs for subscribed
{noformat}

Log:

{noformat}
00:53:24  [ RUN  ] 
NetworkParam/DefaultExecutorCniTest.ROOT_VerifyContainerIP/0
00:53:24  I0930 00:53:24.468544  7413 cluster.cpp:162] Creating default 'local' 
authorizer
00:53:24  I0930 00:53:24.469557 26520 master.cpp:445] Master 
473b9c2c-8d12-417d-98de-d71cd175d503 (ip-172-16-10-96.ec2.internal) started on 
172.16.10.96:38662
00:53:24  I0930 00:53:24.469573 26520 master.cpp:447] Flags at startup: 
--acls="" --agent_ping_timeout="15secs" --agent_reregister_timeout="10mins" 
--allocation_interval="1secs" --allocator="HierarchicalDRF" 
--authenticate_agents="true" --authenticate_frameworks="true" 
--authenticate_http_frameworks="true" --authenticate_http_readonly="true" 
--authenticate_http_readwrite="true" --authenticators="crammd5" 
--authorizers="local" --credentials="/tmp/g7H2KI/credentials" 
--filter_gpu_resources="true" --framework_sorter="drf" --help="false" 
--hostname_lookup="true" --http_authenticators="basic" 
--http_framework_authenticators="basic" --initialize_driver_logging="true" 
--log_auto_initialize="true" --logbufsecs="0" --logging_level="INFO" 
--max_agent_ping_timeouts="5" --max_completed_frameworks="50" 
--max_completed_tasks_per_framework="1000" 
--max_unreachable_tasks_per_framework="1000" --port="5050" --quiet="false" 
--recovery_agent_removal_limit="100%" --registry="in_memory" 
--registry_fetch_timeout="1mins" --registry_gc_interval="15mins" 
--registry_max_agent_age="2weeks" --registry_max_agent_count="102400" 
--registry_store_timeout="100secs" --registry_strict="false" 
--root_submissions="true" --user_sorter="drf" --version="false" 
--webui_dir="/usr/local/share/mesos/webui" --work_dir="/tmp/g7H2KI/master" 
--zk_session_timeout="10secs"
00:53:24  I0930 00:53:24.469699 26520 master.cpp:497] Master only allowing 
authenticated frameworks to register
00:53:24  I0930 00:53:24.469707 26520 master.cpp:511] Master only allowing 
authenticated agents to register
00:53:24  I0930 00:53:24.469712 26520 master.cpp:524] Master only allowing 
authenticated HTTP frameworks to register
00:53:24  I0930 00:53:24.469717 26520 credentials.hpp:37] Loading credentials 
for authentication from '/tmp/g7H2KI/credentials'
00:53:24  I0930 00:53:24.469817 26520 master.cpp:569] Using default 'crammd5' 
authenticator
00:53:24  I0930 00:53:24.469864 26520 http.cpp:1045] Creating default 'basic' 
HTTP authenticator for realm 'mesos-master-readonly'
00:53:24  I0930 00:53:24.469899 26520 http.cpp:1045] Creating default 'basic' 
HTTP authenticator for realm 'mesos-master-readwrite'
00:53:24  I0930 00:53:24.469923 26520 http.cpp:1045] Creating default 'basic' 
HTTP authenticator for realm 'mesos-master-scheduler'
00:53:24  I0930 00:53:24.469943 26520 master.cpp:649] Authorization enabled
00:53:24  I0930 00:53:24.470126 26523 hierarchical.cpp:171] Initialized 
hierarchical allocator process
00:53:24  I0930 00:53:24.470139 26519 whitelist_watcher.cpp:77] No whitelist 
given
00:53:24  I0930 00:53:24.470667 26520 master.cpp:2216] Elected as the leading 
master!
00:53:24  I0930 00:53:24.470679 26520 master.cpp:1705] Recovering from registrar
00:53:24  I0930 00:53:24.470717 26521 registrar.cpp:347] Recovering registrar
00:53:24  I0930 00:53:24.470865 26521 registrar.cpp:391] Successfully fetched 
the registry (0B) in 130048ns
00:53:24  I0930 00:53:24.470899 26521 registrar.cpp:495] Applied 1 operations 
in 6443ns; attempting to update the registry
00:53:24  I0930 00:53:24.471029 26518 registrar.cpp:552] Successfully updated 
the registry in 113920ns
00:53:24  I0930 00:53:24.471076 26518 registrar.cpp:424] Successfully recovered 
registrar
00:53:24  I0930 00:53:24.471153 26519 master.cpp:1809] Recovered 0 agents from 
the registry (168B); allowing 10mins for agents to re-register
00:53:24  I0930 00:53:24.471210 26519 hierarchical.cpp:209] Skipping recovery 
of hierarchical allocator: nothing to recover
00:53:24  W0930 00:53:24.473450  7413 process.cpp:3194] Attempted to spawn 
already running process files@172.16.10.96:38662
00:53:24  I0930 00:53:24.473793  7413 containerizer.cpp:292] Using isolation { 
environment_secret, posix/cpu, posix/mem, filesystem/posix, network/cni }
00:53:24  I0930 00:53:24.479225  7413 linux_launcher.cpp:146] Using 
/sys/fs/cgroup/freezer as the freezer hierarchy for the Linux launcher
00:53:24  I0930 00:53:24.479602  7413 provisioner.cpp:255] Using default 
backend 'overlay'
00:53:24  I0930 00:53:24.480844  7413 cluster.cpp:448] Creating default 'local' 
authorizer
00:53:24  I0930 00:53:24.481324 26523 slave.cpp:254] Mesos agent started on 

[jira] [Commented] (MESOS-7742) ContentType/AgentAPIStreamingTest.AttachInputToNestedContainerSession is flaky

2017-10-02 Thread Till Toenshoff (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-7742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16188081#comment-16188081
 ] 

Till Toenshoff commented on MESOS-7742:
---

Observed this as well on internal CI:
{noformat}
../../src/tests/api_tests.cpp:6951
Value of: (response).get().status
  Actual: "500 Internal Server Error"
Expected: http::OK().status
Which is: "200 OK"
{noformat}

{noformat}
00:50:40  [ RUN  ] 
ContentType/AgentAPIStreamingTest.AttachInputToNestedContainerSession/1
00:50:40  I0930 00:50:40.193588  7413 cluster.cpp:162] Creating default 'local' 
authorizer
00:50:40  I0930 00:50:40.194614 26521 master.cpp:445] Master 
6d4d319b-ce27-402c-91d2-087edb6a4a11 (ip-172-16-10-96.ec2.internal) started on 
172.16.10.96:38662
00:50:40  I0930 00:50:40.194630 26521 master.cpp:447] Flags at startup: 
--acls="" --agent_ping_timeout="15secs" --agent_reregister_timeout="10mins" 
--allocation_interval="1secs" --allocator="HierarchicalDRF" 
--authenticate_agents="true" --authenticate_frameworks="true" 
--authenticate_http_frameworks="true" --authenticate_http_readonly="true" 
--authenticate_http_readwrite="true" --authenticators="crammd5" 
--authorizers="local" --credentials="/tmp/wdBG06/credentials" 
--filter_gpu_resources="true" --framework_sorter="drf" --help="false" 
--hostname_lookup="true" --http_authenticators="basic" 
--http_framework_authenticators="basic" --initialize_driver_logging="true" 
--log_auto_initialize="true" --logbufsecs="0" --logging_level="INFO" 
--max_agent_ping_timeouts="5" --max_completed_frameworks="50" 
--max_completed_tasks_per_framework="1000" 
--max_unreachable_tasks_per_framework="1000" --port="5050" --quiet="false" 
--recovery_agent_removal_limit="100%" --registry="in_memory" 
--registry_fetch_timeout="1mins" --registry_gc_interval="15mins" 
--registry_max_agent_age="2weeks" --registry_max_agent_count="102400" 
--registry_store_timeout="100secs" --registry_strict="false" 
--root_submissions="true" --user_sorter="drf" --version="false" 
--webui_dir="/usr/local/share/mesos/webui" --work_dir="/tmp/wdBG06/master" 
--zk_session_timeout="10secs"
00:50:40  I0930 00:50:40.194720 26521 master.cpp:497] Master only allowing 
authenticated frameworks to register
00:50:40  I0930 00:50:40.194723 26521 master.cpp:511] Master only allowing 
authenticated agents to register
00:50:40  I0930 00:50:40.194725 26521 master.cpp:524] Master only allowing 
authenticated HTTP frameworks to register
00:50:40  I0930 00:50:40.194730 26521 credentials.hpp:37] Loading credentials 
for authentication from '/tmp/wdBG06/credentials'
00:50:40  I0930 00:50:40.194808 26521 master.cpp:569] Using default 'crammd5' 
authenticator
00:50:40  I0930 00:50:40.194844 26521 http.cpp:1045] Creating default 'basic' 
HTTP authenticator for realm 'mesos-master-readonly'
00:50:40  I0930 00:50:40.194876 26521 http.cpp:1045] Creating default 'basic' 
HTTP authenticator for realm 'mesos-master-readwrite'
00:50:40  I0930 00:50:40.194905 26521 http.cpp:1045] Creating default 'basic' 
HTTP authenticator for realm 'mesos-master-scheduler'
00:50:40  I0930 00:50:40.194932 26521 master.cpp:649] Authorization enabled
00:50:40  I0930 00:50:40.194973 26516 hierarchical.cpp:171] Initialized 
hierarchical allocator process
00:50:40  I0930 00:50:40.195008 26516 whitelist_watcher.cpp:77] No whitelist 
given
00:50:40  I0930 00:50:40.195634 26523 master.cpp:2216] Elected as the leading 
master!
00:50:40  I0930 00:50:40.195659 26523 master.cpp:1705] Recovering from registrar
00:50:40  I0930 00:50:40.195701 26523 registrar.cpp:347] Recovering registrar
00:50:40  I0930 00:50:40.195863 26521 registrar.cpp:391] Successfully fetched 
the registry (0B) in 144128ns
00:50:40  I0930 00:50:40.195896 26521 registrar.cpp:495] Applied 1 operations 
in 6568ns; attempting to update the registry
00:50:40  I0930 00:50:40.196048 26519 registrar.cpp:552] Successfully updated 
the registry in 119808ns
00:50:40  I0930 00:50:40.196079 26519 registrar.cpp:424] Successfully recovered 
registrar
00:50:40  I0930 00:50:40.196159 26520 master.cpp:1809] Recovered 0 agents from 
the registry (168B); allowing 10mins for agents to re-register
00:50:40  I0930 00:50:40.196218 26518 hierarchical.cpp:209] Skipping recovery 
of hierarchical allocator: nothing to recover
00:50:40  I0930 00:50:40.197204  7413 containerizer.cpp:292] Using isolation { 
environment_secret, posix/cpu, posix/mem, filesystem/posix, network/cni }
00:50:40  I0930 00:50:40.202510  7413 linux_launcher.cpp:146] Using 
/sys/fs/cgroup/freezer as the freezer hierarchy for the Linux launcher
00:50:40  I0930 00:50:40.202874  7413 provisioner.cpp:255] Using default 
backend 'overlay'
00:50:40  W0930 00:50:40.204174  7413 process.cpp:3194] Attempted to spawn 
already running process files@172.16.10.96:38662
00:50:40  I0930 00:50:40.204308  7413 cluster.cpp:448] Creating default 'local' 
authorizer
00:50:40  I0930 00:50:40.204797 26523 

[jira] [Created] (MESOS-8048) ReservationEndpointsTest.GoodReserveAndUnreserveACL is flaky.

2017-10-02 Thread Till Toenshoff (JIRA)
Till Toenshoff created MESOS-8048:
-

 Summary: ReservationEndpointsTest.GoodReserveAndUnreserveACL is 
flaky.
 Key: MESOS-8048
 URL: https://issues.apache.org/jira/browse/MESOS-8048
 Project: Mesos
  Issue Type: Bug
  Components: test
Affects Versions: 1.5.0
 Environment: Centos7
Reporter: Till Toenshoff


As just observed on our internal CI;

Error Message
{noformat}
../../src/tests/reservation_endpoints_tests.cpp:1026
Value of: (response).get().status
  Actual: "409 Conflict"
Expected: Accepted().status
Which is: "202 Accepted"
{noformat}

Log:
{noformat}
00:42:35  [ RUN  ] ReservationEndpointsTest.GoodReserveAndUnreserveACL
00:42:35  I0930 00:42:35.517658  7413 cluster.cpp:162] Creating default 'local' 
authorizer
00:42:35  I0930 00:42:35.518507  7433 master.cpp:445] Master 
938119f3-8007-4d6f-a45b-d49bf76a0590 (ip-172-16-10-96.ec2.internal) started on 
172.16.10.96:46227
00:42:35  I0930 00:42:35.518523  7433 master.cpp:447] Flags at startup: 
--acls="reserve_resources {
00:42:35principals {
00:42:35  values: "test-principal"
00:42:35}
00:42:35roles {
00:42:35  type: ANY
00:42:35}
00:42:35  }
00:42:35  unreserve_resources {
00:42:35principals {
00:42:35  values: "test-principal"
00:42:35}
00:42:35reserver_principals {
00:42:35  values: "test-principal"
00:42:35}
00:42:35  }
00:42:35  " --agent_ping_timeout="15secs" --agent_reregister_timeout="10mins" 
--allocation_interval="50ms" --allocator="HierarchicalDRF" 
--authenticate_agents="true" --authenticate_frameworks="true" 
--authenticate_http_frameworks="true" --authenticate_http_readonly="true" 
--authenticate_http_readwrite="true" --authenticators="crammd5" 
--authorizers="local" --credentials="/tmp/zFIYus/credentials" 
--filter_gpu_resources="true" --framework_sorter="drf" --help="false" 
--hostname_lookup="true" --http_authenticators="basic" 
--http_framework_authenticators="basic" --initialize_driver_logging="true" 
--log_auto_initialize="true" --logbufsecs="0" --logging_level="INFO" 
--max_agent_ping_timeouts="5" --max_completed_frameworks="50" 
--max_completed_tasks_per_framework="1000" 
--max_unreachable_tasks_per_framework="1000" --port="5050" --quiet="false" 
--recovery_agent_removal_limit="100%" --registry="in_memory" 
--registry_fetch_timeout="1mins" --registry_gc_interval="15mins" 
--registry_max_agent_age="2weeks" --registry_max_agent_count="102400" 
--registry_store_timeout="100secs" --registry_strict="false" --roles="role" 
--root_submissions="true" --user_sorter="drf" --version="false" 
--webui_dir="/usr/local/share/mesos/webui" --work_dir="/tmp/zFIYus/master" 
--zk_session_timeout="10secs"
00:42:35  I0930 00:42:35.518672  7433 master.cpp:497] Master only allowing 
authenticated frameworks to register
00:42:35  I0930 00:42:35.518681  7433 master.cpp:511] Master only allowing 
authenticated agents to register
00:42:35  I0930 00:42:35.518685  7433 master.cpp:524] Master only allowing 
authenticated HTTP frameworks to register
00:42:35  I0930 00:42:35.518689  7433 credentials.hpp:37] Loading credentials 
for authentication from '/tmp/zFIYus/credentials'
00:42:35  I0930 00:42:35.518784  7433 master.cpp:569] Using default 'crammd5' 
authenticator
00:42:35  I0930 00:42:35.518823  7433 http.cpp:1045] Creating default 'basic' 
HTTP authenticator for realm 'mesos-master-readonly'
00:42:35  I0930 00:42:35.518853  7433 http.cpp:1045] Creating default 'basic' 
HTTP authenticator for realm 'mesos-master-readwrite'
00:42:35  I0930 00:42:35.518877  7433 http.cpp:1045] Creating default 'basic' 
HTTP authenticator for realm 'mesos-master-scheduler'
00:42:35  I0930 00:42:35.518898  7433 master.cpp:649] Authorization enabled
00:42:35  W0930 00:42:35.518905  7433 master.cpp:712] The '--roles' flag is 
deprecated. This flag will be removed in the future. See the Mesos 0.27 upgrade 
notes for more information
00:42:35  I0930 00:42:35.519016  7438 whitelist_watcher.cpp:77] No whitelist 
given
00:42:35  I0930 00:42:35.519018  7439 hierarchical.cpp:171] Initialized 
hierarchical allocator process
00:42:35  I0930 00:42:35.519625  7433 master.cpp:2216] Elected as the leading 
master!
00:42:35  I0930 00:42:35.519640  7433 master.cpp:1705] Recovering from registrar
00:42:35  I0930 00:42:35.519677  7433 registrar.cpp:347] Recovering registrar
00:42:35  I0930 00:42:35.519762  7438 registrar.cpp:391] Successfully fetched 
the registry (0B) in 70144ns
00:42:35  I0930 00:42:35.519783  7438 registrar.cpp:495] Applied 1 operations 
in 3246ns; attempting to update the registry
00:42:35  I0930 00:42:35.519870  7439 registrar.cpp:552] Successfully updated 
the registry in 78080ns
00:42:35  I0930 00:42:35.519899  7439 registrar.cpp:424] Successfully recovered 
registrar
00:42:35  I0930 00:42:35.519975  7439 master.cpp:1809] Recovered 0 agents from 
the registry (168B); allowing 10mins for 

[jira] [Created] (MESOS-8047) SubprocessTest.Status does not always receive a signal

2017-10-02 Thread Benno Evers (JIRA)
Benno Evers created MESOS-8047:
--

 Summary: SubprocessTest.Status does not always receive a signal
 Key: MESOS-8047
 URL: https://issues.apache.org/jira/browse/MESOS-8047
 Project: Mesos
  Issue Type: Bug
Reporter: Benno Evers


This one seems to be different from MESOS-1705 and MESOS-1738. It might be that 
previous test runs leave a mesos process running in the background, but I 
didn't investigate very deeply:

{code}
[ RUN  ] SubprocessTest.Status
/home/bevers/src/mesos/worktrees/master/3rdparty/libprocess/src/tests/subprocess_tests.cpp:281:
 Failure
Expecting WIFSIGNALED(s.get().status()()->get()) but  
WIFEXITED(s.get().status()()->get()) is true and 
WEXITSTATUS(s.get().status()()->get()) is 0
{code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (MESOS-8046) MasterTestPrePostReservationRefinement.ReserveAndUnreserveResourcesV1 is flaky.

2017-10-02 Thread Till Toenshoff (JIRA)
Till Toenshoff created MESOS-8046:
-

 Summary: 
MasterTestPrePostReservationRefinement.ReserveAndUnreserveResourcesV1 is flaky.
 Key: MESOS-8046
 URL: https://issues.apache.org/jira/browse/MESOS-8046
 Project: Mesos
  Issue Type: Bug
  Components: test
Affects Versions: 1.5.0
Reporter: Till Toenshoff


As seen on our internal CI.

Error Message
{noformat}
../../src/tests/master_tests.cpp:8682
Value of: (v1UnreserveResourcesResponse).get().status
  Actual: "409 Conflict"
Expected: Accepted().status
Which is: "202 Accepted"
{noformat}


Log:
{noformat}
00:33:08  [ RUN  ] 
bool/MasterTestPrePostReservationRefinement.ReserveAndUnreserveResourcesV1/0
00:33:08  I0929 17:33:08.670744 2067726336 cluster.cpp:162] Creating default 
'local' authorizer
00:33:08  I0929 17:33:08.672592 3211264 master.cpp:445] Master 
71fce4a3-01f6-43a7-b512-28980b04e51f (10.0.49.4) started on 10.0.49.4:54887
00:33:08  I0929 17:33:08.672621 3211264 master.cpp:447] Flags at startup: 
--acls="" --agent_ping_timeout="15secs" --agent_reregister_timeout="10mins" 
--allocation_interval="1secs" --allocator="HierarchicalDRF" 
--authenticate_agents="true" --authenticate_frameworks="true" 
--authenticate_http_frameworks="true" --authenticate_http_readonly="true" 
--authenticate_http_readwrite="true" --authenticators="crammd5" 
--authorizers="local" 
--credentials="/private/var/folders/6w/rw03zh013y38ys6cyn8qppf8gn/T/YdqFmR/credentials"
 --filter_gpu_resources="true" --framework_sorter="drf" --help="false" 
--hostname_lookup="true" --http_authenticators="basic" 
--http_framework_authenticators="basic" --initialize_driver_logging="true" 
--log_auto_initialize="true" --logbufsecs="0" --logging_level="INFO" 
--max_agent_ping_timeouts="5" --max_completed_frameworks="50" 
--max_completed_tasks_per_framework="1000" 
--max_unreachable_tasks_per_framework="1000" --port="5050" --quiet="false" 
--recovery_agent_removal_limit="100%" --registry="in_memory" 
--registry_fetch_timeout="1mins" --registry_gc_interval="15mins" 
--registry_max_agent_age="2weeks" --registry_max_agent_count="102400" 
--registry_store_timeout="100secs" --registry_strict="false" 
--root_submissions="true" --user_sorter="drf" --version="false" 
--webui_dir="/usr/local/share/mesos/webui" 
--work_dir="/private/var/folders/6w/rw03zh013y38ys6cyn8qppf8gn/T/YdqFmR/master"
 --zk_session_timeout="10secs"
00:33:08  I0929 17:33:08.672792 3211264 master.cpp:497] Master only allowing 
authenticated frameworks to register
00:33:08  I0929 17:33:08.672804 3211264 master.cpp:511] Master only allowing 
authenticated agents to register
00:33:08  I0929 17:33:08.672821 3211264 master.cpp:524] Master only allowing 
authenticated HTTP frameworks to register
00:33:08  I0929 17:33:08.672829 3211264 credentials.hpp:37] Loading credentials 
for authentication from 
'/private/var/folders/6w/rw03zh013y38ys6cyn8qppf8gn/T/YdqFmR/credentials'
00:33:08  I0929 17:33:08.672997 3211264 master.cpp:569] Using default 'crammd5' 
authenticator
00:33:08  I0929 17:33:08.673053 3211264 http.cpp:1045] Creating default 'basic' 
HTTP authenticator for realm 'mesos-master-readonly'
00:33:08  I0929 17:33:08.673136 3211264 http.cpp:1045] Creating default 'basic' 
HTTP authenticator for realm 'mesos-master-readwrite'
00:33:08  I0929 17:33:08.673174 3211264 http.cpp:1045] Creating default 'basic' 
HTTP authenticator for realm 'mesos-master-scheduler'
00:33:08  I0929 17:33:08.673226 3211264 master.cpp:649] Authorization enabled
00:33:08  I0929 17:33:08.673306 2674688 hierarchical.cpp:171] Initialized 
hierarchical allocator process
00:33:08  I0929 17:33:08.673326 1601536 whitelist_watcher.cpp:77] No whitelist 
given
00:33:08  I0929 17:33:08.674684 1601536 master.cpp:2216] Elected as the leading 
master!
00:33:08  I0929 17:33:08.674708 1601536 master.cpp:1705] Recovering from 
registrar
00:33:08  I0929 17:33:08.674787 2674688 registrar.cpp:347] Recovering registrar
00:33:08  I0929 17:33:08.674944 2674688 registrar.cpp:391] Successfully fetched 
the registry (0B) in 134912ns
00:33:08  I0929 17:33:08.675014 2674688 registrar.cpp:495] Applied 1 operations 
in 17us; attempting to update the registry
00:33:08  I0929 17:33:08.675209 2674688 registrar.cpp:552] Successfully updated 
the registry in 157184ns
00:33:08  I0929 17:33:08.675252 2674688 registrar.cpp:424] Successfully 
recovered registrar
00:33:08  I0929 17:33:08.675377 2138112 master.cpp:1809] Recovered 0 agents 
from the registry (121B); allowing 10mins for agents to re-register
00:33:08  I0929 17:33:08.675418 528384 hierarchical.cpp:209] Skipping recovery 
of hierarchical allocator: nothing to recover
00:33:08  W0929 17:33:08.678066 2067726336 process.cpp:3194] Attempted to spawn 
already running process files@10.0.49.4:54887
00:33:08  I0929 17:33:08.678484 2067726336 containerizer.cpp:292] Using 
isolation { environment_secret, 

[jira] [Updated] (MESOS-8045) Update Mesos executables output if there is a typo

2017-10-02 Thread Armand Grillet (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-8045?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Armand Grillet updated MESOS-8045:
--
Description: 
Current output if a user makes a typo while using one of the Mesos executables:

{code}
build (master) $ ./bin/mesos-master.sh --ip=127.0.0.1 --workdir=/tmp
Failed to load unknown flag 'workdir'

Usage: mesos-master [options]

  --acls=VALUE  
 The value could be a JSON-formatted string of ACLs

 or a file path containing the JSON-formatted ACLs used

 for authorization. Path could be of the form `file:///path/to/file`

 or `/path/to/file`.


 Note that if the flag `--authorizers` is provided with a value

 different than `local`, the ACLs contents

 will be ignored.


 See the ACLs protobuf in acls.proto for the expected format.


 Example:

 {

   "register_frameworks": [

 {

   "principals": { "type": "ANY" },

   "roles": { "values": ["a"] }

 }

   ],

   "run_tasks": [

 {

   "principals": { "values": ["a", "b"] },

   "users": { "values": ["c"] }

 }

   ],

   "teardown_frameworks": [

 {

   "principals": { "values": ["a", "b"] },

   "framework_principals": { "values": ["c"] }

 }

   ],

   "set_quotas": [

 {

   "principals": { "values": ["a"] },

   "roles": { "values": ["a", "b"] }

 }

   ],

   "remove_quotas": [

 {

   "principals": { "values": ["a"] },

   "quota_principals": { "values": ["a"] }

 }

   ]

 }
  --advertise_ip=VALUE  
 IP address advertised to reach this Mesos master.

[jira] [Created] (MESOS-8045) Update Mesos executables output if there is a typo

2017-10-02 Thread Armand Grillet (JIRA)
Armand Grillet created MESOS-8045:
-

 Summary: Update Mesos executables output if there is a typo
 Key: MESOS-8045
 URL: https://issues.apache.org/jira/browse/MESOS-8045
 Project: Mesos
  Issue Type: Improvement
Reporter: Armand Grillet
Priority: Minor


Current output if a user makes a typo while using one of the Mesos executables:

{code}
build (master) $ ./bin/mesos-master.sh --ip=127.0.0.1 --workdir=/tmp
Failed to load unknown flag 'workdir'

Usage: mesos-master [options]

  --acls=VALUE  
 The value could be a JSON-formatted string of ACLs

 or a file path containing the JSON-formatted ACLs used

 for authorization. Path could be of the form `file:///path/to/file`

 or `/path/to/file`.


 Note that if the flag `--authorizers` is provided with a value

 different than `local`, the ACLs contents

 will be ignored.


 See the ACLs protobuf in acls.proto for the expected format.


 Example:

 {

   "register_frameworks": [

 {

   "principals": { "type": "ANY" },

   "roles": { "values": ["a"] }

 }

   ],

   "run_tasks": [

 {

   "principals": { "values": ["a", "b"] },

   "users": { "values": ["c"] }

 }

   ],

   "teardown_frameworks": [

 {

   "principals": { "values": ["a", "b"] },

   "framework_principals": { "values": ["c"] }

 }

   ],

   "set_quotas": [

 {

   "principals": { "values": ["a"] },

   "roles": { "values": ["a", "b"] }

 }

   ],

   "remove_quotas": [

 {

   "principals": { "values": ["a"] },

   "quota_principals": { "values": ["a"] }

 }

   ]

 }
  --advertise_ip=VALUE  

[jira] [Created] (MESOS-8044) Modules flags are getting copy too often.

2017-10-02 Thread Till Toenshoff (JIRA)
Till Toenshoff created MESOS-8044:
-

 Summary: Modules flags are getting copy too often.
 Key: MESOS-8044
 URL: https://issues.apache.org/jira/browse/MESOS-8044
 Project: Mesos
  Issue Type: Improvement
Reporter: Till Toenshoff


For loading modules, we commonly use the flags {{modules}} and {{modules_dir}}. 
Their description is rather verbose but gets copy across the various 
user-cases a lot. We might want to fix this by providing a single definition 
and then using flags inheritance instead.

See:
https://github.com/apache/mesos/blob/f3a1ad0a1e15136196b3e093cb89083e2db8a0a7/src/slave/flags.cpp#L1034-L1089
https://github.com/apache/mesos/blob/f3a1ad0a1e15136196b3e093cb89083e2db8a0a7/src/master/flags.cpp#L379-L435
https://github.com/apache/mesos/blob/f3a1ad0a1e15136196b3e093cb89083e2db8a0a7/src/sched/flags.hpp#L58-L113



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (MESOS-8011) Enabling Port mapping generate segfault

2017-10-02 Thread Jean-Baptiste (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-8011?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16187798#comment-16187798
 ] 

Jean-Baptiste commented on MESOS-8011:
--

Adding a *strace* output if this can help:
{code}
open("/proc/self/maps", O_RDONLY)   = 10
read(10, "561e673f-561e6742e000 r-xp 0"..., 1024)   = 1024
read(10, "000 ---p  00:00 0 \n7f119"..., 1002)  = 1002
read(10, "ogger-1.3.1.so\n7f11ba91f000-7f11"..., 887)   = 887
read(10, "p  00:00 0 \n7f11bd928000"..., 995)   = 995
read(10, "0 0 \n7f11c2932000-7f11c2933000 -"..., 980)   = 980
read(10, "  /lib/x86_64-linux-gnu/"..., 961)= 961
read(10, "4-linux-gnu/libkrb5support.so.0."..., 937)= 937
read(10, "lib/x86_64-linux-gnu/libp11-kit."..., 946)= 946
read(10, "u/libgmp.so.10.2.0\n7f11c431c000-"..., 927)   = 927
read(10, "o.5.0.0\n7f11c476e000-7f11c477600"..., 920)   = 920
read(10, "b/x86_64-linux-gnu/libcom_err.so"..., 948)= 948
read(10, "0941 /usr/li"..., 976)= 976
read(10, "0 /usr/lib/x"..., 973)= 973
read(10, "0 ca:02 272539  "..., 986)= 986
read(10, "11.6.12\n7f11c5ec5000-7f11c5eef00"..., 915)   = 915
read(10, " /lib/x86_64-linux-gnu/l"..., 960)= 960
read(10, "/usr/lib/x86_64-linux-gnu/li"..., 955)= 955
read(10, "a:02 271003 "..., 983)= 983
read(10, "\n7f11c7087000-7f11c7088000 rw-p "..., 904)   = 904
read(10, "5000 ca:02 295137   "..., 993)= 993
read(10, " /lib/x8"..., 976)= 976
read(10, "76/lib/x"..., 977)= 977
read(10, "-gnutls.so.4.3.0\n7f11c8491000-7f"..., 918)   = 918
read(10, "a:02 294964 "..., 983)= 983
read(10, "143 /usr/lib"..., 975)= 975
read(10, "ibsasl2.so.2.0.25\n7f11c8f8b000-7"..., 924)   = 924
read(10, " /usr/lib/x86_64-linux-g"..., 960)= 960
read(10, " ca:02 295023   "..., 993)= 993
read(10, " /lib/x86_64-lin"..., 968)= 968
read(10, "0 ca:02 77  "..., 986)= 986
read(10, " /usr/lib/x86_64"..., 968)= 968
read(10, "cd3d6000-7f11cd3f7000 rw-p 0"..., 1020)   = 680
read(10, "", 340)   = 0
read(10, "", 1024)  = 0
close(10)   = 0
write(2, "@0x0 (unknow"..., 35@0x0 
(unknown)
) = 35
gettimeofday({1506936412, 36558}, NULL) = 0
gettimeofday({1506936412, 36613}, NULL) = 0
gettimeofday({1506936412, 3}, NULL) = 0
gettimeofday({1506936412, 36719}, NULL) = 0
rt_sigaction(SIGABRT, {SIG_DFL, [], SA_RESTORER, 0x7f11ca091890}, NULL, 8) = 0
kill(30960, SIGABRT)= 0
+++ killed by SIGABRT +++
Aborted
{code}

> Enabling Port mapping generate segfault 
> 
>
> Key: MESOS-8011
> URL: https://issues.apache.org/jira/browse/MESOS-8011
> Project: Mesos
>  Issue Type: Bug
>  Components: agent, network
>Affects Versions: 1.3.0, 1.3.1, 1.4.0
>Reporter: Jean-Baptiste
>  Labels: core, isolation, reliability
>
> h2. Overview
> After a succesful build of Mesos in the different versions (1.3.0 / 1.3.1 / 
> 1.4.0 / 1.5.0), I still get stuck with the following segfault when starting 
> the Mesos agent:
> h2. Environment
> * *Debian* Linux 8.7 (Jessie)
> * *Kernel* 4.12 (also tried with 3.16 and 4.9)
> * *Mesos* 1.3.0 (also tried with 1.3.1, 1.4.0 and 1.5.0)
> * *Libnl* 3.2.27-2
> h2. Stack trace
> {code}
> Sep 25 12:41:46 ip-10-43-20-218 systemd[1]: Starting Mesos Slave...
> Sep 25 12:41:46 ip-10-43-20-218 systemd[1]: Started Mesos Slave.
> Sep 25 12:41:46 ip-10-43-20-218 mesos-slave[2754]: WARNING: Logging before 
> InitGoogleLogging() is written to STDERR
> Sep 25 12:41:46 ip-10-43-20-218 mesos-slave[2754]: W0925 12:41:46.510066  
> 2717 parse.hpp:97] Specifying an absolute filename to read a command line 
> option out of without using 'file:// is deprecated and will be removed in a 
> future release. Simply adding 'file://' to the beginning of the path should 
> eliminate this warning.
> Sep 25 12:41:46 ip-10-43-20-218 mesos-slave[2754]: I0925 12:41:46.510259  
> 2717 main.cpp:322] Build: 2017-09-04 19:29:27 by pbuilder
> Sep 25 12:41:46 ip-10-43-20-218 mesos-slave[2754]: I0925 12:41:46.510275  
> 2717 main.cpp:323] Version: 1.3.1
> Sep 25 12:41:46 ip-10-43-20-218 mesos-slave[2754]: I0925 12:41:46.511230  
> 2717 logging.cpp:194] INFO level logging started!
>