[jira] [Updated] (MESOS-3936) Document possible task state transitions for framework authors
[ https://issues.apache.org/jira/browse/MESOS-3936?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benjamin Hindman updated MESOS-3936: Sprint: Mesosphere Sprint 24, Mesosphere Sprint 25, Mesosphere Sprint 26 (was: Mesosphere Sprint 24, Mesosphere Sprint 25) > Document possible task state transitions for framework authors > -- > > Key: MESOS-3936 > URL: https://issues.apache.org/jira/browse/MESOS-3936 > Project: Mesos > Issue Type: Documentation > Components: documentation >Reporter: Neil Conway >Assignee: Neil Conway > Labels: documentation, mesosphere > > We should document the possible ways in which the state of a task can evolve > over time; what happens when an agent is partitioned from the master; and > more generally, how we recommend that framework authors develop > fault-tolerant schedulers and do task state reconciliation. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-4236) Create a design document for jsonify
[ https://issues.apache.org/jira/browse/MESOS-4236?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benjamin Hindman updated MESOS-4236: Sprint: Mesosphere Sprint 25, Mesosphere Sprint 26 (was: Mesosphere Sprint 25) > Create a design document for jsonify > > > Key: MESOS-4236 > URL: https://issues.apache.org/jira/browse/MESOS-4236 > Project: Mesos > Issue Type: Task > Components: stout >Reporter: Michael Park >Assignee: Michael Park > Labels: mesosphere > > This is the design doc for MESOS-4235 in introducing {{jsonify}} to {{stout}}. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-4238) Update `Master::Http::state` to use the `jsonify` facility.
[ https://issues.apache.org/jira/browse/MESOS-4238?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benjamin Hindman updated MESOS-4238: Sprint: Mesosphere Sprint 25, Mesosphere Sprint 26 (was: Mesosphere Sprint 25) > Update `Master::Http::state` to use the `jsonify` facility. > --- > > Key: MESOS-4238 > URL: https://issues.apache.org/jira/browse/MESOS-4238 > Project: Mesos > Issue Type: Task > Components: master >Reporter: Michael Park >Assignee: Michael Park > Labels: mesosphere > > Update {{Master::Http::state}} to use the {{jsonify}} function introduced > into stout from MESOS-4237. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-2210) Disallow special characters in role.
[ https://issues.apache.org/jira/browse/MESOS-2210?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benjamin Hindman updated MESOS-2210: Sprint: Mesosphere Sprint 22, Mesosphere Sprint 23, Mesosphere Sprint 24, Mesosphere Sprint 25, Mesosphere Sprint 26 (was: Mesosphere Sprint 22, Mesosphere Sprint 23, Mesosphere Sprint 24, Mesosphere Sprint 25) > Disallow special characters in role. > > > Key: MESOS-2210 > URL: https://issues.apache.org/jira/browse/MESOS-2210 > Project: Mesos > Issue Type: Task >Reporter: Jie Yu >Assignee: haosdent > Labels: mesosphere, newbie, persistent-volumes > > As we introduce persistent volumes in MESOS-1524, we will use roles as > directory names on the slave (https://reviews.apache.org/r/28562/). As a > result, the master should disallow special characters (like space and slash) > in role. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-313) Report executor terminations to framework schedulers.
[ https://issues.apache.org/jira/browse/MESOS-313?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benjamin Hindman updated MESOS-313: --- Sprint: Mesosphere Sprint 24, Mesosphere Sprint 25, Mesosphere Sprint 26 (was: Mesosphere Sprint 24, Mesosphere Sprint 25) > Report executor terminations to framework schedulers. > - > > Key: MESOS-313 > URL: https://issues.apache.org/jira/browse/MESOS-313 > Project: Mesos > Issue Type: Improvement >Reporter: Charles Reiss >Assignee: Zhitao Li > Labels: mesosphere, newbie > > The Scheduler interface has a callback for executorLost, but currently it is > never called. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-2353) Improve performance of the state.json endpoint for large clusters.
[ https://issues.apache.org/jira/browse/MESOS-2353?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benjamin Hindman updated MESOS-2353: Sprint: Twitter Mesos Q1 Sprint 5, Mesosphere Sprint 24, Mesosphere Sprint 25, Mesosphere Sprint 26 (was: Twitter Mesos Q1 Sprint 5, Mesosphere Sprint 24, Mesosphere Sprint 25) > Improve performance of the state.json endpoint for large clusters. > -- > > Key: MESOS-2353 > URL: https://issues.apache.org/jira/browse/MESOS-2353 > Project: Mesos > Issue Type: Improvement > Components: master >Reporter: Benjamin Mahler >Assignee: Michael Park > Labels: scalability, twitter > > The master's state.json endpoint consistently takes a long time to compute > the JSON result, for large clusters: > {noformat} > $ time curl -s -o /dev/null localhost:5050/master/state.json > Mon Jan 26 22:38:50 UTC 2015 > real 0m13.174s > user 0m0.003s > sys 0m0.022s > {noformat} > This can cause the master to get backlogged if there are many state.json > requests in flight. > Looking at {{perf}} data, it seems most of the time is spent doing memory > allocation / de-allocation. This ticket will try to capture any low hanging > fruit to speed this up. Possibly we can leverage moves if they are not > already being used by the compiler. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-4228) Use std::is_bind_expression to reroute the result of std::bind.
[ https://issues.apache.org/jira/browse/MESOS-4228?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benjamin Hindman updated MESOS-4228: Sprint: Mesosphere Sprint 25, Mesosphere Sprint 26 (was: Mesosphere Sprint 25) > Use std::is_bind_expression to reroute the result of std::bind. > --- > > Key: MESOS-4228 > URL: https://issues.apache.org/jira/browse/MESOS-4228 > Project: Mesos > Issue Type: Task > Components: libprocess >Reporter: Michael Park >Assignee: Michael Park > Labels: mesosphere > > The Standard (C++11 through 17) does not require {{std::bind}}'s function > call operator to SFINAE, and VS 2015's doesn't. {{std::is_bind_expression}} > can be used to manually reroute bind expressions to the 1-arg overload, where > (conveniently) the argument will be ignored if necessary. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-4177) Create a user doc for Executor HTTP API
[ https://issues.apache.org/jira/browse/MESOS-4177?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benjamin Hindman updated MESOS-4177: Sprint: Mesosphere Sprint 24, Mesosphere Sprint 25, Mesosphere Sprint 26 (was: Mesosphere Sprint 24, Mesosphere Sprint 25) > Create a user doc for Executor HTTP API > --- > > Key: MESOS-4177 > URL: https://issues.apache.org/jira/browse/MESOS-4177 > Project: Mesos > Issue Type: Bug >Reporter: Anand Mazumdar >Assignee: Anand Mazumdar > Labels: mesosphere > > We need a user doc similar to the corresponding one for the Scheduler HTTP > API. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-4241) Consolidate docker store slave flags
[ https://issues.apache.org/jira/browse/MESOS-4241?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benjamin Hindman updated MESOS-4241: Sprint: Mesosphere Sprint 25, Mesosphere Sprint 26 (was: Mesosphere Sprint 25) > Consolidate docker store slave flags > > > Key: MESOS-4241 > URL: https://issues.apache.org/jira/browse/MESOS-4241 > Project: Mesos > Issue Type: Improvement > Components: containerization >Reporter: Timothy Chen >Assignee: Timothy Chen > > Currently there are too many slave flags for configuring the docker > store/puller. > We can remove the following flags: > docker_auth_server_port > docker_local_archives_dir > docker_registry_port > docker_puller > And consolidate them into the existing flags. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-3925) Add HDFS based URI fetcher plugin.
[ https://issues.apache.org/jira/browse/MESOS-3925?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benjamin Hindman updated MESOS-3925: Sprint: Mesosphere Sprint 24, Mesosphere Sprint 25, Mesosphere Sprint 26 (was: Mesosphere Sprint 24, Mesosphere Sprint 25) > Add HDFS based URI fetcher plugin. > -- > > Key: MESOS-3925 > URL: https://issues.apache.org/jira/browse/MESOS-3925 > Project: Mesos > Issue Type: Task >Reporter: Jie Yu >Assignee: Jie Yu > Labels: mesosphere, twitter > > This plugin uses HDFS client to fetch artifacts. It can support schemes like > hdfs/hftp/s3/s3n > It'll shell out the hadoop command to do the actual fetching. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-4136) Add a ContainerLogger module that restrains log sizes
[ https://issues.apache.org/jira/browse/MESOS-4136?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benjamin Hindman updated MESOS-4136: Sprint: Mesosphere Sprint 24, Mesosphere Sprint 25, Mesosphere Sprint 26 (was: Mesosphere Sprint 24, Mesosphere Sprint 25) > Add a ContainerLogger module that restrains log sizes > - > > Key: MESOS-4136 > URL: https://issues.apache.org/jira/browse/MESOS-4136 > Project: Mesos > Issue Type: Improvement > Components: modules >Reporter: Joseph Wu >Assignee: Joseph Wu > Labels: logging, mesosphere > > One of the major problems this logger module aims to solve is overflowing > executor/task log files. Log files are simply written to disk, and are not > managed other than via occasional garbage collection by the agent process > (and this only deals with terminated executors). > We should add a {{ContainerLogger}} module that truncates logs as it reaches > a configurable maximum size. Additionally, we should determine if the web > UI's {{pailer}} needs to be changed to deal with logs that are not > append-only. > This will be a non-default module which will also serve as an example for how > to implement the module. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-4222) Document containerizer from user perspective.
[ https://issues.apache.org/jira/browse/MESOS-4222?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benjamin Hindman updated MESOS-4222: Sprint: Mesosphere Sprint 25, Mesosphere Sprint 26 (was: Mesosphere Sprint 25) > Document containerizer from user perspective. > - > > Key: MESOS-4222 > URL: https://issues.apache.org/jira/browse/MESOS-4222 > Project: Mesos > Issue Type: Documentation > Components: containerization >Reporter: Jojy Varghese >Assignee: Jojy Varghese > Labels: documentaion, mesosphere > > Add documentation that covers: > * Purpose of containerizers from a use case perspective. > * What purpose does each containerizer (mesos. docker, compose) serve. > * What criteria could be used to choose a containerizer. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-3550) Create a Executor Library based on the new Executor HTTP API
[ https://issues.apache.org/jira/browse/MESOS-3550?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benjamin Hindman updated MESOS-3550: Sprint: Mesosphere Sprint 21, Mesosphere Sprint 22, Mesosphere Sprint 23, Mesosphere Sprint 24, Mesosphere Sprint 25, Mesosphere Sprint 26 (was: Mesosphere Sprint 21, Mesosphere Sprint 22, Mesosphere Sprint 23, Mesosphere Sprint 24, Mesosphere Sprint 25) > Create a Executor Library based on the new Executor HTTP API > > > Key: MESOS-3550 > URL: https://issues.apache.org/jira/browse/MESOS-3550 > Project: Mesos > Issue Type: Task >Reporter: Anand Mazumdar >Assignee: Anand Mazumdar > Labels: mesosphere > > Similar to the Scheduler Library {{src/scheduler/scheduler.cpp}} , we would > need a Executor Library that speaks the new Executor HTTP API. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-4237) Introduce `jsonify` to stout.
[ https://issues.apache.org/jira/browse/MESOS-4237?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benjamin Hindman updated MESOS-4237: Sprint: Mesosphere Sprint 25, Mesosphere Sprint 26 (was: Mesosphere Sprint 25) > Introduce `jsonify` to stout. > - > > Key: MESOS-4237 > URL: https://issues.apache.org/jira/browse/MESOS-4237 > Project: Mesos > Issue Type: Task >Reporter: Michael Park >Assignee: Michael Park > Labels: mesosphere > > This ticket is to track the {{jsonify}} function being added to stout. > A quick example: > {code} > namespace store { > struct Customer > { > std::string first_name; > std::string last_name; > int age; > }; > void json(JSON::ObjectWriter* writer, const Customer& customer) > { > writer->field("first name", customer.first_name); > writer->field("last name", customer.last_name); > writer->field("age", customer.age); > } > } // namespace store { > store::Customer customer{"michael", "park", 25}; > std::cout << jsonify(customer) << std::endl; > // prints: {"first name":"michael","last name":"park","age":25} > {code} > Refer to the design doc at MESOS-4236 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-3307) Configurable size of completed task / framework history
[ https://issues.apache.org/jira/browse/MESOS-3307?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Adam B updated MESOS-3307: -- Assignee: Kevin Klues > Configurable size of completed task / framework history > --- > > Key: MESOS-3307 > URL: https://issues.apache.org/jira/browse/MESOS-3307 > Project: Mesos > Issue Type: Bug >Reporter: Ian Babrou >Assignee: Kevin Klues > Labels: mesosphere > > We try to make Mesos work with multiple frameworks and mesos-dns at the same > time. The goal is to have set of frameworks per team / project on a single > Mesos cluster. > At this point our mesos state.json is at 4mb and it takes a while to > assembly. 5 mesos-dns instances hit state.json every 5 seconds, effectively > pushing mesos-master CPU usage through the roof. It's at 100%+ all the time. > Here's the problem: > {noformat} > mesos λ curl -s http://mesos-master:5050/master/state.json | jq > .frameworks[].completed_tasks[].framework_id | sort | uniq -c | sort -n >1 "20150606-001827-252388362-5050-5982-0003" > 16 "20150606-001827-252388362-5050-5982-0005" > 18 "20150606-001827-252388362-5050-5982-0029" > 73 "20150606-001827-252388362-5050-5982-0007" > 141 "20150606-001827-252388362-5050-5982-0009" > 154 "20150820-154817-302720010-5050-15320-" > 289 "20150606-001827-252388362-5050-5982-0004" > 510 "20150606-001827-252388362-5050-5982-0012" > 666 "20150606-001827-252388362-5050-5982-0028" > 923 "20150116-002612-269165578-5050-32204-0003" > 1000 "20150606-001827-252388362-5050-5982-0001" > 1000 "20150606-001827-252388362-5050-5982-0006" > 1000 "20150606-001827-252388362-5050-5982-0010" > 1000 "20150606-001827-252388362-5050-5982-0011" > 1000 "20150606-001827-252388362-5050-5982-0027" > mesos λ fgrep 1000 -r src/master > src/master/constants.cpp:const size_t MAX_REMOVED_SLAVES = 10; > src/master/constants.cpp:const uint32_t MAX_COMPLETED_TASKS_PER_FRAMEWORK = > 1000; > {noformat} > Active tasks are just 6% of state.json response: > {noformat} > mesos λ cat ~/temp/mesos-state.json | jq -c . | wc >1 14796 4138942 > mesos λ cat ~/temp/mesos-state.json | jq .frameworks[].tasks | jq -c . | wc > 16 37 252774 > {noformat} > I see four options that can improve the situation: > 1. Add query string param to exclude completed tasks from state.json and use > it in mesos-dns and similar tools. There is no need for mesos-dns to know > about completed tasks, it's just extra load on master and mesos-dns. > 2. Make history size configurable. > 3. Make JSON serialization faster. With 1s of tasks even without history > it would take a lot of time to serialize tasks for mesos-dns. Doing it every > 60 seconds instead of every 5 seconds isn't really an option. > 4. Create event bus for mesos master. Marathon has it and it'd be nice to > have it in Mesos. This way mesos-dns could avoid polling master state and > switch to listening for events. > All can be done independently. > Note to mesosphere folks: please start distributing debug symbols with your > distribution. I was asking for it for a while and it is really helpful: > https://github.com/mesosphere/marathon/issues/1497#issuecomment-104182501 > Perf report for leading master: > !http://i.imgur.com/iz7C3o0.png! > I'm on 0.23.0. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-4262) Enable net_cls subsytem in cgroup infrastructure
[ https://issues.apache.org/jira/browse/MESOS-4262?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Artem Harutyunyan updated MESOS-4262: - Sprint: Mesosphere Sprint 26 Labels: mesosphere (was: ) > Enable net_cls subsytem in cgroup infrastructure > > > Key: MESOS-4262 > URL: https://issues.apache.org/jira/browse/MESOS-4262 > Project: Mesos > Issue Type: Improvement > Components: containerization >Reporter: Avinash Sridharan >Assignee: Avinash Sridharan > Labels: mesosphere > > Currently the control group infrastructure within mesos supports only the > memory and CPU subsystems. We need to enhance this infrastructure to support > the net_cls subsystem as well. Details of the net_cls subsystem and its > use-cases can be found here: > https://www.kernel.org/doc/Documentation/cgroups/net_cls.txt > Enabling the net_cls will allow us to provide operators to, potentially, > regulate framework traffic on a per-container basis. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-3232) Implement HTTP Basic Authentication for Mesos endpoints
[ https://issues.apache.org/jira/browse/MESOS-3232?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexander Rojas updated MESOS-3232: --- Shepherd: Benjamin Mahler (was: Bernd Mathiske) > Implement HTTP Basic Authentication for Mesos endpoints > --- > > Key: MESOS-3232 > URL: https://issues.apache.org/jira/browse/MESOS-3232 > Project: Mesos > Issue Type: Improvement > Components: security >Reporter: Alexander Rojas >Assignee: Alexander Rojas > Labels: mesosphere, security > > Using the mechanisms implemented in MESOS-3231, implement HTTP Basic > Authentication as described in the > [RFC-2617|https://www.ietf.org/rfc/rfc2617.txt]. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-1763) Add support for multiple roles to be specified in FrameworkInfo
[ https://issues.apache.org/jira/browse/MESOS-1763?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bernd Mathiske updated MESOS-1763: -- Issue Type: Epic (was: Task) > Add support for multiple roles to be specified in FrameworkInfo > --- > > Key: MESOS-1763 > URL: https://issues.apache.org/jira/browse/MESOS-1763 > Project: Mesos > Issue Type: Epic > Components: master >Reporter: Vinod Kone >Assignee: Timothy Chen > Labels: mesosphere, roles > > Currently frameworks have the ability to set only one (resource) role in > FrameworkInfo. It would be nice to let frameworks specify multiple roles so > that they can do more fine grained resource accounting per role. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (MESOS-4255) Add mechanism for testing recovery of HTTP based executors
[ https://issues.apache.org/jira/browse/MESOS-4255?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anand Mazumdar reassigned MESOS-4255: - Assignee: Anand Mazumdar > Add mechanism for testing recovery of HTTP based executors > -- > > Key: MESOS-4255 > URL: https://issues.apache.org/jira/browse/MESOS-4255 > Project: Mesos > Issue Type: Bug >Reporter: Anand Mazumdar >Assignee: Anand Mazumdar > Labels: mesosphere > > Currently, the slave process generates a process ID every time it is > initialized via {{process::ID::generate}} function call. This is a problem > for testing HTTP executors as it can't retry if there is a disconnection > after an agent restart since the prefix is incremented. > {code} > Agent PID before: > slave(1)@127.0.0.1:43915 > Agent PID after restart: > slave(2)@127.0.0.1:43915 > {code} > There are a couple of ways to fix this: > - Add a constructor to {{Slave}} exclusively for testing that passes on a > fixed {{ID}} instead of relying on {{ID::generate}}. > - Currently we delegate to slave(1)@ i.e. (1) when nothing is specified as > the URL in libprocess i.e. {{127.0.0.1:43915/api/v1/executor}} would delegate > to {{slave(1)@127.0.0.1:43915/api/v1/executor}}. Instead of defaulting to > (1), we can default to the last known active ID. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-4209) Document "how to program with dynamic reservations and persistent volumes"
[ https://issues.apache.org/jira/browse/MESOS-4209?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Artem Harutyunyan updated MESOS-4209: - Assignee: Neil Conway > Document "how to program with dynamic reservations and persistent volumes" > -- > > Key: MESOS-4209 > URL: https://issues.apache.org/jira/browse/MESOS-4209 > Project: Mesos > Issue Type: Documentation > Components: documentation >Reporter: Neil Conway >Assignee: Neil Conway > Labels: documentation, mesosphere, persistent-volumes > > Specifically, some of the gotchas around: > * Retrying reservation attempts after a timeout > * Fuzzy-matching resources to determine whether a reservation/PV is successful > * Represent client state as a state machine and repeatedly move "toward" > successful terminate stats > Should also point to persistent volume example framework. We should also ask > Gabriel and others (Arango?) who have built frameworks with PVs/DRs for > feedback. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (MESOS-4284) Draft design doc for multi-role frameworks
Bernd Mathiske created MESOS-4284: - Summary: Draft design doc for multi-role frameworks Key: MESOS-4284 URL: https://issues.apache.org/jira/browse/MESOS-4284 Project: Mesos Issue Type: Story Components: master Reporter: Bernd Mathiske Assignee: Benjamin Bannier Create a document that describes the problems with having only single-role frameworks and proposes an MVP solution and implementation approach. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (MESOS-4286) Expose state(.json) as a structured protobuf
Sargun Dhillon created MESOS-4286: - Summary: Expose state(.json) as a structured protobuf Key: MESOS-4286 URL: https://issues.apache.org/jira/browse/MESOS-4286 Project: Mesos Issue Type: Wish Reporter: Sargun Dhillon Priority: Minor State.json, both on the agent, and the master exposes information about the current state of the Mesos runtime. This information is super valuable to external users such as Mesos-DNS. Unfortunately, working with state.json can at times become cumbersome in languages where dealing with json isn't necessarily a first-class construct. Fortunately, protocol buffers exist. If the state.json was exposed as a protocol buffer, it would make the lives of software authors to the Mesos ecosystem significantly easier. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-3413) Docker containerizer does not symlink persistent volumes into sandbox
[ https://issues.apache.org/jira/browse/MESOS-3413?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15081819#comment-15081819 ] Zhitao Li commented on MESOS-3413: -- [~jieyu] and [~haosd...@gmail.com], I've put up https://reviews.apache.org/r/41892 for a first pass at unblocking DockerContainerizer users to use persistent volumes. Please let me know what you think. > Docker containerizer does not symlink persistent volumes into sandbox > - > > Key: MESOS-3413 > URL: https://issues.apache.org/jira/browse/MESOS-3413 > Project: Mesos > Issue Type: Bug > Components: containerization, docker, slave >Affects Versions: 0.23.0 >Reporter: Max Neunhöffer >Assignee: haosdent > Original Estimate: 1h > Remaining Estimate: 1h > > For the ArangoDB framework I am trying to use the persistent primitives. > nearly all is working, but I am missing a crucial piece at the end: I have > successfully created a persistent disk resource and have set the persistence > and volume information in the DiskInfo message. However, I do not see any way > to find out what directory on the host the mesos slave has reserved for us. I > know it is ${MESOS_SLAVE_WORKDIR}/volumes/roles//_ but we > have no way to query this information anywhere. The docker containerizer does > not automatically mount this directory into our docker container, or symlinks > it into our sandbox. Therefore, I have essentially no access to it. Note that > the mesos containerizer (which I cannot use for other reasons) seems to > create a symlink in the sandbox to the actual path for the persistent volume. > With that, I could mount the volume into our docker container and all would > be well. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (MESOS-4287) Extract stout
Axel Etcheverry created MESOS-4287: -- Summary: Extract stout Key: MESOS-4287 URL: https://issues.apache.org/jira/browse/MESOS-4287 Project: Mesos Issue Type: Improvement Components: stout Reporter: Axel Etcheverry Priority: Minor Is it possible to extract the stout library? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-4214) Introduce HTTP endpoint /weights for updating weight
[ https://issues.apache.org/jira/browse/MESOS-4214?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Adam B updated MESOS-4214: -- Sprint: Mesosphere Sprint 26 > Introduce HTTP endpoint /weights for updating weight > > > Key: MESOS-4214 > URL: https://issues.apache.org/jira/browse/MESOS-4214 > Project: Mesos > Issue Type: Task >Reporter: Yongqiao Wang >Assignee: Yongqiao Wang > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-3831) Document operator HTTP endpoints
[ https://issues.apache.org/jira/browse/MESOS-3831?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benjamin Mahler updated MESOS-3831: --- Shepherd: Neil Conway Assignee: Benjamin Mahler Sprint: Mesosphere Sprint 26 > Document operator HTTP endpoints > > > Key: MESOS-3831 > URL: https://issues.apache.org/jira/browse/MESOS-3831 > Project: Mesos > Issue Type: Documentation > Components: documentation >Reporter: Neil Conway >Assignee: Benjamin Mahler >Priority: Minor > Labels: documentation, mesosphere, newbie > > These are not exhaustively documented; they probably should be. > Some endpoints have docs: e.g., {{/reserve}} and {{/unreserve}} are described > in the reservation doc page. But it would be good to have a single page that > lists all the endpoints and their semantics. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-4271) Consider replacing libtool with dolt to speed up build
[ https://issues.apache.org/jira/browse/MESOS-4271?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benjamin Bannier updated MESOS-4271: Sprint: Mesosphere Sprint 26 > Consider replacing libtool with dolt to speed up build > -- > > Key: MESOS-4271 > URL: https://issues.apache.org/jira/browse/MESOS-4271 > Project: Mesos > Issue Type: Improvement >Reporter: Benjamin Bannier >Assignee: Benjamin Bannier >Priority: Minor > Labels: build > > Mesos uses a pretty standard autotools setup for the build so that > {{libtool}} is used extensively to abstract away the aspects of library > creation (both compiling source files, and creating the libraries). For some > versions of {{libtool}} its invocation can add considerably to the overall > build time. > Dolt provides a much more condensed implementation of {{libtool}}'s > functionality for modern platforms (<100 locs vs ~10 klocs), so that it can > run much faster. We should investigate whether activating dolt makes sense. > I tested dolt under OS X 10.10.5. I first primed ccache and then rebuilt > mesos-related objects, > {code} > ./configure --disable-python --disable-java # benchmark mostly C & C++ file > compile and link > make check GTEST_FILTER='' # prime ccache > make mostlyclean # remove most mesos objects and > libs > make -jN check GTEST_FILTER='' # rebuild > {code} > ||| user [s] | real [s]| sys [s]|| > | make -j10 (dolt)| 42.8±0.1 | 54.3±0.2 | 34.1±0.2 | > | make -j10 (libtool) | 65.6±0.3 | 148.7±1.1 | 108.5±1.0 | > | make -j1 (dolt) | 76.9±0.3 | 45.5±0.1 | 27.1±0.1 | > | make -j1 (libtool) | 168.2±2.3 | 97.5±1.5 | 75.8±1.3 | -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-4277) Provide constexpr Duration::min() and max()
[ https://issues.apache.org/jira/browse/MESOS-4277?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benjamin Bannier updated MESOS-4277: Sprint: Mesosphere Sprint 26 > Provide constexpr Duration::min() and max() > --- > > Key: MESOS-4277 > URL: https://issues.apache.org/jira/browse/MESOS-4277 > Project: Mesos > Issue Type: Improvement > Components: stout, technical debt >Reporter: Benjamin Bannier >Assignee: Benjamin Bannier >Priority: Minor > > {{Duration}} could be implemented so that it can provide {{constexpr}} > {{min}} and {{max}} functions. > This addresses an existing {{TODO}}. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-4273) Replace variadic List constructor with one taking a initializer_list
[ https://issues.apache.org/jira/browse/MESOS-4273?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benjamin Bannier updated MESOS-4273: Sprint: Mesosphere Sprint 26 > Replace variadic List constructor with one taking a initializer_list > > > Key: MESOS-4273 > URL: https://issues.apache.org/jira/browse/MESOS-4273 > Project: Mesos > Issue Type: Improvement > Components: stout >Reporter: Benjamin Bannier >Assignee: Benjamin Bannier >Priority: Minor > > {{List}} provides a variadic constructor currently implemented with some > preprocessor magic. Given that we already require C++11 we can replace that > one with a much simpler one just taking a {{std::initializer_list}}. This > would change the invocations, > {code} > auto l1 = List(1, 2, 3);// now > auto l2 = List({1, 2, 3}); // proposed > {code} > This addresses an existing {{TODO}}. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-4275) Duration uses fixed-width types inconsistently
[ https://issues.apache.org/jira/browse/MESOS-4275?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benjamin Bannier updated MESOS-4275: Sprint: Mesosphere Sprint 26 > Duration uses fixed-width types inconsistently > -- > > Key: MESOS-4275 > URL: https://issues.apache.org/jira/browse/MESOS-4275 > Project: Mesos > Issue Type: Bug > Components: stout >Reporter: Benjamin Bannier >Assignee: Benjamin Bannier > > The implementation of the {{Duration}} class correctly uses fixed-width types > (here {{int64_t}}) for portability internally, but uses {{long}} types in a > few places (in particular {{LLONG_MIN}} and {{LLONG_MAX}}). This is > inconsistent on 64-bit platforms, and probably incorrect on 32-bit as there > {{long}} is 32 bit wide. > Additionally, the longer {{Duration}} types ({{Minutes}}, {{Hours}}, > {{Days}}, and {{Weeks}}) construct from {{int32_t}}, while shorter ones take > {{int64_t}}. Probably as a left-over this is matched with a redundant > {{Duration}} constructor taking an {{int32_t}} value where the other one > taking an {{int64_t}} value would be sufficient. It should be safe to just > construct from {{int64_t}} in all places. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-4278) Constrain types used to instantiate Flags objects
[ https://issues.apache.org/jira/browse/MESOS-4278?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benjamin Bannier updated MESOS-4278: Sprint: Mesosphere Sprint 26 > Constrain types used to instantiate Flags objects > - > > Key: MESOS-4278 > URL: https://issues.apache.org/jira/browse/MESOS-4278 > Project: Mesos > Issue Type: Improvement > Components: stout, technical debt >Reporter: Benjamin Bannier >Assignee: Benjamin Bannier > > stout's {{Flags}} can be instantiated with a number of base flags provided by > the caller as template arguments; these are then inherited from by the > created {{Flags}} instance. > To ensure the expected semantics we could constrain the template arguments to > ones derived from {{FlagsBase}}. > This addresses an existing {{TODO}}. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-4276) Remove dupicate Mesos constructor
[ https://issues.apache.org/jira/browse/MESOS-4276?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benjamin Bannier updated MESOS-4276: Sprint: Mesosphere Sprint 26 > Remove dupicate Mesos constructor > - > > Key: MESOS-4276 > URL: https://issues.apache.org/jira/browse/MESOS-4276 > Project: Mesos > Issue Type: Improvement > Components: technical debt >Reporter: Benjamin Bannier >Assignee: Benjamin Bannier > > {{Mesos}} offers two almost-identical constructors > {code} > // TODO(vinod): Remove this in favor of the below constructor. > Mesos(const std::string& master, > const std::function& connected, > const std::function & disconnected, > const std::function & received); > Mesos(const std::string& master, > ContentType contentType, > const std::function & connected, > const std::function & disconnected, > const std::function & received); > {code} > Here invocations of the first constructor can replaced trivially with > invocations of the second one with {{contentType = ContentType::PROTOBUF}}. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-4208) PersistentVolumeTest.BadACLDropCreateAndDestroy is flaky
[ https://issues.apache.org/jira/browse/MESOS-4208?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benjamin Hindman updated MESOS-4208: Shepherd: Jie Yu Sprint: Mesosphere Sprint 26 > PersistentVolumeTest.BadACLDropCreateAndDestroy is flaky > > > Key: MESOS-4208 > URL: https://issues.apache.org/jira/browse/MESOS-4208 > Project: Mesos > Issue Type: Bug >Reporter: Jie Yu >Assignee: Greg Mann > Labels: flaky-test, mesosphere, persistent-volumes > > {noformat} > [ RUN ] PersistentVolumeTest.BadACLDropCreateAndDestroy > I1219 09:51:32.623245 31878 leveldb.cpp:174] Opened db in 4.393596ms > I1219 09:51:32.624084 31878 leveldb.cpp:181] Compacted db in 709447ns > I1219 09:51:32.624186 31878 leveldb.cpp:196] Created db iterator in 21252ns > I1219 09:51:32.624290 31878 leveldb.cpp:202] Seeked to beginning of db in > 11391ns > I1219 09:51:32.624378 31878 leveldb.cpp:271] Iterated through 0 keys in the > db in 611ns > I1219 09:51:32.624505 31878 replica.cpp:779] Replica recovered with log > positions 0 -> 0 with 1 holes and 0 unlearned > I1219 09:51:32.625195 31904 recover.cpp:447] Starting replica recovery > I1219 09:51:32.625641 31904 recover.cpp:473] Replica is in EMPTY status > I1219 09:51:32.627305 31904 replica.cpp:673] Replica in EMPTY status received > a broadcasted recover request from (6740)@172.17.0.3:36408 > I1219 09:51:32.627749 31904 recover.cpp:193] Received a recover response from > a replica in EMPTY status > I1219 09:51:32.628330 31904 recover.cpp:564] Updating replica status to > STARTING > I1219 09:51:32.629068 31906 leveldb.cpp:304] Persisting metadata (8 bytes) to > leveldb took 410494ns > I1219 09:51:32.629169 31906 replica.cpp:320] Persisted replica status to > STARTING > I1219 09:51:32.629598 31906 recover.cpp:473] Replica is in STARTING status > I1219 09:51:32.630782 31912 replica.cpp:673] Replica in STARTING status > received a broadcasted recover request from (6741)@172.17.0.3:36408 > I1219 09:51:32.631166 31901 recover.cpp:193] Received a recover response from > a replica in STARTING status > I1219 09:51:32.632467 31902 recover.cpp:564] Updating replica status to VOTING > I1219 09:51:32.633600 31907 leveldb.cpp:304] Persisting metadata (8 bytes) to > leveldb took 311370ns > I1219 09:51:32.633627 31907 replica.cpp:320] Persisted replica status to > VOTING > I1219 09:51:32.633719 31907 recover.cpp:578] Successfully joined the Paxos > group > I1219 09:51:32.633874 31907 recover.cpp:462] Recover process terminated > I1219 09:51:32.636409 31909 master.cpp:365] Master > bded856d-1c7f-4fad-a8bc-3629ba8c59d3 (60ab6e727501) started on > 172.17.0.3:36408 > I1219 09:51:32.636593 31909 master.cpp:367] Flags at startup: > --acls="create_volumes { > principals { > values: "creator-principal" > } > volume_types { > type: ANY > } > } > create_volumes { > principals { > type: ANY > } > volume_types { > type: NONE > } > } > " --allocation_interval="1secs" --allocator="HierarchicalDRF" > --authenticate="false" --authenticate_slaves="true" > --authenticators="crammd5" --authorizers="local" > --credentials="/tmp/SpPF7B/credentials" --framework_sorter="drf" > --help="false" --hostname_lookup="true" --initialize_driver_logging="true" > --log_auto_initialize="true" --logbufsecs="0" --logging_level="INFO" > --max_slave_ping_timeouts="5" --quiet="false" > --recovery_slave_removal_limit="100%" --registry="replicated_log" > --registry_fetch_timeout="1mins" --registry_store_timeout="25secs" > --registry_strict="true" --roles="role1" --root_submissions="true" > --slave_ping_timeout="15secs" --slave_reregister_timeout="10mins" > --user_sorter="drf" --version="false" > --webui_dir="/mesos/mesos-0.27.0/_inst/share/mesos/webui" > --work_dir="/tmp/SpPF7B/master" --zk_session_timeout="10secs" > I1219 09:51:32.637055 31909 master.cpp:414] Master allowing unauthenticated > frameworks to register > I1219 09:51:32.637068 31909 master.cpp:417] Master only allowing > authenticated slaves to register > I1219 09:51:32.637094 31909 credentials.hpp:35] Loading credentials for > authentication from '/tmp/SpPF7B/credentials' > I1219 09:51:32.637403 31909 master.cpp:456] Using default 'crammd5' > authenticator > I1219 09:51:32.637555 31909 master.cpp:493] Authorization enabled > W1219 09:51:32.637575 31909 master.cpp:553] The '--roles' flag is deprecated. > This flag will be removed in the future. See the Mesos 0.27 upgrade notes for > more information > I1219 09:51:32.637806 31897 whitelist_watcher.cpp:77] No whitelist given > I1219 09:51:32.637820 31910 hierarchical.cpp:147] Initialized hierarchical > allocator process > I1219 09:51:32.639677 31909 master.cpp:1629] The newly elected leader is > master@172.17.0.3:36408 with id
[jira] [Updated] (MESOS-4255) Add mechanism for testing recovery of HTTP based executors
[ https://issues.apache.org/jira/browse/MESOS-4255?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anand Mazumdar updated MESOS-4255: -- Shepherd: Vinod Kone Sprint: Mesosphere Sprint 26 > Add mechanism for testing recovery of HTTP based executors > -- > > Key: MESOS-4255 > URL: https://issues.apache.org/jira/browse/MESOS-4255 > Project: Mesos > Issue Type: Bug >Reporter: Anand Mazumdar >Assignee: Anand Mazumdar > Labels: mesosphere > > Currently, the slave process generates a process ID every time it is > initialized via {{process::ID::generate}} function call. This is a problem > for testing HTTP executors as it can't retry if there is a disconnection > after an agent restart since the prefix is incremented. > {code} > Agent PID before: > slave(1)@127.0.0.1:43915 > Agent PID after restart: > slave(2)@127.0.0.1:43915 > {code} > There are a couple of ways to fix this: > - Add a constructor to {{Slave}} exclusively for testing that passes on a > fixed {{ID}} instead of relying on {{ID::generate}}. > - Currently we delegate to slave(1)@ i.e. (1) when nothing is specified as > the URL in libprocess i.e. {{127.0.0.1:43915/api/v1/executor}} would delegate > to {{slave(1)@127.0.0.1:43915/api/v1/executor}}. Instead of defaulting to > (1), we can default to the last known active ID. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-2222) Add ACLs for the maintenance HTTP endpoints.
[ https://issues.apache.org/jira/browse/MESOS-?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15081703#comment-15081703 ] Greg Mann commented on MESOS-: -- [~chenzhiwei], what's the status of this ticket? No activity on the review since September. I've recently done some authorization tickets, so if you're not able to continue work on it, I have good context on this and would be happy to take it over. > Add ACLs for the maintenance HTTP endpoints. > > > Key: MESOS- > URL: https://issues.apache.org/jira/browse/MESOS- > Project: Mesos > Issue Type: Task >Reporter: Benjamin Mahler >Assignee: Chen Zhiwei > > In order to authorize the HTTP endpoints for maintenance (to be added in > MESOS-2067), we will need to add an ACL definition for performing maintenance > operations. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-3787) As a developer, I'd like to be able to expand environment variables through the Docker executor.
[ https://issues.apache.org/jira/browse/MESOS-3787?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benjamin Hindman updated MESOS-3787: Shepherd: Benjamin Hindman Assignee: Adam B Sprint: Mesosphere Sprint 26 > As a developer, I'd like to be able to expand environment variables through > the Docker executor. > > > Key: MESOS-3787 > URL: https://issues.apache.org/jira/browse/MESOS-3787 > Project: Mesos > Issue Type: Wish >Reporter: John Garcia >Assignee: Adam B > Labels: mesosphere > Attachments: mesos.patch, test-example.json > > > We'd like to have expanded variables usable in [the json files used to create > a Marathon app, hence] the Task's CommandInfo, so that the executor is able > to detect the correct values at runtime. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-4218) Test for Quota Status Endpoint
[ https://issues.apache.org/jira/browse/MESOS-4218?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joris Van Remoortere updated MESOS-4218: Sprint: Mesosphere Sprint 26 > Test for Quota Status Endpoint > -- > > Key: MESOS-4218 > URL: https://issues.apache.org/jira/browse/MESOS-4218 > Project: Mesos > Issue Type: Bug >Reporter: Joerg Schad >Assignee: Joerg Schad > Labels: mesosphere, quota > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-4263) Report volume usage through ResourceStatistics.
[ https://issues.apache.org/jira/browse/MESOS-4263?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Artem Harutyunyan updated MESOS-4263: - Shepherd: Jie Yu Sprint: Mesosphere Sprint 26 > Report volume usage through ResourceStatistics. > --- > > Key: MESOS-4263 > URL: https://issues.apache.org/jira/browse/MESOS-4263 > Project: Mesos > Issue Type: Bug >Reporter: Artem Harutyunyan >Assignee: Artem Harutyunyan > Labels: mesosphere > > POSIX disk isolator does not currently report volume usage through > ResourceStatistics. {{PosixDiskIsolatorProcess::usage()}} should be amended > to take into account volume usage as well. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-4207) Add an example bug due to a lack of defer() to the defer() documentation
[ https://issues.apache.org/jira/browse/MESOS-4207?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Neil Conway updated MESOS-4207: --- Sprint: Mesosphere Sprint 26 > Add an example bug due to a lack of defer() to the defer() documentation > > > Key: MESOS-4207 > URL: https://issues.apache.org/jira/browse/MESOS-4207 > Project: Mesos > Issue Type: Documentation >Reporter: Greg Mann >Assignee: Greg Mann >Priority: Minor > Labels: documentation, libprocess, mesosphere > > In the past, some bugs have been introduced into the codebase due to a lack > of {{defer()}} where it should have been used. It would be useful to add an > example of this to the {{defer()}} documentation. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (MESOS-4263) Report volume usage through ResourceStatistics.
[ https://issues.apache.org/jira/browse/MESOS-4263?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Artem Harutyunyan reassigned MESOS-4263: Assignee: Artem Harutyunyan > Report volume usage through ResourceStatistics. > --- > > Key: MESOS-4263 > URL: https://issues.apache.org/jira/browse/MESOS-4263 > Project: Mesos > Issue Type: Bug >Reporter: Artem Harutyunyan >Assignee: Artem Harutyunyan > Labels: mesosphere > > POSIX disk isolator does not currently report volume usage through > ResourceStatistics. {{PosixDiskIsolatorProcess::usage()}} should be amended > to take into account volume usage as well. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-4029) ContentType/SchedulerTest is flaky.
[ https://issues.apache.org/jira/browse/MESOS-4029?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Artem Harutyunyan updated MESOS-4029: - Sprint: Mesosphere Sprint 23, Mesosphere Sprint 26 (was: Mesosphere Sprint 23) > ContentType/SchedulerTest is flaky. > --- > > Key: MESOS-4029 > URL: https://issues.apache.org/jira/browse/MESOS-4029 > Project: Mesos > Issue Type: Bug >Affects Versions: 0.26.0 >Reporter: Till Toenshoff >Assignee: Artem Harutyunyan > Labels: flaky, flaky-test, mesosphere > > SSL build, [Ubuntu > 14.04|https://github.com/tillt/mesos-vagrant-ci/blob/master/ubuntu14/setup.sh], > non-root test run. > {noformat} > [--] 22 tests from ContentType/SchedulerTest > [ RUN ] ContentType/SchedulerTest.Subscribe/0 > [ OK ] ContentType/SchedulerTest.Subscribe/0 (48 ms) > *** Aborted at 1448928007 (unix time) try "date -d @1448928007" if you are > using GNU date *** > [ RUN ] ContentType/SchedulerTest.Subscribe/1 > PC: @ 0x1451b8e > testing::internal::UntypedFunctionMockerBase::UntypedInvokeWith() > *** SIGSEGV (@0x10030) received by PID 21320 (TID 0x2b549e5d4700) from > PID 48; stack trace: *** > @ 0x2b54c95940b7 os::Linux::chained_handler() > @ 0x2b54c9598219 JVM_handle_linux_signal > @ 0x2b5496300340 (unknown) > @ 0x1451b8e > testing::internal::UntypedFunctionMockerBase::UntypedInvokeWith() > @ 0xe2ea6d > _ZN7testing8internal18FunctionMockerBaseIFvRKSt5queueIN5mesos2v19scheduler5EventESt5dequeIS6_SaIS6_E10InvokeWithERKSt5tupleIJSC_EE > @ 0xe2b1bc testing::internal::FunctionMocker<>::Invoke() > @ 0x1118aed > mesos::internal::tests::SchedulerTest::Callbacks::received() > @ 0x111c453 > _ZNKSt7_Mem_fnIMN5mesos8internal5tests13SchedulerTest9CallbacksEFvRKSt5queueINS0_2v19scheduler5EventESt5dequeIS8_SaIS8_EclIJSE_EvEEvRS4_DpOT_ > @ 0x111c001 > _ZNSt5_BindIFSt7_Mem_fnIMN5mesos8internal5tests13SchedulerTest9CallbacksEFvRKSt5queueINS1_2v19scheduler5EventESt5dequeIS9_SaIS9_ESt17reference_wrapperIS5_ESt12_PlaceholderILi16__callIvJSF_EJLm0ELm1T_OSt5tupleIJDpT0_EESt12_Index_tupleIJXspT1_EEE > @ 0x111b90d > _ZNSt5_BindIFSt7_Mem_fnIMN5mesos8internal5tests13SchedulerTest9CallbacksEFvRKSt5queueINS1_2v19scheduler5EventESt5dequeIS9_SaIS9_ESt17reference_wrapperIS5_ESt12_PlaceholderILi1clIJSF_EvEET0_DpOT_ > @ 0x111ae09 std::_Function_handler<>::_M_invoke() > @ 0x2b5493c6da09 std::function<>::operator()() > @ 0x2b5493c688ee process::AsyncExecutorProcess::execute<>() > @ 0x2b5493c6db2a > _ZZN7process8dispatchI7NothingNS_20AsyncExecutorProcessERKSt8functionIFvRKSt5queueIN5mesos2v19scheduler5EventESt5dequeIS8_SaIS8_ESC_PvSG_SC_SJ_EENS_6FutureIT_EERKNS_3PIDIT0_EEMSO_FSL_T1_T2_T3_ET4_T5_T6_ENKUlPNS_11ProcessBaseEE_clES11_ > @ 0x2b5493c765a4 > _ZNSt17_Function_handlerIFvPN7process11ProcessBaseEEZNS0_8dispatchI7NothingNS0_20AsyncExecutorProcessERKSt8functionIFvRKSt5queueIN5mesos2v19scheduler5EventESt5dequeISC_SaISC_ESG_PvSK_SG_SN_EENS0_6FutureIT_EERKNS0_3PIDIT0_EEMSS_FSP_T1_T2_T3_ET4_T5_T6_EUlS2_E_E9_M_invokeERKSt9_Any_dataS2_ > @ 0x2b54946b1201 std::function<>::operator()() > @ 0x2b549469960f process::ProcessBase::visit() > @ 0x2b549469d480 process::DispatchEvent::visit() > @ 0x9dc0ba process::ProcessBase::serve() > @ 0x2b54946958cc process::ProcessManager::resume() > @ 0x2b5494692a9c > _ZZN7process14ProcessManager12init_threadsEvENKUlRKSt11atomic_boolE_clES3_ > @ 0x2b549469ccac > _ZNSt5_BindIFZN7process14ProcessManager12init_threadsEvEUlRKSt11atomic_boolE_St17reference_wrapperIS3_EEE6__callIvIEILm0T_OSt5tupleIIDpT0_EESt12_Index_tupleIIXspT1_EEE > @ 0x2b549469cc5c > _ZNSt5_BindIFZN7process14ProcessManager12init_threadsEvEUlRKSt11atomic_boolE_St17reference_wrapperIS3_EEEclIIEvEET0_DpOT_ > @ 0x2b549469cbee > _ZNSt12_Bind_simpleIFSt5_BindIFZN7process14ProcessManager12init_threadsEvEUlRKSt11atomic_boolE_St17reference_wrapperIS4_EEEvEE9_M_invokeIIEEEvSt12_Index_tupleIIXspT_EEE > @ 0x2b549469cb45 > _ZNSt12_Bind_simpleIFSt5_BindIFZN7process14ProcessManager12init_threadsEvEUlRKSt11atomic_boolE_St17reference_wrapperIS4_EEEvEEclEv > @ 0x2b549469cade > _ZNSt6thread5_ImplISt12_Bind_simpleIFSt5_BindIFZN7process14ProcessManager12init_threadsEvEUlRKSt11atomic_boolE_St17reference_wrapperIS6_EEEvEEE6_M_runEv > @ 0x2b5495b81a40 (unknown) > @ 0x2b54962f8182 start_thread > @ 0x2b549660847d (unknown) > make[3]: *** [check-local] Segmentation fault > make[3]: Leaving directory `/home/vagrant/mesos/build/src' > make[2]: *** [check-am] Error 2 > make[2]: Leaving directory
[jira] [Updated] (MESOS-4029) ContentType/SchedulerTest is flaky.
[ https://issues.apache.org/jira/browse/MESOS-4029?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Artem Harutyunyan updated MESOS-4029: - Shepherd: Joris Van Remoortere (was: Benjamin Mahler) > ContentType/SchedulerTest is flaky. > --- > > Key: MESOS-4029 > URL: https://issues.apache.org/jira/browse/MESOS-4029 > Project: Mesos > Issue Type: Bug >Affects Versions: 0.26.0 >Reporter: Till Toenshoff >Assignee: Anand Mazumdar > Labels: flaky, flaky-test, mesosphere > > SSL build, [Ubuntu > 14.04|https://github.com/tillt/mesos-vagrant-ci/blob/master/ubuntu14/setup.sh], > non-root test run. > {noformat} > [--] 22 tests from ContentType/SchedulerTest > [ RUN ] ContentType/SchedulerTest.Subscribe/0 > [ OK ] ContentType/SchedulerTest.Subscribe/0 (48 ms) > *** Aborted at 1448928007 (unix time) try "date -d @1448928007" if you are > using GNU date *** > [ RUN ] ContentType/SchedulerTest.Subscribe/1 > PC: @ 0x1451b8e > testing::internal::UntypedFunctionMockerBase::UntypedInvokeWith() > *** SIGSEGV (@0x10030) received by PID 21320 (TID 0x2b549e5d4700) from > PID 48; stack trace: *** > @ 0x2b54c95940b7 os::Linux::chained_handler() > @ 0x2b54c9598219 JVM_handle_linux_signal > @ 0x2b5496300340 (unknown) > @ 0x1451b8e > testing::internal::UntypedFunctionMockerBase::UntypedInvokeWith() > @ 0xe2ea6d > _ZN7testing8internal18FunctionMockerBaseIFvRKSt5queueIN5mesos2v19scheduler5EventESt5dequeIS6_SaIS6_E10InvokeWithERKSt5tupleIJSC_EE > @ 0xe2b1bc testing::internal::FunctionMocker<>::Invoke() > @ 0x1118aed > mesos::internal::tests::SchedulerTest::Callbacks::received() > @ 0x111c453 > _ZNKSt7_Mem_fnIMN5mesos8internal5tests13SchedulerTest9CallbacksEFvRKSt5queueINS0_2v19scheduler5EventESt5dequeIS8_SaIS8_EclIJSE_EvEEvRS4_DpOT_ > @ 0x111c001 > _ZNSt5_BindIFSt7_Mem_fnIMN5mesos8internal5tests13SchedulerTest9CallbacksEFvRKSt5queueINS1_2v19scheduler5EventESt5dequeIS9_SaIS9_ESt17reference_wrapperIS5_ESt12_PlaceholderILi16__callIvJSF_EJLm0ELm1T_OSt5tupleIJDpT0_EESt12_Index_tupleIJXspT1_EEE > @ 0x111b90d > _ZNSt5_BindIFSt7_Mem_fnIMN5mesos8internal5tests13SchedulerTest9CallbacksEFvRKSt5queueINS1_2v19scheduler5EventESt5dequeIS9_SaIS9_ESt17reference_wrapperIS5_ESt12_PlaceholderILi1clIJSF_EvEET0_DpOT_ > @ 0x111ae09 std::_Function_handler<>::_M_invoke() > @ 0x2b5493c6da09 std::function<>::operator()() > @ 0x2b5493c688ee process::AsyncExecutorProcess::execute<>() > @ 0x2b5493c6db2a > _ZZN7process8dispatchI7NothingNS_20AsyncExecutorProcessERKSt8functionIFvRKSt5queueIN5mesos2v19scheduler5EventESt5dequeIS8_SaIS8_ESC_PvSG_SC_SJ_EENS_6FutureIT_EERKNS_3PIDIT0_EEMSO_FSL_T1_T2_T3_ET4_T5_T6_ENKUlPNS_11ProcessBaseEE_clES11_ > @ 0x2b5493c765a4 > _ZNSt17_Function_handlerIFvPN7process11ProcessBaseEEZNS0_8dispatchI7NothingNS0_20AsyncExecutorProcessERKSt8functionIFvRKSt5queueIN5mesos2v19scheduler5EventESt5dequeISC_SaISC_ESG_PvSK_SG_SN_EENS0_6FutureIT_EERKNS0_3PIDIT0_EEMSS_FSP_T1_T2_T3_ET4_T5_T6_EUlS2_E_E9_M_invokeERKSt9_Any_dataS2_ > @ 0x2b54946b1201 std::function<>::operator()() > @ 0x2b549469960f process::ProcessBase::visit() > @ 0x2b549469d480 process::DispatchEvent::visit() > @ 0x9dc0ba process::ProcessBase::serve() > @ 0x2b54946958cc process::ProcessManager::resume() > @ 0x2b5494692a9c > _ZZN7process14ProcessManager12init_threadsEvENKUlRKSt11atomic_boolE_clES3_ > @ 0x2b549469ccac > _ZNSt5_BindIFZN7process14ProcessManager12init_threadsEvEUlRKSt11atomic_boolE_St17reference_wrapperIS3_EEE6__callIvIEILm0T_OSt5tupleIIDpT0_EESt12_Index_tupleIIXspT1_EEE > @ 0x2b549469cc5c > _ZNSt5_BindIFZN7process14ProcessManager12init_threadsEvEUlRKSt11atomic_boolE_St17reference_wrapperIS3_EEEclIIEvEET0_DpOT_ > @ 0x2b549469cbee > _ZNSt12_Bind_simpleIFSt5_BindIFZN7process14ProcessManager12init_threadsEvEUlRKSt11atomic_boolE_St17reference_wrapperIS4_EEEvEE9_M_invokeIIEEEvSt12_Index_tupleIIXspT_EEE > @ 0x2b549469cb45 > _ZNSt12_Bind_simpleIFSt5_BindIFZN7process14ProcessManager12init_threadsEvEUlRKSt11atomic_boolE_St17reference_wrapperIS4_EEEvEEclEv > @ 0x2b549469cade > _ZNSt6thread5_ImplISt12_Bind_simpleIFSt5_BindIFZN7process14ProcessManager12init_threadsEvEUlRKSt11atomic_boolE_St17reference_wrapperIS6_EEEvEEE6_M_runEv > @ 0x2b5495b81a40 (unknown) > @ 0x2b54962f8182 start_thread > @ 0x2b549660847d (unknown) > make[3]: *** [check-local] Segmentation fault > make[3]: Leaving directory `/home/vagrant/mesos/build/src' > make[2]: *** [check-am] Error 2 > make[2]: Leaving directory `/home/vagrant/mesos/build/src' > make[1]:
[jira] [Commented] (MESOS-3568) The State (/state) endpoint should be documented
[ https://issues.apache.org/jira/browse/MESOS-3568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15081664#comment-15081664 ] Alexander Rojas commented on MESOS-3568: A while ago I was working on unraveling this endpoint, I did some research so this can help you guys, you can find what I wrote here https://docs.google.com/document/d/1dU4e8V6CbomWqHae4FnawZdgK131Iu_YN9uRxZ_y-fw/edit > The State (/state) endpoint should be documented > > > Key: MESOS-3568 > URL: https://issues.apache.org/jira/browse/MESOS-3568 > Project: Mesos > Issue Type: Documentation > Components: documentation, master >Reporter: James Fisher > Labels: documentation, mesosphere, newbie, tech-debt > > Our tests are using a resource `/state.json` hosted by the Mesos master. I > have searched for the documentation for this resource but have been unable to > find anything. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (MESOS-4029) ContentType/SchedulerTest is flaky.
[ https://issues.apache.org/jira/browse/MESOS-4029?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Artem Harutyunyan reassigned MESOS-4029: Assignee: Artem Harutyunyan (was: Anand Mazumdar) > ContentType/SchedulerTest is flaky. > --- > > Key: MESOS-4029 > URL: https://issues.apache.org/jira/browse/MESOS-4029 > Project: Mesos > Issue Type: Bug >Affects Versions: 0.26.0 >Reporter: Till Toenshoff >Assignee: Artem Harutyunyan > Labels: flaky, flaky-test, mesosphere > > SSL build, [Ubuntu > 14.04|https://github.com/tillt/mesos-vagrant-ci/blob/master/ubuntu14/setup.sh], > non-root test run. > {noformat} > [--] 22 tests from ContentType/SchedulerTest > [ RUN ] ContentType/SchedulerTest.Subscribe/0 > [ OK ] ContentType/SchedulerTest.Subscribe/0 (48 ms) > *** Aborted at 1448928007 (unix time) try "date -d @1448928007" if you are > using GNU date *** > [ RUN ] ContentType/SchedulerTest.Subscribe/1 > PC: @ 0x1451b8e > testing::internal::UntypedFunctionMockerBase::UntypedInvokeWith() > *** SIGSEGV (@0x10030) received by PID 21320 (TID 0x2b549e5d4700) from > PID 48; stack trace: *** > @ 0x2b54c95940b7 os::Linux::chained_handler() > @ 0x2b54c9598219 JVM_handle_linux_signal > @ 0x2b5496300340 (unknown) > @ 0x1451b8e > testing::internal::UntypedFunctionMockerBase::UntypedInvokeWith() > @ 0xe2ea6d > _ZN7testing8internal18FunctionMockerBaseIFvRKSt5queueIN5mesos2v19scheduler5EventESt5dequeIS6_SaIS6_E10InvokeWithERKSt5tupleIJSC_EE > @ 0xe2b1bc testing::internal::FunctionMocker<>::Invoke() > @ 0x1118aed > mesos::internal::tests::SchedulerTest::Callbacks::received() > @ 0x111c453 > _ZNKSt7_Mem_fnIMN5mesos8internal5tests13SchedulerTest9CallbacksEFvRKSt5queueINS0_2v19scheduler5EventESt5dequeIS8_SaIS8_EclIJSE_EvEEvRS4_DpOT_ > @ 0x111c001 > _ZNSt5_BindIFSt7_Mem_fnIMN5mesos8internal5tests13SchedulerTest9CallbacksEFvRKSt5queueINS1_2v19scheduler5EventESt5dequeIS9_SaIS9_ESt17reference_wrapperIS5_ESt12_PlaceholderILi16__callIvJSF_EJLm0ELm1T_OSt5tupleIJDpT0_EESt12_Index_tupleIJXspT1_EEE > @ 0x111b90d > _ZNSt5_BindIFSt7_Mem_fnIMN5mesos8internal5tests13SchedulerTest9CallbacksEFvRKSt5queueINS1_2v19scheduler5EventESt5dequeIS9_SaIS9_ESt17reference_wrapperIS5_ESt12_PlaceholderILi1clIJSF_EvEET0_DpOT_ > @ 0x111ae09 std::_Function_handler<>::_M_invoke() > @ 0x2b5493c6da09 std::function<>::operator()() > @ 0x2b5493c688ee process::AsyncExecutorProcess::execute<>() > @ 0x2b5493c6db2a > _ZZN7process8dispatchI7NothingNS_20AsyncExecutorProcessERKSt8functionIFvRKSt5queueIN5mesos2v19scheduler5EventESt5dequeIS8_SaIS8_ESC_PvSG_SC_SJ_EENS_6FutureIT_EERKNS_3PIDIT0_EEMSO_FSL_T1_T2_T3_ET4_T5_T6_ENKUlPNS_11ProcessBaseEE_clES11_ > @ 0x2b5493c765a4 > _ZNSt17_Function_handlerIFvPN7process11ProcessBaseEEZNS0_8dispatchI7NothingNS0_20AsyncExecutorProcessERKSt8functionIFvRKSt5queueIN5mesos2v19scheduler5EventESt5dequeISC_SaISC_ESG_PvSK_SG_SN_EENS0_6FutureIT_EERKNS0_3PIDIT0_EEMSS_FSP_T1_T2_T3_ET4_T5_T6_EUlS2_E_E9_M_invokeERKSt9_Any_dataS2_ > @ 0x2b54946b1201 std::function<>::operator()() > @ 0x2b549469960f process::ProcessBase::visit() > @ 0x2b549469d480 process::DispatchEvent::visit() > @ 0x9dc0ba process::ProcessBase::serve() > @ 0x2b54946958cc process::ProcessManager::resume() > @ 0x2b5494692a9c > _ZZN7process14ProcessManager12init_threadsEvENKUlRKSt11atomic_boolE_clES3_ > @ 0x2b549469ccac > _ZNSt5_BindIFZN7process14ProcessManager12init_threadsEvEUlRKSt11atomic_boolE_St17reference_wrapperIS3_EEE6__callIvIEILm0T_OSt5tupleIIDpT0_EESt12_Index_tupleIIXspT1_EEE > @ 0x2b549469cc5c > _ZNSt5_BindIFZN7process14ProcessManager12init_threadsEvEUlRKSt11atomic_boolE_St17reference_wrapperIS3_EEEclIIEvEET0_DpOT_ > @ 0x2b549469cbee > _ZNSt12_Bind_simpleIFSt5_BindIFZN7process14ProcessManager12init_threadsEvEUlRKSt11atomic_boolE_St17reference_wrapperIS4_EEEvEE9_M_invokeIIEEEvSt12_Index_tupleIIXspT_EEE > @ 0x2b549469cb45 > _ZNSt12_Bind_simpleIFSt5_BindIFZN7process14ProcessManager12init_threadsEvEUlRKSt11atomic_boolE_St17reference_wrapperIS4_EEEvEEclEv > @ 0x2b549469cade > _ZNSt6thread5_ImplISt12_Bind_simpleIFSt5_BindIFZN7process14ProcessManager12init_threadsEvEUlRKSt11atomic_boolE_St17reference_wrapperIS6_EEEvEEE6_M_runEv > @ 0x2b5495b81a40 (unknown) > @ 0x2b54962f8182 start_thread > @ 0x2b549660847d (unknown) > make[3]: *** [check-local] Segmentation fault > make[3]: Leaving directory `/home/vagrant/mesos/build/src' > make[2]: *** [check-am] Error 2 > make[2]: Leaving directory `/home/vagrant/mesos/build/src' >
[jira] [Updated] (MESOS-2179) ExamplesTest.NoExecutorFramework terminates with segmentation fault
[ https://issues.apache.org/jira/browse/MESOS-2179?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benjamin Hindman updated MESOS-2179: Shepherd: Bernd Mathiske Assignee: Joerg Schad Sprint: Mesosphere Sprint 26 > ExamplesTest.NoExecutorFramework terminates with segmentation fault > --- > > Key: MESOS-2179 > URL: https://issues.apache.org/jira/browse/MESOS-2179 > Project: Mesos > Issue Type: Bug > Components: test >Affects Versions: 0.22.0 > Environment: Centos7 inside Docker > Mesos master commit: 49d4553a0645624179f17ed6da8d2443e88998bf >Reporter: Cody Maloney >Assignee: Joerg Schad >Priority: Minor > Labels: flaky, mesosphere > > {code} > [ RUN ] ExamplesTest.NoExecutorFramework > ../../src/tests/script.cpp:83: Failure > Failed > no_executor_framework_test.sh terminated with signal Segmentation fault > [ FAILED ] ExamplesTest.NoExecutorFramework (2543 ms) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-3307) Configurable size of completed task / framework history
[ https://issues.apache.org/jira/browse/MESOS-3307?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Artem Harutyunyan updated MESOS-3307: - Story Points: 3 > Configurable size of completed task / framework history > --- > > Key: MESOS-3307 > URL: https://issues.apache.org/jira/browse/MESOS-3307 > Project: Mesos > Issue Type: Bug >Reporter: Kevin Klues > Labels: mesosphere > > We try to make Mesos work with multiple frameworks and mesos-dns at the same > time. The goal is to have set of frameworks per team / project on a single > Mesos cluster. > At this point our mesos state.json is at 4mb and it takes a while to > assembly. 5 mesos-dns instances hit state.json every 5 seconds, effectively > pushing mesos-master CPU usage through the roof. It's at 100%+ all the time. > Here's the problem: > {noformat} > mesos λ curl -s http://mesos-master:5050/master/state.json | jq > .frameworks[].completed_tasks[].framework_id | sort | uniq -c | sort -n >1 "20150606-001827-252388362-5050-5982-0003" > 16 "20150606-001827-252388362-5050-5982-0005" > 18 "20150606-001827-252388362-5050-5982-0029" > 73 "20150606-001827-252388362-5050-5982-0007" > 141 "20150606-001827-252388362-5050-5982-0009" > 154 "20150820-154817-302720010-5050-15320-" > 289 "20150606-001827-252388362-5050-5982-0004" > 510 "20150606-001827-252388362-5050-5982-0012" > 666 "20150606-001827-252388362-5050-5982-0028" > 923 "20150116-002612-269165578-5050-32204-0003" > 1000 "20150606-001827-252388362-5050-5982-0001" > 1000 "20150606-001827-252388362-5050-5982-0006" > 1000 "20150606-001827-252388362-5050-5982-0010" > 1000 "20150606-001827-252388362-5050-5982-0011" > 1000 "20150606-001827-252388362-5050-5982-0027" > mesos λ fgrep 1000 -r src/master > src/master/constants.cpp:const size_t MAX_REMOVED_SLAVES = 10; > src/master/constants.cpp:const uint32_t MAX_COMPLETED_TASKS_PER_FRAMEWORK = > 1000; > {noformat} > Active tasks are just 6% of state.json response: > {noformat} > mesos λ cat ~/temp/mesos-state.json | jq -c . | wc >1 14796 4138942 > mesos λ cat ~/temp/mesos-state.json | jq .frameworks[].tasks | jq -c . | wc > 16 37 252774 > {noformat} > I see four options that can improve the situation: > 1. Add query string param to exclude completed tasks from state.json and use > it in mesos-dns and similar tools. There is no need for mesos-dns to know > about completed tasks, it's just extra load on master and mesos-dns. > 2. Make history size configurable. > 3. Make JSON serialization faster. With 1s of tasks even without history > it would take a lot of time to serialize tasks for mesos-dns. Doing it every > 60 seconds instead of every 5 seconds isn't really an option. > 4. Create event bus for mesos master. Marathon has it and it'd be nice to > have it in Mesos. This way mesos-dns could avoid polling master state and > switch to listening for events. > All can be done independently. > Note to mesosphere folks: please start distributing debug symbols with your > distribution. I was asking for it for a while and it is really helpful: > https://github.com/mesosphere/marathon/issues/1497#issuecomment-104182501 > Perf report for leading master: > !http://i.imgur.com/iz7C3o0.png! > I'm on 0.23.0. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-4218) Test for Quota Status Endpoint
[ https://issues.apache.org/jira/browse/MESOS-4218?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joerg Schad updated MESOS-4218: --- Story Points: 3 > Test for Quota Status Endpoint > -- > > Key: MESOS-4218 > URL: https://issues.apache.org/jira/browse/MESOS-4218 > Project: Mesos > Issue Type: Bug >Reporter: Joerg Schad >Assignee: Joerg Schad > Labels: mesosphere, quota > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-4150) Implement container logger module metadata recovery
[ https://issues.apache.org/jira/browse/MESOS-4150?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph Wu updated MESOS-4150: - Sprint: Mesosphere Sprint 26 > Implement container logger module metadata recovery > --- > > Key: MESOS-4150 > URL: https://issues.apache.org/jira/browse/MESOS-4150 > Project: Mesos > Issue Type: Task > Components: modules >Reporter: Joseph Wu >Assignee: Joseph Wu > Labels: logging, mesosphere > > The {{ContainerLoggers}} are intended to be isolated from agent failover, in > the same way that executors do not crash when the agent process crashes. > For default {{ContainerLogger}} s, like the {{SandboxContainerLogger}} and > the (tentatively named) {{TruncatingSandboxContainerLogger}}, the log files > are exposed during agent recovery regardless. > For non-default {{ContainerLogger}} s, the recovery of executor metadata may > be necessary to rebuild endpoints that expose the logs. This can be > implemented as part of {{Containerizer::recover}}. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-4206) Write new log-related documentation
[ https://issues.apache.org/jira/browse/MESOS-4206?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph Wu updated MESOS-4206: - Sprint: Mesosphere Sprint 26 > Write new log-related documentation > --- > > Key: MESOS-4206 > URL: https://issues.apache.org/jira/browse/MESOS-4206 > Project: Mesos > Issue Type: Documentation > Components: documentation >Reporter: Neil Conway >Assignee: Joseph Wu > Labels: documentation, logging, mesosphere > > This should include: > * Default logging behavior for master, agent, framework, executor, task. > * Master/agent: > ** A summary of log-related flags. > ** {{glog}} specific options. > * Separation of master/agent logs from container logs. > * The {{ContainerLogger}} module. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-3943) Support dynamic weight in allocator
[ https://issues.apache.org/jira/browse/MESOS-3943?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Adam B updated MESOS-3943: -- Story Points: 5 > Support dynamic weight in allocator > --- > > Key: MESOS-3943 > URL: https://issues.apache.org/jira/browse/MESOS-3943 > Project: Mesos > Issue Type: Task >Reporter: James Wang >Assignee: Yongqiao Wang > > This JIRA will focus on update the allocator API to support weight update of > a role. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-4198) Disk Resource Reservation is NOT Enforced for Persistent Volumes
[ https://issues.apache.org/jira/browse/MESOS-4198?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15081646#comment-15081646 ] Artem Harutyunyan commented on MESOS-4198: -- {noformat} author Artem HarutyunyanMon, 4 Jan 2016 09:20:30 -0800 (09:20 -0800) committer Jie Yu Mon, 4 Jan 2016 09:20:30 -0800 (09:20 -0800) commit 5682052be45d67dc2bcdf969ee38bb55cb4e2019 author Artem Harutyunyan Mon, 4 Jan 2016 09:20:36 -0800 (09:20 -0800) committer Jie Yu Mon, 4 Jan 2016 09:20:36 -0800 (09:20 -0800) commit 4706504c51446f31253a88a733d7aa30e9fea842 {noformat} > Disk Resource Reservation is NOT Enforced for Persistent Volumes > > > Key: MESOS-4198 > URL: https://issues.apache.org/jira/browse/MESOS-4198 > Project: Mesos > Issue Type: Bug >Reporter: Gabriel Hartmann >Assignee: Artem Harutyunyan > Labels: isolation, mesosphere, persistent-volumes, reservations > > If I create a persistent volume on a reserved disk resource, I am able to > write data in excess of my reserved size. > Disk resource reservation should be enforced just as "cpus" and "mem" > reservations are enforced. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-3809) Expose advertise_ip and advertise_port as command line options in mesos slave
[ https://issues.apache.org/jira/browse/MESOS-3809?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Adam B updated MESOS-3809: -- Fix Version/s: 0.27.0 > Expose advertise_ip and advertise_port as command line options in mesos slave > - > > Key: MESOS-3809 > URL: https://issues.apache.org/jira/browse/MESOS-3809 > Project: Mesos > Issue Type: Bug > Components: slave >Affects Versions: 0.25.0 >Reporter: Anindya Sinha >Assignee: Anindya Sinha >Priority: Minor > Labels: mesosphere > Fix For: 0.27.0 > > > advertise_ip and advertise_port are exposed as mesos master command line args > (MESOS-809). But the following use case makes it a candidate for adding as > command line args in mesos slave as well. > On Tue, Oct 27, 2015 at 7:43 PM, Xiaodong Zhangwrote: > It works! Thanks a lot. > 发件人: haosdent > 答复: "u...@mesos.apache.org" > 日期: 2015年10月28日 星期三 上午10:23 > 至: user > 主题: Re: How to tell master which ip to connect. > Do you try `export LIBPROCESS_ADVERTISE_IP=xxx` and > `LIBPROCESS_ADVERTISE_PORT` when start slave? > On Wed, Oct 28, 2015 at 10:16 AM, Xiaodong Zhang wrote: > Hi teams: > My scenarios is like this: > My master nodes were deployed in AWS. My slaves were in AZURE.So they > communicate via public ip. > I got trouble when slaves try to register to master. > Now slaves can get master’s public ip address,and can send register > request.But they can only send there private ip to master.(Because they don’t > know there public ip,thus they can’t not bind a public ip via —ip flag), > thus masters can’t connect slaves.How can the slave to tell master which ip > master should connect(I can’t find any flags like —advertise_ip in master). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-4209) Document "how to program with dynamic reservations and persistent volumes"
[ https://issues.apache.org/jira/browse/MESOS-4209?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Neil Conway updated MESOS-4209: --- Shepherd: Joris Van Remoortere > Document "how to program with dynamic reservations and persistent volumes" > -- > > Key: MESOS-4209 > URL: https://issues.apache.org/jira/browse/MESOS-4209 > Project: Mesos > Issue Type: Documentation > Components: documentation >Reporter: Neil Conway >Assignee: Neil Conway > Labels: documentation, mesosphere, persistent-volumes > > Specifically, some of the gotchas around: > * Retrying reservation attempts after a timeout > * Fuzzy-matching resources to determine whether a reservation/PV is successful > * Represent client state as a state machine and repeatedly move "toward" > successful terminate stats > Should also point to persistent volume example framework. We should also ask > Gabriel and others (Arango?) who have built frameworks with PVs/DRs for > feedback. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-4233) Logging is too verbose for sysadmins / syslog
[ https://issues.apache.org/jira/browse/MESOS-4233?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Artem Harutyunyan updated MESOS-4233: - Assignee: Kapil Arya > Logging is too verbose for sysadmins / syslog > - > > Key: MESOS-4233 > URL: https://issues.apache.org/jira/browse/MESOS-4233 > Project: Mesos > Issue Type: Epic >Reporter: Cody Maloney >Assignee: Kapil Arya > Labels: mesosphere > Attachments: giant_port_range_logging > > > Currently mesos logs a lot. When launching a thousand tasks in the space of > 10 seconds it will print tens of thousands of log lines, overwhelming syslog > (there is a max rate at which a process can send stuff over a unix socket) > and not giving useful information to a sysadmin who cares about just the > high-level activity and when something goes wrong. > Note mesos also blocks writing to its log locations, so when writing a lot of > log messages, it can fill up the write buffer in the kernel, and be suspended > until the syslog agent catches up reading from the socket (GLOG does a > blocking fwrite to stderr). GLOG also has a big mutex around logging so only > one thing logs at a time. > While for "internal debugging" it is useful to see things like "message went > from internal compoent x to internal component y", from a sysadmin > perspective I only care about the high level actions taken (launched task for > framework x), sent offer to framework y, got task failed from host z. Note > those are what I'd expect at the "INFO" level. At the "WARNING" level I'd > expect very little to be logged / almost nothing in normal operation. Just > things like "WARN: Repliacted log write took longer than expected". WARN > would also get things like backtraces on crashes and abnormal exits / abort. > When trying to launch 3k+ tasks inside a second, mesos logging currently > overwhelms syslog with 100k+ messages, many of which are thousands of bytes. > Sysadmins expect to be able to use syslog to monitor basic events in their > system. This is too much. > We can keep logging the messages to files, but the logging to stderr needs to > be reduced significantly (stderr gets picked up and forwarded to syslog / > central aggregation). > What I would like is if I can set the stderr logging level to be different / > independent from the file logging level (Syslog giving the "sysadmin" > aggregated overview, files useful for debugging in depth what happened in a > cluster). A lot of what mesos currently logs at info is really debugging info > / should show up as debug log level. > Some samples of mesos logging a lot more than a sysadmin would want / expect > are attached, and some are below: > - Every task gets printed multiple times for a basic launch: > {noformat} > Dec 15 22:58:30 ip-10-0-7-60.us-west-2.compute.internal mesos-master[1311]: > I1215 22:58:29.382644 1315 master.cpp:3248] Launching task > envy.5b19a713-a37f-11e5-8b3e-0251692d6109 of framework > 5178f46d-71d6-422f-922c-5bbe82dff9cc- (marathon) > Dec 15 22:58:30 ip-10-0-7-60.us-west-2.compute.internal mesos-master[1311]: > I1215 22:58:29.382925 1315 master.hpp:176] Adding task > envy.5b1958f2-a37f-11e5-8b3e-0251692d6109 with resources cpus(*):0.0001; > mem(*):16; ports(*):[14047-14047] > {noformat} > - Every task status update prints many log lines, successful ones are part > of normal operation and maybe should be logged at info / debug levels, but > not to a sysadmin (Just show when things fail, and maybe aggregate counters > to tell of the volume of working) > - No log messagse should be really big / more than 1k characters (Would > prevent the giant port list attached, make that easily discoverable / bug > filable / fixable) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-4233) Logging is too verbose for sysadmins / syslog
[ https://issues.apache.org/jira/browse/MESOS-4233?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Artem Harutyunyan updated MESOS-4233: - Sprint: Mesosphere Sprint 26 > Logging is too verbose for sysadmins / syslog > - > > Key: MESOS-4233 > URL: https://issues.apache.org/jira/browse/MESOS-4233 > Project: Mesos > Issue Type: Epic >Reporter: Cody Maloney >Assignee: Kapil Arya > Labels: mesosphere > Attachments: giant_port_range_logging > > > Currently mesos logs a lot. When launching a thousand tasks in the space of > 10 seconds it will print tens of thousands of log lines, overwhelming syslog > (there is a max rate at which a process can send stuff over a unix socket) > and not giving useful information to a sysadmin who cares about just the > high-level activity and when something goes wrong. > Note mesos also blocks writing to its log locations, so when writing a lot of > log messages, it can fill up the write buffer in the kernel, and be suspended > until the syslog agent catches up reading from the socket (GLOG does a > blocking fwrite to stderr). GLOG also has a big mutex around logging so only > one thing logs at a time. > While for "internal debugging" it is useful to see things like "message went > from internal compoent x to internal component y", from a sysadmin > perspective I only care about the high level actions taken (launched task for > framework x), sent offer to framework y, got task failed from host z. Note > those are what I'd expect at the "INFO" level. At the "WARNING" level I'd > expect very little to be logged / almost nothing in normal operation. Just > things like "WARN: Repliacted log write took longer than expected". WARN > would also get things like backtraces on crashes and abnormal exits / abort. > When trying to launch 3k+ tasks inside a second, mesos logging currently > overwhelms syslog with 100k+ messages, many of which are thousands of bytes. > Sysadmins expect to be able to use syslog to monitor basic events in their > system. This is too much. > We can keep logging the messages to files, but the logging to stderr needs to > be reduced significantly (stderr gets picked up and forwarded to syslog / > central aggregation). > What I would like is if I can set the stderr logging level to be different / > independent from the file logging level (Syslog giving the "sysadmin" > aggregated overview, files useful for debugging in depth what happened in a > cluster). A lot of what mesos currently logs at info is really debugging info > / should show up as debug log level. > Some samples of mesos logging a lot more than a sysadmin would want / expect > are attached, and some are below: > - Every task gets printed multiple times for a basic launch: > {noformat} > Dec 15 22:58:30 ip-10-0-7-60.us-west-2.compute.internal mesos-master[1311]: > I1215 22:58:29.382644 1315 master.cpp:3248] Launching task > envy.5b19a713-a37f-11e5-8b3e-0251692d6109 of framework > 5178f46d-71d6-422f-922c-5bbe82dff9cc- (marathon) > Dec 15 22:58:30 ip-10-0-7-60.us-west-2.compute.internal mesos-master[1311]: > I1215 22:58:29.382925 1315 master.hpp:176] Adding task > envy.5b1958f2-a37f-11e5-8b3e-0251692d6109 with resources cpus(*):0.0001; > mem(*):16; ports(*):[14047-14047] > {noformat} > - Every task status update prints many log lines, successful ones are part > of normal operation and maybe should be logged at info / debug levels, but > not to a sysadmin (Just show when things fail, and maybe aggregate counters > to tell of the volume of working) > - No log messagse should be really big / more than 1k characters (Would > prevent the giant port list attached, make that easily discoverable / bug > filable / fixable) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-3903) Add authorization for '/create-volume' and '/destroy-volume' HTTP endpoints
[ https://issues.apache.org/jira/browse/MESOS-3903?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Greg Mann updated MESOS-3903: - Sprint: Mesosphere Sprint 26 > Add authorization for '/create-volume' and '/destroy-volume' HTTP endpoints > --- > > Key: MESOS-3903 > URL: https://issues.apache.org/jira/browse/MESOS-3903 > Project: Mesos > Issue Type: Improvement >Reporter: Greg Mann >Assignee: Greg Mann > Labels: mesosphere, persistent-volumes > > This is the fourth in a series of tickets that adds authorization support for > persistent volumes. > We need to add ACL authorization for the '/create-volume' and > '/destroy-volume' HTTP endpoints. In other complementary work, authorization > for frameworks performing {{CREATE}} and {{DESTROY}} operations is being > added by MESOS-3065. > This will consist of adding authorization calls into the HTTP endpoint code > in {{src/master/http.cpp}}, as well as tests for both failed & successful > calls to '/create-volume' and '/destroy-volume' with authorization. We also > must ensure that the {{principal}} field of {{Resource.DiskInfo.Persistence}} > is being populated correctly. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-4286) Expose state(.json) as a structured protobuf
[ https://issues.apache.org/jira/browse/MESOS-4286?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sargun Dhillon updated MESOS-4286: -- Labels: mesosphere (was: ) > Expose state(.json) as a structured protobuf > > > Key: MESOS-4286 > URL: https://issues.apache.org/jira/browse/MESOS-4286 > Project: Mesos > Issue Type: Wish >Reporter: Sargun Dhillon >Priority: Minor > Labels: mesosphere > > State.json, both on the agent, and the master exposes information about the > current state of the Mesos runtime. This information is super valuable to > external users such as Mesos-DNS. Unfortunately, working with state.json can > at times become cumbersome in languages where dealing with json isn't > necessarily a first-class construct. > Fortunately, protocol buffers exist. If the state.json was exposed as a > protocol buffer, it would make the lives of software authors to the Mesos > ecosystem significantly easier. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-4220) Introduce result_of with C++14 semantics to stout.
[ https://issues.apache.org/jira/browse/MESOS-4220?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benjamin Hindman updated MESOS-4220: Sprint: Mesosphere Sprint 25, Mesosphere Sprint 26 (was: Mesosphere Sprint 25) > Introduce result_of with C++14 semantics to stout. > -- > > Key: MESOS-4220 > URL: https://issues.apache.org/jira/browse/MESOS-4220 > Project: Mesos > Issue Type: Task > Components: stout >Reporter: Michael Park >Assignee: Michael Park > Labels: mesospheree > > The {{std::result_of}} in VS 2015 Update 1 implements C++11 semantics which > does not allow it to be used in SFINAE contexts. > Introduce a C++14 {{std::result_of}} into stout until we get to VS 2014 > Update 2, at which point we can switch back to simply using > {{std::result_of}}. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-4239) Update relevant libprocess components to support the `jsonify` facility.
[ https://issues.apache.org/jira/browse/MESOS-4239?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benjamin Hindman updated MESOS-4239: Sprint: Mesosphere Sprint 25, Mesosphere Sprint 26 (was: Mesosphere Sprint 25) > Update relevant libprocess components to support the `jsonify` facility. > > > Key: MESOS-4239 > URL: https://issues.apache.org/jira/browse/MESOS-4239 > Project: Mesos > Issue Type: Task > Components: libprocess >Reporter: Michael Park >Assignee: Michael Park > Labels: mesosphere > > Update relevant {{libprocess}} components to support the {{jsonify}} > facility. For example, the {{OK}} HTTP response object should be able to > construct off of {{JsonifyProxy}}. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-4143) Reserve/UnReserve Dynamic Reservation Endpoints allow reservations on non-existing roles
[ https://issues.apache.org/jira/browse/MESOS-4143?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benjamin Hindman updated MESOS-4143: Sprint: Mesosphere Sprint 24, Mesosphere Sprint 25, Mesosphere Sprint 26 (was: Mesosphere Sprint 24, Mesosphere Sprint 25) > Reserve/UnReserve Dynamic Reservation Endpoints allow reservations on > non-existing roles > > > Key: MESOS-4143 > URL: https://issues.apache.org/jira/browse/MESOS-4143 > Project: Mesos > Issue Type: Bug > Components: general >Affects Versions: 0.25.0, 0.26.0 >Reporter: John Omernik >Assignee: Neil Conway > Labels: mesosphere, reservations > > When working with Dynamic reservations via the /reserve and /unreserve > endpoints, it is possible to reserve resources for roles that have not been > specified via the --roles flag on the master. However, these roles are not > usable because the roles have not been defined, nor are they added to the > list of roles available. > Per the mailing list, changing roles after the fact is not possible at this > time. (That may be another JIRA), more importantly, the /reserve and > /unreserve end points should not allow reservation of roles not specified by > --roles. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-4221) Invoke _Deferred's implicit conversion operator explicitly.
[ https://issues.apache.org/jira/browse/MESOS-4221?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benjamin Hindman updated MESOS-4221: Sprint: Mesosphere Sprint 25, Mesosphere Sprint 26 (was: Mesosphere Sprint 25) > Invoke _Deferred's implicit conversion operator explicitly. > --- > > Key: MESOS-4221 > URL: https://issues.apache.org/jira/browse/MESOS-4221 > Project: Mesos > Issue Type: Task > Components: libprocess >Reporter: Michael Park >Assignee: Michael Park > Labels: mesosphere > > As of VS 2015 Update 1, MSVC implements C++11 semantics for > {{std::function}}'s {{Callable}} constructor which does not SFINAE. In the > short term, we call the implicit conversion operator from {{_Deferred}} to > {{std::function}} explicitly. > Going forward, I propose to make {{_Deferred}} callable which will bring us > to a state where {{process::defer}} is similar to {{std::bind}} in that the > objects returned from them are "implementation-defined" (i.e., {{_Deferred}} > and something like {{_Bind}}), and that they were both callable. {{Deferred}} > and {{std::function}} are similar in that they perform type-erasure. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-4204) Document that frameworks that participate in a role should cooperate
[ https://issues.apache.org/jira/browse/MESOS-4204?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Artem Harutyunyan updated MESOS-4204: - Shepherd: Joris Van Remoortere > Document that frameworks that participate in a role should cooperate > > > Key: MESOS-4204 > URL: https://issues.apache.org/jira/browse/MESOS-4204 > Project: Mesos > Issue Type: Documentation > Components: documentation >Reporter: Neil Conway >Priority: Minor > Labels: documentation, mesosphere, reservations > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-4209) Document "how to program with dynamic reservations and persistent volumes"
[ https://issues.apache.org/jira/browse/MESOS-4209?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Neil Conway updated MESOS-4209: --- Story Points: 3 > Document "how to program with dynamic reservations and persistent volumes" > -- > > Key: MESOS-4209 > URL: https://issues.apache.org/jira/browse/MESOS-4209 > Project: Mesos > Issue Type: Documentation > Components: documentation >Reporter: Neil Conway >Assignee: Neil Conway > Labels: documentation, mesosphere, persistent-volumes > > Specifically, some of the gotchas around: > * Retrying reservation attempts after a timeout > * Fuzzy-matching resources to determine whether a reservation/PV is successful > * Represent client state as a state machine and repeatedly move "toward" > successful terminate stats > Should also point to persistent volume example framework. We should also ask > Gabriel and others (Arango?) who have built frameworks with PVs/DRs for > feedback. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-3307) Configurable size of completed task / framework history
[ https://issues.apache.org/jira/browse/MESOS-3307?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Artem Harutyunyan updated MESOS-3307: - Shepherd: Benjamin Mahler Sprint: Mesosphere Sprint 26 > Configurable size of completed task / framework history > --- > > Key: MESOS-3307 > URL: https://issues.apache.org/jira/browse/MESOS-3307 > Project: Mesos > Issue Type: Bug >Reporter: Kevin Klues > Labels: mesosphere > > We try to make Mesos work with multiple frameworks and mesos-dns at the same > time. The goal is to have set of frameworks per team / project on a single > Mesos cluster. > At this point our mesos state.json is at 4mb and it takes a while to > assembly. 5 mesos-dns instances hit state.json every 5 seconds, effectively > pushing mesos-master CPU usage through the roof. It's at 100%+ all the time. > Here's the problem: > {noformat} > mesos λ curl -s http://mesos-master:5050/master/state.json | jq > .frameworks[].completed_tasks[].framework_id | sort | uniq -c | sort -n >1 "20150606-001827-252388362-5050-5982-0003" > 16 "20150606-001827-252388362-5050-5982-0005" > 18 "20150606-001827-252388362-5050-5982-0029" > 73 "20150606-001827-252388362-5050-5982-0007" > 141 "20150606-001827-252388362-5050-5982-0009" > 154 "20150820-154817-302720010-5050-15320-" > 289 "20150606-001827-252388362-5050-5982-0004" > 510 "20150606-001827-252388362-5050-5982-0012" > 666 "20150606-001827-252388362-5050-5982-0028" > 923 "20150116-002612-269165578-5050-32204-0003" > 1000 "20150606-001827-252388362-5050-5982-0001" > 1000 "20150606-001827-252388362-5050-5982-0006" > 1000 "20150606-001827-252388362-5050-5982-0010" > 1000 "20150606-001827-252388362-5050-5982-0011" > 1000 "20150606-001827-252388362-5050-5982-0027" > mesos λ fgrep 1000 -r src/master > src/master/constants.cpp:const size_t MAX_REMOVED_SLAVES = 10; > src/master/constants.cpp:const uint32_t MAX_COMPLETED_TASKS_PER_FRAMEWORK = > 1000; > {noformat} > Active tasks are just 6% of state.json response: > {noformat} > mesos λ cat ~/temp/mesos-state.json | jq -c . | wc >1 14796 4138942 > mesos λ cat ~/temp/mesos-state.json | jq .frameworks[].tasks | jq -c . | wc > 16 37 252774 > {noformat} > I see four options that can improve the situation: > 1. Add query string param to exclude completed tasks from state.json and use > it in mesos-dns and similar tools. There is no need for mesos-dns to know > about completed tasks, it's just extra load on master and mesos-dns. > 2. Make history size configurable. > 3. Make JSON serialization faster. With 1s of tasks even without history > it would take a lot of time to serialize tasks for mesos-dns. Doing it every > 60 seconds instead of every 5 seconds isn't really an option. > 4. Create event bus for mesos master. Marathon has it and it'd be nice to > have it in Mesos. This way mesos-dns could avoid polling master state and > switch to listening for events. > All can be done independently. > Note to mesosphere folks: please start distributing debug symbols with your > distribution. I was asking for it for a while and it is really helpful: > https://github.com/mesosphere/marathon/issues/1497#issuecomment-104182501 > Perf report for leading master: > !http://i.imgur.com/iz7C3o0.png! > I'm on 0.23.0. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-3307) Configurable size of completed task / framework history
[ https://issues.apache.org/jira/browse/MESOS-3307?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Artem Harutyunyan updated MESOS-3307: - Reporter: Kevin Klues (was: Ian Babrou) > Configurable size of completed task / framework history > --- > > Key: MESOS-3307 > URL: https://issues.apache.org/jira/browse/MESOS-3307 > Project: Mesos > Issue Type: Bug >Reporter: Kevin Klues > Labels: mesosphere > > We try to make Mesos work with multiple frameworks and mesos-dns at the same > time. The goal is to have set of frameworks per team / project on a single > Mesos cluster. > At this point our mesos state.json is at 4mb and it takes a while to > assembly. 5 mesos-dns instances hit state.json every 5 seconds, effectively > pushing mesos-master CPU usage through the roof. It's at 100%+ all the time. > Here's the problem: > {noformat} > mesos λ curl -s http://mesos-master:5050/master/state.json | jq > .frameworks[].completed_tasks[].framework_id | sort | uniq -c | sort -n >1 "20150606-001827-252388362-5050-5982-0003" > 16 "20150606-001827-252388362-5050-5982-0005" > 18 "20150606-001827-252388362-5050-5982-0029" > 73 "20150606-001827-252388362-5050-5982-0007" > 141 "20150606-001827-252388362-5050-5982-0009" > 154 "20150820-154817-302720010-5050-15320-" > 289 "20150606-001827-252388362-5050-5982-0004" > 510 "20150606-001827-252388362-5050-5982-0012" > 666 "20150606-001827-252388362-5050-5982-0028" > 923 "20150116-002612-269165578-5050-32204-0003" > 1000 "20150606-001827-252388362-5050-5982-0001" > 1000 "20150606-001827-252388362-5050-5982-0006" > 1000 "20150606-001827-252388362-5050-5982-0010" > 1000 "20150606-001827-252388362-5050-5982-0011" > 1000 "20150606-001827-252388362-5050-5982-0027" > mesos λ fgrep 1000 -r src/master > src/master/constants.cpp:const size_t MAX_REMOVED_SLAVES = 10; > src/master/constants.cpp:const uint32_t MAX_COMPLETED_TASKS_PER_FRAMEWORK = > 1000; > {noformat} > Active tasks are just 6% of state.json response: > {noformat} > mesos λ cat ~/temp/mesos-state.json | jq -c . | wc >1 14796 4138942 > mesos λ cat ~/temp/mesos-state.json | jq .frameworks[].tasks | jq -c . | wc > 16 37 252774 > {noformat} > I see four options that can improve the situation: > 1. Add query string param to exclude completed tasks from state.json and use > it in mesos-dns and similar tools. There is no need for mesos-dns to know > about completed tasks, it's just extra load on master and mesos-dns. > 2. Make history size configurable. > 3. Make JSON serialization faster. With 1s of tasks even without history > it would take a lot of time to serialize tasks for mesos-dns. Doing it every > 60 seconds instead of every 5 seconds isn't really an option. > 4. Create event bus for mesos master. Marathon has it and it'd be nice to > have it in Mesos. This way mesos-dns could avoid polling master state and > switch to listening for events. > All can be done independently. > Note to mesosphere folks: please start distributing debug symbols with your > distribution. I was asking for it for a while and it is really helpful: > https://github.com/mesosphere/marathon/issues/1497#issuecomment-104182501 > Perf report for leading master: > !http://i.imgur.com/iz7C3o0.png! > I'm on 0.23.0. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-4133) User-oriented docs for containerizers + isolators
[ https://issues.apache.org/jira/browse/MESOS-4133?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benjamin Hindman updated MESOS-4133: Epic Name: Containerization Documentation > User-oriented docs for containerizers + isolators > - > > Key: MESOS-4133 > URL: https://issues.apache.org/jira/browse/MESOS-4133 > Project: Mesos > Issue Type: Epic > Components: containerization, documentation, isolation >Reporter: Neil Conway >Assignee: Jojy Varghese > Labels: documentation, mesosphere > > This should cover practical user-oriented questions, such as: > * what is a containerizer, and what problems do they solve? > * how should I choose among the available containerizer options to solve a > few typical, practical problems > * what is an isolator, and what problems do they solve? > * how should I choose among the available isolator options to solve a few > typical, practical problems > We could possibly get into the details of cgroups and other system-level > facilities for configuring resource isolation as well. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-3943) Support dynamic weight in allocator
[ https://issues.apache.org/jira/browse/MESOS-3943?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Adam B updated MESOS-3943: -- Sprint: Mesosphere Sprint 26 > Support dynamic weight in allocator > --- > > Key: MESOS-3943 > URL: https://issues.apache.org/jira/browse/MESOS-3943 > Project: Mesos > Issue Type: Task >Reporter: James Wang >Assignee: Yongqiao Wang > > This JIRA will focus on update the allocator API to support weight update of > a role. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-1763) Add support for multiple roles to be specified in FrameworkInfo
[ https://issues.apache.org/jira/browse/MESOS-1763?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bernd Mathiske updated MESOS-1763: -- Assignee: (was: Timothy Chen) Epic Name: multi-role frameworks > Add support for multiple roles to be specified in FrameworkInfo > --- > > Key: MESOS-1763 > URL: https://issues.apache.org/jira/browse/MESOS-1763 > Project: Mesos > Issue Type: Epic > Components: master >Reporter: Vinod Kone > Labels: mesosphere, roles > > Currently frameworks have the ability to set only one (resource) role in > FrameworkInfo. It would be nice to let frameworks specify multiple roles so > that they can do more fine grained resource accounting per role. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-4207) Add an example bug due to a lack of defer() to the defer() documentation
[ https://issues.apache.org/jira/browse/MESOS-4207?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Neil Conway updated MESOS-4207: --- Component/s: documentation > Add an example bug due to a lack of defer() to the defer() documentation > > > Key: MESOS-4207 > URL: https://issues.apache.org/jira/browse/MESOS-4207 > Project: Mesos > Issue Type: Documentation > Components: documentation >Reporter: Greg Mann >Assignee: Greg Mann >Priority: Minor > Labels: documentation, libprocess, mesosphere > > In the past, some bugs have been introduced into the codebase due to a lack > of {{defer()}} where it should have been used. It would be useful to add an > example of this to the {{defer()}} documentation. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-4209) Document "how to program with dynamic reservations and persistent volumes"
[ https://issues.apache.org/jira/browse/MESOS-4209?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Artem Harutyunyan updated MESOS-4209: - Sprint: Mesosphere Sprint 26 > Document "how to program with dynamic reservations and persistent volumes" > -- > > Key: MESOS-4209 > URL: https://issues.apache.org/jira/browse/MESOS-4209 > Project: Mesos > Issue Type: Documentation > Components: documentation >Reporter: Neil Conway >Assignee: Neil Conway > Labels: documentation, mesosphere, persistent-volumes > > Specifically, some of the gotchas around: > * Retrying reservation attempts after a timeout > * Fuzzy-matching resources to determine whether a reservation/PV is successful > * Represent client state as a state machine and repeatedly move "toward" > successful terminate stats > Should also point to persistent volume example framework. We should also ask > Gabriel and others (Arango?) who have built frameworks with PVs/DRs for > feedback. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-4035) UserCgroupIsolatorTest.ROOT_CGROUPS_UserCgroup fails on CentOS 6.6
[ https://issues.apache.org/jira/browse/MESOS-4035?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benjamin Hindman updated MESOS-4035: Shepherd: Till Toenshoff Assignee: Jan Schlicht Sprint: Mesosphere Sprint 26 > UserCgroupIsolatorTest.ROOT_CGROUPS_UserCgroup fails on CentOS 6.6 > -- > > Key: MESOS-4035 > URL: https://issues.apache.org/jira/browse/MESOS-4035 > Project: Mesos > Issue Type: Bug > Environment: CentOS6.6 >Reporter: Gilbert Song >Assignee: Jan Schlicht > > `ROOT_CGROUPS_UserCgroup` on CentOS6.6 with 0.26rc3. The environment setup on > CentOS6.6 is based on latest update of /docs/getting-started.md. Either using > devtoolset-2 or devtoolset-3 returns the same failure. > If running `sudo ./bin/mesos-tests.sh > --gtest_filter="*ROOT_CGROUPS_UserCgroup*"`, it would return failures as > following log: > {noformat} > [==] Running 3 tests from 3 test cases. > [--] Global test environment set-up. > [--] 1 test from UserCgroupIsolatorTest/0, where TypeParam = > mesos::internal::slave::CgroupsMemIsolatorProcess > userdel: user 'mesos.test.unprivileged.user' does not exist > [ RUN ] UserCgroupIsolatorTest/0.ROOT_CGROUPS_UserCgroup > ../../src/tests/mesos.cpp:722: Failure > cgroups::mount(hierarchy, subsystem): '/tmp/mesos_test_cgroup/perf_event' > already exists in the file system > - > We cannot run any cgroups tests that require > a hierarchy with subsystem 'perf_event' > because we failed to find an existing hierarchy > or create a new one (tried '/tmp/mesos_test_cgroup/perf_event'). > You can either remove all existing > hierarchies, or disable this test case > (i.e., --gtest_filter=-UserCgroupIsolatorTest/0.*). > - > ../../src/tests/mesos.cpp:776: Failure > cgroups: '/tmp/mesos_test_cgroup/perf_event' is not a valid hierarchy > [ FAILED ] UserCgroupIsolatorTest/0.ROOT_CGROUPS_UserCgroup, where > TypeParam = mesos::internal::slave::CgroupsMemIsolatorProcess (1 ms) > [--] 1 test from UserCgroupIsolatorTest/0 (1 ms total) > [--] 1 test from UserCgroupIsolatorTest/1, where TypeParam = > mesos::internal::slave::CgroupsCpushareIsolatorProcess > userdel: user 'mesos.test.unprivileged.user' does not exist > [ RUN ] UserCgroupIsolatorTest/1.ROOT_CGROUPS_UserCgroup > ../../src/tests/mesos.cpp:722: Failure > cgroups::mount(hierarchy, subsystem): '/tmp/mesos_test_cgroup/perf_event' > already exists in the file system > - > We cannot run any cgroups tests that require > a hierarchy with subsystem 'perf_event' > because we failed to find an existing hierarchy > or create a new one (tried '/tmp/mesos_test_cgroup/perf_event'). > You can either remove all existing > hierarchies, or disable this test case > (i.e., --gtest_filter=-UserCgroupIsolatorTest/1.*). > - > ../../src/tests/mesos.cpp:776: Failure > cgroups: '/tmp/mesos_test_cgroup/perf_event' is not a valid hierarchy > [ FAILED ] UserCgroupIsolatorTest/1.ROOT_CGROUPS_UserCgroup, where > TypeParam = mesos::internal::slave::CgroupsCpushareIsolatorProcess (4 ms) > [--] 1 test from UserCgroupIsolatorTest/1 (5 ms total) > [--] 1 test from UserCgroupIsolatorTest/2, where TypeParam = > mesos::internal::slave::CgroupsPerfEventIsolatorProcess > userdel: user 'mesos.test.unprivileged.user' does not exist > [ RUN ] UserCgroupIsolatorTest/2.ROOT_CGROUPS_UserCgroup > ../../src/tests/mesos.cpp:722: Failure > cgroups::mount(hierarchy, subsystem): '/tmp/mesos_test_cgroup/perf_event' > already exists in the file system > - > We cannot run any cgroups tests that require > a hierarchy with subsystem 'perf_event' > because we failed to find an existing hierarchy > or create a new one (tried '/tmp/mesos_test_cgroup/perf_event'). > You can either remove all existing > hierarchies, or disable this test case > (i.e., --gtest_filter=-UserCgroupIsolatorTest/2.*). > - > ../../src/tests/mesos.cpp:776: Failure > cgroups: '/tmp/mesos_test_cgroup/perf_event' is not a valid hierarchy > [ FAILED ] UserCgroupIsolatorTest/2.ROOT_CGROUPS_UserCgroup, where > TypeParam = mesos::internal::slave::CgroupsPerfEventIsolatorProcess (2 ms) > [--] 1 test from UserCgroupIsolatorTest/2 (2 ms total) > [--] Global test environment tear-down > [==] 3 tests from 3 test cases ran. (349 ms total) > [ PASSED ] 0 tests. > [ FAILED ] 3 tests, listed below: > [ FAILED ] UserCgroupIsolatorTest/0.ROOT_CGROUPS_UserCgroup,
[jira] [Updated] (MESOS-4038) SlaveRecoveryTests, UserCgroupIsolatorTests fail on CentOS 6.6
[ https://issues.apache.org/jira/browse/MESOS-4038?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benjamin Hindman updated MESOS-4038: Shepherd: Till Toenshoff Assignee: Jan Schlicht Sprint: Mesosphere Sprint 26 > SlaveRecoveryTests, UserCgroupIsolatorTests fail on CentOS 6.6 > -- > > Key: MESOS-4038 > URL: https://issues.apache.org/jira/browse/MESOS-4038 > Project: Mesos > Issue Type: Bug > Environment: CentOS 6.6 >Reporter: Greg Mann >Assignee: Jan Schlicht > Labels: mesosphere, test-failure > > All {{SlaveRecoveryTest.\*}} tests, > {{MesosContainerizerSlaveRecoveryTest.\*}} tests, and > {{UserCgroupIsolatorTest*}} tests fail on CentOS 6.6 with {{TypeParam = > mesos::internal::slave::MesosContainerizer}}. They all fail with the same > error: > {code} > [--] 1 test from SlaveRecoveryTest/0, where TypeParam = > mesos::internal::slave::MesosContainerizer > [ RUN ] SlaveRecoveryTest/0.ReconnectExecutor > ../../src/tests/mesos.cpp:722: Failure > cgroups::mount(hierarchy, subsystem): '/cgroup/perf_event' already exists in > the file system > - > We cannot run any cgroups tests that require > a hierarchy with subsystem 'perf_event' > because we failed to find an existing hierarchy > or create a new one (tried '/cgroup/perf_event'). > You can either remove all existing > hierarchies, or disable this test case > (i.e., --gtest_filter=-SlaveRecoveryTest/0.*). > - > ../../src/tests/mesos.cpp:776: Failure > cgroups: '/cgroup/perf_event' is not a valid hierarchy > [ FAILED ] SlaveRecoveryTest/0.ReconnectExecutor, where TypeParam = > mesos::internal::slave::MesosContainerizer (8 ms) > [--] 1 test from SlaveRecoveryTest/0 (9 ms total) > [--] Global test environment tear-down > [==] 1 test from 1 test case ran. (15 ms total) > [ PASSED ] 0 tests. > [ FAILED ] 1 test, listed below: > [ FAILED ] SlaveRecoveryTest/0.ReconnectExecutor, where TypeParam = > mesos::internal::slave::MesosContainerizer > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (MESOS-4262) Enable net_cls subsytem in cgroup infrastructure
[ https://issues.apache.org/jira/browse/MESOS-4262?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Avinash Sridharan reassigned MESOS-4262: Assignee: Avinash Sridharan > Enable net_cls subsytem in cgroup infrastructure > > > Key: MESOS-4262 > URL: https://issues.apache.org/jira/browse/MESOS-4262 > Project: Mesos > Issue Type: Improvement > Components: containerization >Reporter: Avinash Sridharan >Assignee: Avinash Sridharan > > Currently the control group infrastructure within mesos supports only the > memory and CPU subsystems. We need to enhance this infrastructure to support > the net_cls subsystem as well. Details of the net_cls subsystem and its > use-cases can be found here: > https://www.kernel.org/doc/Documentation/cgroups/net_cls.txt > Enabling the net_cls will allow us to provide operators to, potentially, > regulate framework traffic on a per-container basis. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (MESOS-4285) Mesos command task doesn't support volumes with image
Timothy Chen created MESOS-4285: --- Summary: Mesos command task doesn't support volumes with image Key: MESOS-4285 URL: https://issues.apache.org/jira/browse/MESOS-4285 Project: Mesos Issue Type: Bug Components: containerization Reporter: Timothy Chen Assignee: Timothy Chen Currently volumes are stripped when an image is specified running a command task with Mesos containerizer. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-4285) Mesos command task doesn't support volumes with image
[ https://issues.apache.org/jira/browse/MESOS-4285?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Timothy Chen updated MESOS-4285: Labels: mesosphere (was: ) > Mesos command task doesn't support volumes with image > - > > Key: MESOS-4285 > URL: https://issues.apache.org/jira/browse/MESOS-4285 > Project: Mesos > Issue Type: Bug > Components: containerization >Reporter: Timothy Chen >Assignee: Timothy Chen > Labels: mesosphere > > Currently volumes are stripped when an image is specified running a command > task with Mesos containerizer. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-4102) Quota doesn't allocate resources on slave joining
[ https://issues.apache.org/jira/browse/MESOS-4102?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joris Van Remoortere updated MESOS-4102: Sprint: Mesosphere Sprint 26 > Quota doesn't allocate resources on slave joining > - > > Key: MESOS-4102 > URL: https://issues.apache.org/jira/browse/MESOS-4102 > Project: Mesos > Issue Type: Bug > Components: allocation >Reporter: Neil Conway >Assignee: Alexander Rukletsov > Labels: mesosphere, quota > Attachments: quota_absent_framework_test-1.patch > > > See attached patch. {{framework1}} is not allocated any resources, despite > the fact that the resources on {{agent2}} can safely be allocated to it > without risk of violating {{quota1}}. If I understand the intended quota > behavior correctly, this doesn't seem intended. > Note that if the framework is added _after_ the slaves are added, the > resources on {{agent2}} are allocated to {{framework1}}. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-4058) Do not use `Resource.role` for resources in quota request
[ https://issues.apache.org/jira/browse/MESOS-4058?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joris Van Remoortere updated MESOS-4058: Sprint: Mesosphere Sprint 26 > Do not use `Resource.role` for resources in quota request > - > > Key: MESOS-4058 > URL: https://issues.apache.org/jira/browse/MESOS-4058 > Project: Mesos > Issue Type: Improvement > Components: master >Reporter: Alexander Rukletsov >Assignee: Alexander Rukletsov > Labels: mesosphere > > To be consistent with other operator endpoints and to adhere to the principal > of least surprise, move role from each {{Resource}} in quota set request to > the request itself. > {{Resource.role}} is used for reserved resources. Since quota is not a direct > reservation request, to avoid confusion we shall not reuse this field for > communicating the role for which quota should be reserved. > Food for thought: Shall we try to keep internal storage protobufs as close as > possible to operator's JSON to provide some sort of a schema or decouple > those two for the sake of flexibility? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-3979) Replace `QuotaInfo` with `Quota` in allocator interface
[ https://issues.apache.org/jira/browse/MESOS-3979?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joris Van Remoortere updated MESOS-3979: Sprint: Mesosphere Sprint 26 > Replace `QuotaInfo` with `Quota` in allocator interface > --- > > Key: MESOS-3979 > URL: https://issues.apache.org/jira/browse/MESOS-3979 > Project: Mesos > Issue Type: Improvement > Components: allocation >Reporter: Alexander Rukletsov >Assignee: Alexander Rukletsov > Labels: mesosphere > > After introduction of C++ wrapper `Quota` for `QuotaInfo`, all allocator > methods using `QuotaInfo` should be updated. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-4128) Refactor sorter factories in allocator and improve comments around them
[ https://issues.apache.org/jira/browse/MESOS-4128?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joris Van Remoortere updated MESOS-4128: Sprint: Mesosphere Sprint 26 > Refactor sorter factories in allocator and improve comments around them > --- > > Key: MESOS-4128 > URL: https://issues.apache.org/jira/browse/MESOS-4128 > Project: Mesos > Issue Type: Improvement > Components: allocation >Reporter: Alexander Rukletsov >Assignee: Alexander Rukletsov > Labels: mesosphere > > For clarity we want to refactor the factory section in the allocator and > explain the purpose (and necessity) of all sorters. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-4281) Correctly handle disk quota usage when volumes are bind mounted into the container.
[ https://issues.apache.org/jira/browse/MESOS-4281?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Artem Harutyunyan updated MESOS-4281: - Labels: mesosphere (was: mesospheree) > Correctly handle disk quota usage when volumes are bind mounted into the > container. > --- > > Key: MESOS-4281 > URL: https://issues.apache.org/jira/browse/MESOS-4281 > Project: Mesos > Issue Type: Bug >Reporter: Artem Harutyunyan >Assignee: Artem Harutyunyan > Labels: mesosphere > > In its current implementation disk quota enforcement on the task sandbox will > not work correctly when disk volumes are bind mounted into the task sandbox > (this happens when Linux filesystem isolator is used). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-4214) Introduce HTTP endpoint /weights for updating weight
[ https://issues.apache.org/jira/browse/MESOS-4214?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Adam B updated MESOS-4214: -- Story Points: 5 > Introduce HTTP endpoint /weights for updating weight > > > Key: MESOS-4214 > URL: https://issues.apache.org/jira/browse/MESOS-4214 > Project: Mesos > Issue Type: Task >Reporter: Yongqiao Wang >Assignee: Yongqiao Wang > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-4191) Design doc for fixed point resources
[ https://issues.apache.org/jira/browse/MESOS-4191?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Neil Conway updated MESOS-4191: --- Shepherd: Joris Van Remoortere Sprint: Mesosphere Sprint 26 > Design doc for fixed point resources > > > Key: MESOS-4191 > URL: https://issues.apache.org/jira/browse/MESOS-4191 > Project: Mesos > Issue Type: Task > Components: master >Reporter: Neil Conway >Assignee: Neil Conway > Labels: mesosphere, resources > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-2317) Remove deprecated checkpoint=false code
[ https://issues.apache.org/jira/browse/MESOS-2317?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joerg Schad updated MESOS-2317: --- Sprint: Mesosphere Q1 Sprint 6 - 4/3, Mesosphere Q1 Sprint 7 - 4/17, Mesosphere Q2 Sprint 8 - 5/1, Mesosphere Q1 Sprint 9 - 5/15, Mesosphere Sprint 10, Mesosphere Sprint 11, Mesosphere Sprint 26 (was: Mesosphere Q1 Sprint 6 - 4/3, Mesosphere Q1 Sprint 7 - 4/17, Mesosphere Q2 Sprint 8 - 5/1, Mesosphere Q1 Sprint 9 - 5/15, Mesosphere Sprint 10, Mesosphere Sprint 11) > Remove deprecated checkpoint=false code > --- > > Key: MESOS-2317 > URL: https://issues.apache.org/jira/browse/MESOS-2317 > Project: Mesos > Issue Type: Epic >Affects Versions: 0.22.0 >Reporter: Adam B >Assignee: Joerg Schad > Labels: checkpoint, mesosphere > > Cody's plan from MESOS-444 was: > 1) -Make it so the flag can't be changed at the command line- > 2) -Remove the checkpoint variable entirely from slave/flags.hpp. This is a > fairly involved change since a number of unit tests depend on manually > setting the flag, as well as the default being non-checkpointing.- > 3) Remove logic around checkpointing in the slave > 4) Drop the flag from the SlaveInfo struct, remove logic inside the master > (Will require a deprecation cycle). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-2317) Remove deprecated checkpoint=false code
[ https://issues.apache.org/jira/browse/MESOS-2317?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joerg Schad updated MESOS-2317: --- Story Points: 3 > Remove deprecated checkpoint=false code > --- > > Key: MESOS-2317 > URL: https://issues.apache.org/jira/browse/MESOS-2317 > Project: Mesos > Issue Type: Epic >Affects Versions: 0.22.0 >Reporter: Adam B >Assignee: Joerg Schad > Labels: checkpoint, mesosphere > > Cody's plan from MESOS-444 was: > 1) -Make it so the flag can't be changed at the command line- > 2) -Remove the checkpoint variable entirely from slave/flags.hpp. This is a > fairly involved change since a number of unit tests depend on manually > setting the flag, as well as the default being non-checkpointing.- > 3) Remove logic around checkpointing in the slave > 4) Drop the flag from the SlaveInfo struct, remove logic inside the master > (Will require a deprecation cycle). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-4240) Pull provisioner from linux filesystem isolator to Mesos containerizer.
[ https://issues.apache.org/jira/browse/MESOS-4240?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gilbert Song updated MESOS-4240: Shepherd: Jie Yu Sprint: Mesosphere Sprint 26 > Pull provisioner from linux filesystem isolator to Mesos containerizer. > --- > > Key: MESOS-4240 > URL: https://issues.apache.org/jira/browse/MESOS-4240 > Project: Mesos > Issue Type: Task >Reporter: Jie Yu >Assignee: Gilbert Song > > The rationale behind this change is that many of the image specifications > (e.g., Docker/Appc) are not just for filesystems. They also specify runtime > configurations (e.g., environment variables, volumes, etc) for the container. > Provisioner should return those runtime configurations to the Mesos > containerizer and Mesos containerizer will delegate the isolation of those > runtime configurations to the relevant isolator. > Here is what it will be look like eventually. We could do those changes in > phases: > 1) Provisioner will return a ProvisionInfo which includes a 'rootfs' and > image specific runtime configurations (could be the Docker/Appc manifest). > 2) Then, the Mesos containerizer will generate a ContainerConfig (a protobuf > which includes rootfs, sandbox, docker/appc manifest, similar to OCI's host > independent config.json) and pass that to each isolator in 'prepare'. Imaging > in the future, a DockerRuntimeIsolator takes the docker manifest from > ContainerConfig and prepare the container. > 3) The isolator's prepare function will return a ContainerLaunchInfo > (contains environment variables, namespaces, etc.) which will be used by > Mesos containerize to launch containers. Imaging that information will be > passed to the launcher in the future. > We can do the renaming (ContainerPrepareInfo -> ContainerLaunchInfo) later. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-4272) DurationTest.Arithmetic performs inexact float calculation in test
[ https://issues.apache.org/jira/browse/MESOS-4272?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benjamin Bannier updated MESOS-4272: Sprint: Mesosphere Sprint 26 > DurationTest.Arithmetic performs inexact float calculation in test > -- > > Key: MESOS-4272 > URL: https://issues.apache.org/jira/browse/MESOS-4272 > Project: Mesos > Issue Type: Bug > Components: test >Reporter: Benjamin Bannier >Assignee: Benjamin Bannier >Priority: Minor > > {{DurationTest.Arithmetic}} does a calculation with not exactly representable > floating point values and also performs an equality check, > {code} > EXPECT_EQ(Duration::create(3.3).get(), Seconds(10) * 0.33); > {code} > Here neither the value {{3.3}} nor {{0.33}} cannot be represented exactly as > a floating point number so the check might fail incorrectly (as it does e.g. > when compiling and executing the test under 32-bit on Debian8). > Instead we should just use exactly representable values to make sure the test > will succeed as long as the implementation behaves as expected. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-4025) SlaveRecoveryTest/0.GCExecutor is flaky.
[ https://issues.apache.org/jira/browse/MESOS-4025?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benjamin Hindman updated MESOS-4025: Sprint: Mesosphere Sprint 23, Mesosphere Sprint 26 (was: Mesosphere Sprint 23) > SlaveRecoveryTest/0.GCExecutor is flaky. > > > Key: MESOS-4025 > URL: https://issues.apache.org/jira/browse/MESOS-4025 > Project: Mesos > Issue Type: Bug > Components: test >Affects Versions: 0.26.0 >Reporter: Till Toenshoff >Assignee: Jan Schlicht > Labels: flaky, flaky-test, test > > Build was SSL enabled (--enable-ssl, --enable-libevent). The build was based > on 0.26.0-rc1. > Testsuite was run as root. > {noformat} > sudo ./bin/mesos-tests.sh --gtest_break_on_failure --gtest_repeat=-1 > {noformat} > {noformat} > [ RUN ] SlaveRecoveryTest/0.GCExecutor > I1130 16:49:16.336833 1032 exec.cpp:136] Version: 0.26.0 > I1130 16:49:16.345212 1049 exec.cpp:210] Executor registered on slave > dde9fd4e-b016-4a99-9081-b047e9df9afa-S0 > Registered executor on ubuntu14 > Starting task 22c63bba-cbf8-46fd-b23a-5409d69e4114 > sh -c 'sleep 1000' > Forked command at 1057 > ../../src/tests/mesos.cpp:779: Failure > (cgroups::destroy(hierarchy, cgroup)).failure(): Failed to remove cgroup > '/sys/fs/cgroup/memory/mesos_test_e5edb2a8-9af3-441f-b991-613082f264e2/slave': > Device or resource busy > *** Aborted at 1448902156 (unix time) try "date -d @1448902156" if you are > using GNU date *** > PC: @ 0x1443e9a testing::UnitTest::AddTestPartResult() > *** SIGSEGV (@0x0) received by PID 27364 (TID 0x7f1bfdd2b800) from PID 0; > stack trace: *** > @ 0x7f1be92b80b7 os::Linux::chained_handler() > @ 0x7f1be92bc219 JVM_handle_linux_signal > @ 0x7f1bf7bbc340 (unknown) > @ 0x1443e9a testing::UnitTest::AddTestPartResult() > @ 0x1438b99 testing::internal::AssertHelper::operator=() > @ 0xf0b3bb > mesos::internal::tests::ContainerizerTest<>::TearDown() > @ 0x1461882 > testing::internal::HandleSehExceptionsInMethodIfSupported<>() > @ 0x145c6f8 > testing::internal::HandleExceptionsInMethodIfSupported<>() > @ 0x143de4a testing::Test::Run() > @ 0x143e584 testing::TestInfo::Run() > @ 0x143ebca testing::TestCase::Run() > @ 0x1445312 testing::internal::UnitTestImpl::RunAllTests() > @ 0x14624a7 > testing::internal::HandleSehExceptionsInMethodIfSupported<>() > @ 0x145d26e > testing::internal::HandleExceptionsInMethodIfSupported<>() > @ 0x14440ae testing::UnitTest::Run() > @ 0xd15cd4 RUN_ALL_TESTS() > @ 0xd158c1 main > @ 0x7f1bf7808ec5 (unknown) > @ 0x913009 (unknown) > {noformat} > My Vagrantfile generator; > {noformat} > #!/usr/bin/env bash > cat << EOF > Vagrantfile > # -*- mode: ruby -*-" > > # vi: set ft=ruby : > Vagrant.configure(2) do |config| > # Disable shared folder to prevent certain kernel module dependencies. > config.vm.synced_folder ".", "/vagrant", disabled: true > config.vm.box = "bento/ubuntu-14.04" > config.vm.hostname = "${PLATFORM_NAME}" > config.vm.provider "virtualbox" do |vb| > vb.memory = ${VAGRANT_MEM} > vb.cpus = ${VAGRANT_CPUS} > vb.customize ["modifyvm", :id, "--nictype1", "virtio"] > vb.customize ["modifyvm", :id, "--natdnshostresolver1", "on"] > vb.customize ["modifyvm", :id, "--natdnsproxy1", "on"] > end > config.vm.provider "vmware_fusion" do |vb| > vb.memory = ${VAGRANT_MEM} > vb.cpus = ${VAGRANT_CPUS} > end > config.vm.provision "file", source: "../test.sh", destination: "~/test.sh" > config.vm.provision "shell", inline: <<-SHELL > sudo apt-get update > sudo apt-get -y install openjdk-7-jdk autoconf libtool > sudo apt-get -y install build-essential python-dev python-boto \ > libcurl4-nss-dev libsasl2-dev maven \ > libapr1-dev libsvn-dev libssl-dev libevent-dev > sudo apt-get -y install git > sudo wget -qO- https://get.docker.com/ | sh > SHELL > end > EOF > {noformat} > The problem is kicking in frequently in my tests - I'ld say > 10% but less > than 50%. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-3307) Configurable size of completed task / framework history
[ https://issues.apache.org/jira/browse/MESOS-3307?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Adam B updated MESOS-3307: -- Reporter: Ian Babrou (was: Kevin Klues) > Configurable size of completed task / framework history > --- > > Key: MESOS-3307 > URL: https://issues.apache.org/jira/browse/MESOS-3307 > Project: Mesos > Issue Type: Bug >Reporter: Ian Babrou > Labels: mesosphere > > We try to make Mesos work with multiple frameworks and mesos-dns at the same > time. The goal is to have set of frameworks per team / project on a single > Mesos cluster. > At this point our mesos state.json is at 4mb and it takes a while to > assembly. 5 mesos-dns instances hit state.json every 5 seconds, effectively > pushing mesos-master CPU usage through the roof. It's at 100%+ all the time. > Here's the problem: > {noformat} > mesos λ curl -s http://mesos-master:5050/master/state.json | jq > .frameworks[].completed_tasks[].framework_id | sort | uniq -c | sort -n >1 "20150606-001827-252388362-5050-5982-0003" > 16 "20150606-001827-252388362-5050-5982-0005" > 18 "20150606-001827-252388362-5050-5982-0029" > 73 "20150606-001827-252388362-5050-5982-0007" > 141 "20150606-001827-252388362-5050-5982-0009" > 154 "20150820-154817-302720010-5050-15320-" > 289 "20150606-001827-252388362-5050-5982-0004" > 510 "20150606-001827-252388362-5050-5982-0012" > 666 "20150606-001827-252388362-5050-5982-0028" > 923 "20150116-002612-269165578-5050-32204-0003" > 1000 "20150606-001827-252388362-5050-5982-0001" > 1000 "20150606-001827-252388362-5050-5982-0006" > 1000 "20150606-001827-252388362-5050-5982-0010" > 1000 "20150606-001827-252388362-5050-5982-0011" > 1000 "20150606-001827-252388362-5050-5982-0027" > mesos λ fgrep 1000 -r src/master > src/master/constants.cpp:const size_t MAX_REMOVED_SLAVES = 10; > src/master/constants.cpp:const uint32_t MAX_COMPLETED_TASKS_PER_FRAMEWORK = > 1000; > {noformat} > Active tasks are just 6% of state.json response: > {noformat} > mesos λ cat ~/temp/mesos-state.json | jq -c . | wc >1 14796 4138942 > mesos λ cat ~/temp/mesos-state.json | jq .frameworks[].tasks | jq -c . | wc > 16 37 252774 > {noformat} > I see four options that can improve the situation: > 1. Add query string param to exclude completed tasks from state.json and use > it in mesos-dns and similar tools. There is no need for mesos-dns to know > about completed tasks, it's just extra load on master and mesos-dns. > 2. Make history size configurable. > 3. Make JSON serialization faster. With 1s of tasks even without history > it would take a lot of time to serialize tasks for mesos-dns. Doing it every > 60 seconds instead of every 5 seconds isn't really an option. > 4. Create event bus for mesos master. Marathon has it and it'd be nice to > have it in Mesos. This way mesos-dns could avoid polling master state and > switch to listening for events. > All can be done independently. > Note to mesosphere folks: please start distributing debug symbols with your > distribution. I was asking for it for a while and it is really helpful: > https://github.com/mesosphere/marathon/issues/1497#issuecomment-104182501 > Perf report for leading master: > !http://i.imgur.com/iz7C3o0.png! > I'm on 0.23.0. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-3809) Expose advertise_ip and advertise_port as command line options in mesos slave
[ https://issues.apache.org/jira/browse/MESOS-3809?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15080861#comment-15080861 ] Bernd Mathiske commented on MESOS-3809: --- Unfortunately, this commit was indeed not cherry-picked into 0.26.0, but should have, and the ticket shows up in the CHANGELOG. I'll update the CHANGELOG for 0.26.0, removing MESOS-3809 from it, and set the target version for this ticket to 0.27.0. > Expose advertise_ip and advertise_port as command line options in mesos slave > - > > Key: MESOS-3809 > URL: https://issues.apache.org/jira/browse/MESOS-3809 > Project: Mesos > Issue Type: Bug > Components: slave >Affects Versions: 0.25.0 >Reporter: Anindya Sinha >Assignee: Anindya Sinha >Priority: Minor > Labels: mesosphere > Fix For: 0.26.0 > > > advertise_ip and advertise_port are exposed as mesos master command line args > (MESOS-809). But the following use case makes it a candidate for adding as > command line args in mesos slave as well. > On Tue, Oct 27, 2015 at 7:43 PM, Xiaodong Zhangwrote: > It works! Thanks a lot. > 发件人: haosdent > 答复: "u...@mesos.apache.org" > 日期: 2015年10月28日 星期三 上午10:23 > 至: user > 主题: Re: How to tell master which ip to connect. > Do you try `export LIBPROCESS_ADVERTISE_IP=xxx` and > `LIBPROCESS_ADVERTISE_PORT` when start slave? > On Wed, Oct 28, 2015 at 10:16 AM, Xiaodong Zhang wrote: > Hi teams: > My scenarios is like this: > My master nodes were deployed in AWS. My slaves were in AZURE.So they > communicate via public ip. > I got trouble when slaves try to register to master. > Now slaves can get master’s public ip address,and can send register > request.But they can only send there private ip to master.(Because they don’t > know there public ip,thus they can’t not bind a public ip via —ip flag), > thus masters can’t connect slaves.How can the slave to tell master which ip > master should connect(I can’t find any flags like —advertise_ip in master). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-1478) Replace Master/Slave terminology
[ https://issues.apache.org/jira/browse/MESOS-1478?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15080881#comment-15080881 ] Adam B commented on MESOS-1478: --- The email thread I started was intended as a discussion thread to gauge the sentiment of the community. As a code change, the final say belongs to the Apache Mesos PMC (not Mesosphere), and we had a private@ discussion/vote to use "Agent" in the new 1.0 HTTP API. The old "slave" binaries/APIs will continue to work for the foreseeable future (perhaps until Mesos 2.0). [~benjaminhindman] and [~darroyo] can provide an update on the progress/approach. See also the issues under this epic. > Replace Master/Slave terminology > > > Key: MESOS-1478 > URL: https://issues.apache.org/jira/browse/MESOS-1478 > Project: Mesos > Issue Type: Epic >Reporter: Clark Breyman >Assignee: Benjamin Hindman >Priority: Minor > Labels: mesosphere > > Inspired by the comments on this PR: > https://github.com/django/django/pull/2692 > TL;DR - Computers sharing work should be a good thing. Using the language of > human bondage and suffering is inappropriate in this context. It also has the > potential to alienate users and community members. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-3809) Expose advertise_ip and advertise_port as command line options in mesos slave
[ https://issues.apache.org/jira/browse/MESOS-3809?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15080903#comment-15080903 ] Harry Metske commented on MESOS-3809: - Thanks for the quick response. kind regards, Harry > Expose advertise_ip and advertise_port as command line options in mesos slave > - > > Key: MESOS-3809 > URL: https://issues.apache.org/jira/browse/MESOS-3809 > Project: Mesos > Issue Type: Bug > Components: slave >Affects Versions: 0.25.0 >Reporter: Anindya Sinha >Assignee: Anindya Sinha >Priority: Minor > Labels: mesosphere > > advertise_ip and advertise_port are exposed as mesos master command line args > (MESOS-809). But the following use case makes it a candidate for adding as > command line args in mesos slave as well. > On Tue, Oct 27, 2015 at 7:43 PM, Xiaodong Zhangwrote: > It works! Thanks a lot. > 发件人: haosdent > 答复: "u...@mesos.apache.org" > 日期: 2015年10月28日 星期三 上午10:23 > 至: user > 主题: Re: How to tell master which ip to connect. > Do you try `export LIBPROCESS_ADVERTISE_IP=xxx` and > `LIBPROCESS_ADVERTISE_PORT` when start slave? > On Wed, Oct 28, 2015 at 10:16 AM, Xiaodong Zhang wrote: > Hi teams: > My scenarios is like this: > My master nodes were deployed in AWS. My slaves were in AZURE.So they > communicate via public ip. > I got trouble when slaves try to register to master. > Now slaves can get master’s public ip address,and can send register > request.But they can only send there private ip to master.(Because they don’t > know there public ip,thus they can’t not bind a public ip via —ip flag), > thus masters can’t connect slaves.How can the slave to tell master which ip > master should connect(I can’t find any flags like —advertise_ip in master). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-3933) Use a simpler realm for "Unauthorized" HTTP responses.
[ https://issues.apache.org/jira/browse/MESOS-3933?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15080900#comment-15080900 ] Abhishek Dasgupta commented on MESOS-3933: -- Can you be more specific about how to produce "unauthorized" response which includes "Mesos master"? If you please provide a test request data and URL. > Use a simpler realm for "Unauthorized" HTTP responses. > -- > > Key: MESOS-3933 > URL: https://issues.apache.org/jira/browse/MESOS-3933 > Project: Mesos > Issue Type: Bug > Components: HTTP API >Reporter: Jan Schlicht >Priority: Trivial > Labels: easyfix, newbie > > Currently, if a HTTP request cannot be authorized, an {{Unauthorized}} > response is returned using "Mesos master" for the {{realm}} parameter. While > not strictly forbidden by the HTTP RFC, strings with spaces seem to be very > uncommon for the {{realm}} parameter. A simpler realm such as "Mesos" should > be used instead. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (MESOS-4276) Remove dupicate Mesos constructor
Benjamin Bannier created MESOS-4276: --- Summary: Remove dupicate Mesos constructor Key: MESOS-4276 URL: https://issues.apache.org/jira/browse/MESOS-4276 Project: Mesos Issue Type: Improvement Components: technical debt Reporter: Benjamin Bannier Assignee: Benjamin Bannier {{Mesos}} offers two almost-identical constructors {code} // TODO(vinod): Remove this in favor of the below constructor. Mesos(const std::string& master, const std::function& connected, const std::function & disconnected, const std::function & received); Mesos(const std::string& master, ContentType contentType, const std::function & connected, const std::function & disconnected, const std::function & received); {code} Here invocations of the first constructor can replaced trivially with invocations of the second one with {{contentType = ContentType::PROTOBUF}}. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (MESOS-4277) Provide constexpr Duration::min() and max()
Benjamin Bannier created MESOS-4277: --- Summary: Provide constexpr Duration::min() and max() Key: MESOS-4277 URL: https://issues.apache.org/jira/browse/MESOS-4277 Project: Mesos Issue Type: Improvement Components: stout Reporter: Benjamin Bannier Assignee: Benjamin Bannier Priority: Minor {{Duration}} could be implemented so that it can provide {{constexpr}} {{min}} and {{max}} functions. This addresses an existing {{TODO}}. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-4271) Consider replacing libtool with dolt to speed up build
[ https://issues.apache.org/jira/browse/MESOS-4271?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Till Toenshoff updated MESOS-4271: -- Shepherd: Till Toenshoff > Consider replacing libtool with dolt to speed up build > -- > > Key: MESOS-4271 > URL: https://issues.apache.org/jira/browse/MESOS-4271 > Project: Mesos > Issue Type: Improvement >Reporter: Benjamin Bannier >Assignee: Benjamin Bannier >Priority: Minor > Labels: build > > Mesos uses a pretty standard autotools setup for the build so that > {{libtool}} is used extensively to abstract away the aspects of library > creation (both compiling source files, and creating the libraries). For some > versions of {{libtool}} its invocation can add considerably to the overall > build time. > Dolt provides a much more condensed implementation of {{libtool}}'s > functionality for modern platforms (<100 locs vs ~10 klocs), so that it can > run much faster. We should investigate whether activating dolt makes sense. > I tested dolt under OS X 10.10.5. I first primed ccache and then rebuilt > mesos-related objects, > {code} > ./configure --disable-python --disable-java # benchmark mostly C & C++ file > compile and link > make check GTEST_FILTER='' # prime ccache > make mostlyclean # remove most mesos objects and > libs > make -jN check GTEST_FILTER='' # rebuild > {code} > ||| user [s] | real [s]| sys [s]|| > | make -j10 (dolt)| 42.8±0.1 | 54.3±0.2 | 34.1±0.2 | > | make -j10 (libtool) | 65.6±0.3 | 148.7±1.1 | 108.5±1.0 | > | make -j1 (dolt) | 76.9±0.3 | 45.5±0.1 | 27.1±0.1 | > | make -j1 (libtool) | 168.2±2.3 | 97.5±1.5 | 75.8±1.3 | -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-4271) Consider replacing libtool with dolt to speed up build
[ https://issues.apache.org/jira/browse/MESOS-4271?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15081025#comment-15081025 ] Till Toenshoff commented on MESOS-4271: --- Thanks a bunch for measuring @bbannier. Given that this introduces another, new tool in our build-chain (even though there is a fallback to libtool), I guess we should make sure others have a chance to comment here as well - either with their results or other discussion points. > Consider replacing libtool with dolt to speed up build > -- > > Key: MESOS-4271 > URL: https://issues.apache.org/jira/browse/MESOS-4271 > Project: Mesos > Issue Type: Improvement >Reporter: Benjamin Bannier >Assignee: Benjamin Bannier >Priority: Minor > Labels: build > > Mesos uses a pretty standard autotools setup for the build so that > {{libtool}} is used extensively to abstract away the aspects of library > creation (both compiling source files, and creating the libraries). For some > versions of {{libtool}} its invocation can add considerably to the overall > build time. > Dolt provides a much more condensed implementation of {{libtool}}'s > functionality for modern platforms (<100 locs vs ~10 klocs), so that it can > run much faster. We should investigate whether activating dolt makes sense. > I tested dolt under OS X 10.10.5. I first primed ccache and then rebuilt > mesos-related objects, > {code} > ./configure --disable-python --disable-java # benchmark mostly C & C++ file > compile and link > make check GTEST_FILTER='' # prime ccache > make mostlyclean # remove most mesos objects and > libs > make -jN check GTEST_FILTER='' # rebuild > {code} > ||| user [s] | real [s]| sys [s]|| > | make -j10 (dolt)| 42.8±0.1 | 54.3±0.2 | 34.1±0.2 | > | make -j10 (libtool) | 65.6±0.3 | 148.7±1.1 | 108.5±1.0 | > | make -j1 (dolt) | 76.9±0.3 | 45.5±0.1 | 27.1±0.1 | > | make -j1 (libtool) | 168.2±2.3 | 97.5±1.5 | 75.8±1.3 | -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (MESOS-4274) libprocess build fail with libhttp-parser >= 2.0
Jocelyn De La Rosa created MESOS-4274: - Summary: libprocess build fail with libhttp-parser >= 2.0 Key: MESOS-4274 URL: https://issues.apache.org/jira/browse/MESOS-4274 Project: Mesos Issue Type: Bug Components: build, libprocess Affects Versions: 0.26.0 Environment: debian 8 with package {{libhttp-parser-dev}} installed and libprocess configured with {{--disable-bundled}} Reporter: Jocelyn De La Rosa Priority: Minor Since mesos 0.26 libprocess does not compile if the libhttp-parser version is >= 2.0. I traced back the issue to the commit {{d347bf56c807d}} that added URL to the {{http::Request}} but forgot to modify the {{#if HTTP_PARSER_VERSION MAJORS >=2}} parts in {{3rdparty/libprocess/src/decoder.hpp}} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-4276) Remove dupicate Mesos constructor
[ https://issues.apache.org/jira/browse/MESOS-4276?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Till Toenshoff updated MESOS-4276: -- Shepherd: Till Toenshoff > Remove dupicate Mesos constructor > - > > Key: MESOS-4276 > URL: https://issues.apache.org/jira/browse/MESOS-4276 > Project: Mesos > Issue Type: Improvement > Components: technical debt >Reporter: Benjamin Bannier >Assignee: Benjamin Bannier > > {{Mesos}} offers two almost-identical constructors > {code} > // TODO(vinod): Remove this in favor of the below constructor. > Mesos(const std::string& master, > const std::function& connected, > const std::function & disconnected, > const std::function & received); > Mesos(const std::string& master, > ContentType contentType, > const std::function & connected, > const std::function & disconnected, > const std::function & received); > {code} > Here invocations of the first constructor can replaced trivially with > invocations of the second one with {{contentType = ContentType::PROTOBUF}}. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-4277) Provide constexpr Duration::min() and max()
[ https://issues.apache.org/jira/browse/MESOS-4277?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Till Toenshoff updated MESOS-4277: -- Shepherd: Till Toenshoff > Provide constexpr Duration::min() and max() > --- > > Key: MESOS-4277 > URL: https://issues.apache.org/jira/browse/MESOS-4277 > Project: Mesos > Issue Type: Improvement > Components: stout, technical debt >Reporter: Benjamin Bannier >Assignee: Benjamin Bannier >Priority: Minor > > {{Duration}} could be implemented so that it can provide {{constexpr}} > {{min}} and {{max}} functions. > This addresses an existing {{TODO}}. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (MESOS-4271) Consider replacing libtool with dolt to speed up build
[ https://issues.apache.org/jira/browse/MESOS-4271?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15081025#comment-15081025 ] Till Toenshoff edited comment on MESOS-4271 at 1/4/16 11:21 AM: Thanks a bunch for measuring [~bbannier]. Given that this introduces another, new tool in our build-chain (even though there is a fallback to libtool), I guess we should make sure others have a chance to comment here as well - either with their results or other discussion points. was (Author: tillt): Thanks a bunch for measuring @bbannier. Given that this introduces another, new tool in our build-chain (even though there is a fallback to libtool), I guess we should make sure others have a chance to comment here as well - either with their results or other discussion points. > Consider replacing libtool with dolt to speed up build > -- > > Key: MESOS-4271 > URL: https://issues.apache.org/jira/browse/MESOS-4271 > Project: Mesos > Issue Type: Improvement >Reporter: Benjamin Bannier >Assignee: Benjamin Bannier >Priority: Minor > Labels: build > > Mesos uses a pretty standard autotools setup for the build so that > {{libtool}} is used extensively to abstract away the aspects of library > creation (both compiling source files, and creating the libraries). For some > versions of {{libtool}} its invocation can add considerably to the overall > build time. > Dolt provides a much more condensed implementation of {{libtool}}'s > functionality for modern platforms (<100 locs vs ~10 klocs), so that it can > run much faster. We should investigate whether activating dolt makes sense. > I tested dolt under OS X 10.10.5. I first primed ccache and then rebuilt > mesos-related objects, > {code} > ./configure --disable-python --disable-java # benchmark mostly C & C++ file > compile and link > make check GTEST_FILTER='' # prime ccache > make mostlyclean # remove most mesos objects and > libs > make -jN check GTEST_FILTER='' # rebuild > {code} > ||| user [s] | real [s]| sys [s]|| > | make -j10 (dolt)| 42.8±0.1 | 54.3±0.2 | 34.1±0.2 | > | make -j10 (libtool) | 65.6±0.3 | 148.7±1.1 | 108.5±1.0 | > | make -j1 (dolt) | 76.9±0.3 | 45.5±0.1 | 27.1±0.1 | > | make -j1 (libtool) | 168.2±2.3 | 97.5±1.5 | 75.8±1.3 | -- This message was sent by Atlassian JIRA (v6.3.4#6332)