[jira] [Commented] (MESOS-5340) libevent builds may prevent new connections
[ https://issues.apache.org/jira/browse/MESOS-5340?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15322899#comment-15322899 ] Benjamin Mahler commented on MESOS-5340: Test committed: {noformat} commit 771ecf45382c09eee91539adb406657f71d84abb Author: Benjamin Mahler Date: Fri May 13 14:39:46 2016 -0700 Added a test for the SSL head-of-line blocking issue in MESOS-5340. Review: https://reviews.apache.org/r/47362 {noformat} > libevent builds may prevent new connections > --- > > Key: MESOS-5340 > URL: https://issues.apache.org/jira/browse/MESOS-5340 > Project: Mesos > Issue Type: Bug > Components: security >Affects Versions: 1.0.0 >Reporter: Till Toenshoff >Assignee: Benjamin Mahler >Priority: Blocker > Labels: mesosphere, security, ssl > Fix For: 1.0.0 > > > When using an SSL-enabled build of Mesos in combination with SSL-downgrading > support, any connection that does not actually transmit data will hang the > runnable (e.g. master). > For reproducing the issue (on any platform)... > Spin up a master with enabled SSL-downgrading: > {noformat} > $ export SSL_ENABLED=true > $ export SSL_SUPPORT_DOWNGRADE=true > $ export SSL_KEY_FILE=/path/to/your/foo.key > $ export SSL_CERT_FILE=/path/to/your/foo.crt > $ export SSL_CA_FILE=/path/to/your/ca.crt > $ ./bin/mesos-master.sh --work_dir=/tmp/foo > {noformat} > Create some artificial HTTP request load for quickly spotting the problem in > both, the master logs as well as the output of CURL itself: > {noformat} > $ while true; do sleep 0.1; echo $( date +">%H:%M:%S.%3N"; curl -s -k -A "SSL > Debug" http://localhost:5050/master/slaves; echo ;date +"<%H:%M:%S.%3N"; > echo); done > {noformat} > Now create a connection to the master that does not transmit any data: > {noformat} > $ telnet localhost 5050 > {noformat} > You should now see the CURL requests hanging, the master stops responding to > new connections. This will persist until either some data is transmitted via > the above telnet connection or it is closed. > This problem has initially been observed when running Mesos on an AWS cluster > with enabled load-balancer (which uses an idle, persistent connection) for > the master node. Such connection does naturally not transmit any data as long > as there are no external requests routed via the load-balancer. AWS allows > setting up a timeout for those connections and in our test environment, this > duration was set to 60 seconds and hence we were seeing our master getting > repetitively unresponsive for 60 seconds, then getting "unstuck" for a brief > period until it got stuck again. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-4279) Docker executor truncates task's output when the task is killed.
[ https://issues.apache.org/jira/browse/MESOS-4279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15321312#comment-15321312 ] Benjamin Mahler commented on MESOS-4279: I posted two fixes related to this ticket. The first is to send terminal status updates in the same manner as the command executor: https://reviews.apache.org/r/48428/ The second is to eliminate the killing of the 'docker run' subprocess, which breaks the log redirection: https://reviews.apache.org/r/48429/ Let me know if you have any feedback, [~jieyu] kindly agreed to review. > Docker executor truncates task's output when the task is killed. > > > Key: MESOS-4279 > URL: https://issues.apache.org/jira/browse/MESOS-4279 > Project: Mesos > Issue Type: Bug > Components: containerization, docker >Affects Versions: 0.25.0, 0.26.0, 0.27.2, 0.28.1 >Reporter: Martin Bydzovsky >Assignee: Benjamin Mahler >Priority: Critical > Labels: docker, mesosphere > Fix For: 1.0.0 > > > I'm implementing a graceful restarts of our mesos-marathon-docker setup and I > came to a following issue: > (it was already discussed on > https://github.com/mesosphere/marathon/issues/2876 and guys form mesosphere > got to a point that its probably a docker containerizer problem...) > To sum it up: > When i deploy simple python script to all mesos-slaves: > {code} > #!/usr/bin/python > from time import sleep > import signal > import sys > import datetime > def sigterm_handler(_signo, _stack_frame): > print "got %i" % _signo > print datetime.datetime.now().time() > sys.stdout.flush() > sleep(2) > print datetime.datetime.now().time() > print "ending" > sys.stdout.flush() > sys.exit(0) > signal.signal(signal.SIGTERM, sigterm_handler) > signal.signal(signal.SIGINT, sigterm_handler) > try: > print "Hello" > i = 0 > while True: > i += 1 > print datetime.datetime.now().time() > print "Iteration #%i" % i > sys.stdout.flush() > sleep(1) > finally: > print "Goodbye" > {code} > and I run it through Marathon like > {code:javascript} > data = { > args: ["/tmp/script.py"], > instances: 1, > cpus: 0.1, > mem: 256, > id: "marathon-test-api" > } > {code} > During the app restart I get expected result - the task receives sigterm and > dies peacefully (during my script-specified 2 seconds period) > But when i wrap this python script in a docker: > {code} > FROM node:4.2 > RUN mkdir /app > ADD . /app > WORKDIR /app > ENTRYPOINT [] > {code} > and run appropriate application by Marathon: > {code:javascript} > data = { > args: ["./script.py"], > container: { > type: "DOCKER", > docker: { > image: "bydga/marathon-test-api" > }, > forcePullImage: yes > }, > cpus: 0.1, > mem: 256, > instances: 1, > id: "marathon-test-api" > } > {code} > The task during restart (issued from marathon) dies immediately without > having a chance to do any cleanup. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (MESOS-4279) Docker executor truncates task's output when the task is killed.
[ https://issues.apache.org/jira/browse/MESOS-4279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15317514#comment-15317514 ] Benjamin Mahler edited comment on MESOS-4279 at 6/7/16 1:15 AM: Apologies [~bydga] that this issue has remained open for this long. I'll take on a fix here since the docker containerizer / executor is currently not well maintained (Tim is no longer responsive). was (Author: bmahler): Apologies [~bydga] that this issue has remained open for this long. I'll take on a fix here since the docker containerizer / executor is in need of a new maintainer. > Docker executor truncates task's output when the task is killed. > > > Key: MESOS-4279 > URL: https://issues.apache.org/jira/browse/MESOS-4279 > Project: Mesos > Issue Type: Bug > Components: containerization, docker >Affects Versions: 0.25.0, 0.26.0, 0.27.2, 0.28.1 >Reporter: Martin Bydzovsky >Assignee: Benjamin Mahler >Priority: Critical > Labels: docker, mesosphere > Fix For: 1.0.0 > > > I'm implementing a graceful restarts of our mesos-marathon-docker setup and I > came to a following issue: > (it was already discussed on > https://github.com/mesosphere/marathon/issues/2876 and guys form mesosphere > got to a point that its probably a docker containerizer problem...) > To sum it up: > When i deploy simple python script to all mesos-slaves: > {code} > #!/usr/bin/python > from time import sleep > import signal > import sys > import datetime > def sigterm_handler(_signo, _stack_frame): > print "got %i" % _signo > print datetime.datetime.now().time() > sys.stdout.flush() > sleep(2) > print datetime.datetime.now().time() > print "ending" > sys.stdout.flush() > sys.exit(0) > signal.signal(signal.SIGTERM, sigterm_handler) > signal.signal(signal.SIGINT, sigterm_handler) > try: > print "Hello" > i = 0 > while True: > i += 1 > print datetime.datetime.now().time() > print "Iteration #%i" % i > sys.stdout.flush() > sleep(1) > finally: > print "Goodbye" > {code} > and I run it through Marathon like > {code:javascript} > data = { > args: ["/tmp/script.py"], > instances: 1, > cpus: 0.1, > mem: 256, > id: "marathon-test-api" > } > {code} > During the app restart I get expected result - the task receives sigterm and > dies peacefully (during my script-specified 2 seconds period) > But when i wrap this python script in a docker: > {code} > FROM node:4.2 > RUN mkdir /app > ADD . /app > WORKDIR /app > ENTRYPOINT [] > {code} > and run appropriate application by Marathon: > {code:javascript} > data = { > args: ["./script.py"], > container: { > type: "DOCKER", > docker: { > image: "bydga/marathon-test-api" > }, > forcePullImage: yes > }, > cpus: 0.1, > mem: 256, > instances: 1, > id: "marathon-test-api" > } > {code} > The task during restart (issued from marathon) dies immediately without > having a chance to do any cleanup. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (MESOS-5195) Docker executor: task logs lost on shutdown
[ https://issues.apache.org/jira/browse/MESOS-5195?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benjamin Mahler reassigned MESOS-5195: -- Assignee: Benjamin Mahler This looks to be a duplicate of MESOS-4279, I'll take this on since we don't currently have a responsive maintainer for the docker support. > Docker executor: task logs lost on shutdown > --- > > Key: MESOS-5195 > URL: https://issues.apache.org/jira/browse/MESOS-5195 > Project: Mesos > Issue Type: Bug > Components: containerization, docker >Affects Versions: 0.27.2 > Environment: Linux 4.4.2 "Ubuntu 14.04.2 LTS" >Reporter: Steven Schlansker >Assignee: Benjamin Mahler > Fix For: 1.0.0 > > > When you try to kill a task running in the Docker executor (in our case via > Singularity), the task shuts down cleanly but the last logs to standard out / > standard error are lost in teardown. > For example, we run dumb-init. With debugging on, you can see it should > write: > {noformat} > DEBUG("Forwarded signal %d to children.\n", signum); > {noformat} > If you attach strace to the process, you can see it clearly writes the text > to stderr. But that message is lost and never is written to the sandbox > 'stderr' file. > We believe the issue starts here, in Docker executor.cpp: > {code} > void shutdown(ExecutorDriver* driver) > { > cout << "Shutting down" << endl; > if (run.isSome() && !killed) { > // The docker daemon might still be in progress starting the > // container, therefore we kill both the docker run process > // and also ask the daemon to stop the container. > // Making a mutable copy of the future so we can call discard. > Future(run.get()).discard(); > stop = docker->stop(containerName, stopTimeout); > killed = true; > } > } > {code} > Notice how the "run" future is discarded *before* the Docker daemon is told > to stop -- now what will discarding it do? > {code} > void commandDiscarded(const Subprocess& s, const string& cmd) > { > VLOG(1) << "'" << cmd << "' is being discarded"; > os::killtree(s.pid(), SIGKILL); > } > {code} > Oops, just sent SIGKILL to the entire process tree... > You can see another (harmless?) side effect in the Docker daemon logs, it > never gets a chance to kill the task: > {noformat} > ERROR Handler for DELETE > /v1.22/containers/mesos-f3bb39fe-8fd9-43d2-80a6-93df6a76807e-S2.0c509380-c326-4ff7-bb68-86a37b54f233 > returned error: No such container: > mesos-f3bb39fe-8fd9-43d2-80a6-93df6a76807e-S2.0c509380-c326-4ff7-bb68-86a37b54f233 > {noformat} > I suspect that the fix is wait for 'docker->stop()' to complete before > discarding the 'run' future. > Happy to provide more information if necessary. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (MESOS-4279) Docker executor truncates task's output when the task is killed.
[ https://issues.apache.org/jira/browse/MESOS-4279?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benjamin Mahler reassigned MESOS-4279: -- Assignee: Benjamin Mahler (was: Martin Bydzovsky) Apologies [~bydga] that this issue has remained open for this long. I'll take on a fix here since the docker containerizer / executor is in need of a new maintainer. > Docker executor truncates task's output when the task is killed. > > > Key: MESOS-4279 > URL: https://issues.apache.org/jira/browse/MESOS-4279 > Project: Mesos > Issue Type: Bug > Components: containerization, docker >Affects Versions: 0.25.0, 0.26.0, 0.27.2, 0.28.1 >Reporter: Martin Bydzovsky >Assignee: Benjamin Mahler >Priority: Critical > Labels: docker, mesosphere > Fix For: 1.0.0 > > > I'm implementing a graceful restarts of our mesos-marathon-docker setup and I > came to a following issue: > (it was already discussed on > https://github.com/mesosphere/marathon/issues/2876 and guys form mesosphere > got to a point that its probably a docker containerizer problem...) > To sum it up: > When i deploy simple python script to all mesos-slaves: > {code} > #!/usr/bin/python > from time import sleep > import signal > import sys > import datetime > def sigterm_handler(_signo, _stack_frame): > print "got %i" % _signo > print datetime.datetime.now().time() > sys.stdout.flush() > sleep(2) > print datetime.datetime.now().time() > print "ending" > sys.stdout.flush() > sys.exit(0) > signal.signal(signal.SIGTERM, sigterm_handler) > signal.signal(signal.SIGINT, sigterm_handler) > try: > print "Hello" > i = 0 > while True: > i += 1 > print datetime.datetime.now().time() > print "Iteration #%i" % i > sys.stdout.flush() > sleep(1) > finally: > print "Goodbye" > {code} > and I run it through Marathon like > {code:javascript} > data = { > args: ["/tmp/script.py"], > instances: 1, > cpus: 0.1, > mem: 256, > id: "marathon-test-api" > } > {code} > During the app restart I get expected result - the task receives sigterm and > dies peacefully (during my script-specified 2 seconds period) > But when i wrap this python script in a docker: > {code} > FROM node:4.2 > RUN mkdir /app > ADD . /app > WORKDIR /app > ENTRYPOINT [] > {code} > and run appropriate application by Marathon: > {code:javascript} > data = { > args: ["./script.py"], > container: { > type: "DOCKER", > docker: { > image: "bydga/marathon-test-api" > }, > forcePullImage: yes > }, > cpus: 0.1, > mem: 256, > instances: 1, > id: "marathon-test-api" > } > {code} > The task during restart (issued from marathon) dies immediately without > having a chance to do any cleanup. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (MESOS-5529) Distinguish non-revocable and revocable allocation guarantees.
Benjamin Mahler created MESOS-5529: -- Summary: Distinguish non-revocable and revocable allocation guarantees. Key: MESOS-5529 URL: https://issues.apache.org/jira/browse/MESOS-5529 Project: Mesos Issue Type: Epic Components: allocation Reporter: Benjamin Mahler Currently, the notion of fair sharing and quota do not make a distinction between revocable and non-revocable resources. However, this makes fair sharing difficult since we currently offer resources as non-revocable within the fair share and cannot perform revocation when we need to restore fairness or quota. As we move towards providing guarantees for the particular resources types, we may want to allow the operator to specify quota (absolutes) or shares (relatives) for both revocable or non-revocable resources: | |*Non-revocable*|*Revocable*| |*Quota*|absolute guarantees for non-revocable resources (well suited for service-like always running workloads)|absolute guarantees for revocable resources (useful for expressing minimum requirements of batch workload?)| |*Fair Share*|relative guarantees for non-revocable resources (e.g. backwards compatibility with old behavior)|relative guarantees for revocable resources (e.g. well suited for fair sharing in a dynamic cluster)| See MESOS-5526 for revocation support. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (MESOS-5528) Use inverse offers to reclaim resources from schedulers over their quota.
Benjamin Mahler created MESOS-5528: -- Summary: Use inverse offers to reclaim resources from schedulers over their quota. Key: MESOS-5528 URL: https://issues.apache.org/jira/browse/MESOS-5528 Project: Mesos Issue Type: Epic Components: allocation Reporter: Benjamin Mahler As we move towards distinguishing non-revocable and revocable allocation of resources, we need to enforce that the upper limits specified via quota are enforced. For example, if a scheduler has quota for non-revocable resources and there is only fair sharing turned on for revocable resources, the scheduler should not be able to consume more non-revocable resources than its quota limit. Even if mesos disallows this when tasks are launched, there are cases where the scheduler can exceed its quota: * Unreachable nodes that were not accounted for reconnect to the cluster with existing resources allocated to the scheduler's role. * The operator lowers the amount of quota for the role. In these cases and more generally, we need an always running mechanism for reclaiming excess quota allocation via inverse offers. The deadline should be configurable by the operator. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (MESOS-5527) Provide work conservation incentives for schedulers.
Benjamin Mahler created MESOS-5527: -- Summary: Provide work conservation incentives for schedulers. Key: MESOS-5527 URL: https://issues.apache.org/jira/browse/MESOS-5527 Project: Mesos Issue Type: Epic Components: allocation, framework Reporter: Benjamin Mahler As we begin to add support for schedulers to revoke resources to obtain their quota or fair share, we need to consider the case of non-cooperative or malicious schedulers that cause excessive revocation either by accident or intentionally. For example, a malicious scheduler could keep a low allocation below its fair share, and revoke as many resources as it can in order to disturb existing work as much as possible. We can provide mitigation techniques, or incentives / penalties to schedulers that cause excessive revocation: * Disallow revocation when a scheduler resources are available. The scheduler must choose available resources or wait until allocated resources free up. This means picky schedulers may not obtain the resources they want. * Penalize schedulers causing excessive revocation in order to incentivize them to play nicely. * Use a degree of pessimism to restrict which resources a scheduler can revoke (e.g. only batch tasks that have not been running for a long time). If we augment task information to know whether it is a service or a batch job we may be able to do better here. * etc The techniques employed for work conservation in the presence of revocation should be configurable, and users should be able to achieve their own custom work conservation policies by implementing an allocator (or a subcomponent of the existing allocator). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (MESOS-5526) Allow schedulers to revoke resources to obtain their quota or fair share.
Benjamin Mahler created MESOS-5526: -- Summary: Allow schedulers to revoke resources to obtain their quota or fair share. Key: MESOS-5526 URL: https://issues.apache.org/jira/browse/MESOS-5526 Project: Mesos Issue Type: Epic Components: allocation Reporter: Benjamin Mahler In order to ensure fairness and quota guarantees are met in a dynamic cluster, we need to ensure that schedulers can revoke existing revocable allocations in order to obtain their fair share or their quota. Otherwise, schedulers must wait (potentially forever!) until existing allocations are freed. This is a policy that completely favors work conservation, in favor of meeting the fairness and quota guarantees in a bounded amount of time. As we expose resource constraints to schedulers (MESOS-5524), they will be able to determine when Mesos will allow them to revoke resources. For example: * If a scheduler is below its fair share, the scheduler may revoke existing revocable resources that are offered to it. * If a scheduler is below its quota, it can revoke existing revocable resources in order to consume it for quota in a non-revocable manner. This is orthogonal to optimistic or pessimistic allocation, in that either approaches need to allow the schedulers to perform revocation in this manner. In the pessimistic approach, we may confine what the scheduler can revoke, and in an optimistic approach, we may provide more choice to the scheduler. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-5526) Allow schedulers to revoke resources to obtain their quota or fair share.
[ https://issues.apache.org/jira/browse/MESOS-5526?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benjamin Mahler updated MESOS-5526: --- Component/s: framework api > Allow schedulers to revoke resources to obtain their quota or fair share. > - > > Key: MESOS-5526 > URL: https://issues.apache.org/jira/browse/MESOS-5526 > Project: Mesos > Issue Type: Epic > Components: allocation, framework api >Reporter: Benjamin Mahler > > In order to ensure fairness and quota guarantees are met in a dynamic > cluster, we need to ensure that schedulers can revoke existing revocable > allocations in order to obtain their fair share or their quota. Otherwise, > schedulers must wait (potentially forever!) until existing allocations are > freed. This is a policy that completely favors work conservation, in favor of > meeting the fairness and quota guarantees in a bounded amount of time. > As we expose resource constraints to schedulers (MESOS-5524), they will be > able to determine when Mesos will allow them to revoke resources. For example: > * If a scheduler is below its fair share, the scheduler may revoke existing > revocable resources that are offered to it. > * If a scheduler is below its quota, it can revoke existing revocable > resources in order to consume it for quota in a non-revocable manner. > This is orthogonal to optimistic or pessimistic allocation, in that either > approaches need to allow the schedulers to perform revocation in this manner. > In the pessimistic approach, we may confine what the scheduler can revoke, > and in an optimistic approach, we may provide more choice to the scheduler. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (MESOS-5525) Allow schedulers to decide whether to consume resources as revocable or non-revocable.
Benjamin Mahler created MESOS-5525: -- Summary: Allow schedulers to decide whether to consume resources as revocable or non-revocable. Key: MESOS-5525 URL: https://issues.apache.org/jira/browse/MESOS-5525 Project: Mesos Issue Type: Epic Components: framework api, allocation Reporter: Benjamin Mahler The idea here is that although some resources may only be consumed in a revocable manner (e.g. oversubscribed resources, resources from "spot instances", etc), other resources may be consumed in a non-revocable manner (e.g. dedicated instance, on-premise machine). However, a scheduler may wish to consume these non-revocable resources in a revocable manner. For example, if the scheduler has quota for non-revocable resources it may want not want to use its quota for a particular task and may wish to launch it in a revocable manner out of its fair share. See: In order to support this, we should adjust the meaning of revocable and non-revocable resources in order to allow schedulers to decide how to consume them. The scheduler could choose to consume non-revocable resources in a revocable manner in order to use its fair share of revocable resources rather than its quota. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (MESOS-5524) Expose resource consumption constraints (quota, shares) to schedulers.
Benjamin Mahler created MESOS-5524: -- Summary: Expose resource consumption constraints (quota, shares) to schedulers. Key: MESOS-5524 URL: https://issues.apache.org/jira/browse/MESOS-5524 Project: Mesos Issue Type: Epic Components: scheduler api, allocation Reporter: Benjamin Mahler Currently, schedulers do not have visibility into their quota or shares of the cluster. By providing this information, we give the scheduler the ability to make better decisions. As we start to allow schedulers to decide how they'd like to use a particular resource (e.g. as non-revocable or revocable), schedulers need visibility into their quota and shares to make an effective decision (otherwise they may accidentally exceed their quota and will not find out until mesos replies with TASK_LOST REASON_QUOTA_EXCEEDED). We would start by exposing the following information: * quota: e.g. cpus:10, mem:20, disk:40 * shares: e.g. cpus:20, mem:40, disk:80 Currently, quota is used for non-revocable resources and the idea is to use shares only for consuming revocable resources since the number of shares available to a role changes dynamically as resources come and go, frameworks come and go, or the operator manipulates the amount of resources sectioned off for quota. By exposing quota and shares, the framework knows when it can consume additional non-revocable resources (i.e. when it has fewer non-revocable resources allocated to it than its quota) or when it can consume revocable resources (always! but in the future, it cannot revoke another user's revocable resources if the framework is above its fair share). This also allows schedulers to determine whether they have sufficient quota assigned to them, and to alert the operator if they need more to run safely. Also, by viewing their fair share, the framework can expose monitoring information that shows the discrepancy between how much it would like and its fair share (note that the framework can actually exceed its fair share but in the future this will mean increased potential for revocation). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-2082) Update the webui to include maintenance information.
[ https://issues.apache.org/jira/browse/MESOS-2082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15300569#comment-15300569 ] Benjamin Mahler commented on MESOS-2082: Happy to help get this committed if someone assists with reviewing. [~lins05] be sure to include some screenshots. > Update the webui to include maintenance information. > > > Key: MESOS-2082 > URL: https://issues.apache.org/jira/browse/MESOS-2082 > Project: Mesos > Issue Type: Task > Components: webui >Reporter: Benjamin Mahler >Assignee: Shuai Lin > Labels: mesosphere, twitter > > The simplest thing here would probably be to include another tab in the > header for maintenance information. > We could also consider adding maintenance information inline to the slaves > table. Depending on how this is done, the maintenance tab could actually be a > subset of the slaves table; only those slaves for which there is maintenance > information. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-5340) libevent builds may prevent new connections
[ https://issues.apache.org/jira/browse/MESOS-5340?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15283203#comment-15283203 ] Benjamin Mahler commented on MESOS-5340: Added a unit test for this issue here: https://reviews.apache.org/r/47362/ > libevent builds may prevent new connections > --- > > Key: MESOS-5340 > URL: https://issues.apache.org/jira/browse/MESOS-5340 > Project: Mesos > Issue Type: Bug > Components: security >Affects Versions: 0.29.0, 0.28.1 >Reporter: Till Toenshoff >Assignee: Benjamin Mahler >Priority: Blocker > Labels: mesosphere, security, ssl > Fix For: 0.29.0 > > > When using an SSL-enabled build of Mesos in combination with SSL-downgrading > support, any connection that does not actually transmit data will hang the > runnable (e.g. master). > For reproducing the issue (on any platform)... > Spin up a master with enabled SSL-downgrading: > {noformat} > $ export SSL_ENABLED=true > $ export SSL_SUPPORT_DOWNGRADE=true > $ export SSL_KEY_FILE=/path/to/your/foo.key > $ export SSL_CERT_FILE=/path/to/your/foo.crt > $ export SSL_CA_FILE=/path/to/your/ca.crt > $ ./bin/mesos-master.sh --work_dir=/tmp/foo > {noformat} > Create some artificial HTTP request load for quickly spotting the problem in > both, the master logs as well as the output of CURL itself: > {noformat} > $ while true; do sleep 0.1; echo $( date +">%H:%M:%S.%3N"; curl -s -k -A "SSL > Debug" http://localhost:5050/master/slaves; echo ;date +"<%H:%M:%S.%3N"; > echo); done > {noformat} > Now create a connection to the master that does not transmit any data: > {noformat} > $ telnet localhost 5050 > {noformat} > You should now see the CURL requests hanging, the master stops responding to > new connections. This will persist until either some data is transmitted via > the above telnet connection or it is closed. > This problem has initially been observed when running Mesos on an AWS cluster > with enabled load-balancer (which uses an idle, persistent connection) for > the master node. Such connection does naturally not transmit any data as long > as there are no external requests routed via the load-balancer. AWS allows > setting up a timeout for those connections and in our test environment, this > duration was set to 60 seconds and hence we were seeing our master getting > repetitively unresponsive for 60 seconds, then getting "unstuck" for a brief > period until it got stuck again. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-5330) Agent should backoff before connecting to the master
[ https://issues.apache.org/jira/browse/MESOS-5330?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benjamin Mahler updated MESOS-5330: --- Shepherd: Benjamin Mahler > Agent should backoff before connecting to the master > > > Key: MESOS-5330 > URL: https://issues.apache.org/jira/browse/MESOS-5330 > Project: Mesos > Issue Type: Bug >Reporter: David Robinson >Assignee: David Robinson > > When an agent is started it starts a background task (libprocess process?) to > detect the leading master. When the leading master is detected (or changes) > the [SocketManager's link() method is called and a TCP connection to the > master is > established|https://github.com/apache/mesos/blob/a138e2246a30c4b5c9bc3f7069ad12204dcaffbc/src/slave/slave.cpp#L954]. > The agent _then_ backs off before sending a ReRegisterSlave message via the > newly established connection. The agent needs to backoff _before_ attempting > to establish a TCP connection to the master, not before sending the first > message over the connection. > During scale tests at Twitter we discovered that agents can SYN flood the > master upon leader changes, then the problem described in MESOS-5200 can > occur where ephemeral connections are used, which exacerbates the problem. > The end result is a lot of hosts setting up and tearing down TCP connections > every slave_ping_timeout seconds (15 by default), connections failing to be > established, hosts being marked as unhealthy and being shutdown. We observed > ~800 passive TCP connections per second on the leading master during scale > tests. > The problem can be somewhat mitigated by tuning the kernel to handle a > thundering herd of TCP connections, but ideally there would not be a > thundering herd to begin with. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-5377) Improve DRF behavior with scarce resources.
[ https://issues.apache.org/jira/browse/MESOS-5377?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benjamin Mahler updated MESOS-5377: --- Description: The allocator currently uses the notion of Weighted [Dominant Resource Fairness|https://www.cs.berkeley.edu/~alig/papers/drf.pdf] (WDRF) to establish a linear notion of fairness across allocation roles. DRF behaves well for resources that are present within each machine in a cluster (e.g. CPUs, memory, disk). However, some resources (e.g. GPUs) are only present on a subset of machines in the cluster. Consider the behavior when there are the following agents in a cluster: 1000 agents with (cpus:4,mem:1024,disk:1024) 1 agent with (gpus:1,cpus:4,mem:1024,disk:1024) If a role wishes to use both GPU and non-GPU resources for tasks, consuming 1 GPU will lead DRF to consider the role to have a 100% share of the cluster, since it consumes 100% of the GPUs in the cluster. This framework will then not receive any other offers. Among possible improvements, fairness can have understanding of resource packages. In a sense there is 1 GPU package that is competed on and 1000 non-GPU packages competed on, and ideally a role's consumption of the single GPU package does not have a large effect on the role's access to the other 1000 non-GPU packages. In the interim, we should consider having a recommended way to deal with scarce resources in the current model. was: The allocator currently uses the notion of Weighted [Dominant Resource Fairness|https://www.cs.berkeley.edu/~alig/papers/drf.pdf] (WDRF) to establish a linear notion of fairness across allocation roles. DRF behaves well for resources that are present within each machine in a cluster (e.g. CPUs, memory, disk). However, some resources (e.g. GPUs) are only present on a subset of machines in the cluster. Consider the behavior when there are the following agents in a cluster: 1000 agents with (cpus:4,mem:1024,disk:1024) 1 agent with (gpus:1,cpus:4,mem:1024,disk:1024) If a role wishes to use both GPU and non-GPU resources for tasks, consuming 1 GPU will lead DRF to consider the role to have a 100% share of the cluster, since it consumes 100% of the GPUs in the cluster. This framework will then not receive any other offers. Among possible improvements, fairness can have understanding of resource packages. In a sense there is 1 GPU package that is competed on and 1000 non-GPU packages competed on, and consuming the GPU package does not have a large effect on the role's access to the 1000 non-GPU packages. In the interim, we should consider having a recommended way to deal with scarce resources in the current model. > Improve DRF behavior with scarce resources. > --- > > Key: MESOS-5377 > URL: https://issues.apache.org/jira/browse/MESOS-5377 > Project: Mesos > Issue Type: Epic > Components: allocation >Reporter: Benjamin Mahler > > The allocator currently uses the notion of Weighted [Dominant Resource > Fairness|https://www.cs.berkeley.edu/~alig/papers/drf.pdf] (WDRF) to > establish a linear notion of fairness across allocation roles. > DRF behaves well for resources that are present within each machine in a > cluster (e.g. CPUs, memory, disk). However, some resources (e.g. GPUs) are > only present on a subset of machines in the cluster. > Consider the behavior when there are the following agents in a cluster: > 1000 agents with (cpus:4,mem:1024,disk:1024) > 1 agent with (gpus:1,cpus:4,mem:1024,disk:1024) > If a role wishes to use both GPU and non-GPU resources for tasks, consuming 1 > GPU will lead DRF to consider the role to have a 100% share of the cluster, > since it consumes 100% of the GPUs in the cluster. This framework will then > not receive any other offers. > Among possible improvements, fairness can have understanding of resource > packages. In a sense there is 1 GPU package that is competed on and 1000 > non-GPU packages competed on, and ideally a role's consumption of the single > GPU package does not have a large effect on the role's access to the other > 1000 non-GPU packages. > In the interim, we should consider having a recommended way to deal with > scarce resources in the current model. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (MESOS-5377) Improve DRF behavior with scarce resources.
Benjamin Mahler created MESOS-5377: -- Summary: Improve DRF behavior with scarce resources. Key: MESOS-5377 URL: https://issues.apache.org/jira/browse/MESOS-5377 Project: Mesos Issue Type: Epic Components: allocation Reporter: Benjamin Mahler The allocator currently uses the notion of Weighted [Dominant Resource Fairness|https://www.cs.berkeley.edu/~alig/papers/drf.pdf] (WDRF) to establish a linear notion of fairness across allocation roles. DRF behaves well for resources that are present within each machine in a cluster (e.g. CPUs, memory, disk). However, some resources (e.g. GPUs) are only present on a subset of machines in the cluster. Consider the behavior when there are the following agents in a cluster: 1000 agents with (cpus:4,mem:1024,disk:1024) 1 agent with (gpus:1,cpus:4,mem:1024,disk:1024) If a role wishes to use both GPU and non-GPU resources for tasks, consuming 1 GPU will lead DRF to consider the role to have a 100% share of the cluster, since it consumes 100% of the GPUs in the cluster. This framework will then not receive any other offers. Among possible improvements, fairness can have understanding of resource packages. In a sense there is 1 GPU package that is competed on and 1000 non-GPU packages competed on, and consuming the GPU package does not have a large effect on the role's access to the 1000 non-GPU packages. In the interim, we should consider having a recommended way to deal with scarce resources in the current model. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-5340) libevent builds may prevent new connections
[ https://issues.apache.org/jira/browse/MESOS-5340?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benjamin Mahler updated MESOS-5340: --- Shepherd: Joris Van Remoortere Assignee: Benjamin Mahler [~jvanremoortere] I took a look and have a proposed a fix here: https://reviews.apache.org/r/47192/ > libevent builds may prevent new connections > --- > > Key: MESOS-5340 > URL: https://issues.apache.org/jira/browse/MESOS-5340 > Project: Mesos > Issue Type: Bug > Components: security >Affects Versions: 0.29.0, 0.28.1 >Reporter: Till Toenshoff >Assignee: Benjamin Mahler >Priority: Blocker > Labels: mesosphere, security, ssl > > When using an SSL-enabled build of Mesos in combination with SSL-downgrading > support, any connection that does not actually transmit data will hang the > runnable (e.g. master). > For reproducing the issue (on any platform)... > Spin up a master with enabled SSL-downgrading: > {noformat} > $ export SSL_ENABLED=true > $ export SSL_SUPPORT_DOWNGRADE=true > $ export SSL_KEY_FILE=/path/to/your/foo.key > $ export SSL_CERT_FILE=/path/to/your/foo.crt > $ export SSL_CA_FILE=/path/to/your/ca.crt > $ ./bin/mesos-master.sh --work_dir=/tmp/foo > {noformat} > Create some artificial HTTP request load for quickly spotting the problem in > both, the master logs as well as the output of CURL itself: > {noformat} > $ while true; do sleep 0.1; echo $( date +">%H:%M:%S.%3N"; curl -s -k -A "SSL > Debug" http://localhost:5050/master/slaves; echo ;date +"<%H:%M:%S.%3N"; > echo); done > {noformat} > Now create a connection to the master that does not transmit any data: > {noformat} > $ telnet localhost 5050 > {noformat} > You should now see the CURL requests hanging, the master stops responding to > new connections. This will persist until either some data is transmitted via > the above telnet connection or it is closed. > This problem has initially been observed when running Mesos on an AWS cluster > with enabled load-balancer (which uses an idle, persistent connection) for > the master node. Such connection does naturally not transmit any data as long > as there are no external requests routed via the load-balancer. AWS allows > setting up a timeout for those connections and in our test environment, this > duration was set to 60 seconds and hence we were seeing our master getting > repetitively unresponsive for 60 seconds, then getting "unstuck" for a brief > period until it got stuck again. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-4658) process::Connection can lead to process::wait deadlock
[ https://issues.apache.org/jira/browse/MESOS-4658?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benjamin Mahler updated MESOS-4658: --- Assignee: Benjamin Mahler (was: Anand Mazumdar) Description: The {{Connection}} abstraction is prone to deadlocks arising from last reference of {{Connection}} getting destructed by the {{ConnectionProcess}} execution context, at which point {{ConnectionProcess}} waits on itself (deadlock). Consider this example: {code} Option connection = process::http::connect(...).get(); // When the ConnectionProcess completes the Future, if 'connection' // is the last copy of the Connection it will wait on itself! connection.disconnected() .onAny(defer(self(), &SomeFunc, connection)); connection.disconnect(); connection = None(); {code} In the above snippet, deadlock can occur as follows: 1. {{Connection = None() executes}}, the last copy of the {{Connection}} remains within the disconnected Future. 2. {{ConnectionProcess::disconnect}} completes the disconnection Future and executes SomeFunc. The Future then clears the callbacks which destructs the last copy of the {{Connection}}. 3. {{Connection::~Data}} waits on the {{ConnectionProcess}} from within the {{ConnectionProcess}} execution context. Deadlock. We do have a snippet in our existing code that alludes to such occurrences happening: https://github.com/apache/mesos/blob/master/3rdparty/libprocess/src/http.cpp#L1325 {code} // This is a one time request which will close the connection when // the response is received. Since 'Connection' is reference-counted, // we must keep a copy around until the disconnection occurs. Note // that in order to avoid a deadlock (Connection destruction occurring // from the ConnectionProcess execution context), we use 'async'. {code} was: The {{Connection}} abstraction is prone to deadlocks arising from the object being destroyed inside the same execution context. Consider this example: {code} Option connection = process::http::connect(...).get(); connection.disconnected() .onAny(defer(self(), &SomeFunc, connection)); connection.disconnect(); connection = None(); {code} In the above snippet, if the {{connection = None()}} gets executed first before the actual dispatch to {{ConnectionProcess}} happens. You might loose the only existing reference to {{Connection}} object inside {{ConnectionProcess::disconnect}}. This would lead to the destruction of the {{Connection}} object in the {{ConnectionProcess}} execution context. We do have a snippet in our existing code that alludes to such occurrences happening: https://github.com/apache/mesos/blob/master/3rdparty/libprocess/src/http.cpp#L1325 {code} // This is a one time request which will close the connection when // the response is received. Since 'Connection' is reference-counted, // we must keep a copy around until the disconnection occurs. Note // that in order to avoid a deadlock (Connection destruction occurring // from the ConnectionProcess execution context), we use 'async'. {code} AFAICT, for scenarios where we need to hold on to the {{Connection}} object for later, this approach does not suffice. Summary: process::Connection can lead to process::wait deadlock (was: process::Connection can lead to deadlock around execution in the same context.) > process::Connection can lead to process::wait deadlock > -- > > Key: MESOS-4658 > URL: https://issues.apache.org/jira/browse/MESOS-4658 > Project: Mesos > Issue Type: Bug > Components: HTTP API, libprocess >Reporter: Anand Mazumdar >Assignee: Benjamin Mahler > Labels: mesosphere > > The {{Connection}} abstraction is prone to deadlocks arising from last > reference of {{Connection}} getting destructed by the {{ConnectionProcess}} > execution context, at which point {{ConnectionProcess}} waits on itself > (deadlock). > Consider this example: > {code} > Option connection = process::http::connect(...).get(); > // When the ConnectionProcess completes the Future, if 'connection' > // is the last copy of the Connection it will wait on itself! > connection.disconnected() > .onAny(defer(self(), &SomeFunc, connection)); > connection.disconnect(); > connection = None(); > {code} > In the above snippet, deadlock can occur as follows: > 1. {{Connection = None() executes}}, the last copy of the {{Connection}} > remains within the disconnected Future. > 2. {{ConnectionProcess::disconnect}} completes the disconnection Future and > executes SomeFunc. The Future then clears the callbacks which destructs the > last copy of the {{Connection}}. > 3. {{Connection::~Data}} waits on the {{ConnectionProcess}} from within the > {{ConnectionProcess}} execution context. Deadlock. > We do have a snippet in our existing code that a
[jira] [Commented] (MESOS-5263) pivot_root is not available on ARM
[ https://issues.apache.org/jira/browse/MESOS-5263?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15274701#comment-15274701 ] Benjamin Mahler commented on MESOS-5263: {noformat} commit 547f4a4d122253a42819d5746cf51593923a56bc Author: Tomasz Janiszewski Date: Fri May 6 13:42:08 2016 -0700 Removed architecture specific syscalls for pivot root. The workarounds were put in place for systems with an old glibc but a new kernel. The header provides the __NR_pivot_root symbol. Review: https://reviews.apache.org/r/46730/ {noformat} > pivot_root is not available on ARM > -- > > Key: MESOS-5263 > URL: https://issues.apache.org/jira/browse/MESOS-5263 > Project: Mesos > Issue Type: Bug >Reporter: Tomasz Janiszewski >Assignee: Tomasz Janiszewski > Fix For: 0.29.0 > > > When compile on ARM, it will through error. > The current code logic in src/linux/fs.cpp is: > {code} > #ifdef __NR_pivot_root > int ret = ::syscall(__NR_pivot_root, newRoot.c_str(), putOld.c_str()); > #elif __x86_64__ > // A workaround for systems that have an old glib but have a new > // kernel. The magic number '155' is the syscall number for > // 'pivot_root' on the x86_64 architecture, see > // arch/x86/syscalls/syscall_64.tbl > int ret = ::syscall(155, newRoot.c_str(), putOld.c_str()); > #elif __powerpc__ || __ppc__ || __powerpc64__ || __ppc64__ > // A workaround for powerpc. The magic number '203' is the syscall > // number for 'pivot_root' on the powerpc architecture, see > // https://w3challs.com/syscalls/?arch=powerpc_64 > int ret = ::syscall(203, newRoot.c_str(), putOld.c_str()); > #else > #error "pivot_root is not available" > #endif > {code} > Possible sollution is to add `unistd.h` header -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-5193) Recovery failed: Failed to recover registrar on reboot of mesos master
[ https://issues.apache.org/jira/browse/MESOS-5193?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benjamin Mahler updated MESOS-5193: --- Attachment: full.log I've attached an interleaved version of the log where each line is prefixed with the node number. You can see the recovery failures of node3 then node1 then node2 towards the end. Interestingly, I took a look with [~jieyu] and it appears there may have been some message loss, or connectivity issues: (1) when node3 gets elected, node2 appears to be offline, it broadcasts an implicit promise request to node3 (itself) and node1. *This message is not received by node1 for some reason.* (2) after node3 dies, node1 broadcasts an implicit promise request to node1 (itself) and node2. *This message is not received by node2 for some reason.* After this point, only node2 remains, and we do not have quorum. {quote} Although, once a master process gets killed the service gets terminated as well. {quote} Can you fix that so that the masters are restarted? That is a requirement for running HA masters, otherwise we cannot maintain a quorum. > Recovery failed: Failed to recover registrar on reboot of mesos master > -- > > Key: MESOS-5193 > URL: https://issues.apache.org/jira/browse/MESOS-5193 > Project: Mesos > Issue Type: Bug > Components: master >Affects Versions: 0.22.0, 0.27.0 >Reporter: Priyanka Gupta > Labels: master, mesosphere > Attachments: full.log, node1.log, node1_after_work_dir.log, > node2.log, node2_after_work_dir.log, node3.log, node3_after_work_dir.log > > > Hi all, > We are using a 3 node cluster with mesos master, mesos slave and zookeeper on > all of them. We are using chronos on top of it. The problem is when we reboot > the mesos master leader, the other nodes try to get elected as leader but > fail with recovery registrar issue. > "Recovery failed: Failed to recover registrar: Failed to perform fetch within > 1mins" > The next node then try to become the leader but again fails with same error. > I am not sure about the issue. We are currently using mesos 0.22 and also > tried to upgrade to mesos 0.27 as well but the problem continues to happen. > /usr/sbin/mesos-master --work_dir=/tmp/mesos_dir > --zk=zk://node1:2181,node2:2181,node3:2181/mesos --quorum=2 > Can you please help us resolve this issue as its a production system. > Thanks, > Priyanka -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (MESOS-5193) Recovery failed: Failed to recover registrar on reboot of mesos master
[ https://issues.apache.org/jira/browse/MESOS-5193?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15264584#comment-15264584 ] Benjamin Mahler edited comment on MESOS-5193 at 4/29/16 7:19 PM: - [~prigupta] Looking at the logs, there was a ~ 3 minute window of time in which the masters were experiencing ZooKeeper connectivity issues (from 18:33 - 18:36). Have you noticed this? Also we require that the masters are run under supervision, are you ensuring that the master are being promptly restarted if they terminate? Since the recovery timeout is 1 minute by default, I would suggest a supervision restart that is much smaller, like 10 seconds. Were the masters restarted after the last recovery failures here? {noformat} Master 1: W0429 18:33:08.726205 2518 logging.cpp:88] RAW: Received signal SIGTERM from process 2938 of user 0; exiting I0429 18:33:28.846740 1083 main.cpp:230] Build: 2016-04-13 23:22:05 by screwdrv I0429 18:37:26.008154 1134 master.cpp:1723] Elected as the leading master! F0429 18:38:26.008847 1127 master.cpp:1457] Recovery failed: Failed to recover registrar: Failed to perform fetch within 1mins Master 2: W0429 18:36:04.716518 2410 logging.cpp:88] RAW: Received signal SIGTERM from process 3029 of user 0; exiting I0429 18:36:30.429669 1091 main.cpp:230] Build: 2016-04-13 23:22:05 by screwdrv I0429 18:38:34.699726 1144 master.cpp:1723] Elected as the leading master! F0429 18:39:34.715205 1139 master.cpp:1457] Recovery failed: Failed to recover registrar: Failed to perform fetch within 1mins Master 3: I0429 18:32:12.877344 7962 main.cpp:230] Build: 2016-04-13 23:22:05 by screwdrv I0429 18:36:16.489387 7963 master.cpp:1723] Elected as the leading master! F0429 18:37:16.490408 7967 master.cpp:1457] Recovery failed: Failed to recover registrar: Failed to perform fetch within 1mins {noformat} If they were restarted and the ZooKeeper connectivity was resolved, the masters should have been able to get back up and running. was (Author: bmahler): [~prigupta] Looking at the logs, there was a ~ 3 minute window of time in which the masters were experiencing ZooKeeper connectivity issues (from 18:33 - 18:36). Have you noticed this? Also we require that the masters are run under supervision, are you ensuring that the master are being promptly restarted if they terminate? Since the recovery timeout is 1 minute by default, I would suggest something much smaller, like 10 seconds. Were the masters restarted after the last recovery failures here? {noformat} Master 1: W0429 18:33:08.726205 2518 logging.cpp:88] RAW: Received signal SIGTERM from process 2938 of user 0; exiting I0429 18:33:28.846740 1083 main.cpp:230] Build: 2016-04-13 23:22:05 by screwdrv I0429 18:37:26.008154 1134 master.cpp:1723] Elected as the leading master! F0429 18:38:26.008847 1127 master.cpp:1457] Recovery failed: Failed to recover registrar: Failed to perform fetch within 1mins Master 2: W0429 18:36:04.716518 2410 logging.cpp:88] RAW: Received signal SIGTERM from process 3029 of user 0; exiting I0429 18:36:30.429669 1091 main.cpp:230] Build: 2016-04-13 23:22:05 by screwdrv I0429 18:38:34.699726 1144 master.cpp:1723] Elected as the leading master! F0429 18:39:34.715205 1139 master.cpp:1457] Recovery failed: Failed to recover registrar: Failed to perform fetch within 1mins Master 3: I0429 18:32:12.877344 7962 main.cpp:230] Build: 2016-04-13 23:22:05 by screwdrv I0429 18:36:16.489387 7963 master.cpp:1723] Elected as the leading master! F0429 18:37:16.490408 7967 master.cpp:1457] Recovery failed: Failed to recover registrar: Failed to perform fetch within 1mins {noformat} If they were restarted and the ZooKeeper connectivity was resolved, the masters should have been able to get back up and running. > Recovery failed: Failed to recover registrar on reboot of mesos master > -- > > Key: MESOS-5193 > URL: https://issues.apache.org/jira/browse/MESOS-5193 > Project: Mesos > Issue Type: Bug > Components: master >Affects Versions: 0.22.0, 0.27.0 >Reporter: Priyanka Gupta > Labels: master, mesosphere > Attachments: node1.log, node1_after_work_dir.log, node2.log, > node2_after_work_dir.log, node3.log, node3_after_work_dir.log > > > Hi all, > We are using a 3 node cluster with mesos master, mesos slave and zookeeper on > all of them. We are using chronos on top of it. The problem is when we reboot > the mesos master leader, the other nodes try to get elected as leader but > fail with recovery registrar issue. > "Recovery failed: Failed to recover registrar: Failed to perform fetch within > 1mins" > The next node then try to become the leader but again fails with same error. > I am not sure about the issue. We are c
[jira] [Commented] (MESOS-5193) Recovery failed: Failed to recover registrar on reboot of mesos master
[ https://issues.apache.org/jira/browse/MESOS-5193?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15264584#comment-15264584 ] Benjamin Mahler commented on MESOS-5193: [~prigupta] Looking at the logs, there was a ~ 3 minute window of time in which the masters were experiencing ZooKeeper connectivity issues (from 18:33 - 18:36). Have you noticed this? Also we require that the masters are run under supervision, are you ensuring that the master are being promptly restarted if they terminate? Since the recovery timeout is 1 minute by default, I would suggest something much smaller, like 10 seconds. Were the masters restarted after the last recovery failures here? {noformat} Master 1: W0429 18:33:08.726205 2518 logging.cpp:88] RAW: Received signal SIGTERM from process 2938 of user 0; exiting I0429 18:33:28.846740 1083 main.cpp:230] Build: 2016-04-13 23:22:05 by screwdrv I0429 18:37:26.008154 1134 master.cpp:1723] Elected as the leading master! F0429 18:38:26.008847 1127 master.cpp:1457] Recovery failed: Failed to recover registrar: Failed to perform fetch within 1mins Master 2: W0429 18:36:04.716518 2410 logging.cpp:88] RAW: Received signal SIGTERM from process 3029 of user 0; exiting I0429 18:36:30.429669 1091 main.cpp:230] Build: 2016-04-13 23:22:05 by screwdrv I0429 18:38:34.699726 1144 master.cpp:1723] Elected as the leading master! F0429 18:39:34.715205 1139 master.cpp:1457] Recovery failed: Failed to recover registrar: Failed to perform fetch within 1mins Master 3: I0429 18:32:12.877344 7962 main.cpp:230] Build: 2016-04-13 23:22:05 by screwdrv I0429 18:36:16.489387 7963 master.cpp:1723] Elected as the leading master! F0429 18:37:16.490408 7967 master.cpp:1457] Recovery failed: Failed to recover registrar: Failed to perform fetch within 1mins {noformat} If they were restarted and the ZooKeeper connectivity was resolved, the masters should have been able to get back up and running. > Recovery failed: Failed to recover registrar on reboot of mesos master > -- > > Key: MESOS-5193 > URL: https://issues.apache.org/jira/browse/MESOS-5193 > Project: Mesos > Issue Type: Bug > Components: master >Affects Versions: 0.22.0, 0.27.0 >Reporter: Priyanka Gupta > Labels: master, mesosphere > Attachments: node1.log, node1_after_work_dir.log, node2.log, > node2_after_work_dir.log, node3.log, node3_after_work_dir.log > > > Hi all, > We are using a 3 node cluster with mesos master, mesos slave and zookeeper on > all of them. We are using chronos on top of it. The problem is when we reboot > the mesos master leader, the other nodes try to get elected as leader but > fail with recovery registrar issue. > "Recovery failed: Failed to recover registrar: Failed to perform fetch within > 1mins" > The next node then try to become the leader but again fails with same error. > I am not sure about the issue. We are currently using mesos 0.22 and also > tried to upgrade to mesos 0.27 as well but the problem continues to happen. > /usr/sbin/mesos-master --work_dir=/tmp/mesos_dir > --zk=zk://node1:2181,node2:2181,node3:2181/mesos --quorum=2 > Can you please help us resolve this issue as its a production system. > Thanks, > Priyanka -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-5263) pivot_root is not available on ARM
[ https://issues.apache.org/jira/browse/MESOS-5263?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benjamin Mahler updated MESOS-5263: --- Shepherd: Benjamin Mahler > pivot_root is not available on ARM > -- > > Key: MESOS-5263 > URL: https://issues.apache.org/jira/browse/MESOS-5263 > Project: Mesos > Issue Type: Bug >Reporter: Tomasz Janiszewski >Assignee: Tomasz Janiszewski > Fix For: 0.29.0 > > > When compile on ARM, it will through error. > The current code logic in src/linux/fs.cpp is: > {code} > #ifdef __NR_pivot_root > int ret = ::syscall(__NR_pivot_root, newRoot.c_str(), putOld.c_str()); > #elif __x86_64__ > // A workaround for systems that have an old glib but have a new > // kernel. The magic number '155' is the syscall number for > // 'pivot_root' on the x86_64 architecture, see > // arch/x86/syscalls/syscall_64.tbl > int ret = ::syscall(155, newRoot.c_str(), putOld.c_str()); > #elif __powerpc__ || __ppc__ || __powerpc64__ || __ppc64__ > // A workaround for powerpc. The magic number '203' is the syscall > // number for 'pivot_root' on the powerpc architecture, see > // https://w3challs.com/syscalls/?arch=powerpc_64 > int ret = ::syscall(203, newRoot.c_str(), putOld.c_str()); > #else > #error "pivot_root is not available" > #endif > {code} > Possible sollution is to add `unistd.h` header -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (MESOS-4869) /usr/libexec/mesos/mesos-health-check using/leaking a lot of memory
[ https://issues.apache.org/jira/browse/MESOS-4869?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benjamin Mahler reassigned MESOS-4869: -- Assignee: Benjamin Mahler > /usr/libexec/mesos/mesos-health-check using/leaking a lot of memory > --- > > Key: MESOS-4869 > URL: https://issues.apache.org/jira/browse/MESOS-4869 > Project: Mesos > Issue Type: Bug >Affects Versions: 0.27.1 >Reporter: Anthony Scalisi >Assignee: Benjamin Mahler >Priority: Critical > Labels: health-check > Fix For: 0.26.1, 0.25.1, 0.24.2, 0.28.1, 0.27.3 > > > We switched our health checks in Marathon from HTTP to COMMAND: > {noformat} > "healthChecks": [ > { > "protocol": "COMMAND", > "path": "/ops/ping", > "command": { "value": "curl --silent -f -X GET > http://$HOST:$PORT0/ops/ping > /dev/null" }, > "gracePeriodSeconds": 90, > "intervalSeconds": 2, > "portIndex": 0, > "timeoutSeconds": 5, > "maxConsecutiveFailures": 3 > } > ] > {noformat} > All our applications have the same health check (and /ops/ping endpoint). > Even though we have the issue on all our Meos slaves, I'm going to focus on a > particular one: *mesos-slave-i-e3a9c724*. > The slave has 16 gigs of memory, with about 12 gigs allocated for 8 tasks: > !https://i.imgur.com/gbRf804.png! > Here is a *docker ps* on it: > {noformat} > root@mesos-slave-i-e3a9c724 # docker ps > CONTAINER IDIMAGE COMMAND CREATED >STATUS PORTS NAMES > 4f7c0aa8d03ajava:8 "/bin/sh -c 'JAVA_OPT" 6 hours ago >Up 6 hours 0.0.0.0:31926->8080/tcp > mesos-29e183be-f611-41b4-824c-2d05b052231b-S6.3dbb1004-5bb8-432f-8fd8-b863bd29341d > 66f2fc8f8056java:8 "/bin/sh -c 'JAVA_OPT" 6 hours ago >Up 6 hours 0.0.0.0:31939->8080/tcp > mesos-29e183be-f611-41b4-824c-2d05b052231b-S6.60972150-b2b1-45d8-8a55-d63e81b8372a > f7382f241fcejava:8 "/bin/sh -c 'JAVA_OPT" 6 hours ago >Up 6 hours 0.0.0.0:31656->8080/tcp > mesos-29e183be-f611-41b4-824c-2d05b052231b-S6.39731a2f-d29e-48d1-9927-34ab8c5f557d > 880934c0049ejava:8 "/bin/sh -c 'JAVA_OPT" 24 hours ago >Up 24 hours 0.0.0.0:31371->8080/tcp > mesos-29e183be-f611-41b4-824c-2d05b052231b-S6.23dfe408-ab8f-40be-bf6f-ce27fe885ee0 > 5eab1f8dac4ajava:8 "/bin/sh -c 'JAVA_OPT" 46 hours ago >Up 46 hours 0.0.0.0:31500->8080/tcp > mesos-29e183be-f611-41b4-824c-2d05b052231b-S6.5ac75198-283f-4349-a220-9e9645b313e7 > b63740fe56e7java:8 "/bin/sh -c 'JAVA_OPT" 46 hours ago >Up 46 hours 0.0.0.0:31382->8080/tcp > mesos-29e183be-f611-41b4-824c-2d05b052231b-S6.5d417f16-df24-49d5-a5b0-38a7966460fe > 5c7a9ea77b0ejava:8 "/bin/sh -c 'JAVA_OPT" 2 days ago >Up 2 days 0.0.0.0:31186->8080/tcp > mesos-29e183be-f611-41b4-824c-2d05b052231b-S6.b05043c5-44fc-40bf-aea2-10354e8f5ab4 > 53065e7a31adjava:8 "/bin/sh -c 'JAVA_OPT" 2 days ago >Up 2 days 0.0.0.0:31839->8080/tcp > mesos-29e183be-f611-41b4-824c-2d05b052231b-S6.f0a3f4c5-ecdb-4f97-bede-d744feda670c > {noformat} > Here is a *docker stats* on it: > {noformat} > root@mesos-slave-i-e3a9c724 # docker stats > CONTAINER CPU % MEM USAGE / LIMIT MEM % > NET I/O BLOCK I/O > 4f7c0aa8d03a2.93% 797.3 MB / 1.611 GB 49.50% > 1.277 GB / 1.189 GB 155.6 kB / 151.6 kB > 53065e7a31ad8.30% 738.9 MB / 1.611 GB 45.88% > 419.6 MB / 554.3 MB 98.3 kB / 61.44 kB > 5c7a9ea77b0e4.91% 1.081 GB / 1.611 GB 67.10% > 423 MB / 526.5 MB 3.219 MB / 61.44 kB > 5eab1f8dac4a3.13% 1.007 GB / 1.611 GB 62.53% > 2.737 GB / 2.564 GB 6.566 MB / 118.8 kB > 66f2fc8f80563.15% 768.1 MB / 1.611 GB 47.69% > 258.5 MB / 252.8 MB 1.86 MB / 151.6 kB > 880934c0049e10.07% 735.1 MB / 1.611 GB 45.64% > 1.451 GB / 1.399 GB 573.4 kB / 94.21 kB > b63740fe56e712.04% 629 MB / 1.611 GB 39.06% > 10.29 GB / 9.344 GB 8.102 MB / 61.44 kB > f7382f241fce6.21% 505 MB / 1.611 GB 31.36% > 153.4 MB / 151.9 MB 5.837 MB / 94.21 kB > {noformat} > Not much else is running on the slave, yet the used memory doesn't map to the > tasks memory: > {noformat} > Mem:16047M used:13340M buffers:1139M cache:776M > {noformat} > If I
[jira] [Updated] (MESOS-4705) Linux 'perf' parsing logic may fail when OS distribution has perf backports.
[ https://issues.apache.org/jira/browse/MESOS-4705?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benjamin Mahler updated MESOS-4705: --- Summary: Linux 'perf' parsing logic may fail when OS distribution has perf backports. (was: Slave failed to sample container with perf event) > Linux 'perf' parsing logic may fail when OS distribution has perf backports. > > > Key: MESOS-4705 > URL: https://issues.apache.org/jira/browse/MESOS-4705 > Project: Mesos > Issue Type: Bug > Components: cgroups, isolation >Affects Versions: 0.27.1 >Reporter: Fan Du >Assignee: Fan Du > Fix For: 0.29.0, 0.27.3, 0.28.2, 0.26.2 > > > When sampling container with perf event on Centos7 with kernel > 3.10.0-123.el7.x86_64, slave complained with below error spew: > {code} > E0218 16:32:00.591181 8376 perf_event.cpp:408] Failed to get perf sample: > Failed to parse perf sample: Failed to parse perf sample line > '25871993253,,cycles,mesos/5f23ffca-87ed-4ff6-84f2-6ec3d4098ab8,10059827422,100.00': > Unexpected number of fields > {code} > it's caused by the current perf format [assumption | > https://git-wip-us.apache.org/repos/asf?p=mesos.git;a=blob;f=src/linux/perf.cpp;h=1c113a2b3f57877e132bbd65e01fb2f045132128;hb=HEAD#l430] > with kernel version below 3.12 > On 3.10.0-123.el7.x86_64 kernel, the format is with 6 tokens as below: > value,unit,event,cgroup,running,ratio > A local modification fixed this error on my test bed, please review this > ticket. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (MESOS-4705) Slave failed to sample container with perf event
[ https://issues.apache.org/jira/browse/MESOS-4705?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15257195#comment-15257195 ] Benjamin Mahler edited comment on MESOS-4705 at 4/25/16 10:49 PM: -- [~fan.du] this is now committed, I'll backport this onto the stable branches for 0.28.x, 0.27.x, 0.26.x: {noformat} commit a5c81d4077400892cd3a5c306143f16903aac62c Author: fan du Date: Mon Apr 25 13:50:50 2016 -0700 Fixed the 'perf' parsing logic. Previously the 'perf' parsing logic used the kernel version to determine the token ordering. However, this approach breaks when distributions backport perf parsing changes onto older kernel versions. This updates the parsing logic to understand all existing formats. Co-authored with haosdent. Review: https://reviews.apache.org/r/44379/ {noformat} was (Author: bmahler): [~fan.du] this is now committed, I'll backport this onto the stable branches for 0.28.x, 0.27.x, 0.26.x. > Slave failed to sample container with perf event > > > Key: MESOS-4705 > URL: https://issues.apache.org/jira/browse/MESOS-4705 > Project: Mesos > Issue Type: Bug > Components: cgroups, isolation >Affects Versions: 0.27.1 >Reporter: Fan Du >Assignee: Fan Du > Fix For: 0.29.0, 0.27.3, 0.28.2, 0.26.2 > > > When sampling container with perf event on Centos7 with kernel > 3.10.0-123.el7.x86_64, slave complained with below error spew: > {code} > E0218 16:32:00.591181 8376 perf_event.cpp:408] Failed to get perf sample: > Failed to parse perf sample: Failed to parse perf sample line > '25871993253,,cycles,mesos/5f23ffca-87ed-4ff6-84f2-6ec3d4098ab8,10059827422,100.00': > Unexpected number of fields > {code} > it's caused by the current perf format [assumption | > https://git-wip-us.apache.org/repos/asf?p=mesos.git;a=blob;f=src/linux/perf.cpp;h=1c113a2b3f57877e132bbd65e01fb2f045132128;hb=HEAD#l430] > with kernel version below 3.12 > On 3.10.0-123.el7.x86_64 kernel, the format is with 6 tokens as below: > value,unit,event,cgroup,running,ratio > A local modification fixed this error on my test bed, please review this > ticket. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-5255) Add GPUs to container resource consumption metrics.
[ https://issues.apache.org/jira/browse/MESOS-5255?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benjamin Mahler updated MESOS-5255: --- Summary: Add GPUs to container resource consumption metrics. (was: Add support for GPU usage in metrics endpoint) > Add GPUs to container resource consumption metrics. > --- > > Key: MESOS-5255 > URL: https://issues.apache.org/jira/browse/MESOS-5255 > Project: Mesos > Issue Type: Task >Reporter: Kevin Klues > Labels: gpu > > Currently the usage callback in the Nvidia GPU isolator is unimplemented: > {noformat} > src/slave/containerizer/mesos/isolators/cgroups/devices/gpus/nvidia.cpp > {noformat} > It should use functionality from NVML to gather the current GPU usage and add > it to a ResourceStatistics object. It is still an open question as to exactly > what information we want to expose here (power, memory consumption, current > load, etc.). Whatever we decide on should be standard across different GPU > types, different GPU vendors, etc. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-2638) Add support for Optional parameters to protobuf handlers to wrap option fields
[ https://issues.apache.org/jira/browse/MESOS-2638?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benjamin Mahler updated MESOS-2638: --- Description: We currently don't have a way to install a protobuf handler for an option field where the handler takes an Optional parameter of the 'option' field in the protobuf message. The goal is to be able to do: {code:title=example|borderStyle=solid} message Person { required string name = 1; option uint32_t age = 2; } void person(const std::string& name, const Option& age) { if (age.isSome()) { ... } } install( person, &Person::name, &Person::age); {code} We can then use this to test whether the field was provided, as opposed to capturing a reference to a default constructed value of the the type. For now, the workaround is to use the take the entire message in the handler: {code} void person(const Person& person) { if (person.has_age()) { ... } } install(person); {code} was: We currently don't have a way to install a protobuf handler for an option field where the handler takes an Optional parameter of the 'option' field in the protobuf message. The goal is to be able to do: {code:title=example|borderStyle=solid} message Person { required string name = 1; option uint32_t age = 2; } void person(const std::string& name, const Option& age) { if (age.isKnown()) { ... } } install(person, &Person::name, &Person::age); {code} We can then use this to test whether the field was provided, as opposed to capturing a reference to a default constructed value of the the type. > Add support for Optional parameters to protobuf handlers to wrap option fields > -- > > Key: MESOS-2638 > URL: https://issues.apache.org/jira/browse/MESOS-2638 > Project: Mesos > Issue Type: Improvement > Components: libprocess >Reporter: Joris Van Remoortere > > We currently don't have a way to install a protobuf handler for an option > field where the handler takes an Optional parameter of the 'option' field in > the protobuf message. The goal is to be able to do: > {code:title=example|borderStyle=solid} > message Person { > required string name = 1; > option uint32_t age = 2; > } > void person(const std::string& name, const Option& age) > { > if (age.isSome()) { ... } > } > install( > person, > &Person::name, > &Person::age); > {code} > We can then use this to test whether the field was provided, as opposed to > capturing a reference to a default constructed value of the the type. > For now, the workaround is to use the take the entire message in the handler: > {code} > void person(const Person& person) > { > if (person.has_age()) { ... } > } > install(person); > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-5031) Authorization Action enum does not support upgrades.
[ https://issues.apache.org/jira/browse/MESOS-5031?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15246807#comment-15246807 ] Benjamin Mahler commented on MESOS-5031: [~adam-mesos] [~yongtang] see my comment in the review on preferring an explicit case statement in favor of using 'default': https://reviews.apache.org/r/45342/ See context in MESOS-2664 and MESOS-3754. > Authorization Action enum does not support upgrades. > > > Key: MESOS-5031 > URL: https://issues.apache.org/jira/browse/MESOS-5031 > Project: Mesos > Issue Type: Bug >Affects Versions: 0.29.0 >Reporter: Adam B >Assignee: Yong Tang > Labels: mesosphere, security > Fix For: 0.29.0 > > > We need to make the Action enum optional in authorization::Request, and add > an `UNKNOWN = 0;` enum value. See MESOS-4997 for details. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-2331) MasterSlaveReconciliationTest.ReconcileRace is flaky
[ https://issues.apache.org/jira/browse/MESOS-2331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15246251#comment-15246251 ] Benjamin Mahler commented on MESOS-2331: [~haosd...@gmail.com] good eye! I had the same conclusion last week when I took a look, but didn't send out a patch. Here is my version of the fix: https://reviews.apache.org/r/46339/ > MasterSlaveReconciliationTest.ReconcileRace is flaky > > > Key: MESOS-2331 > URL: https://issues.apache.org/jira/browse/MESOS-2331 > Project: Mesos > Issue Type: Bug > Components: test >Affects Versions: 0.22.0 >Reporter: Yan Xu >Assignee: Qian Zhang > Labels: flaky > > {noformat:title=} > [ RUN ] MasterSlaveReconciliationTest.ReconcileRace > Using temporary directory > '/tmp/MasterSlaveReconciliationTest_ReconcileRace_NE9nhV' > I0206 19:09:44.196542 32362 leveldb.cpp:175] Opened db in 38.230192ms > I0206 19:09:44.206826 32362 leveldb.cpp:182] Compacted db in 9.988493ms > I0206 19:09:44.207164 32362 leveldb.cpp:197] Created db iterator in 29979ns > I0206 19:09:44.207641 32362 leveldb.cpp:203] Seeked to beginning of db in > 4478ns > I0206 19:09:44.207929 32362 leveldb.cpp:272] Iterated through 0 keys in the > db in 737ns > I0206 19:09:44.208222 32362 replica.cpp:743] Replica recovered with log > positions 0 -> 0 with 1 holes and 0 unlearned > I0206 19:09:44.209132 32384 recover.cpp:448] Starting replica recovery > I0206 19:09:44.209524 32384 recover.cpp:474] Replica is in EMPTY status > I0206 19:09:44.211094 32384 replica.cpp:640] Replica in EMPTY status received > a broadcasted recover request > I0206 19:09:44.211385 32384 recover.cpp:194] Received a recover response from > a replica in EMPTY status > I0206 19:09:44.211902 32384 recover.cpp:565] Updating replica status to > STARTING > I0206 19:09:44.236177 32381 master.cpp:344] Master > 20150206-190944-16842879-36452-32362 (lucid) started on 127.0.1.1:36452 > I0206 19:09:44.236291 32381 master.cpp:390] Master only allowing > authenticated frameworks to register > I0206 19:09:44.236305 32381 master.cpp:395] Master only allowing > authenticated slaves to register > I0206 19:09:44.236327 32381 credentials.hpp:35] Loading credentials for > authentication from > '/tmp/MasterSlaveReconciliationTest_ReconcileRace_NE9nhV/credentials' > I0206 19:09:44.236601 32381 master.cpp:439] Authorization enabled > I0206 19:09:44.238539 32381 hierarchical_allocator_process.hpp:284] > Initialized hierarchical allocator process > I0206 19:09:44.238662 32381 whitelist_watcher.cpp:64] No whitelist given > I0206 19:09:44.239364 32381 master.cpp:1350] The newly elected leader is > master@127.0.1.1:36452 with id 20150206-190944-16842879-36452-32362 > I0206 19:09:44.239392 32381 master.cpp:1363] Elected as the leading master! > I0206 19:09:44.239413 32381 master.cpp:1181] Recovering from registrar > I0206 19:09:44.239645 32381 registrar.cpp:312] Recovering registrar > I0206 19:09:44.241142 32384 leveldb.cpp:305] Persisting metadata (8 bytes) to > leveldb took 29.029117ms > I0206 19:09:44.241189 32384 replica.cpp:322] Persisted replica status to > STARTING > I0206 19:09:44.241478 32384 recover.cpp:474] Replica is in STARTING status > I0206 19:09:44.243075 32384 replica.cpp:640] Replica in STARTING status > received a broadcasted recover request > I0206 19:09:44.243398 32384 recover.cpp:194] Received a recover response from > a replica in STARTING status > I0206 19:09:44.243964 32384 recover.cpp:565] Updating replica status to VOTING > I0206 19:09:44.255692 32384 leveldb.cpp:305] Persisting metadata (8 bytes) to > leveldb took 11.502759ms > I0206 19:09:44.255765 32384 replica.cpp:322] Persisted replica status to > VOTING > I0206 19:09:44.256009 32384 recover.cpp:579] Successfully joined the Paxos > group > I0206 19:09:44.256253 32384 recover.cpp:463] Recover process terminated > I0206 19:09:44.257669 32384 log.cpp:659] Attempting to start the writer > I0206 19:09:44.259944 32377 replica.cpp:476] Replica received implicit > promise request with proposal 1 > I0206 19:09:44.268805 32377 leveldb.cpp:305] Persisting metadata (8 bytes) to > leveldb took 8.45858ms > I0206 19:09:44.269067 32377 replica.cpp:344] Persisted promised to 1 > I0206 19:09:44.277974 32383 coordinator.cpp:229] Coordinator attemping to > fill missing position > I0206 19:09:44.279767 32383 replica.cpp:377] Replica received explicit > promise request for position 0 with proposal 2 > I0206 19:09:44.288940 32383 leveldb.cpp:342] Persisting action (8 bytes) to > leveldb took 9.128603ms > I0206 19:09:44.289294 32383 replica.cpp:678] Persisted action at 0 > I0206 19:09:44.296417 32377 replica.cpp:510] Replica received write request > for position 0 > I0206 19:09:44.296944 32377 leveldb.cpp:437] Reading po
[jira] [Updated] (MESOS-5060) Requesting /files/read.json with a negative length value causes subsequent /files requests to 404.
[ https://issues.apache.org/jira/browse/MESOS-5060?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benjamin Mahler updated MESOS-5060: --- Shepherd: Benjamin Mahler I'll shepherd the change, happy to have greg help me with the reviews! > Requesting /files/read.json with a negative length value causes subsequent > /files requests to 404. > -- > > Key: MESOS-5060 > URL: https://issues.apache.org/jira/browse/MESOS-5060 > Project: Mesos > Issue Type: Bug >Affects Versions: 0.23.0 > Environment: Mesos 0.23.0 on CentOS 6, also Mesos 0.28.0 on OSX >Reporter: Tom Petr >Assignee: zhou xing >Priority: Minor > Fix For: 0.29.0 > > > I accidentally hit a slave's /files/read.json endpoint with a negative length > (ex. http://hostname:5051/files/read.json?path=XXX&offset=0&length=-100). The > HTTP request timed out after 30 seconds with nothing relevant in the slave > logs, and subsequent calls to any of the /files endpoints on that slave > immediately returned a HTTP 404 response. We ultimately got things working > again by restarting the mesos-slave process (checkpointing FTW!), but it'd be > wise to guard against negative lengths on the slave's end too. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-5200) agent->master messages use temporary TCP connections
[ https://issues.apache.org/jira/browse/MESOS-5200?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15239946#comment-15239946 ] Benjamin Mahler commented on MESOS-5200: [~drobinson] Ok, in that case I'll go ahead and close this out a duplicate of MESOS-1963. > agent->master messages use temporary TCP connections > > > Key: MESOS-5200 > URL: https://issues.apache.org/jira/browse/MESOS-5200 > Project: Mesos > Issue Type: Bug >Reporter: David Robinson > > Background info: When an agent is started it starts a background task > (libprocess process?) to detect the leading master. When the leading master > is detected (or changes) the [SocketManager's link() > method|https://github.com/apache/mesos/blob/master/3rdparty/libprocess/src/process.cpp#L1415] > [is > called|https://github.com/apache/mesos/blob/master/src/slave/slave.cpp#L942] > and a TCP connection to the master is established. The connection is used by > the agent to send messages to the master, and the master, upon receiving a > RegisterSlaveMessage/ReregisterSlaveMessage, establishes another TCP > connection back to the agent. Each TCP connection is uni-directional, the > agent writes messages on one connection and reads messages from the other, > and the master reads/writes from the opposite ends of the connections. > If the initial TCP connection to the master fails to be established then > temporary connections are used for all agent->master messages; each send() > causes a new TCP connection to be setup, the message sent, then the > connection torn down. If link() succeeds a persistent TCP connection is used > instead. > If agents do not use ZK to detect the master then the master detector > "detects" the master immediately and attempts to connect immediately. The > master may not be listening for connections at the time, or it could be > overwhelmed w/ TCP connection attempts, therefore the initial TCP connection > attempt fails. The agent does not attempt to establish a new persistent > connection as link() is only called when a new master is detected, which only > occurs once unless ZK is used. > It's possible for agents to overwhelm a master w/ TCP connections such that > agents cannot establish connections. When this occurs pong messages may not > be received by the master so the master shuts down agents thus killing any > tasks they were running. We have witnessed this scenario during scale/load > tests at Twitter. > The problem is trivial to reproduce: configure an agent to use a certain > master (\-\-master=10.20.30.40:5050), start the agent, wait several minutes > then start the master. All the agent->master messages will occur over > temporary connections. > The problem occurs less frequently in production because ZK is typically used > for master detection and a master only registers in ZK after it has started > listening on its socket. However, the scenario described above can also occur > when ZK is used – a thundering herd of 10,000+ slaves establishing TCP > connections to the master can result in some connection attempts failing and > agents using temporary connections. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-1963) Slave should use exited() to detect disconnection with Master.
[ https://issues.apache.org/jira/browse/MESOS-1963?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benjamin Mahler updated MESOS-1963: --- Description: The slave already links with the master, but it does not use the built in exited() notification from libprocess to trigger re-registration. Of particular concern is that, if the socket breaks and subsequent messages are successfully sent on ephemeral sockets, then we don't re-register and re-link with the master. Inconsistency can arise as a result of this, since we currently rely on re-registration to reconcile state when messages are dropped. was: The slave already links with the master, but it does not use the built in exited() notification from libprocess to trigger re-registration. Of particular concern is that, if the socket breaks and subsequent messages are successfully sent on ephemeral sockets, then we don't re-register with the master. Inconsistency can arise as a result of this, since we currently rely on re-registration to reconcile state when messages are dropped. > Slave should use exited() to detect disconnection with Master. > -- > > Key: MESOS-1963 > URL: https://issues.apache.org/jira/browse/MESOS-1963 > Project: Mesos > Issue Type: Improvement > Components: master, slave >Reporter: Benjamin Mahler > Labels: reliability, twitter > > The slave already links with the master, but it does not use the built in > exited() notification from libprocess to trigger re-registration. > Of particular concern is that, if the socket breaks and subsequent messages > are successfully sent on ephemeral sockets, then we don't re-register and > re-link with the master. Inconsistency can arise as a result of this, since > we currently rely on re-registration to reconcile state when messages are > dropped. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-5200) agent->master messages use temporary TCP connections
[ https://issues.apache.org/jira/browse/MESOS-5200?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15239852#comment-15239852 ] Benjamin Mahler commented on MESOS-5200: Looks like exited was indeed called (it should be called if a link can't be established), anything I'm missing? {noformat} I0412 23:53:02.831640 49996 slave.cpp:3528] master@10.20.30.40:5050 exited {noformat} > agent->master messages use temporary TCP connections > > > Key: MESOS-5200 > URL: https://issues.apache.org/jira/browse/MESOS-5200 > Project: Mesos > Issue Type: Bug >Reporter: David Robinson > > Background info: When an agent is started it starts a background task > (libprocess process?) to detect the leading master. When the leading master > is detected (or changes) the [SocketManager's link() > method|https://github.com/apache/mesos/blob/master/3rdparty/libprocess/src/process.cpp#L1415] > [is > called|https://github.com/apache/mesos/blob/master/src/slave/slave.cpp#L942] > and a TCP connection to the master is established. The connection is used by > the agent to send messages to the master, and the master, upon receiving a > RegisterSlaveMessage/ReregisterSlaveMessage, establishes another TCP > connection back to the agent. Each TCP connection is uni-directional, the > agent writes messages on one connection and reads messages from the other, > and the master reads/writes from the opposite ends of the connections. > If the initial TCP connection to the master fails to be established then > temporary connections are used for all agent->master messages; each send() > causes a new TCP connection to be setup, the message sent, then the > connection torn down. If link() succeeds a persistent TCP connection is used > instead. > If agents do not use ZK to detect the master then the master detector > "detects" the master immediately and attempts to connect immediately. The > master may not be listening for connections at the time, or it could be > overwhelmed w/ TCP connection attempts, therefore the initial TCP connection > attempt fails. The agent does not attempt to establish a new persistent > connection as link() is only called when a new master is detected, which only > occurs once unless ZK is used. > It's possible for agents to overwhelm a master w/ TCP connections such that > agents cannot establish connections. When this occurs pong messages may not > be received by the master so the master shuts down agents thus killing any > tasks they were running. We have witnessed this scenario during scale/load > tests at Twitter. > The problem is trivial to reproduce: configure an agent to use a certain > master (\-\-master=10.20.30.40:5050), start the agent, wait several minutes > then start the master. All the agent->master messages will occur over > temporary connections. > The problem occurs less frequently in production because ZK is typically used > for master detection and a master only registers in ZK after it has started > listening on its socket. However, the scenario described above can also occur > when ZK is used – a thundering herd of 10,000+ slaves establishing TCP > connections to the master can result in some connection attempts failing and > agents using temporary connections. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-5131) Slave allows the resource estimator to send non-revocable resources.
[ https://issues.apache.org/jira/browse/MESOS-5131?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benjamin Mahler updated MESOS-5131: --- Summary: Slave allows the resource estimator to send non-revocable resources. (was: DRF allocator crashes master with CHECK when resource is incorrect) > Slave allows the resource estimator to send non-revocable resources. > > > Key: MESOS-5131 > URL: https://issues.apache.org/jira/browse/MESOS-5131 > Project: Mesos > Issue Type: Bug > Components: allocation, oversubscription >Reporter: Zhitao Li >Assignee: Zhitao Li >Priority: Critical > > We were testing a custom resource estimator which broadcasts oversubscribed > resources, but they are not marked as "revocable". > This unfortunately triggered the following check in hierarchical allocator: > {quote} > void HierarchicalAllocatorProcess::updateSlave( > // Check that all the oversubscribed resources are revocable. > CHECK_EQ(oversubscribed, oversubscribed.revocable()); > {quote} > This definitely shouldn't happen in production cluster. IMO, we should do > both of following: > 1. Make sure incorrect resource is not sent from agent (even crash agent > process is better); > 2. Decline agent registration if it's resources is incorrect, or even tell it > to shutdown, and possibly remove this check. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-5131) DRF allocator crashes master with CHECK when resource is incorrect
[ https://issues.apache.org/jira/browse/MESOS-5131?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benjamin Mahler updated MESOS-5131: --- Shepherd: Benjamin Mahler > DRF allocator crashes master with CHECK when resource is incorrect > -- > > Key: MESOS-5131 > URL: https://issues.apache.org/jira/browse/MESOS-5131 > Project: Mesos > Issue Type: Bug > Components: allocation, oversubscription >Reporter: Zhitao Li >Assignee: Zhitao Li >Priority: Critical > > We were testing a custom resource estimator which broadcasts oversubscribed > resources, but they are not marked as "revocable". > This unfortunately triggered the following check in hierarchical allocator: > {quote} > void HierarchicalAllocatorProcess::updateSlave( > // Check that all the oversubscribed resources are revocable. > CHECK_EQ(oversubscribed, oversubscribed.revocable()); > {quote} > This definitely shouldn't happen in production cluster. IMO, we should do > both of following: > 1. Make sure incorrect resource is not sent from agent (even crash agent > process is better); > 2. Decline agent registration if it's resources is incorrect, or even tell it > to shutdown, and possibly remove this check. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (MESOS-5183) Provide backup/restore functionality for the registry.
Benjamin Mahler created MESOS-5183: -- Summary: Provide backup/restore functionality for the registry. Key: MESOS-5183 URL: https://issues.apache.org/jira/browse/MESOS-5183 Project: Mesos Issue Type: Epic Components: master Reporter: Benjamin Mahler Priority: Critical Currently there is no built-in support for backup/restore of the registry state. The current suggestion is to back up the LevelDB directories across each master and to restore them. This can be error prone and it requires that operators deal directly with the underlying storage layer. Ideally, the master provides a means to extract the complete registry contents for backup purposes, and has the ability to restore its state from a backup. As a note, the {{/registrar(1)/registry}} endpoint currently provides an ability to extract the state as JSON. There is currently no built-in support for restoring from backups. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-4705) Slave failed to sample container with perf event
[ https://issues.apache.org/jira/browse/MESOS-4705?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15235933#comment-15235933 ] Benjamin Mahler commented on MESOS-4705: Which patch? This one? https://reviews.apache.org/r/44379/ It still does not contain the information related to perf stat formats that [~haosd...@gmail.com] provided earlier in this thread. Can you add that? With respect to https://reviews.apache.org/r/44255/, happy to discuss further, but let's do that outside of this ticket since it is not related. > Slave failed to sample container with perf event > > > Key: MESOS-4705 > URL: https://issues.apache.org/jira/browse/MESOS-4705 > Project: Mesos > Issue Type: Bug > Components: cgroups, isolation >Affects Versions: 0.27.1 >Reporter: Fan Du >Assignee: Fan Du > > When sampling container with perf event on Centos7 with kernel > 3.10.0-123.el7.x86_64, slave complained with below error spew: > {code} > E0218 16:32:00.591181 8376 perf_event.cpp:408] Failed to get perf sample: > Failed to parse perf sample: Failed to parse perf sample line > '25871993253,,cycles,mesos/5f23ffca-87ed-4ff6-84f2-6ec3d4098ab8,10059827422,100.00': > Unexpected number of fields > {code} > it's caused by the current perf format [assumption | > https://git-wip-us.apache.org/repos/asf?p=mesos.git;a=blob;f=src/linux/perf.cpp;h=1c113a2b3f57877e132bbd65e01fb2f045132128;hb=HEAD#l430] > with kernel version below 3.12 > On 3.10.0-123.el7.x86_64 kernel, the format is with 6 tokens as below: > value,unit,event,cgroup,running,ratio > A local modification fixed this error on my test bed, please review this > ticket. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-4705) Slave failed to sample container with perf event
[ https://issues.apache.org/jira/browse/MESOS-4705?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15233098#comment-15233098 ] Benjamin Mahler commented on MESOS-4705: [~haosd...@gmail.com] thanks for investigating the kernel code and clarifying. [~fan.du] [~haosd...@gmail.com] This is the kind of information I'd like to see in the code so that our methodology is clear to the reader. If you'd like to remove kernel version checking as a part of this, that's fine as well so long as the explanation is clear and correct. > Slave failed to sample container with perf event > > > Key: MESOS-4705 > URL: https://issues.apache.org/jira/browse/MESOS-4705 > Project: Mesos > Issue Type: Bug > Components: cgroups, isolation >Affects Versions: 0.27.1 >Reporter: Fan Du >Assignee: Fan Du > > When sampling container with perf event on Centos7 with kernel > 3.10.0-123.el7.x86_64, slave complained with below error spew: > {code} > E0218 16:32:00.591181 8376 perf_event.cpp:408] Failed to get perf sample: > Failed to parse perf sample: Failed to parse perf sample line > '25871993253,,cycles,mesos/5f23ffca-87ed-4ff6-84f2-6ec3d4098ab8,10059827422,100.00': > Unexpected number of fields > {code} > it's caused by the current perf format [assumption | > https://git-wip-us.apache.org/repos/asf?p=mesos.git;a=blob;f=src/linux/perf.cpp;h=1c113a2b3f57877e132bbd65e01fb2f045132128;hb=HEAD#l430] > with kernel version below 3.12 > On 3.10.0-123.el7.x86_64 kernel, the format is with 6 tokens as below: > value,unit,event,cgroup,running,ratio > A local modification fixed this error on my test bed, please review this > ticket. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-5135) Update existing documentation to Include references to GPUs as a first class resource.
[ https://issues.apache.org/jira/browse/MESOS-5135?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benjamin Mahler updated MESOS-5135: --- Issue Type: Task (was: Bug) Moving from a bug to a task. > Update existing documentation to Include references to GPUs as a first class > resource. > -- > > Key: MESOS-5135 > URL: https://issues.apache.org/jira/browse/MESOS-5135 > Project: Mesos > Issue Type: Task > Components: documentation >Reporter: Kevin Klues >Assignee: Kevin Klues > Labels: docs, gpu, mesosphere, resource > > Specifically, the documentation in the following files should be udated: > {noformat} > docs/attributes-resources.md > docs/monitoring.md > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-5136) Update the default JSON representation of a Resource to include GPUs
[ https://issues.apache.org/jira/browse/MESOS-5136?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benjamin Mahler updated MESOS-5136: --- Issue Type: Task (was: Bug) Moved this from a bug to a task. > Update the default JSON representation of a Resource to include GPUs > > > Key: MESOS-5136 > URL: https://issues.apache.org/jira/browse/MESOS-5136 > Project: Mesos > Issue Type: Task >Reporter: Kevin Klues >Assignee: Kevin Klues > Labels: gpu, json, mesosphere, resource > Fix For: 0.29.0 > > > The default JSON representation of a Resource currently lists a value of "0" > if no value is set on a first class SCALAR resource (i.e. cpus, mem, disk). > We should add GPUs in here as well. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-5137) Remove 'dashboard.js' from the webui.
[ https://issues.apache.org/jira/browse/MESOS-5137?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benjamin Mahler updated MESOS-5137: --- Issue Type: Task (was: Bug) Changed this from a Bug to a Task. > Remove 'dashboard.js' from the webui. > - > > Key: MESOS-5137 > URL: https://issues.apache.org/jira/browse/MESOS-5137 > Project: Mesos > Issue Type: Task >Reporter: Kevin Klues >Assignee: Kevin Klues > Labels: webui > > This file is no longer in use anywhere. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-5115) Grant access to /dev/nvidiactl and /dev/nvidia-uvm in the Nvidia GPU isolator.
[ https://issues.apache.org/jira/browse/MESOS-5115?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benjamin Mahler updated MESOS-5115: --- Description: Calls to 'nvidia-smi' fail inside a container even if access to a GPU has been granted. Moreover, access to /dev/nvidiactl is actually required for a container to do anything useful with a GPU even if it has access to it. We should grant/revoke access to /dev/nvidiactl and /dev/nvidia-uvm as GPUs are added and removed from a container in the Nvidia GPU isolator. was: Calls to 'nvidia-smi' fail inside a container even if access to a GPU has been granted. Moreover, access to /dev/nvidiactl is actually required for a container to do anything useful with a GPU even if it has access to it. We should grant/revoke access to /dev/nvidiactl as GPUs are added and removed from a container in the Nvidia GPU isolator. > Grant access to /dev/nvidiactl and /dev/nvidia-uvm in the Nvidia GPU isolator. > -- > > Key: MESOS-5115 > URL: https://issues.apache.org/jira/browse/MESOS-5115 > Project: Mesos > Issue Type: Bug > Components: isolation >Reporter: Kevin Klues >Assignee: Kevin Klues > Labels: gpu > Fix For: 0.29.0 > > > Calls to 'nvidia-smi' fail inside a container even if access to a GPU has > been granted. Moreover, access to /dev/nvidiactl is actually required for a > container to do anything useful with a GPU even if it has access to it. > > We should grant/revoke access to /dev/nvidiactl and /dev/nvidia-uvm as GPUs > are added and removed from a container in the Nvidia GPU isolator. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-5115) Grant access to /dev/nvidiactl and /dev/nvidia-uvm in the Nvidia GPU isolator.
[ https://issues.apache.org/jira/browse/MESOS-5115?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benjamin Mahler updated MESOS-5115: --- Component/s: isolation Summary: Grant access to /dev/nvidiactl and /dev/nvidia-uvm in the Nvidia GPU isolator. (was: Add support to grant access to /dev/nvidiactl in the Nvidia GPU isolator.) > Grant access to /dev/nvidiactl and /dev/nvidia-uvm in the Nvidia GPU isolator. > -- > > Key: MESOS-5115 > URL: https://issues.apache.org/jira/browse/MESOS-5115 > Project: Mesos > Issue Type: Bug > Components: isolation >Reporter: Kevin Klues >Assignee: Kevin Klues > Labels: gpu > > Calls to 'nvidia-smi' fail inside a container even if access to a GPU has > been granted. Moreover, access to /dev/nvidiactl is actually required for a > container to do anything useful with a GPU even if it has access to it. > > We should grant/revoke access to /dev/nvidiactl as GPUs are added and removed > from a container in the Nvidia GPU isolator. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-4705) Slave failed to sample container with perf event
[ https://issues.apache.org/jira/browse/MESOS-4705?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15227054#comment-15227054 ] Benjamin Mahler commented on MESOS-4705: I still don't follow the comment in that diff. For example, how do we know that certain vendors enhance it to 6 tokens? And how do we know which token positions to use for custom formats? If there is a specific OS we're supporting it would be great to document that for posterity. > Slave failed to sample container with perf event > > > Key: MESOS-4705 > URL: https://issues.apache.org/jira/browse/MESOS-4705 > Project: Mesos > Issue Type: Bug > Components: cgroups, isolation >Affects Versions: 0.27.1 >Reporter: Fan Du >Assignee: Fan Du > > When sampling container with perf event on Centos7 with kernel > 3.10.0-123.el7.x86_64, slave complained with below error spew: > {code} > E0218 16:32:00.591181 8376 perf_event.cpp:408] Failed to get perf sample: > Failed to parse perf sample: Failed to parse perf sample line > '25871993253,,cycles,mesos/5f23ffca-87ed-4ff6-84f2-6ec3d4098ab8,10059827422,100.00': > Unexpected number of fields > {code} > it's caused by the current perf format [assumption | > https://git-wip-us.apache.org/repos/asf?p=mesos.git;a=blob;f=src/linux/perf.cpp;h=1c113a2b3f57877e132bbd65e01fb2f045132128;hb=HEAD#l430] > with kernel version below 3.12 > On 3.10.0-123.el7.x86_64 kernel, the format is with 6 tokens as below: > value,unit,event,cgroup,running,ratio > A local modification fixed this error on my test bed, please review this > ticket. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-4981) Framework (re-)register metric counters broken for calls made via scheduler driver
[ https://issues.apache.org/jira/browse/MESOS-4981?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15227046#comment-15227046 ] Benjamin Mahler commented on MESOS-4981: [~fan.du] Hm.. I'm not sure I follow the difficulty here. Can't these metrics be distinguished by introspecting {{subscribe.framework_info.id}}? If id is present, it is a re-registration. If id is absent, it is a registration. > Framework (re-)register metric counters broken for calls made via scheduler > driver > -- > > Key: MESOS-4981 > URL: https://issues.apache.org/jira/browse/MESOS-4981 > Project: Mesos > Issue Type: Bug > Components: master >Reporter: Anand Mazumdar >Assignee: Fan Du > Labels: mesosphere > > The counters {{master/messages_register_framework}} and > {{master/messages_reregister_framework}} are no longer being incremented > after the scheduler driver started sending {{Call}} messages to the master in > Mesos 0.23. We should correctly be incrementing these counters for PID based > frameworks as was the case previously. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-5030) Expose TaskInfo's metadata to ResourceUsage struct
[ https://issues.apache.org/jira/browse/MESOS-5030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15214725#comment-15214725 ] Benjamin Mahler commented on MESOS-5030: I'd suggest we define a stripped {{Task}} message within {{ResourceUsage}} and only expose minimal metadata (e.g. id, name, labels). > Expose TaskInfo's metadata to ResourceUsage struct > -- > > Key: MESOS-5030 > URL: https://issues.apache.org/jira/browse/MESOS-5030 > Project: Mesos > Issue Type: Improvement > Components: oversubscription >Reporter: Zhitao Li >Assignee: Zhitao Li > Labels: qos, uber > > So QosController could use metadata information from TaskInfo. > Based on conversations from Mesos work group, we would at least include: > - task id; > - name; > - labels; > ( I think resources, kill_policy should probably also included). > Alternative would be just purge fields like `data`. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-5029) Add labels to ExecutorInfo
[ https://issues.apache.org/jira/browse/MESOS-5029?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15214701#comment-15214701 ] Benjamin Mahler commented on MESOS-5029: Sounds good, rather than introducing a new test, it would be ideal to extend an existing test that ensures the QoSController is getting a complete ResourceUsage (if one exists). I would also suggest that we deprecate {{ExecutorInfo.source}} in favor of using labels. > Add labels to ExecutorInfo > -- > > Key: MESOS-5029 > URL: https://issues.apache.org/jira/browse/MESOS-5029 > Project: Mesos > Issue Type: Improvement >Reporter: Zhitao Li >Assignee: Zhitao Li >Priority: Minor > Labels: uber > > We want to to allow frameworks to populate metadata on ExecutorInfo object. > An use case would be custom labels inspected by QosController. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-5018) FrameworkInfo Capability enum does not support upgrades.
[ https://issues.apache.org/jira/browse/MESOS-5018?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benjamin Mahler updated MESOS-5018: --- Fix Version/s: 0.27.3 0.28.1 > FrameworkInfo Capability enum does not support upgrades. > > > Key: MESOS-5018 > URL: https://issues.apache.org/jira/browse/MESOS-5018 > Project: Mesos > Issue Type: Bug >Affects Versions: 0.23.0, 0.23.1, 0.24.0, 0.24.1, 0.25.0, 0.26.0, 0.27.0, > 0.27.1, 0.28.0, 0.27.2, 0.26.1, 0.25.1 >Reporter: Benjamin Mahler >Assignee: Benjamin Mahler > Fix For: 0.29.0, 0.28.1, 0.27.3 > > > See MESOS-4997 for the general issue around enum usage. This ticket tracks > fixing the FrameworkInfo Capability enum to support upgrades in a backwards > compatible way. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-3487) Running libprocess tests in a loop leads to unbounded memory growth.
[ https://issues.apache.org/jira/browse/MESOS-3487?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15209685#comment-15209685 ] Benjamin Mahler commented on MESOS-3487: Linking in MESOS-5021, which is one source of a leak. > Running libprocess tests in a loop leads to unbounded memory growth. > > > Key: MESOS-3487 > URL: https://issues.apache.org/jira/browse/MESOS-3487 > Project: Mesos > Issue Type: Bug > Components: libprocess >Reporter: Benjamin Mahler >Assignee: Diana Arroyo > Labels: newbie > > Was doing some repeat testing on a patch to check for flakiness and noticed > that the libprocess tests appear to have a leak that leads to unbounded > memory growth. > I checked the stout tests as well, they appear ok. > Notice the large RSS for the libprocess tests: > {noformat} > PID USER PR NI VIRT RES SHR S %CPU %MEMTIME+ COMMAND > 55133 root 20 0 479m 9.9m 6860 S 56.8 0.0 0:02.36 tests > 16410 root 20 0 575m 152m 6864 S 60.1 0.2 6:09.70 tests > 61363 root 20 0 606m 304m 6948 R 74.7 0.4 15:50.11 tests > 32836 root 20 0 116m 9032 5580 S 88.3 0.0 3:46.32 stout-tests > {noformat} > Commands to reproduce: > {noformat} > $ sudo ./3rdparty/libprocess/tests --gtest_repeat=-1 --gtest_break_on_failure > $ sudo ./3rdparty/libprocess/3rdparty/stout-tests --gtest_repeat=-1 > --gtest_break_on_failure > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-5021) Memory leak in subprocess when 'environment' argument is provided.
[ https://issues.apache.org/jira/browse/MESOS-5021?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benjamin Mahler updated MESOS-5021: --- Fix Version/s: 0.24.2 0.25.1 0.26.1 0.23.2 0.27.3 > Memory leak in subprocess when 'environment' argument is provided. > -- > > Key: MESOS-5021 > URL: https://issues.apache.org/jira/browse/MESOS-5021 > Project: Mesos > Issue Type: Bug > Components: libprocess, slave >Affects Versions: 0.23.0, 0.23.1, 0.24.0, 0.24.1, 0.25.0, 0.26.0, 0.27.0, > 0.27.1, 0.28.0, 0.27.2 >Reporter: Benjamin Mahler >Assignee: Benjamin Mahler >Priority: Blocker > Fix For: 0.26.1, 0.25.1, 0.24.2, 0.28.1, 0.27.3, 0.23.2 > > > A memory leak in process::subprocess was introduced here: > https://github.com/apache/mesos/commit/14b49f31840ff1523b31007c21b12c604700323f > This was found when [~jieyu] and I examined a memory leak in the health check > program (see MESOS-4869). > The leak is here: > https://github.com/apache/mesos/blob/0.28.0/3rdparty/libprocess/src/subprocess.cpp#L451-L456 > {code} > // Like above, we need to construct the environment that we'll pass > // to 'os::execvpe' as it might not be async-safe to perform the > // memory allocations. > char** envp = os::raw::environment(); > if (environment.isSome()) { > // NOTE: We add 1 to the size for a NULL terminator. > envp = new char*[environment.get().size() + 1]; > size_t index = 0; > foreachpair (const string& key, const string& value, environment.get()) { > string entry = key + "=" + value; > envp[index] = new char[entry.size() + 1]; > strncpy(envp[index], entry.c_str(), entry.size() + 1); > ++index; > } > envp[index] = NULL; > } > ... > // Need to delete 'envp' if we had environment variables passed to > // us and we needed to allocate the space. > if (environment.isSome()) { > CHECK_NE(os::raw::environment(), envp); > delete[] envp; // XXX Does not delete the sub arrays. > } > {code} > Auditing the code, it appears to affect a number of locations: > * > [docker::run|https://github.com/apache/mesos/blob/0.28.0/src/docker/docker.cpp#L661-L668] > * [health check > binary|https://github.com/apache/mesos/blob/0.28.0/src/health-check/main.cpp#L177-L205] > * > [liblogrotate|https://github.com/apache/mesos/blob/0.28.0/src/slave/container_loggers/lib_logrotate.cpp#L137-L194] > * Docker containerizer: > [here|https://github.com/apache/mesos/blob/0.28.0/src/slave/containerizer/docker.cpp#L1207-L1220] > and > [here|https://github.com/apache/mesos/blob/0.28.0/src/slave/containerizer/docker.cpp#L1119-L1131] > * [External > containerizer|https://github.com/apache/mesos/blob/0.28.0/src/slave/containerizer/external_containerizer.cpp#L479-L483] > * [Posix > launcher|https://github.com/apache/mesos/blob/0.28.0/src/slave/containerizer/mesos/launcher.cpp#L131-L141] > and [Linux > launcher|https://github.com/apache/mesos/blob/0.28.0/src/slave/containerizer/mesos/linux_launcher.cpp#L314-L324] > * > [Fetcher|https://github.com/apache/mesos/blob/0.28.0/src/slave/containerizer/fetcher.cpp#L768-L773] -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (MESOS-5021) Memory leak in subprocess when 'environment' argument is provided.
[ https://issues.apache.org/jira/browse/MESOS-5021?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benjamin Mahler reassigned MESOS-5021: -- Assignee: Benjamin Mahler > Memory leak in subprocess when 'environment' argument is provided. > -- > > Key: MESOS-5021 > URL: https://issues.apache.org/jira/browse/MESOS-5021 > Project: Mesos > Issue Type: Bug > Components: libprocess, slave >Affects Versions: 0.23.0, 0.23.1, 0.24.0, 0.24.1, 0.25.0, 0.26.0, 0.27.0, > 0.27.1, 0.28.0, 0.27.2 >Reporter: Benjamin Mahler >Assignee: Benjamin Mahler >Priority: Blocker > Fix For: 0.28.1 > > > A memory leak in process::subprocess was introduced here: > https://github.com/apache/mesos/commit/14b49f31840ff1523b31007c21b12c604700323f > This was found when [~jieyu] and I examined a memory leak in the health check > program (see MESOS-4869). > The leak is here: > https://github.com/apache/mesos/blob/0.28.0/3rdparty/libprocess/src/subprocess.cpp#L451-L456 > {code} > // Like above, we need to construct the environment that we'll pass > // to 'os::execvpe' as it might not be async-safe to perform the > // memory allocations. > char** envp = os::raw::environment(); > if (environment.isSome()) { > // NOTE: We add 1 to the size for a NULL terminator. > envp = new char*[environment.get().size() + 1]; > size_t index = 0; > foreachpair (const string& key, const string& value, environment.get()) { > string entry = key + "=" + value; > envp[index] = new char[entry.size() + 1]; > strncpy(envp[index], entry.c_str(), entry.size() + 1); > ++index; > } > envp[index] = NULL; > } > ... > // Need to delete 'envp' if we had environment variables passed to > // us and we needed to allocate the space. > if (environment.isSome()) { > CHECK_NE(os::raw::environment(), envp); > delete[] envp; // XXX Does not delete the sub arrays. > } > {code} > Auditing the code, it appears to affect a number of locations: > * > [docker::run|https://github.com/apache/mesos/blob/0.28.0/src/docker/docker.cpp#L661-L668] > * [health check > binary|https://github.com/apache/mesos/blob/0.28.0/src/health-check/main.cpp#L177-L205] > * > [liblogrotate|https://github.com/apache/mesos/blob/0.28.0/src/slave/container_loggers/lib_logrotate.cpp#L137-L194] > * Docker containerizer: > [here|https://github.com/apache/mesos/blob/0.28.0/src/slave/containerizer/docker.cpp#L1207-L1220] > and > [here|https://github.com/apache/mesos/blob/0.28.0/src/slave/containerizer/docker.cpp#L1119-L1131] > * [External > containerizer|https://github.com/apache/mesos/blob/0.28.0/src/slave/containerizer/external_containerizer.cpp#L479-L483] > * [Posix > launcher|https://github.com/apache/mesos/blob/0.28.0/src/slave/containerizer/mesos/launcher.cpp#L131-L141] > and [Linux > launcher|https://github.com/apache/mesos/blob/0.28.0/src/slave/containerizer/mesos/linux_launcher.cpp#L314-L324] > * > [Fetcher|https://github.com/apache/mesos/blob/0.28.0/src/slave/containerizer/fetcher.cpp#L768-L773] -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-5021) Memory leak in subprocess when 'environment' argument is provided.
[ https://issues.apache.org/jira/browse/MESOS-5021?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benjamin Mahler updated MESOS-5021: --- Fix Version/s: 0.28.1 > Memory leak in subprocess when 'environment' argument is provided. > -- > > Key: MESOS-5021 > URL: https://issues.apache.org/jira/browse/MESOS-5021 > Project: Mesos > Issue Type: Bug > Components: libprocess, slave >Affects Versions: 0.23.0, 0.23.1, 0.24.0, 0.24.1, 0.25.0, 0.26.0, 0.27.0, > 0.27.1, 0.28.0, 0.27.2 >Reporter: Benjamin Mahler >Priority: Blocker > Fix For: 0.28.1 > > > A memory leak in process::subprocess was introduced here: > https://github.com/apache/mesos/commit/14b49f31840ff1523b31007c21b12c604700323f > This was found when [~jieyu] and I examined a memory leak in the health check > program (see MESOS-4869). > The leak is here: > https://github.com/apache/mesos/blob/0.28.0/3rdparty/libprocess/src/subprocess.cpp#L451-L456 > {code} > // Like above, we need to construct the environment that we'll pass > // to 'os::execvpe' as it might not be async-safe to perform the > // memory allocations. > char** envp = os::raw::environment(); > if (environment.isSome()) { > // NOTE: We add 1 to the size for a NULL terminator. > envp = new char*[environment.get().size() + 1]; > size_t index = 0; > foreachpair (const string& key, const string& value, environment.get()) { > string entry = key + "=" + value; > envp[index] = new char[entry.size() + 1]; > strncpy(envp[index], entry.c_str(), entry.size() + 1); > ++index; > } > envp[index] = NULL; > } > ... > // Need to delete 'envp' if we had environment variables passed to > // us and we needed to allocate the space. > if (environment.isSome()) { > CHECK_NE(os::raw::environment(), envp); > delete[] envp; // XXX Does not delete the sub arrays. > } > {code} > Auditing the code, it appears to affect a number of locations: > * > [docker::run|https://github.com/apache/mesos/blob/0.28.0/src/docker/docker.cpp#L661-L668] > * [health check > binary|https://github.com/apache/mesos/blob/0.28.0/src/health-check/main.cpp#L177-L205] > * > [liblogrotate|https://github.com/apache/mesos/blob/0.28.0/src/slave/container_loggers/lib_logrotate.cpp#L137-L194] > * Docker containerizer: > [here|https://github.com/apache/mesos/blob/0.28.0/src/slave/containerizer/docker.cpp#L1207-L1220] > and > [here|https://github.com/apache/mesos/blob/0.28.0/src/slave/containerizer/docker.cpp#L1119-L1131] > * [External > containerizer|https://github.com/apache/mesos/blob/0.28.0/src/slave/containerizer/external_containerizer.cpp#L479-L483] > * [Posix > launcher|https://github.com/apache/mesos/blob/0.28.0/src/slave/containerizer/mesos/launcher.cpp#L131-L141] > and [Linux > launcher|https://github.com/apache/mesos/blob/0.28.0/src/slave/containerizer/mesos/linux_launcher.cpp#L314-L324] > * > [Fetcher|https://github.com/apache/mesos/blob/0.28.0/src/slave/containerizer/fetcher.cpp#L768-L773] -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-4869) /usr/libexec/mesos/mesos-health-check using/leaking a lot of memory
[ https://issues.apache.org/jira/browse/MESOS-4869?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15209666#comment-15209666 ] Benjamin Mahler commented on MESOS-4869: Hi [~scalp42], thank you for reporting this! [~jieyu] and I took a look at the health check program and found that there is a memory leak in some of the library code it uses. I filed MESOS-5021 to follow up on it. We leak the environment variables each time a health check process is run, which is likely the leak you've observed. > /usr/libexec/mesos/mesos-health-check using/leaking a lot of memory > --- > > Key: MESOS-4869 > URL: https://issues.apache.org/jira/browse/MESOS-4869 > Project: Mesos > Issue Type: Bug >Affects Versions: 0.27.1 >Reporter: Anthony Scalisi >Priority: Critical > > We switched our health checks in Marathon from HTTP to COMMAND: > {noformat} > "healthChecks": [ > { > "protocol": "COMMAND", > "path": "/ops/ping", > "command": { "value": "curl --silent -f -X GET > http://$HOST:$PORT0/ops/ping > /dev/null" }, > "gracePeriodSeconds": 90, > "intervalSeconds": 2, > "portIndex": 0, > "timeoutSeconds": 5, > "maxConsecutiveFailures": 3 > } > ] > {noformat} > All our applications have the same health check (and /ops/ping endpoint). > Even though we have the issue on all our Meos slaves, I'm going to focus on a > particular one: *mesos-slave-i-e3a9c724*. > The slave has 16 gigs of memory, with about 12 gigs allocated for 8 tasks: > !https://i.imgur.com/gbRf804.png! > Here is a *docker ps* on it: > {noformat} > root@mesos-slave-i-e3a9c724 # docker ps > CONTAINER IDIMAGE COMMAND CREATED >STATUS PORTS NAMES > 4f7c0aa8d03ajava:8 "/bin/sh -c 'JAVA_OPT" 6 hours ago >Up 6 hours 0.0.0.0:31926->8080/tcp > mesos-29e183be-f611-41b4-824c-2d05b052231b-S6.3dbb1004-5bb8-432f-8fd8-b863bd29341d > 66f2fc8f8056java:8 "/bin/sh -c 'JAVA_OPT" 6 hours ago >Up 6 hours 0.0.0.0:31939->8080/tcp > mesos-29e183be-f611-41b4-824c-2d05b052231b-S6.60972150-b2b1-45d8-8a55-d63e81b8372a > f7382f241fcejava:8 "/bin/sh -c 'JAVA_OPT" 6 hours ago >Up 6 hours 0.0.0.0:31656->8080/tcp > mesos-29e183be-f611-41b4-824c-2d05b052231b-S6.39731a2f-d29e-48d1-9927-34ab8c5f557d > 880934c0049ejava:8 "/bin/sh -c 'JAVA_OPT" 24 hours ago >Up 24 hours 0.0.0.0:31371->8080/tcp > mesos-29e183be-f611-41b4-824c-2d05b052231b-S6.23dfe408-ab8f-40be-bf6f-ce27fe885ee0 > 5eab1f8dac4ajava:8 "/bin/sh -c 'JAVA_OPT" 46 hours ago >Up 46 hours 0.0.0.0:31500->8080/tcp > mesos-29e183be-f611-41b4-824c-2d05b052231b-S6.5ac75198-283f-4349-a220-9e9645b313e7 > b63740fe56e7java:8 "/bin/sh -c 'JAVA_OPT" 46 hours ago >Up 46 hours 0.0.0.0:31382->8080/tcp > mesos-29e183be-f611-41b4-824c-2d05b052231b-S6.5d417f16-df24-49d5-a5b0-38a7966460fe > 5c7a9ea77b0ejava:8 "/bin/sh -c 'JAVA_OPT" 2 days ago >Up 2 days 0.0.0.0:31186->8080/tcp > mesos-29e183be-f611-41b4-824c-2d05b052231b-S6.b05043c5-44fc-40bf-aea2-10354e8f5ab4 > 53065e7a31adjava:8 "/bin/sh -c 'JAVA_OPT" 2 days ago >Up 2 days 0.0.0.0:31839->8080/tcp > mesos-29e183be-f611-41b4-824c-2d05b052231b-S6.f0a3f4c5-ecdb-4f97-bede-d744feda670c > {noformat} > Here is a *docker stats* on it: > {noformat} > root@mesos-slave-i-e3a9c724 # docker stats > CONTAINER CPU % MEM USAGE / LIMIT MEM % > NET I/O BLOCK I/O > 4f7c0aa8d03a2.93% 797.3 MB / 1.611 GB 49.50% > 1.277 GB / 1.189 GB 155.6 kB / 151.6 kB > 53065e7a31ad8.30% 738.9 MB / 1.611 GB 45.88% > 419.6 MB / 554.3 MB 98.3 kB / 61.44 kB > 5c7a9ea77b0e4.91% 1.081 GB / 1.611 GB 67.10% > 423 MB / 526.5 MB 3.219 MB / 61.44 kB > 5eab1f8dac4a3.13% 1.007 GB / 1.611 GB 62.53% > 2.737 GB / 2.564 GB 6.566 MB / 118.8 kB > 66f2fc8f80563.15% 768.1 MB / 1.611 GB 47.69% > 258.5 MB / 252.8 MB 1.86 MB / 151.6 kB > 880934c0049e10.07% 735.1 MB / 1.611 GB 45.64% > 1.451 GB / 1.399 GB 573.4 kB / 94.21 kB > b63740fe56e712.04% 629 MB / 1.611 GB 39.06% > 10.29 GB / 9.344 GB 8.102 MB / 61.44 kB > f7382f241fce6.21% 505 MB / 1.611 GB 31.36% > 153.4 MB / 151.9
[jira] [Created] (MESOS-5021) Memory leak in subprocess when 'environment' argument is provided.
Benjamin Mahler created MESOS-5021: -- Summary: Memory leak in subprocess when 'environment' argument is provided. Key: MESOS-5021 URL: https://issues.apache.org/jira/browse/MESOS-5021 Project: Mesos Issue Type: Bug Components: libprocess, slave Affects Versions: 0.27.2, 0.28.0, 0.27.1, 0.27.0, 0.26.0, 0.25.0, 0.24.1, 0.24.0, 0.23.1, 0.23.0 Reporter: Benjamin Mahler Priority: Blocker A memory leak in process::subprocess was introduced here: https://github.com/apache/mesos/commit/14b49f31840ff1523b31007c21b12c604700323f This was found when [~jieyu] and I examined a memory leak in the health check program (see MESOS-4869). The leak is here: https://github.com/apache/mesos/blob/0.28.0/3rdparty/libprocess/src/subprocess.cpp#L451-L456 {code} // Like above, we need to construct the environment that we'll pass // to 'os::execvpe' as it might not be async-safe to perform the // memory allocations. char** envp = os::raw::environment(); if (environment.isSome()) { // NOTE: We add 1 to the size for a NULL terminator. envp = new char*[environment.get().size() + 1]; size_t index = 0; foreachpair (const string& key, const string& value, environment.get()) { string entry = key + "=" + value; envp[index] = new char[entry.size() + 1]; strncpy(envp[index], entry.c_str(), entry.size() + 1); ++index; } envp[index] = NULL; } ... // Need to delete 'envp' if we had environment variables passed to // us and we needed to allocate the space. if (environment.isSome()) { CHECK_NE(os::raw::environment(), envp); delete[] envp; // XXX Does not delete the sub arrays. } {code} Auditing the code, it appears to affect a number of locations: * [docker::run|https://github.com/apache/mesos/blob/0.28.0/src/docker/docker.cpp#L661-L668] * [health check binary|https://github.com/apache/mesos/blob/0.28.0/src/health-check/main.cpp#L177-L205] * [liblogrotate|https://github.com/apache/mesos/blob/0.28.0/src/slave/container_loggers/lib_logrotate.cpp#L137-L194] * Docker containerizer: [here|https://github.com/apache/mesos/blob/0.28.0/src/slave/containerizer/docker.cpp#L1207-L1220] and [here|https://github.com/apache/mesos/blob/0.28.0/src/slave/containerizer/docker.cpp#L1119-L1131] * [External containerizer|https://github.com/apache/mesos/blob/0.28.0/src/slave/containerizer/external_containerizer.cpp#L479-L483] * [Posix launcher|https://github.com/apache/mesos/blob/0.28.0/src/slave/containerizer/mesos/launcher.cpp#L131-L141] and [Linux launcher|https://github.com/apache/mesos/blob/0.28.0/src/slave/containerizer/mesos/linux_launcher.cpp#L314-L324] * [Fetcher|https://github.com/apache/mesos/blob/0.28.0/src/slave/containerizer/fetcher.cpp#L768-L773] -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-4981) Framework (re-)register metric counters broken for calls made via scheduler driver
[ https://issues.apache.org/jira/browse/MESOS-4981?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15209630#comment-15209630 ] Benjamin Mahler commented on MESOS-4981: I don't follow why you're subtracting in [r/45097|https://reviews.apache.org/r/45097/], it seems like a hack? Seeing the counter for registrations go up and then back down is going to cause confusion. > Framework (re-)register metric counters broken for calls made via scheduler > driver > -- > > Key: MESOS-4981 > URL: https://issues.apache.org/jira/browse/MESOS-4981 > Project: Mesos > Issue Type: Bug > Components: master >Reporter: Anand Mazumdar >Assignee: Fan Du > Labels: mesosphere > > The counters {{master/messages_register_framework}} and > {{master/messages_reregister_framework}} are no longer being incremented > after the scheduler driver started sending {{Call}} messages to the master in > Mesos 0.23. We should correctly be incrementing these counters for PID based > frameworks as was the case previously. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-1571) Signal escalation timeout is not configurable.
[ https://issues.apache.org/jira/browse/MESOS-1571?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benjamin Mahler updated MESOS-1571: --- Issue Type: Improvement (was: Bug) > Signal escalation timeout is not configurable. > -- > > Key: MESOS-1571 > URL: https://issues.apache.org/jira/browse/MESOS-1571 > Project: Mesos > Issue Type: Improvement >Reporter: Niklas Quarfot Nielsen >Assignee: Alexander Rukletsov > Labels: mesosphere > > Even though the executor shutdown grace period is set to a larger interval, > the signal escalation timeout will still be 3 seconds. It should either be > configurable or dependent on EXECUTOR_SHUTDOWN_GRACE_PERIOD. > Thoughts? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-3243) Replace NULL with nullptr
[ https://issues.apache.org/jira/browse/MESOS-3243?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benjamin Mahler updated MESOS-3243: --- Issue Type: Improvement (was: Bug) > Replace NULL with nullptr > - > > Key: MESOS-3243 > URL: https://issues.apache.org/jira/browse/MESOS-3243 > Project: Mesos > Issue Type: Improvement >Reporter: Michael Park >Assignee: Tomasz Janiszewski > > As part of the C++ upgrade, it would be nice to move our use of {{NULL}} over > to use {{nullptr}}. I think it would be an interesting exercise to do this > with {{clang-modernize}} using the [nullptr > transform|http://clang.llvm.org/extra/UseNullptrTransform.html] (although > it's probably just as easy to use {{sed}}). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-4997) Current approach to protobuf enums does not support upgrades.
[ https://issues.apache.org/jira/browse/MESOS-4997?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15209408#comment-15209408 ] Benjamin Mahler commented on MESOS-4997: [~a10gupta] great! We can look at each enum one by one. Could you file tickets under this epic for each enum that needs to be fixed? You can follow the format of MESOS-5018 that I filed for FrameworkInfo Capability. > Current approach to protobuf enums does not support upgrades. > - > > Key: MESOS-4997 > URL: https://issues.apache.org/jira/browse/MESOS-4997 > Project: Mesos > Issue Type: Epic > Components: technical debt >Reporter: Benjamin Mahler >Priority: Critical > > Some users were opting in to the recently introduced > [TASK_KILLING_STATE|https://github.com/apache/mesos/blob/0.28.0/include/mesos/v1/mesos.proto#L259-L272] > capability introduced in 0.28.0. When the scheduler ties to register with > the TASK_KILLING_STATE capability against a 0.27.0 master, the master drops > the message and prints the following: > {noformat} > [libprotobuf ERROR google/protobuf/message_lite.cc:123] Can't parse message > of type "mesos.scheduler.Call" because it is missing required fields: > subscribe.framework_info.capabilities[0].type > {noformat} > It turns out that our approach to enums in general does not allow for > backwards compatibility. For example: > {code} > message Capability { > enum Type { > REVOCABLE_RESOURCES = 1; > TASK_KILLING_STATE = 2; // New! > } > required Type type = 1; > } > {code} > Using a required enum is problematic because protobuf will strip unknown enum > values during de-serialization: > https://developers.google.com/protocol-buffers/docs/proto#updating > {quote} > enum is compatible with int32, uint32, int64, and uint64 in terms of wire > format (note that values will be truncated if they don't fit), but be aware > that client code may treat them differently when the message is deserialized. > Notably, unrecognized enum values are discarded when the message is > deserialized, which makes the field's has.. accessor return false and its > getter return the first value listed in the enum definition. However, an > integer field will always preserve its value. Because of this, you need to be > very careful when upgrading an integer to an enum in terms of receiving out > of bounds enum values on the wire. > {quote} > The suggestion on the protobuf mailing list is to use optional enum fields > and include an UNKNOWN value as the first entry in the enum list (and/or > explicitly specifying it as the default): > https://groups.google.com/forum/#!msg/protobuf/NhUjBfDyGmY/pf294zMi2bIJ > The updated version of Capability would be: > {code} > message Capability { > enum Type { > UNKNOWN = 0; > REVOCABLE_RESOURCES = 1; > TASK_KILLING_STATE = 2; // New! > } > optional Type type = 1; > } > {code} > Note that the first entry in an enum list is the default value, even if it's > number is not the lowest (unless {{\[default = \]}} is explicitly > specified). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-5018) FrameworkInfo Capability enum does not support upgrades.
[ https://issues.apache.org/jira/browse/MESOS-5018?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benjamin Mahler updated MESOS-5018: --- Affects Version/s: 0.28.0 > FrameworkInfo Capability enum does not support upgrades. > > > Key: MESOS-5018 > URL: https://issues.apache.org/jira/browse/MESOS-5018 > Project: Mesos > Issue Type: Bug >Affects Versions: 0.23.0, 0.23.1, 0.24.0, 0.24.1, 0.25.0, 0.26.0, 0.27.0, > 0.27.1, 0.28.0, 0.27.2, 0.26.1, 0.25.1 >Reporter: Benjamin Mahler >Assignee: Benjamin Mahler > Fix For: 0.29.0 > > > See MESOS-4997 for the general issue around enum usage. This ticket tracks > fixing the FrameworkInfo Capability enum to support upgrades in a backwards > compatible way. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (MESOS-5018) FrameworkInfo Capability enum does not support upgrades.
Benjamin Mahler created MESOS-5018: -- Summary: FrameworkInfo Capability enum does not support upgrades. Key: MESOS-5018 URL: https://issues.apache.org/jira/browse/MESOS-5018 Project: Mesos Issue Type: Bug Affects Versions: 0.27.2, 0.27.1, 0.27.0, 0.26.0, 0.25.0, 0.24.1, 0.24.0, 0.23.1, 0.23.0, 0.26.1, 0.25.1 Reporter: Benjamin Mahler Assignee: Benjamin Mahler See MESOS-4997 for the general issue around enum usage. This ticket tracks fixing the FrameworkInfo Capability enum to support upgrades in a backwards compatible way. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-4997) Current approach to protobuf enums does not support upgrades.
[ https://issues.apache.org/jira/browse/MESOS-4997?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benjamin Mahler updated MESOS-4997: --- Epic Name: Enum Upgrade > Current approach to protobuf enums does not support upgrades. > - > > Key: MESOS-4997 > URL: https://issues.apache.org/jira/browse/MESOS-4997 > Project: Mesos > Issue Type: Epic > Components: technical debt >Reporter: Benjamin Mahler >Priority: Critical > > Some users were opting in to the recently introduced > [TASK_KILLING_STATE|https://github.com/apache/mesos/blob/0.28.0/include/mesos/v1/mesos.proto#L259-L272] > capability introduced in 0.28.0. When the scheduler ties to register with > the TASK_KILLING_STATE capability against a 0.27.0 master, the master drops > the message and prints the following: > {noformat} > [libprotobuf ERROR google/protobuf/message_lite.cc:123] Can't parse message > of type "mesos.scheduler.Call" because it is missing required fields: > subscribe.framework_info.capabilities[0].type > {noformat} > It turns out that our approach to enums in general does not allow for > backwards compatibility. For example: > {code} > message Capability { > enum Type { > REVOCABLE_RESOURCES = 1; > TASK_KILLING_STATE = 2; // New! > } > required Type type = 1; > } > {code} > Using a required enum is problematic because protobuf will strip unknown enum > values during de-serialization: > https://developers.google.com/protocol-buffers/docs/proto#updating > {quote} > enum is compatible with int32, uint32, int64, and uint64 in terms of wire > format (note that values will be truncated if they don't fit), but be aware > that client code may treat them differently when the message is deserialized. > Notably, unrecognized enum values are discarded when the message is > deserialized, which makes the field's has.. accessor return false and its > getter return the first value listed in the enum definition. However, an > integer field will always preserve its value. Because of this, you need to be > very careful when upgrading an integer to an enum in terms of receiving out > of bounds enum values on the wire. > {quote} > The suggestion on the protobuf mailing list is to use optional enum fields > and include an UNKNOWN value as the first entry in the enum list (and/or > explicitly specifying it as the default): > https://groups.google.com/forum/#!msg/protobuf/NhUjBfDyGmY/pf294zMi2bIJ > The updated version of Capability would be: > {code} > message Capability { > enum Type { > UNKNOWN = 0; > REVOCABLE_RESOURCES = 1; > TASK_KILLING_STATE = 2; // New! > } > optional Type type = 1; > } > {code} > Note that the first entry in an enum list is the default value, even if it's > number is not the lowest (unless {{\[default = \]}} is explicitly > specified). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-4997) Current approach to protobuf enums does not support upgrades.
[ https://issues.apache.org/jira/browse/MESOS-4997?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benjamin Mahler updated MESOS-4997: --- Issue Type: Epic (was: Bug) > Current approach to protobuf enums does not support upgrades. > - > > Key: MESOS-4997 > URL: https://issues.apache.org/jira/browse/MESOS-4997 > Project: Mesos > Issue Type: Epic > Components: technical debt >Reporter: Benjamin Mahler >Priority: Critical > > Some users were opting in to the recently introduced > [TASK_KILLING_STATE|https://github.com/apache/mesos/blob/0.28.0/include/mesos/v1/mesos.proto#L259-L272] > capability introduced in 0.28.0. When the scheduler ties to register with > the TASK_KILLING_STATE capability against a 0.27.0 master, the master drops > the message and prints the following: > {noformat} > [libprotobuf ERROR google/protobuf/message_lite.cc:123] Can't parse message > of type "mesos.scheduler.Call" because it is missing required fields: > subscribe.framework_info.capabilities[0].type > {noformat} > It turns out that our approach to enums in general does not allow for > backwards compatibility. For example: > {code} > message Capability { > enum Type { > REVOCABLE_RESOURCES = 1; > TASK_KILLING_STATE = 2; // New! > } > required Type type = 1; > } > {code} > Using a required enum is problematic because protobuf will strip unknown enum > values during de-serialization: > https://developers.google.com/protocol-buffers/docs/proto#updating > {quote} > enum is compatible with int32, uint32, int64, and uint64 in terms of wire > format (note that values will be truncated if they don't fit), but be aware > that client code may treat them differently when the message is deserialized. > Notably, unrecognized enum values are discarded when the message is > deserialized, which makes the field's has.. accessor return false and its > getter return the first value listed in the enum definition. However, an > integer field will always preserve its value. Because of this, you need to be > very careful when upgrading an integer to an enum in terms of receiving out > of bounds enum values on the wire. > {quote} > The suggestion on the protobuf mailing list is to use optional enum fields > and include an UNKNOWN value as the first entry in the enum list (and/or > explicitly specifying it as the default): > https://groups.google.com/forum/#!msg/protobuf/NhUjBfDyGmY/pf294zMi2bIJ > The updated version of Capability would be: > {code} > message Capability { > enum Type { > UNKNOWN = 0; > REVOCABLE_RESOURCES = 1; > TASK_KILLING_STATE = 2; // New! > } > optional Type type = 1; > } > {code} > Note that the first entry in an enum list is the default value, even if it's > number is not the lowest (unless {{\[default = \]}} is explicitly > specified). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-4997) Current approach to protobuf enums does not support upgrades.
[ https://issues.apache.org/jira/browse/MESOS-4997?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15209386#comment-15209386 ] Benjamin Mahler commented on MESOS-4997: We should probably convert this to an epic and file tickets for each enum that we'd like to fix. For now, I've fixed the capability one here: https://reviews.apache.org/r/45151/ > Current approach to protobuf enums does not support upgrades. > - > > Key: MESOS-4997 > URL: https://issues.apache.org/jira/browse/MESOS-4997 > Project: Mesos > Issue Type: Bug > Components: technical debt >Reporter: Benjamin Mahler >Priority: Critical > > Some users were opting in to the recently introduced > [TASK_KILLING_STATE|https://github.com/apache/mesos/blob/0.28.0/include/mesos/v1/mesos.proto#L259-L272] > capability introduced in 0.28.0. When the scheduler ties to register with > the TASK_KILLING_STATE capability against a 0.27.0 master, the master drops > the message and prints the following: > {noformat} > [libprotobuf ERROR google/protobuf/message_lite.cc:123] Can't parse message > of type "mesos.scheduler.Call" because it is missing required fields: > subscribe.framework_info.capabilities[0].type > {noformat} > It turns out that our approach to enums in general does not allow for > backwards compatibility. For example: > {code} > message Capability { > enum Type { > REVOCABLE_RESOURCES = 1; > TASK_KILLING_STATE = 2; // New! > } > required Type type = 1; > } > {code} > Using a required enum is problematic because protobuf will strip unknown enum > values during de-serialization: > https://developers.google.com/protocol-buffers/docs/proto#updating > {quote} > enum is compatible with int32, uint32, int64, and uint64 in terms of wire > format (note that values will be truncated if they don't fit), but be aware > that client code may treat them differently when the message is deserialized. > Notably, unrecognized enum values are discarded when the message is > deserialized, which makes the field's has.. accessor return false and its > getter return the first value listed in the enum definition. However, an > integer field will always preserve its value. Because of this, you need to be > very careful when upgrading an integer to an enum in terms of receiving out > of bounds enum values on the wire. > {quote} > The suggestion on the protobuf mailing list is to use optional enum fields > and include an UNKNOWN value as the first entry in the enum list (and/or > explicitly specifying it as the default): > https://groups.google.com/forum/#!msg/protobuf/NhUjBfDyGmY/pf294zMi2bIJ > The updated version of Capability would be: > {code} > message Capability { > enum Type { > UNKNOWN = 0; > REVOCABLE_RESOURCES = 1; > TASK_KILLING_STATE = 2; // New! > } > optional Type type = 1; > } > {code} > Note that the first entry in an enum list is the default value, even if it's > number is not the lowest (unless {{\[default = \]}} is explicitly > specified). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-4721) Expose allocation algorithm latency via a metric.
[ https://issues.apache.org/jira/browse/MESOS-4721?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benjamin Mahler updated MESOS-4721: --- Description: The allocation algorithm has grown to become fairly expensive, gaining visibility into its latency enables monitoring and alerting. Similar allocator timing-related information is already exposed in the log, but should also be exposed via an endpoint. was:Similar allocator timing-related information is already exposed in the log, but should also be exposed via an endpoint. Summary: Expose allocation algorithm latency via a metric. (was: Add allocator metric for allocation duration) > Expose allocation algorithm latency via a metric. > - > > Key: MESOS-4721 > URL: https://issues.apache.org/jira/browse/MESOS-4721 > Project: Mesos > Issue Type: Improvement > Components: allocation >Reporter: Benjamin Bannier >Assignee: Benjamin Bannier > Labels: mesosphere > > The allocation algorithm has grown to become fairly expensive, gaining > visibility into its latency enables monitoring and alerting. > Similar allocator timing-related information is already exposed in the log, > but should also be exposed via an endpoint. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-4720) Add allocator metrics for total vs offered/allocated resources.
[ https://issues.apache.org/jira/browse/MESOS-4720?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benjamin Mahler updated MESOS-4720: --- Summary: Add allocator metrics for total vs offered/allocated resources. (was: Add allocator metric for current allocation breakdown) > Add allocator metrics for total vs offered/allocated resources. > --- > > Key: MESOS-4720 > URL: https://issues.apache.org/jira/browse/MESOS-4720 > Project: Mesos > Issue Type: Improvement > Components: allocation >Reporter: Benjamin Bannier >Assignee: Benjamin Bannier > Labels: mesosphere > > Exposing the current allocation breakdown as seen by the allocator will allow > us to correlated the corresponding metrics in the master with what the > allocator sees. We should expose at least allocated or available, and total. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-5001) Prefix allocator metrics with "mesos/" to better support custom allocator metrics.
[ https://issues.apache.org/jira/browse/MESOS-5001?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benjamin Mahler updated MESOS-5001: --- Description: There currently exists only a single allocator metric named {code} 'allocator/event_queue_dispatches' {code} In order to support different allocator implementations (the "mesos" allocator being the default one included in the project currently), it would be better to rename the metric so that allocator metrics are prefixed with the allocator implementation name: {code} allocator/mesos/event_queue_dispatches {code} This consistent with the approach taken for containerizer metrics, where the mesos containerizer exposes its metrics under a "mesos/" prefix. was:The allocator metric {{allocator/event_queue_dispatches}} is specific to the Mesos allocator, while the name does not clearly communicate that fact. It seems we should be more clear, like we e.g., already do for the containerizers, e.g., a name like {{allocator/mesos/event_queue_dispatches}}. Summary: Prefix allocator metrics with "mesos/" to better support custom allocator metrics. (was: Allocator metric 'allocator/event_queue_dispatches' should be marked as Mesos allocator-specific ) > Prefix allocator metrics with "mesos/" to better support custom allocator > metrics. > -- > > Key: MESOS-5001 > URL: https://issues.apache.org/jira/browse/MESOS-5001 > Project: Mesos > Issue Type: Improvement > Components: allocation >Affects Versions: 0.23.0, 0.24.0, 0.25.0, 0.26.0, 0.27.0, 0.28.0 >Reporter: Benjamin Bannier >Assignee: Benjamin Bannier > Labels: mesosphere > > There currently exists only a single allocator metric named > {code} > 'allocator/event_queue_dispatches' > {code} > In order to support different allocator implementations (the "mesos" > allocator being the default one included in the project currently), it would > be better to rename the metric so that allocator metrics are prefixed with > the allocator implementation name: > {code} > allocator/mesos/event_queue_dispatches > {code} > This consistent with the approach taken for containerizer metrics, where the > mesos containerizer exposes its metrics under a "mesos/" prefix. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-4997) Current approach to protobuf enums does not support upgrades.
[ https://issues.apache.org/jira/browse/MESOS-4997?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benjamin Mahler updated MESOS-4997: --- Component/s: technical debt > Current approach to protobuf enums does not support upgrades. > - > > Key: MESOS-4997 > URL: https://issues.apache.org/jira/browse/MESOS-4997 > Project: Mesos > Issue Type: Bug > Components: technical debt >Reporter: Benjamin Mahler >Priority: Critical > > Some users were opting in to the recently introduced > [TASK_KILLING_STATE|https://github.com/apache/mesos/blob/0.28.0/include/mesos/v1/mesos.proto#L259-L272] > capability introduced in 0.28.0. When the scheduler ties to register with > the TASK_KILLING_STATE capability against a 0.27.0 master, the master drops > the message and prints the following: > {noformat} > [libprotobuf ERROR google/protobuf/message_lite.cc:123] Can't parse message > of type "mesos.scheduler.Call" because it is missing required fields: > subscribe.framework_info.capabilities[0].type > {noformat} > It turns out that our approach to enums in general does not allow for > backwards compatibility. For example: > {code} > message Capability { > enum Type { > REVOCABLE_RESOURCES = 1; > TASK_KILLING_STATE = 2; // New! > } > required Type type = 1; > } > {code} > Using a required enum is problematic because protobuf will strip unknown enum > values during de-serialization: > https://developers.google.com/protocol-buffers/docs/proto#updating > {quote} > enum is compatible with int32, uint32, int64, and uint64 in terms of wire > format (note that values will be truncated if they don't fit), but be aware > that client code may treat them differently when the message is deserialized. > Notably, unrecognized enum values are discarded when the message is > deserialized, which makes the field's has.. accessor return false and its > getter return the first value listed in the enum definition. However, an > integer field will always preserve its value. Because of this, you need to be > very careful when upgrading an integer to an enum in terms of receiving out > of bounds enum values on the wire. > {quote} > The suggestion on the protobuf mailing list is to use optional enum fields > and include an UNKNOWN value as the first entry in the enum list (and/or > explicitly specifying it as the default): > https://groups.google.com/forum/#!msg/protobuf/NhUjBfDyGmY/pf294zMi2bIJ > The updated version of Capability would be: > {code} > message Capability { > enum Type { > UNKNOWN = 0; > REVOCABLE_RESOURCES = 1; > TASK_KILLING_STATE = 2; // New! > } > optional Type type = 1; > } > {code} > Note that the first entry in an enum list is the default value, even if it's > number is not the lowest (unless {{\[default = \]}} is explicitly > specified). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (MESOS-4997) Current approach to protobuf enums does not support upgrades.
Benjamin Mahler created MESOS-4997: -- Summary: Current approach to protobuf enums does not support upgrades. Key: MESOS-4997 URL: https://issues.apache.org/jira/browse/MESOS-4997 Project: Mesos Issue Type: Bug Reporter: Benjamin Mahler Priority: Critical Some users were opting in to the recently introduced [TASK_KILLING_STATE|https://github.com/apache/mesos/blob/0.28.0/include/mesos/v1/mesos.proto#L259-L272] capability introduced in 0.28.0. When the scheduler ties to register with the TASK_KILLING_STATE capability against a 0.27.0 master, the master drops the message and prints the following: {noformat} [libprotobuf ERROR google/protobuf/message_lite.cc:123] Can't parse message of type "mesos.scheduler.Call" because it is missing required fields: subscribe.framework_info.capabilities[0].type {noformat} It turns out that our approach to enums in general does not allow for backwards compatibility. For example: {code} message Capability { enum Type { REVOCABLE_RESOURCES = 1; TASK_KILLING_STATE = 2; // New! } required Type type = 1; } {code} Using a required enum is problematic because protobuf will strip unknown enum values during de-serialization: https://developers.google.com/protocol-buffers/docs/proto#updating {quote} enum is compatible with int32, uint32, int64, and uint64 in terms of wire format (note that values will be truncated if they don't fit), but be aware that client code may treat them differently when the message is deserialized. Notably, unrecognized enum values are discarded when the message is deserialized, which makes the field's has.. accessor return false and its getter return the first value listed in the enum definition. However, an integer field will always preserve its value. Because of this, you need to be very careful when upgrading an integer to an enum in terms of receiving out of bounds enum values on the wire. {quote} The suggestion on the protobuf mailing list is to use optional enum fields and include an UNKNOWN value as the first entry in the enum list (and/or explicitly specifying it as the default): https://groups.google.com/forum/#!msg/protobuf/NhUjBfDyGmY/pf294zMi2bIJ The updated version of Capability would be: {code} message Capability { enum Type { UNKNOWN = 0; REVOCABLE_RESOURCES = 1; TASK_KILLING_STATE = 2; // New! } optional Type type = 1; } {code} Note that the first entry in an enum list is the default value, even if it's number is not the lowest (unless {{\[default = \]}} is explicitly specified). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (MESOS-4980) Expose metrics in the agent for directory garbage collection.
Benjamin Mahler created MESOS-4980: -- Summary: Expose metrics in the agent for directory garbage collection. Key: MESOS-4980 URL: https://issues.apache.org/jira/browse/MESOS-4980 Project: Mesos Issue Type: Epic Components: slave Reporter: Benjamin Mahler Rather than immediate deletion, the agent performs garbage collection of directories that are no longer deleted based on a combination of time and machine-level disk usage. There are currently no metrics providing visibility into this process, so monitoring and alerting are not possible. This means errors can go unnoticed, as was the case with MESOS-4979. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-4740) Improve master metrics/snapshot performace
[ https://issues.apache.org/jira/browse/MESOS-4740?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15191486#comment-15191486 ] Benjamin Mahler commented on MESOS-4740: If you can't reproduce the slowness, then it seems more likely that the metrics computation isn't inherently slow, no? The implication of it being slow only sometimes seems to be that sometimes the master and/or allocator are backlogged. "Complete waste of CPU cycles" has an assumption that the only thing we care about is how many CPU cycles are needed to accomplish our work. We care about much more than just that, for example, how simple and understandable is the code? By introducing event-driven counters, we'll be making the code more complicated. If we want to make such a tradeoff, we first have to establish a basis for it (there are endless places where we could reduce cpu cycles) and measure what we're improving (how large is the improvement). I'm not saying we shouldn't do it, but please first do a deeper analysis here and use benchmarks to demonstrate that the improvement is worth it. For example, https://reviews.apache.org/r/44675/ seems misdirected, I would be very surprised if this has a non-negligible impact on what you're seeing. > Improve master metrics/snapshot performace > -- > > Key: MESOS-4740 > URL: https://issues.apache.org/jira/browse/MESOS-4740 > Project: Mesos > Issue Type: Task >Reporter: Cong Wang >Assignee: Cong Wang > > [~drobinson] noticed retrieving metrics/snapshot statistics could be very > inefficient. > {noformat} > [user@server ~]$ time curl -s localhost:5050/metrics/snapshot > real 0m35.654s > user 0m0.019s > sys 0m0.011s > {noformat} > MESOS-1287 introduces a timeout parameter for this query, but for > metric-collectors like ours they are not aware of such URL-specific > parameter, so we need: > 1) We should always have a timeout and set some default value to it > 2) Investigate why master metrics/snapshot could take such a long time to > complete under load. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-4740) Improve master metrics/snapshot performace
[ https://issues.apache.org/jira/browse/MESOS-4740?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15190096#comment-15190096 ] Benjamin Mahler commented on MESOS-4740: Hey [~wangcong] [~idownes], did you guys investigate why computing metrics takes 30 seconds? It's entirely possible that allocator backup / master backup is where most of the 30 seconds is spent, at which point optimizing the gauge computation would be useless. Or was it the case that computing all of the gauge values took 30 seconds? (Seems unlikely). > Improve master metrics/snapshot performace > -- > > Key: MESOS-4740 > URL: https://issues.apache.org/jira/browse/MESOS-4740 > Project: Mesos > Issue Type: Task >Reporter: Cong Wang >Assignee: Cong Wang > > [~drobinson] noticed retrieving metrics/snapshot statistics could be very > inefficient. > {noformat} > [user@server ~]$ time curl -s localhost:5050/metrics/snapshot > real 0m35.654s > user 0m0.019s > sys 0m0.011s > {noformat} > MESOS-1287 introduces a timeout parameter for this query, but for > metric-collectors like ours they are not aware of such URL-specific > parameter, so we need: > 1) We should always have a timeout and set some default value to it > 2) Investigate why master metrics/snapshot could take such a long time to > complete under load. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-4705) Slave failed to sample container with perf event
[ https://issues.apache.org/jira/browse/MESOS-4705?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15188413#comment-15188413 ] Benjamin Mahler commented on MESOS-4705: Switching from the perf binary to the headers sounds good but is an orthogonal issue and a larger change. It would be ideal to first fix the version parsing issues in the existing code, since it should be a small change and could be backported to previous releases if necessary. [~fan.du] Can you update your patch to address the concerns? > Slave failed to sample container with perf event > > > Key: MESOS-4705 > URL: https://issues.apache.org/jira/browse/MESOS-4705 > Project: Mesos > Issue Type: Bug > Components: cgroups, isolation >Affects Versions: 0.27.1 >Reporter: Fan Du >Assignee: Fan Du > > When sampling container with perf event on Centos7 with kernel > 3.10.0-123.el7.x86_64, slave complained with below error spew: > {code} > E0218 16:32:00.591181 8376 perf_event.cpp:408] Failed to get perf sample: > Failed to parse perf sample: Failed to parse perf sample line > '25871993253,,cycles,mesos/5f23ffca-87ed-4ff6-84f2-6ec3d4098ab8,10059827422,100.00': > Unexpected number of fields > {code} > it's caused by the current perf format [assumption | > https://git-wip-us.apache.org/repos/asf?p=mesos.git;a=blob;f=src/linux/perf.cpp;h=1c113a2b3f57877e132bbd65e01fb2f045132128;hb=HEAD#l430] > with kernel version below 3.12 > On 3.10.0-123.el7.x86_64 kernel, the format is with 6 tokens as below: > value,unit,event,cgroup,running,ratio > A local modification fixed this error on my test bed, please review this > ticket. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-4709) Enable compiler optimization by default
[ https://issues.apache.org/jira/browse/MESOS-4709?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15184256#comment-15184256 ] Benjamin Mahler commented on MESOS-4709: Linked in MESOS-1985 for some context on why this was changed originally. > Enable compiler optimization by default > --- > > Key: MESOS-4709 > URL: https://issues.apache.org/jira/browse/MESOS-4709 > Project: Mesos > Issue Type: Improvement > Components: general >Reporter: Neil Conway >Assignee: Neil Conway > Labels: autoconf, configure, mesosphere > > At present, Mesos defaults to compiling with "-O0"; to enable compiler > optimizations, the user needs to specify "--enable-optimize" when running > {{configure}}. > We should change the default for the following reasons: > (1) The autoconf default for CFLAGS/CXXFLAGS is "-O2 -g". Anecdotally, > I think most software packages compile with a reasonable level of > optimizations enabled by default. > (2) I think we should make the default configure flags appropriate for > end-users (rather than Mesos developers): developers will be familiar > enough with Mesos to tune the configure flags according to their own > preferences. > (3) The performance consequences of not enabling compiler > optimizations can be pretty severe: 5x in a benchmark I just ran, and > we've seen between 2x and 30x (!) performance differences for some > real-world workloads. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-4705) Slave failed to sample container with perf event
[ https://issues.apache.org/jira/browse/MESOS-4705?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benjamin Mahler updated MESOS-4705: --- Shepherd: Benjamin Mahler Sorry for the delay, thanks for looking into this! I left some comments on the review. > Slave failed to sample container with perf event > > > Key: MESOS-4705 > URL: https://issues.apache.org/jira/browse/MESOS-4705 > Project: Mesos > Issue Type: Bug > Components: cgroups, isolation >Affects Versions: 0.27.1 >Reporter: Fan Du >Assignee: Fan Du > > When sampling container with perf event on Centos7 with kernel > 3.10.0-123.el7.x86_64, slave complained with below error spew: > {code} > E0218 16:32:00.591181 8376 perf_event.cpp:408] Failed to get perf sample: > Failed to parse perf sample: Failed to parse perf sample line > '25871993253,,cycles,mesos/5f23ffca-87ed-4ff6-84f2-6ec3d4098ab8,10059827422,100.00': > Unexpected number of fields > {code} > it's caused by the current perf format [assumption | > https://git-wip-us.apache.org/repos/asf?p=mesos.git;a=blob;f=src/linux/perf.cpp;h=1c113a2b3f57877e132bbd65e01fb2f045132128;hb=HEAD#l430] > with kernel version below 3.12 > On 3.10.0-123.el7.x86_64 kernel, the format is with 6 tokens as below: > value,unit,event,cgroup,running,ratio > A local modification fixed this error on my test bed, please review this > ticket. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-4447) Updated reserved() API
[ https://issues.apache.org/jira/browse/MESOS-4447?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15172278#comment-15172278 ] Benjamin Mahler commented on MESOS-4447: Hm.. I can't tell why we're doing this change: https://reviews.apache.org/r/42590/diff/3#1 If I remember correctly, we have these two overloads because the return types are different. For the first function {{hashmap reserved()}}, we want to obtain a mapping of the reserved resources, indexed by the role. The second function {{Resources reserved(string role)}} is equivalent to an entry in the map returned by the first function. What are the issues with these and why are you trying to consolidate them? > Updated reserved() API > -- > > Key: MESOS-4447 > URL: https://issues.apache.org/jira/browse/MESOS-4447 > Project: Mesos > Issue Type: Bug >Reporter: Guangya Liu >Assignee: Guangya Liu > > There are some problems for current {{reserve}} API. The problem is as > following: > {code} > hashmap Resources::reserved() const > { > hashmap result; > foreach (const Resource& resource, resources) { > if (isReserved(resource)) { > result[resource.role()] += resource; > } > } > return result; > } > Resources Resources::reserved(const string& role) const > { > return filter(lambda::bind(isReserved, lambda::_1, role)); > } > bool Resources::isReserved( > const Resource& resource, > const Option& role) > { > if (role.isSome()) { > return !isUnreserved(resource) && role.get() == resource.role(); > } else { > return !isUnreserved(resource); > } > } > {code} > This caused the {{reserved(const string& role) }} has no chance to transfer a > None() parameter to get all reserved resources in flatten mode. > The solution is remove {{reserved()}} and update {{reserved(const string& > role) }} to {{reserved(const Option& role = None()) }} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-4776) Libprocess metrics/snapshot endpoint rate limiting should be configurable.
[ https://issues.apache.org/jira/browse/MESOS-4776?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benjamin Mahler updated MESOS-4776: --- Description: Currently the {{/metrics/snapshot}} endpoint in libprocess has a [hard-coded|https://github.com/apache/mesos/blob/0.27.1/3rdparty/libprocess/include/process/metrics/metrics.hpp#L52] rate limit of 2 requests per second: {code} MetricsProcess() : ProcessBase("metrics"), limiter(2, Seconds(1)) {} {code} This should be configurable via a libprocess environment variable so that users can control this when initializing libprocess. Summary: Libprocess metrics/snapshot endpoint rate limiting should be configurable. (was: It should be possible to disable rate limiting of the metrics endpoint for tests) > Libprocess metrics/snapshot endpoint rate limiting should be configurable. > -- > > Key: MESOS-4776 > URL: https://issues.apache.org/jira/browse/MESOS-4776 > Project: Mesos > Issue Type: Improvement > Components: libprocess >Reporter: Benjamin Bannier >Assignee: Benjamin Bannier > > Currently the {{/metrics/snapshot}} endpoint in libprocess has a > [hard-coded|https://github.com/apache/mesos/blob/0.27.1/3rdparty/libprocess/include/process/metrics/metrics.hpp#L52] > rate limit of 2 requests per second: > {code} > MetricsProcess() > : ProcessBase("metrics"), > limiter(2, Seconds(1)) {} > {code} > This should be configurable via a libprocess environment variable so that > users can control this when initializing libprocess. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-4776) It should be possible to disable rate limiting of the metrics endpoint for tests
[ https://issues.apache.org/jira/browse/MESOS-4776?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benjamin Mahler updated MESOS-4776: --- Component/s: libprocess > It should be possible to disable rate limiting of the metrics endpoint for > tests > > > Key: MESOS-4776 > URL: https://issues.apache.org/jira/browse/MESOS-4776 > Project: Mesos > Issue Type: Improvement > Components: libprocess >Reporter: Benjamin Bannier >Assignee: Benjamin Bannier > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-4673) Agent fails to shutdown after re-registering period timed-out.
[ https://issues.apache.org/jira/browse/MESOS-4673?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benjamin Mahler updated MESOS-4673: --- Component/s: docker > Agent fails to shutdown after re-registering period timed-out. > -- > > Key: MESOS-4673 > URL: https://issues.apache.org/jira/browse/MESOS-4673 > Project: Mesos > Issue Type: Bug > Components: docker >Reporter: Jan Schlicht >Assignee: Jan Schlicht > Labels: mesosphere > > Under certain conditions, when a mesos agent looses connection to the master > for an extended period of time (Say a switch fails), the master will > de-register the agent, and then when the agent comes back up, refuse to let > it register: {{Slave asked to shut down by master@10.102.25.1:5050 because > 'Slave attempted to re-register after removal'}}. > The agent doesn't seem to be able to properly shutdown and remove running > tasks as it should do to register as a new agent. Hence this message will > persist until it's resolved by manual intervetion. > This seems to be caused by Docker tasks that couldn't shutdown cleanly when > the agent is asked to shutdown running tasks to be able to register as a new > agent with the master. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (MESOS-4770) Investigate performance improvements for 'Resources' class.
Benjamin Mahler created MESOS-4770: -- Summary: Investigate performance improvements for 'Resources' class. Key: MESOS-4770 URL: https://issues.apache.org/jira/browse/MESOS-4770 Project: Mesos Issue Type: Improvement Reporter: Benjamin Mahler Priority: Critical Currently we have some performance issues when we have heavy usage of the {{Resources}} class. Currently, we tend to work around these issues (e.g. reduce the amount of Resources arithmetic operations in the caller code). The implementation of {{Resources}} currently consists of wrapping underlying {{Resource}} protobuf objects and manipulating them. This is fairly expensive compared to doing things more directly with C++ objects. This ticket is to explore the performance improvements of using C++ objects more directly instead of working off of {{Resource}} objects. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (MESOS-4767) Apply batching to allocation events to reduce allocator backlogging.
Benjamin Mahler created MESOS-4767: -- Summary: Apply batching to allocation events to reduce allocator backlogging. Key: MESOS-4767 URL: https://issues.apache.org/jira/browse/MESOS-4767 Project: Mesos Issue Type: Improvement Components: allocation Reporter: Benjamin Mahler Per the [discussion|https://issues.apache.org/jira/browse/MESOS-3157?focusedCommentId=14728377&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14728377] that came out of MESOS-3157, we'd like to batch together outstanding allocation dispatches in order to avoid backing up the allocator. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-4694) DRFAllocator takes very long to allocate resources with a large number of frameworks
[ https://issues.apache.org/jira/browse/MESOS-4694?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benjamin Mahler updated MESOS-4694: --- Issue Type: Improvement (was: Bug) > DRFAllocator takes very long to allocate resources with a large number of > frameworks > > > Key: MESOS-4694 > URL: https://issues.apache.org/jira/browse/MESOS-4694 > Project: Mesos > Issue Type: Improvement > Components: allocation >Affects Versions: 0.26.0, 0.27.0, 0.27.1 >Reporter: Dario Rexin >Assignee: Dario Rexin > > With a growing number of connected frameworks, the allocation time grows to > very high numbers. The addition of quota in 0.27 had an additional impact on > these numbers. Running `mesos-tests.sh --benchmark > --gtest_filter=HierarchicalAllocator_BENCHMARK_Test.DeclineOffers` gives us > the following numbers: > {noformat} > [==] Running 1 test from 1 test case. > [--] Global test environment set-up. > [--] 1 test from HierarchicalAllocator_BENCHMARK_Test > [ RUN ] HierarchicalAllocator_BENCHMARK_Test.DeclineOffers > Using 2000 slaves and 200 frameworks > round 0 allocate took 2.921202secs to make 200 offers > round 1 allocate took 2.85045secs to make 200 offers > round 2 allocate took 2.823768secs to make 200 offers > {noformat} > Increasing the number of frameworks to 2000: > {noformat} > [==] Running 1 test from 1 test case. > [--] Global test environment set-up. > [--] 1 test from HierarchicalAllocator_BENCHMARK_Test > [ RUN ] HierarchicalAllocator_BENCHMARK_Test.DeclineOffers > Using 2000 slaves and 2000 frameworks > round 0 allocate took 28.209454secs to make 2000 offers > round 1 allocate took 28.469419secs to make 2000 offers > round 2 allocate took 28.138086secs to make 2000 offers > {noformat} > I was able to reduce this time by a substantial amount. After applying the > patches: > {noformat} > [==] Running 1 test from 1 test case. > [--] Global test environment set-up. > [--] 1 test from HierarchicalAllocator_BENCHMARK_Test > [ RUN ] HierarchicalAllocator_BENCHMARK_Test.DeclineOffers > Using 2000 slaves and 200 frameworks > round 0 allocate took 1.016226secs to make 2000 offers > round 1 allocate took 1.102729secs to make 2000 offers > round 2 allocate took 1.102624secs to make 2000 offers > {noformat} > And with 2000 frameworks: > {noformat} > [==] Running 1 test from 1 test case. > [--] Global test environment set-up. > [--] 1 test from HierarchicalAllocator_BENCHMARK_Test > [ RUN ] HierarchicalAllocator_BENCHMARK_Test.DeclineOffers > Using 2000 slaves and 2000 frameworks > round 0 allocate took 12.563203secs to make 2000 offers > round 1 allocate took 12.437517secs to make 2000 offers > round 2 allocate took 12.470708secs to make 2000 offers > {noformat} > The patches do 3 things to improve the performance of the allocator. > 1) The total values in the DRFSorter will be pre calculated per resource type > 2) In the allocate method, when no resources are available to allocate, we > break out of the innermost loop to prevent looping over a large number of > frameworks when we have nothing to allocate > 3) when a framework suppresses offers, we remove it from the sorter instead > of just calling continue in the allocation loop - this greatly improves > performance in the sorter and prevents looping over frameworks that don't > need resources > Assuming that most of the frameworks behave nicely and suppress offers when > they have nothing to schedule, it is fair to assume, that point 3) has the > biggest impact on the performance. If we suppress offers for 90% of the > frameworks in the benchmark test, we see following numbers: > {noformat} > ==] Running 1 test from 1 test case. > [--] Global test environment set-up. > [--] 1 test from HierarchicalAllocator_BENCHMARK_Test > [ RUN ] HierarchicalAllocator_BENCHMARK_Test.DeclineOffers > Using 200 slaves and 2000 frameworks > round 0 allocate took 11626us to make 200 offers > round 1 allocate took 22890us to make 200 offers > round 2 allocate took 21346us to make 200 offers > {noformat} > And for 200 frameworks: > {noformat} > [==] Running 1 test from 1 test case. > [--] Global test environment set-up. > [--] 1 test from HierarchicalAllocator_BENCHMARK_Test > [ RUN ] HierarchicalAllocator_BENCHMARK_Test.DeclineOffers > Using 2000 slaves and 2000 frameworks > round 0 allocate took 1.11178secs to make 2000 offers > round 1 allocate took 1.062649secs to make 2000 offers > round 2 allocate took 1.080181secs to make 2000 offers > {noformat} > Review requests: > https://reviews.apache.org/r/43665/ > https://reviews.apache.org/r/43666/ > https://
[jira] [Created] (MESOS-4766) Improve allocator performance.
Benjamin Mahler created MESOS-4766: -- Summary: Improve allocator performance. Key: MESOS-4766 URL: https://issues.apache.org/jira/browse/MESOS-4766 Project: Mesos Issue Type: Epic Components: allocation Reporter: Benjamin Mahler Priority: Critical This is an epic to track the various tickets around improving the performance of the allocator, including the following: * Preventing un-necessary backup of the allocator. * Reducing the cost of allocations and allocator state updates. * Improving performance of the DRF sorter. * More benchmarking to simulate scenarios with performance issues. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (MESOS-4547) Introduce TASK_KILLING state.
[ https://issues.apache.org/jira/browse/MESOS-4547?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15155681#comment-15155681 ] Benjamin Mahler edited comment on MESOS-4547 at 2/23/16 6:11 PM: - {noformat} commit 022be0a833dfb58de958a80eabc89fa9334782e0 Author: Abhishek Dasgupta Date: Sat Feb 20 12:56:24 2016 +0100 Introduced a TASK_KILLING state. TASK_KILLING can be used to signify that the kill request has been received by the executor, but the task is not yet killed. This is similar to how TASK_STARTING indicates the launch request has been received by the executor, but the task is not yet launched. This new state will be guarded by a framework capability in order to ensure that we do not break older frameworks. Review: https://reviews.apache.org/r/43487/ {noformat} {noformat} commit cae9162a198a4a298fee920c57dbd128731529e2 Author: Abhishek Dasgupta Date: Sat Feb 20 13:09:16 2016 +0100 Added a framework capability to guard TASK_KILLING. Frameworks must opt-in to receive TASK_KILLING. For now, it will be entirely the responsibility of the executor to check the capability before sending a TASK_KILLING update. Review: https://reviews.apache.org/r/43488/ {noformat} {noformat} commit a30233b994dd1a77eb8ef37525b5aa7b6ecdf3bd Author: Abhishek Dasgupta Date: Sat Feb 20 14:25:15 2016 +0100 Updated the command / docker executors to send TASK_KILLING. Review: https://reviews.apache.org/r/43489/ {noformat} {noformat} commit ee86b13633a9469629dbd79681d0776b6020f76a Author: Benjamin Mahler Date: Sat Feb 20 16:18:22 2016 +0100 Added command executor tests for TASK_KILLING. {noformat} {noformat} commit 978ccb5dd637f0e1577ecae1e21973f50429b04c Author: Benjamin Mahler Date: Sat Feb 20 17:28:58 2016 +0100 Added docker executor tests for TASK_KILLING. {noformat} {noformat} commit 1488f16d283f69b7dc96feaee91b04a09012ca4a Author: Benjamin Mahler Date: Sat Feb 20 17:35:30 2016 +0100 Added TASK_KILLING to the API changes in the CHANGELOG. {noformat} {noformat} commit 8b5137ddbb33417518dee32066c0fb552d05d046 Author: Benjamin Mahler Date: Mon Feb 22 06:16:50 2016 +0100 Updated the HA framework guide for TASK_KILLING. Review: https://reviews.apache.org/r/43821 {noformat} {noformat} commit b1092b005b03d231b18806679a8e1d58a09f3004 Author: Benjamin Mahler Date: Mon Feb 22 10:03:30 2016 +0100 Updated webui to reflect the new TASK_KILLING state. Review: https://reviews.apache.org/r/43888 {noformat} was (Author: bmahler): {noformat} commit 022be0a833dfb58de958a80eabc89fa9334782e0 Author: Abhishek Dasgupta Date: Sat Feb 20 12:56:24 2016 +0100 Introduced a TASK_KILLING state. TASK_KILLING can be used to signify that the kill request has been received by the executor, but the task is not yet killed. This is similar to how TASK_STARTING indicates the launch request has been received by the executor, but the task is not yet launched. This new state will be guarded by a framework capability in order to ensure that we do not break older frameworks. Review: https://reviews.apache.org/r/43487/ {noformat} {noformat} commit cae9162a198a4a298fee920c57dbd128731529e2 Author: Abhishek Dasgupta Date: Sat Feb 20 13:09:16 2016 +0100 Added a framework capability to guard TASK_KILLING. Frameworks must opt-in to receive TASK_KILLING. For now, it will be entirely the responsibility of the executor to check the capability before sending a TASK_KILLING update. Review: https://reviews.apache.org/r/43488/ {noformat} {noformat} commit a30233b994dd1a77eb8ef37525b5aa7b6ecdf3bd Author: Abhishek Dasgupta Date: Sat Feb 20 14:25:15 2016 +0100 Updated the command / docker executors to send TASK_KILLING. Review: https://reviews.apache.org/r/43489/ {noformat} {noformat} commit ee86b13633a9469629dbd79681d0776b6020f76a Author: Benjamin Mahler Date: Sat Feb 20 16:18:22 2016 +0100 Added command executor tests for TASK_KILLING. {noformat} {noformat} commit 978ccb5dd637f0e1577ecae1e21973f50429b04c Author: Benjamin Mahler Date: Sat Feb 20 17:28:58 2016 +0100 Added docker executor tests for TASK_KILLING. {noformat} {noformat} commit 1488f16d283f69b7dc96feaee91b04a09012ca4a Author: Benjamin Mahler Date: Sat Feb 20 17:35:30 2016 +0100 Added TASK_KILLING to the API changes in the CHANGELOG. {noformat} > Introduce TASK_KILLING state. > - > > Key: MESOS-4547 > URL: https://issues.apache.org/jira/browse/MESOS-4547 > Project: Mesos > Issue Type: Improvement >Reporter: Benjamin Mahler >Assignee: Abhishek Dasgupta > Labels: mesosphere > Fix For: 0.28.0 > > > Currently there is no state to express that a task i
[jira] [Created] (MESOS-4729) Support 'delay' calls to direct or deferred functions.
Benjamin Mahler created MESOS-4729: -- Summary: Support 'delay' calls to direct or deferred functions. Key: MESOS-4729 URL: https://issues.apache.org/jira/browse/MESOS-4729 Project: Mesos Issue Type: Improvement Components: libprocess Reporter: Benjamin Mahler Currently the {{delay}} primitive in libprocess can only be called for a member function of a Process, and will execute in that Process' execution context. However, we may want to delay to an unspecified execution context (handled by libprocess) or direct functions, in the same way that defer supports these use cases: {code} // Execute directly from the timeout completion context. delay(timeout, () { exit(1); }); // Execute within a deferred execution context. // Libprocess will manage callback executions (currently handled // via a single Executor [1]). delay(timeout, defer(() { exit(1); })); {code} \[1\] [Libprocess callback Executor| https://github.com/apache/mesos/blob/0.27.0/3rdparty/libprocess/include/process/executor.hpp#L25-L66] -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (MESOS-4664) Add allocator metrics.
Benjamin Mahler created MESOS-4664: -- Summary: Add allocator metrics. Key: MESOS-4664 URL: https://issues.apache.org/jira/browse/MESOS-4664 Project: Mesos Issue Type: Improvement Components: allocation Reporter: Benjamin Mahler Priority: Critical There are currently no metrics that provide visibility into the allocator, except for the event queue size. This makes monitoring an debugging allocation behavior in a multi-framework setup difficult. Some thoughts for initial metrics to add: * How many allocation runs have completed? (counter) * Current allocation breakdown: allocated / available / total (gauges) * Current maximum shares (gauges) * How many active filters are there for the role / framework? (gauges) * How many frameworks are suppressing offers? (gauges) * How long does an allocation run take? (timers) * Maintenance related metrics: ** How many maintenance events are active? (gauges) ** How many maintenance events are scheduled but not active (gauges) * Quota related metrics: ** How much quota is set for each role? (gauges) ** How much quota is satisfied? How much unsatisfied? (gauges) Some of these are already exposed from the master's metrics, but we should not assume this within the allocator. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-3078) Recovered resources are not re-allocated until the next allocation delay.
[ https://issues.apache.org/jira/browse/MESOS-3078?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benjamin Mahler updated MESOS-3078: --- Shepherd: Benjamin Mahler Assignee: (was: Klaus Ma) > Recovered resources are not re-allocated until the next allocation delay. > - > > Key: MESOS-3078 > URL: https://issues.apache.org/jira/browse/MESOS-3078 > Project: Mesos > Issue Type: Improvement > Components: allocation >Reporter: Benjamin Mahler > > Currently, when resources are recovered, we do not perform an allocation for > that slave. Rather, we wait until the next allocation interval. > For small task, high throughput frameworks, this can have a significant > impact on overall throughput, see the following thread: > http://markmail.org/thread/y6mzfwzlurv6nik3 > We should consider immediately performing a re-allocation for the slave upon > resource recovery. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (MESOS-4656) strings::split behaves incorrectly when n=1
Benjamin Mahler created MESOS-4656: -- Summary: strings::split behaves incorrectly when n=1 Key: MESOS-4656 URL: https://issues.apache.org/jira/browse/MESOS-4656 Project: Mesos Issue Type: Bug Components: stout Reporter: Benjamin Mahler Assignee: Benjamin Mahler While looking at the patches for MESOS-3833, I noticed that the code for strings::split behaves incorrectly for n=1 (maximum number of tokens). Adding the following test case demonstrates the issue: {code} TEST(StringsTest, SplitNOne) { vector tokens = strings::split("foo,bar,,,", ",", 1); ASSERT_EQ(1u, tokens.size()); EXPECT_EQ("foo,bar,,,", tokens[0]); } {code} This fails as follows: {noformat} [ RUN ] StringsTest.SplitNOne ../../../../3rdparty/libprocess/3rdparty/stout/tests/strings_tests.cpp:357: Failure Value of: tokens.size() Actual: 5 Expected: 1u Which is: 1 [ FAILED ] StringsTest.SplitNOne (0 ms) {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-3833) /help endpoints do not work for nested paths
[ https://issues.apache.org/jira/browse/MESOS-3833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15140592#comment-15140592 ] Benjamin Mahler commented on MESOS-3833: Yes sorry for the delay, [~gyliu] please email me at bmah...@apache.org when you need reviews :) Just gave you a review, let me know when you've updated! > /help endpoints do not work for nested paths > > > Key: MESOS-3833 > URL: https://issues.apache.org/jira/browse/MESOS-3833 > Project: Mesos > Issue Type: Bug > Components: HTTP API >Reporter: Anand Mazumdar >Assignee: Guangya Liu >Priority: Minor > Labels: mesosphere, newbie > > Mesos displays the list of all supported endpoints starting at a given path > prefix using the {{/help}} suffix, e.g. {{master:5050/help}}. > It seems that the {{help}} functionality is broken for URL's having nested > paths e.g. {{master:5050/help/master/machine/down}}. The response returned is: > {quote} > Malformed URL, expecting '/help/id/name/' > {quote} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (MESOS-4627) Improve Ranges parsing to handle single values.
Benjamin Mahler created MESOS-4627: -- Summary: Improve Ranges parsing to handle single values. Key: MESOS-4627 URL: https://issues.apache.org/jira/browse/MESOS-4627 Project: Mesos Issue Type: Improvement Reporter: Benjamin Mahler Users expect to be able to write a single value entry when specifying ports: {noformat} ./bin/mesos-slave.sh --resources="ports:[80, 100-120]" --master=localhost:5050 ... Failed to determine slave resources: Failed to parse resource ports value [80, 100-120] error Expecting one or more "ranges" {noformat} We should improve our parsing ability here. We should also consider stringifying using this more succinct format. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (MESOS-4626) Support Nvidia GPUs with filesystem isolation enabled.
Benjamin Mahler created MESOS-4626: -- Summary: Support Nvidia GPUs with filesystem isolation enabled. Key: MESOS-4626 URL: https://issues.apache.org/jira/browse/MESOS-4626 Project: Mesos Issue Type: Task Components: isolation Reporter: Benjamin Mahler When filesystem isolation is enabled, containers that use Nvidia GPU resources need access to GPU libraries residing on the host. We'll need to provide a means for operators to inject the necessary volumes into *all* containers that use "gpus" resources. See the nvidia-docker project for more details: [nvidia-docker/tools/src/nvidia/volumes.go|https://github.com/NVIDIA/nvidia-docker/blob/fda10b2d27bf5578cc5337c23877f827e4d1ed77/tools/src/nvidia/volumes.go#L50-L103] -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-4623) Add a stub Nvidia GPU isolator.
[ https://issues.apache.org/jira/browse/MESOS-4623?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benjamin Mahler updated MESOS-4623: --- Story Points: 3 > Add a stub Nvidia GPU isolator. > --- > > Key: MESOS-4623 > URL: https://issues.apache.org/jira/browse/MESOS-4623 > Project: Mesos > Issue Type: Task > Components: isolation >Reporter: Benjamin Mahler > > We'll first wire up a skeleton Nvidia GPU isolator, which needs to be guarded > by a configure flag due to the dependency on NVML. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-3368) Add device support in cgroups abstraction
[ https://issues.apache.org/jira/browse/MESOS-3368?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benjamin Mahler updated MESOS-3368: --- Story Points: 3 > Add device support in cgroups abstraction > - > > Key: MESOS-3368 > URL: https://issues.apache.org/jira/browse/MESOS-3368 > Project: Mesos > Issue Type: Task >Reporter: Niklas Quarfot Nielsen > > Add support for [device > cgroups|https://www.kernel.org/doc/Documentation/cgroup-v1/devices.txt] to > aid isolators controlling access to devices. > In the future, we could think about how to numerate and control access to > devices as resource or task/container policy -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-4625) Implement Nvidia GPU isolation w/o filesystem isolation enabled.
[ https://issues.apache.org/jira/browse/MESOS-4625?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benjamin Mahler updated MESOS-4625: --- Story Points: 5 > Implement Nvidia GPU isolation w/o filesystem isolation enabled. > > > Key: MESOS-4625 > URL: https://issues.apache.org/jira/browse/MESOS-4625 > Project: Mesos > Issue Type: Task > Components: isolation >Reporter: Benjamin Mahler > > The Nvidia GPU isolator will need to use the device cgroup to restrict access > to GPU resources, and will need to recover this information after agent > failover. For now this will require that the operator specifies the GPU > devices via a flag. > To handle filesystem isolation requires that we provide mechanisms for > operators to inject volumes with the necessary libraries into all containers > using GPU resources, we'll tackle this in a separate ticket. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (MESOS-4625) Implement Nvidia GPU isolation w/o filesystem isolation enabled.
Benjamin Mahler created MESOS-4625: -- Summary: Implement Nvidia GPU isolation w/o filesystem isolation enabled. Key: MESOS-4625 URL: https://issues.apache.org/jira/browse/MESOS-4625 Project: Mesos Issue Type: Task Components: isolation Reporter: Benjamin Mahler The Nvidia GPU isolator will need to use the device cgroup to restrict access to GPU resources, and will need to recover this information after agent failover. For now this will require that the operator specifies the GPU devices via a flag. To handle filesystem isolation requires that we provide mechanisms for operators to inject volumes with the necessary libraries into all containers using GPU resources, we'll tackle this in a separate ticket. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (MESOS-4624) Add allocation metrics for "gpus" resources.
Benjamin Mahler created MESOS-4624: -- Summary: Add allocation metrics for "gpus" resources. Key: MESOS-4624 URL: https://issues.apache.org/jira/browse/MESOS-4624 Project: Mesos Issue Type: Task Components: master, slave Reporter: Benjamin Mahler Allocation metrics are currently hard-coded to include only {{\["cpus", "mem", "disk"\]}} resources. We'll need to add "gpus" to the list to start, possibly following up on the TODO to remove the hard-coding. See: https://github.com/apache/mesos/blob/0.27.0/src/master/metrics.cpp#L266-L269 https://github.com/apache/mesos/blob/0.27.0/src/slave/metrics.cpp#L123-L126 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (MESOS-4623) Add a stub Nvidia GPU isolator.
Benjamin Mahler created MESOS-4623: -- Summary: Add a stub Nvidia GPU isolator. Key: MESOS-4623 URL: https://issues.apache.org/jira/browse/MESOS-4623 Project: Mesos Issue Type: Task Components: isolation Reporter: Benjamin Mahler We'll first wire up a skeleton Nvidia GPU isolator, which needs to be guarded by a configure flag due to the dependency on NVML. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-4547) Introduce TASK_KILLING state.
[ https://issues.apache.org/jira/browse/MESOS-4547?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benjamin Mahler updated MESOS-4547: --- Shepherd: Benjamin Mahler > Introduce TASK_KILLING state. > - > > Key: MESOS-4547 > URL: https://issues.apache.org/jira/browse/MESOS-4547 > Project: Mesos > Issue Type: Improvement >Reporter: Benjamin Mahler >Assignee: Abhishek Dasgupta > Labels: mesosphere > > Currently there is no state to express that a task is being killed, but is > not yet killed (see MESOS-4140). In a similar way to how we have > TASK_STARTING to indicate the task is starting but not yet running, a > TASK_KILLING state would indicate the task is being killed but is not yet > killed. > This would need to be guarded by a framework capability to protect old > frameworks that cannot understand the TASK_KILLING state. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (MESOS-4547) Introduce TASK_KILLING state.
[ https://issues.apache.org/jira/browse/MESOS-4547?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benjamin Mahler reassigned MESOS-4547: -- Assignee: Abhishek Dasgupta Done, I'm looking into having this soon, so let me know if you're not able to get started. > Introduce TASK_KILLING state. > - > > Key: MESOS-4547 > URL: https://issues.apache.org/jira/browse/MESOS-4547 > Project: Mesos > Issue Type: Improvement >Reporter: Benjamin Mahler >Assignee: Abhishek Dasgupta > Labels: mesosphere > > Currently there is no state to express that a task is being killed, but is > not yet killed (see MESOS-4140). In a similar way to how we have > TASK_STARTING to indicate the task is starting but not yet running, a > TASK_KILLING state would indicate the task is being killed but is not yet > killed. > This would need to be guarded by a framework capability to protect old > frameworks that cannot understand the TASK_KILLING state. -- This message was sent by Atlassian JIRA (v6.3.4#6332)