[jira] [Updated] (MESOS-1550) MesosSchedulerDriver should never, ever, call 'stop'.
[ https://issues.apache.org/jira/browse/MESOS-1550?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benjamin Mahler updated MESOS-1550: --- Fix Version/s: 0.19.1 MesosSchedulerDriver should never, ever, call 'stop'. - Key: MESOS-1550 URL: https://issues.apache.org/jira/browse/MESOS-1550 Project: Mesos Issue Type: Bug Components: framework, java api, python api Affects Versions: 0.14.0, 0.14.1, 0.14.2, 0.17.0, 0.16.0, 0.15.0, 0.18.0, 0.19.0 Reporter: Benjamin Hindman Priority: Critical Fix For: 0.19.1 Using MesosSchedulerDriver.stop causes the master to unregister the framework. The library should never make this decision for a framework, it should defer to the framework itself. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (MESOS-1550) MesosSchedulerDriver should never, ever, call 'stop'.
[ https://issues.apache.org/jira/browse/MESOS-1550?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14046397#comment-14046397 ] Bill Farner commented on MESOS-1550: Somewhat related, does it make sense to also follow up with an API change/deprecation to make the behavior of {{stop()}} more obvious (i.e. rename the method)? Aurora also bumped into the unintended consequences years ago, and i doubt we were the last. MesosSchedulerDriver should never, ever, call 'stop'. - Key: MESOS-1550 URL: https://issues.apache.org/jira/browse/MESOS-1550 Project: Mesos Issue Type: Bug Components: framework, java api, python api Affects Versions: 0.14.0, 0.14.1, 0.14.2, 0.17.0, 0.16.0, 0.15.0, 0.18.0, 0.19.0 Reporter: Benjamin Hindman Priority: Critical Fix For: 0.19.1 Using MesosSchedulerDriver.stop causes the master to unregister the framework. The library should never make this decision for a framework, it should defer to the framework itself. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Resolved] (MESOS-1342) Add authorization support.
[ https://issues.apache.org/jira/browse/MESOS-1342?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinod Kone resolved MESOS-1342. --- Resolution: Fixed Fix Version/s: 0.20.0 Add authorization support. -- Key: MESOS-1342 URL: https://issues.apache.org/jira/browse/MESOS-1342 Project: Mesos Issue Type: Epic Components: master, security Reporter: Vinod Kone Assignee: Vinod Kone Fix For: 0.20.0 This adds support for authorizing frameworks to run tasks as certain users, receive offers for certain roles and clients to access HTTP end points. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (MESOS-1552) Mesos javadoc should include .proto javadoc
Kevin Sweeney created MESOS-1552: Summary: Mesos javadoc should include .proto javadoc Key: MESOS-1552 URL: https://issues.apache.org/jira/browse/MESOS-1552 Project: Mesos Issue Type: Story Components: documentation, java api Reporter: Kevin Sweeney The Java API documentation on the website (http://mesos.apache.org/api/latest/java/) should include protobuf documentation. protoc automatically generates javadoc based on comments in the .proto so this should be a matter of wiring. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Assigned] (MESOS-1550) MesosSchedulerDriver should never, ever, call 'stop'.
[ https://issues.apache.org/jira/browse/MESOS-1550?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benjamin Hindman reassigned MESOS-1550: --- Assignee: Benjamin Hindman https://reviews.apache.org/r/23142 MesosSchedulerDriver should never, ever, call 'stop'. - Key: MESOS-1550 URL: https://issues.apache.org/jira/browse/MESOS-1550 Project: Mesos Issue Type: Bug Components: framework, java api, python api Affects Versions: 0.14.0, 0.14.1, 0.14.2, 0.17.0, 0.16.0, 0.15.0, 0.18.0, 0.19.0 Reporter: Benjamin Hindman Assignee: Benjamin Hindman Priority: Critical Using MesosSchedulerDriver.stop causes the master to unregister the framework. The library should never make this decision for a framework, it should defer to the framework itself. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (MESOS-1539) No longer able to spin up Mesos master in local mode
[ https://issues.apache.org/jira/browse/MESOS-1539?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benjamin Mahler updated MESOS-1539: --- Target Version/s: 0.19.1 Fix Version/s: (was: 0.19.1) No longer able to spin up Mesos master in local mode Key: MESOS-1539 URL: https://issues.apache.org/jira/browse/MESOS-1539 Project: Mesos Issue Type: Bug Components: java api Affects Versions: 0.19.0 Environment: Ubuntu 14.04 / Mac OS X against Mesos 0.19.0 Reporter: Sunil Shah Assignee: Benjamin Mahler Fix For: 0.20.0 JVM frameworks such as Marathon use the local master mode for testing purposes (passed through as the `--master local` parameter). This doesn't not to work in Mesos 0.19.0 because of the new mandatory registry and quorum parameters. There is no way to set these for local masters - it emits the following message before terminating the framework: `--work_dir needed for replicated log based registry`. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (MESOS-1517) Maintain a queue of messages that arrive before the master recovers.
[ https://issues.apache.org/jira/browse/MESOS-1517?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14046537#comment-14046537 ] Benjamin Mahler commented on MESOS-1517: There's only a few types of messages involved here. When a master fails over, the slaves and frameworks will try to re-register before doing anything else. This means that if we queue up the messages we'll only be reducing the need for frameworks and slaves to retry registration, which is already something that is required of them. So I think this change would mostly be beneficial for our integration tests where the retries are not desirable. :) Maintain a queue of messages that arrive before the master recovers. Key: MESOS-1517 URL: https://issues.apache.org/jira/browse/MESOS-1517 Project: Mesos Issue Type: Improvement Components: master Reporter: Benjamin Mahler Labels: reliability Currently when the master is recovering, we drop all incoming messages. If slaves and frameworks knew about the leading master only once it has recovered, then we would only expect to see messages after we've recovered. We previously considered enqueuing all messages through the recovery future, but this has the downside of forcing all messages to go through the master's queue twice: {code} // TODO(bmahler): Consider instead re-enqueing *all* messages // through recover(). What are the performance implications of // the additional queueing delay and the accumulated backlog // of messages post-recovery? if (!recovered.get().isReady()) { VLOG(1) Dropping ' event.message-name ' message since not recovered yet; ++metrics.dropped_messages; return; } {code} However, an easy solution to this problem is to maintain an explicit queue of incoming messages that gets flushed once we finish recovery. This ensures that all messages post-recovery are processed normally. -- This message was sent by Atlassian JIRA (v6.2#6252)