>
> Largely because of a requirement to bring everything back up in a certain
> order


I don't think they need to be brought back up in a certain order. You just
need to restart all of them. The only requirement is that all masters
should be running at 0.19.0.

I'd also be very interested in a zookeeper implementation


I think there is an issue with ZK impl. Ben Mahler probably can expand
here.

- Jie


On Fri, Jun 13, 2014 at 12:32 AM, Tom Arnfeld <t...@duedil.com> wrote:

> Hey Dave (and the group),
>
> I have to say for me it was a little fiddly to upgrade a 0.18.2
> cluster to 0.19.0. Largely because of a requirement to bring
> everything back up in a certain order (I had to lower the quorum count
> to 1) otherwise mesos failed to get a majority vote to initialise the
> log (I had 3 masters).
>
> I'd also be very interested in a zookeeper implementation - and
> perhaps some improved documentation around the log.
>
> Cheers,
>
> Tom.
>
> > On 13 Jun 2014, at 08:17, Dick Davies <d...@hellooperator.net> wrote:
> >
> > I thought I read that there was going to be a registry implementation
> > backed by zookeeper;
> > does anyone know why that was dropped?
> >
> > Really excited to see the containerizer features rolling in, but the
> > quorum looks at first glance
> > to make Mesos a little harder to operate
> > ("This means adding or removing masters must be done carefully! ") - I
> > understand the
> > benefits but was hoping we could get by with the zookeeper registry.
> >
> >
> >> On 13 June 2014 03:49, Dave Lester <daveles...@gmail.com> wrote:
> >> Hi All,
> >>
> >> Below is a blog post that Ben Mahler wrote as release manager for Mesos
> >> 0.19.0; it was published on the Mesos site today.
> >>
> >> I know that not everyone follows @ApacheMesos Twitter (even though you
> >> should!), so I wanted to make sure was also shared on the user@ list.
> >>
> >> Cheers,
> >> Dave
> >>
> >>
> >> Apache Mesos 0.19.0 Released
> >>
> >> The latest Mesos release, 0.19.0 is now available for download. This new
> >> version includes the following features and improvements:
> >>
> >> The master now persists the list of registered slaves in a durable
> >> replicated manner using the Registrar and the replicated log.
> >> Alpha support for custom container technologies has been added with the
> >> ExternalContainerizer.
> >> Metrics reporting has been overhauled and is now exposed on
> >> <ip:port>/metrics/snapshot.
> >> Slave Authentication: optionally, only authenticated slaves can register
> >> with the master.
> >> Numerous bug fixes and stability improvements.
> >>
> >> Full release notes are available on JIRA.
> >>
> >> Registrar
> >>
> >> Mesos 0.19.0 introduces the “Registrar”: the master now persists the
> list of
> >> registered slaves in a durable replicated manner. The previous lack of
> >> durable state was an intentional design decision that simplified
> failover
> >> and allowed masters to be run and migrated with ease. However, the
> stateless
> >> design had issues:
> >>
> >> In the event of a dual failure (slave fails while master is down), no
> lost
> >> task notifications are sent. This leads to a task running according to
> the
> >> framework but unknown to Mesos.
> >> When a new master is elected, we may allow rogue slaves to re-register
> with
> >> the master. This leads to tasks running on the slave that are not known
> to
> >> the framework.
> >>
> >> Persisting the list of registered slaves allows failed over masters to
> >> detect slaves that do not re-register, and notify frameworks
> accordingly. It
> >> also allows us to prevent rogue slaves from re-registering; terminating
> the
> >> rogue tasks in the process.
> >>
> >> The state is persisted using the replicated log (available since 0.9.0).
> >>
> >> External Containerization
> >>
> >> As alluded to during the containerization / isolation refactor in
> 0.18.0,
> >> the ExternalContainerizer has landed in this release. This provides
> alpha
> >> level support for custom containerization.
> >>
> >> Developers can implement their own external containerizers to provide
> >> support for custom container technologies. Initial Docker support is now
> >> available through some community driven external containerizers: Docker
> >> Containerizer for Mesos by Tom Arnfeld and Deimos by Jason Dusek. Please
> >> reach out on the mailing lists with questions!
> >>
> >> Metrics
> >>
> >> Previously, Mesos components had to use custom metrics code and custom
> HTTP
> >> endpoints for exposing metrics. This made it difficult to expose
> additional
> >> system metrics and often required having an endpoint for each libprocess
> >> Process (Actor) for which metrics were desired. Having metrics spread
> across
> >> endpoints was operationally complex.
> >>
> >> We needed a consistent, simple, and global way to expose metrics, which
> led
> >> to the creation of a metrics library within libprocess. All metrics are
> now
> >> exposed via /metrics/snapshot. The /stats.json endpoint remains for
> >> backwards compatibility.
> >>
> >> Upgrading
> >>
> >> For backwards compatibility, the “Registrar” will be enabled in a phased
> >> manner. By default, the “Registrar” is write-only in 0.19.0 and will be
> >> read/write in 0.20.0.
> >>
> >> If running in high-availability mode with ZooKeeper, operators must now
> >> specify the --work_dir for the master, along with the --quorum size of
> the
> >> ensemble of masters. This means adding or removing masters must be done
> >> carefully! The best practice is to only ever add or remove a single
> master
> >> at a time and to allow a small amount of time for the replicated log to
> >> catch up on the new master. Maintenance documentation will be added to
> >> reflect this.
> >>
> >> Please refer to the upgrades document, which details how to perform an
> >> upgrade from 0.18.x.
> >>
> >> Future Work
> >>
> >> Thanks to the Registrar, reconciliation primitives can now be provided
> to
> >> ensure that the state of tasks between Mesos and frameworks is kept
> >> consistent. This will remove the need for frameworks to implement
> >> out-of-band task reconciliation to inspect the state of slaves.
> >> Reconciliation work is being tracked at MESOS-1407.
> >>
> >> The addition of state through the Registrar opens up a rich set of
> possible
> >> features that were previously not possible due to the lack of persistent
> >> state in the master. These include:
> >>
> >> Cluster maintenance primitives (MESOS-1474)
> >> Repair automation (MESOS-695)
> >> Global resource reservations
> >>
> >> Getting Involved
> >>
> >> We encourage you to try out this release, and let us know what you
> think and
> >> if you hit any issues on the user mailing list. You can also get in
> touch
> >> with us via @ApacheMesos or via mailing lists and IRC.
> >>
> >> Thanks
> >>
> >> Thanks to the 32 contributors who made 0.19.0 possible:
> >>
> >> Ashutosh Jain, Adam B, Alexandra Sava, Anton Lindström, Archana kumari,
> >> Benjamin Hindman, Benjamin Mahler, Bernardo Gomez Palacio, Bernd
> Mathiske,
> >> Charlie Carson, Chengwei Yang, Chi Zhang, Dave Lester, Dominic Hamon,
> Ian
> >> Downes, Isabel Jimenez, Jake Farrell, Jameel, Al-Aziz, Jiang Yan Xu,
> Jie Yu,
> >> Nikita Vetoshkin, Niklas Q. Nielsen, Ritwik Yadav, Sam Taha, Steven
> Phung,
> >> Till Toenshoff, Timothy St. Clair, Tobi Knaup, Tom Arnfeld, Tom
> Galloway,
> >> Vinod Kone, Vinson Lee
>

Reply via email to