Hi Weitao,

I came up with this architecture as a way of distributing our application
across multiple nodes. Pre-Mesos, our application, delivered as a single
VMware VM, was not easily scalable. By breaking out the several application
components as Docker containers, we are now able (within limits imposed
chiefly by the application itself) to distribute & run those containers
across the several nodes in the Mesos cluster. Application containers that
need to talk to each other are connected via Weave's "overlay" (veth)
network.

Not surprisingly, this architecture has some of the benefits that you'd
expect from Mesos, chief among them being high-availability (more on this
below), scalability, and hybrid Cloud deployment.

The core unit of deployment is an Ubuntu image (14.04 LTS) that I've
configured with the appropriate components:

Zookeeper
Mesos-master
Mesos-slave
Marathon
Docker
Weave

SSH (including RSA keys)

Our application


This images is presently downloaded by a customer as a VMware .ova file. We
typically ask the customer to convert the resulting VM to a so-called
VMware template from which she can easily deploy multiple VMs as needed.
Please note that although we've started with VMware as our virtualization
platform, I've successfully run cluster nodes on both EC2 and Azure.

I tend to describe the Ubuntu image as "polymorphic", i.e., it can be told
to assume one of two roles, either a "master" role or a "slave" role. A
master runs ZK, mesos-master, and Marathon. A slave runs mesos-slave,
Docker, Weave, and the application.

We presently offer 3 canned deployment options:

   1. single-host, no HA
   2. multi-host, no HA (1 master, 3 slaves)
   3. multi-host, HA     (3 masters, 3 slaves)

The single-host, no HA option exists chiefly to mimic the original
pre-Mesos deployment. But it has the added virtue, thanks to Mesos, of
allowing us to dynamically "grow" from a single-host to multiple hosts.

The multi-host, no HA option is presently geared toward a sharded MongoDB
backend where each slave runs a mongod container that is a single partition
(shard) of the larger database. This deployment option also lends itself
very nicely to adding a new slave node at the cluster level, and a new
mongod container at the application level - all without any downtime
whatsoever.

The multi-host, HA option offers the probably familiar *cluster-level* high
availability. I stress "cluster-level" because I think we have to
distinguish between HA at that level & HA at the application level. The
former is realized by the 3 master hosts, i.e., you can lose a master and
new one will self-elect thereby keeping the cluster up & running. But, to
my mind, at least, application level HA requires some co-operation on the
part of the application itself (e.g., checkpoint/restart). That said, it
*is* almost magical to watch Mesos re-launch an application container that
has crashed. But whether or not that re-launch results in coherent
application behavior is another matter.

An important home-grown component here is a Java program that automates
these functions:

create cluster - configures a host for a given role and starts Mesos
services. This is done via SSH
start application - distributes application containers across slave hosts.
This is done by talking to the Marathon REST API
stop application - again, via the Marathon REST API
stop cluster - stops Mesos services. Again, via SSH
destroy cluster - deconfigures the host (after which it has no defined
role); again, SSH


As I write, I see Ajay's e-mail arrive about Calico. I am aware of this
project and it seems quite solid. But I've never understood the need to
"worry about networking containers in multihost setup". Weave runs as a
Docker container and It Just Works. I've "woven" together slaves nodes in a
cluster that spanned 3 different datacenters, one of them in EC2, without
any difficulty. Yes, I do have to assign Weave IP addresses to the several
containers, but this is hardly onerous. In fact, I've found it "liberating"
to select such addresses from a CIDR/8 address space, assigning them to
containers based on the container's purpose (e.g., MongoDB shard containers
might live at 10.4.0.X, etc.). Ultimately, this assignment boils down to
setting an environment variable that Marathon (or the mesos-slave executor)
will use when creating the container via "docker run".

There is a whole lot more that I could say about the internals of this
architecture. But, if you're still interested, I'll await further questions
from you.

HTH.

Cordially,

Paul


On Thu, Nov 26, 2015 at 7:16 AM, Paul <arach...@gmail.com> wrote:

> Gladly, Weitao. It'd be my pleasure.
>
> But give me a few hours to find some free time.
>
> I am today tasked with cooking a Thanksgiving turkey.
>
> But I will try to find the time before noon today (I'm on the right coast
> in the USA).
>
> -Paul
>
> > On Nov 25, 2015, at 11:26 PM, Weitao <zhouwtl...@gmail.com> wrote:
> >
> > Hi, Paul. Can your share the total experience about the arch with us. I
> am trying to do the similar thing
> >
> >
> >> 在 2015年11月26日,09:47,Paul <arach...@gmail.com> 写道:
> >>
> >> experience
>

Reply via email to