Hi Weitao, I came up with this architecture as a way of distributing our application across multiple nodes. Pre-Mesos, our application, delivered as a single VMware VM, was not easily scalable. By breaking out the several application components as Docker containers, we are now able (within limits imposed chiefly by the application itself) to distribute & run those containers across the several nodes in the Mesos cluster. Application containers that need to talk to each other are connected via Weave's "overlay" (veth) network.
Not surprisingly, this architecture has some of the benefits that you'd expect from Mesos, chief among them being high-availability (more on this below), scalability, and hybrid Cloud deployment. The core unit of deployment is an Ubuntu image (14.04 LTS) that I've configured with the appropriate components: Zookeeper Mesos-master Mesos-slave Marathon Docker Weave SSH (including RSA keys) Our application This images is presently downloaded by a customer as a VMware .ova file. We typically ask the customer to convert the resulting VM to a so-called VMware template from which she can easily deploy multiple VMs as needed. Please note that although we've started with VMware as our virtualization platform, I've successfully run cluster nodes on both EC2 and Azure. I tend to describe the Ubuntu image as "polymorphic", i.e., it can be told to assume one of two roles, either a "master" role or a "slave" role. A master runs ZK, mesos-master, and Marathon. A slave runs mesos-slave, Docker, Weave, and the application. We presently offer 3 canned deployment options: 1. single-host, no HA 2. multi-host, no HA (1 master, 3 slaves) 3. multi-host, HA (3 masters, 3 slaves) The single-host, no HA option exists chiefly to mimic the original pre-Mesos deployment. But it has the added virtue, thanks to Mesos, of allowing us to dynamically "grow" from a single-host to multiple hosts. The multi-host, no HA option is presently geared toward a sharded MongoDB backend where each slave runs a mongod container that is a single partition (shard) of the larger database. This deployment option also lends itself very nicely to adding a new slave node at the cluster level, and a new mongod container at the application level - all without any downtime whatsoever. The multi-host, HA option offers the probably familiar *cluster-level* high availability. I stress "cluster-level" because I think we have to distinguish between HA at that level & HA at the application level. The former is realized by the 3 master hosts, i.e., you can lose a master and new one will self-elect thereby keeping the cluster up & running. But, to my mind, at least, application level HA requires some co-operation on the part of the application itself (e.g., checkpoint/restart). That said, it *is* almost magical to watch Mesos re-launch an application container that has crashed. But whether or not that re-launch results in coherent application behavior is another matter. An important home-grown component here is a Java program that automates these functions: create cluster - configures a host for a given role and starts Mesos services. This is done via SSH start application - distributes application containers across slave hosts. This is done by talking to the Marathon REST API stop application - again, via the Marathon REST API stop cluster - stops Mesos services. Again, via SSH destroy cluster - deconfigures the host (after which it has no defined role); again, SSH As I write, I see Ajay's e-mail arrive about Calico. I am aware of this project and it seems quite solid. But I've never understood the need to "worry about networking containers in multihost setup". Weave runs as a Docker container and It Just Works. I've "woven" together slaves nodes in a cluster that spanned 3 different datacenters, one of them in EC2, without any difficulty. Yes, I do have to assign Weave IP addresses to the several containers, but this is hardly onerous. In fact, I've found it "liberating" to select such addresses from a CIDR/8 address space, assigning them to containers based on the container's purpose (e.g., MongoDB shard containers might live at 10.4.0.X, etc.). Ultimately, this assignment boils down to setting an environment variable that Marathon (or the mesos-slave executor) will use when creating the container via "docker run". There is a whole lot more that I could say about the internals of this architecture. But, if you're still interested, I'll await further questions from you. HTH. Cordially, Paul On Thu, Nov 26, 2015 at 7:16 AM, Paul <arach...@gmail.com> wrote: > Gladly, Weitao. It'd be my pleasure. > > But give me a few hours to find some free time. > > I am today tasked with cooking a Thanksgiving turkey. > > But I will try to find the time before noon today (I'm on the right coast > in the USA). > > -Paul > > > On Nov 25, 2015, at 11:26 PM, Weitao <zhouwtl...@gmail.com> wrote: > > > > Hi, Paul. Can your share the total experience about the arch with us. I > am trying to do the similar thing > > > > > >> 在 2015年11月26日,09:47,Paul <arach...@gmail.com> 写道: > >> > >> experience >