I think it makes sense for each datacenter to stay independently managed by
a local Mesos master. But maybe a framework like Marathon could know about
Marathon frameworks in other datacenters, and a Marathon app config could
specify which datacenters to deploy to. The framework would coordinate with
its peers for deployment.

Of course, this means any advanced framework scheduler would need to
provide its own distributed coordination/config service (not ideal), so
cross-datacenter discovery/coordination functionality may make sense as a
service provided to frameworks by the Mesos core, even if the master itself
doesn't actively coordinate with other masters.

This probably requires some non-trivial changes:

1) Datacenter-local masters that are aware of other Mesos clusters and
share configuration with them, but only care about configuration that
applies to their own cluster/datacenter.

2) A coordination/config service that emphasizes AP over C, at least across
datacenter boundaries. So probably not zookeeper or etcd. If we have
pluggable coordinators, this could be implemented with Cassandra
(CL_LOCAL_QUORUM = locally consistent, but partition tolerant across
datacenters), but watches/locks would be less efficient.

With this setup a datacenter or two could drop out without impacting quorum
availability within the remaining datacenters, and master/coordinator
latency becomes a non-issue.


On Tue, Aug 26, 2014 at 9:43 AM, Justin Holmes <justin.hol...@opencredo.com>
wrote:

> Hi
>
> We have been running Mesos in Google Compute with slaves in us-central1
> and europe-west1 with masters in europe-west1, response times between the
> zones have been around 100/110ms.
>
> I am interested in running masters across zones and will evaluate
> DRBD/Ceph/GlusterFS for multi-site master.
>
> I am also wondering if anyone has tuned master election with Zookeeper
> across zones and also can we switch out the Zookeeper dependency and use
> etcd/Cassandra
>
> Kind regards
>
>
> On 26 August 2014 15:19, Yaron Rosenbaum <ya...@whatson-social.com> wrote:
>
>> Hi
>>
>> Here's a crazy idea:
>> Is it possible / has anyone tried to run Mesos where the slaves are in
>> radically different network zones? For example: A few slaves on Azure, a
>> few slaves on AWS, and a bunch of other slaves on premises etc.
>>
>>    - Assuming it's possible, is it possible to define resource
>>    requirements for tasks, in terms of 'access to network resource A with 
>> less
>>    than X latency and throughput between i and m' for example?
>>    - Masters would probably have to be 'close' to each other, to prevent
>>    'brain-splits', true or not ?
>>       - If so, then how does one assure Master HA ?
>>
>>
>> I've been thinking about this for a while, and can't find a reason 'why
>> not'.
>>
>> Please share your thoughts on the subject.
>>
>> (Y)
>>
>>
>
>
> --
>
> *Justin Holmes*Consultant
>
> Open Credo Ltd – Delivering emerging technology today
>
> Mobile: +44 (0) 7863173405
> Main: +44 (0) 20 3603 2680
>
> justin.hol...@opencredo.com
> http://twitter.com/DevOpsScientist
> http://www.opencredo.com
>
>
> Registered Office:  5-11 Lavington St., London SE1 0NZ.
> Registered in UK. No 3943999
>
>
>
> *If you have received this e-mail in error please accept our apologies,
> destroy it immediately and it would be greatly appreciated if you notified
> the sender.  It is your responsibility to protect your system from viruses
> and any other harmful code or device.  We try to eliminate them from
> e-mails and attachments; but we accept no liability for any that remain. We
> may monitor or access any or all e-mails sent to us.*
>

Reply via email to