Regions are designed to be completely separate from one other, so no, as far as I'm aware there is no way to sync secondary storage data between them. I don't think you'd want to do that anyway, as it defeats the purpose of maintaining an isolated cloud region from another.
- Si ________________________________________ From: Logan Barfield <lbarfi...@tqhosting.com> Sent: Wednesday, January 07, 2015 2:00 PM To: dev@cloudstack.apache.org Cc: us...@cloudstack.apache.org Subject: Re: Multi-Datacenter Deployment A followup here: You can't have secondary storage that spans regions (e.g, templates/snapshots in sync), even with S3/Swift correct? If not that's another downside to regions on top of the account sync. It seems like the best solution to prevent weird split-brain/HA issues would be to have at least 3 databases set up as master/master/master with quorum. That way if two sites lose contact and re-establish there's a 2/1 majority saying the hosts are all reachable. Would hopefully prevent the ones that lost contact from kicking off HA immediately. I don't even know how feasible that would be; maybe with Galera? Even then it would have to be on a table level since there would be a conflict, for instance: - Given sites 1, 2, and 3, where site 1 loses contact with site 2 and comes back up - Site 1: Thinks site 1 is up and site 2 is down - Site 2: Thinks site 2 is up and 1 is down. - Site 3: Thinks all sites are up. In the above case the least harmful thing would be to push site 3 to the other two, but since all three sites have different data it may just hang instead. This is going to drive me nuts. :D Thank You, Logan Barfield Tranquil Hosting On Wed, Jan 7, 2015 at 12:57 PM, Simon Weller <swel...@ena.com> wrote: > See inline. > ________________________________________ > From: Logan Barfield <lbarfi...@tqhosting.com> > Sent: Wednesday, January 07, 2015 11:43 AM > To: dev@cloudstack.apache.org > Cc: us...@cloudstack.apache.org > Subject: Re: Multi-Datacenter Deployment > > I appreciate the explanation. That seems to confirm what I was thinking, > that until regions are working 100% we'll just have to make sure the > DC-to-DC links are as stable/redundant as possible to prevent HA issues. > If we increase the HA delay it shouldn't be a major issue, and it will > still be better than nothing. > > For us is probably also makes sense to not worry about having management > servers in each DC for now. If we have a big enough outage in our primary > DC to affect access to the management server we probably have bigger > problems to worry about. > > > Yeah, I agree. Even with Mgmt down, it's not going to stop any existing > services from running or functioning as long as the clusters are healthy. > > - Si > > Much appreciated! > > > Thank You, > > Logan Barfield > Tranquil Hosting > > On Wed, Jan 7, 2015 at 12:15 PM, Simon Weller <swel...@ena.com> wrote: > > > Logan, > > > > We currently run CS in multiple geographically separate DCs, and may be > > able to give you a little insight into things. > > > > We run KVM in advanced networking mode, with CLVM clusters backed onto > > Dell Compellent SANs. We currently have different DCs running different > > zones per DC, in a single region. We've been running CS in production now > > since 4.0 prior to regions, so that functionality (along with its > > limitations) hasn't been something we've adopted yet. We run our > Management > > (With Multiple clustered nodes) out of 1 DC, and have a backup set of > > Management Nodes in another DC should we need to invoke BCDR in the event > > the primary Management nodes became unavailable. > > > > Your concerns regarding HA problems are founded. We run our own > nationwide > > MPLS backbone, and therefore have multiple high capacity bandwidth paths > > between our different DCs, and even with that capacity and fault tolerant > > design, we've seen issues where Management has attempted to invoke HA due > > to brief loss of connectivity (typically due to maintenance or grooming > > activity), and this can be quite problematic. VPN tunnels are going to be > > very challenging for you, and you really need to look at VPLS or some > other > > technology that can layer on top of a resilient infrastructure with > > multiple paths and fast failover (e.g. MPLS Fast Reroute). > > > > Ideally, regions should solve this with dedicated local management nodes, > > but until the syncing is sorted out, and those newer releases are stable, > > there isn't much option short of using a single region right now, short > of > > setting up a completely separate CS instances per DC. > > > > Hope this helps a little. > > > > - Si > > > > ________________________________________ > > From: Logan Barfield <lbarfi...@tqhosting.com> > > Sent: Tuesday, January 06, 2015 1:45 PM > > To: dev@cloudstack.apache.org; us...@cloudstack.apache.org > > Subject: Multi-Datacenter Deployment > > > > We are currently running a single location CloudStack deployment: > > - 1 Hardware firewall > > - 1 Mangement/Database Server > > - 1 NFS staging store (for S3 secondary storage) > > - Ceph RBD for primary storage > > - 4 Hypervisors > > - 1 Zone/Pod/Cluster > > > > We are looking to expand our deployment to other datacenters, and I'm > > trying to determine the best way to go about it. The documentation is a > > bit lacking for multi-site deployments. > > > > Our goal for the multi-site deployment is to have a zone for each site > > (E.G. US East, US West, Europe) that our customers can use to deploy > > instances in their preferred geographic area. > > > > Since we don't want to have different accounts for every datacenter, I > > don't think using Regions makes sense for us (and I'm not sure what > they're > > actually good for without keeping accounts/users/domains in sync). > > > > Right now I'm thinking our setup will be as follows: > > - Firewall, Management Server, NFS staging server, primary storage, and > > Hypervisors in each datacenter. > > - All Management servers will be on the same management network. > > - Management servers will be connected via site-to-site VPN links over > WAN. > > - MySQL replication (Percona?) will be set up on the management servers. > > Having an odd number of servers to protect against split brain, and > keeping > > redundant database backups. > > - One region (default) > > - One zone for each datacenter > > - Geo-enabled DNS to direct customers to the nearest Management server > > - Object storage for secondary storage across cloud. > > > > My primary concerns with this setup are: > > - I haven't really seen multi-site deployments details anywhere. > > - Potential for split-brain. > > - How will HA be handled (e.g., if a VPN link goes down and one of the > > remote management servers can't contact a host, will it try to initiate > > HA?) - This sort of goes along with the split brain problem. > > > > Are my assumptions here sound, or is there a standard recommended way of > > doing multi-site deployments? > > > > Any suggestions are much appreciated. > > >