Here is what I felt I could do with shard allocation filtering: Use the following properties at the node level in each DC:
In DC 1: Node 1: node.group1 = xxx node.group2 = yyy node.group4 = zzz Node 2: node.group2 = yyy node.group3 = zzz node.group5 = bbb Node 3: node.group1 = xxx node.group2 = zzz node.group6 = ccc In DC 2: Node 4: node.group1 = xxx node.group4 = aaa node.group6 = ccc Node 5: node.group2 = yyy node.group5 = bbb node.group6 = ccc Node 6: node.group3 = zzz node.group4 = aaa node.group5 = bbb All clients in DC1 would send requests with setting: index.routing.allocation.include.group1 = xxx OR index.routing.allocation.include.group2 = yyy OR index.routing.allocation.include.group3 = zzz in round robin fashion Similarly clients in DC 2 would send requests with setting: index.routing.allocation.include.group4 = aaa OR index.routing.allocation.include.group5 = bbb OR index.routing.allocation.include.group6 = ccc again in round robin fashion, so as to distribute the load evenly across the nodes. Will this work or should I just do snapshot/restore across DCs? Regards Parag On Wednesday, February 11, 2015 at 4:56:08 PM UTC-8, Parag Shah wrote: > > Hi all, > > I am trying to setup async replication with a multi-dc setup. Here is > my scenario: > > 1. I will have 3 nodes in each DC. For now, assume I have only 2 DCs. > 2. I would like to use async replication, such that it writes > synchronously to the primary shard, but writes to only one node > asynchronously in the remote DC. > 3. I would like to have 2 replicas. So, 3 copies in all. > > I have looked at the cluster awareness settings here > <http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/modules-cluster.html>. > > However, I am not sure how it will work in practice, because if I simply > set "async" replication on each indexing request, it may choose to > replicate all 3 copies locally (including the primary) or even worse, 1 > copy locally and 2 copies in the remote DC. > > If I use the dc as the cluster awareness attribute, it will create only 1 > copy in the local dc and another one in the remote DC. That won't satisfy > by 3 copies requirement. > > If I create an additional attribute which is zone like this: > > cluster.routing.allocation.awareness.attributes: dc, zone > > and divide my DC into zone 1 (1 node?) and zone 2 (2 nodes?) and > vice-versa in the remote DC, I might be able to distribute the load > between zones in a DC and between DCs for the 3rd copy. However, there is > still the possibility that 2 of it's replica shards might be allocated in > the remote DC, which will prevent me from obtaining local (DC) quorums. > > Can someone shed light on how shard allocation works when you have more > cluster allocation attributes than 1? > > Also, if there is some known best practice for doing Async Replication > with MultiDC such that local quorums are always possible for reads/writes, > it will be greatly appreciated. In summary, I am looking for 3 copies > including primary. 2 in the local DC and 1 in the remote DC and want to > write synchronously only to the primary shard. > > Thanks in advance, > Parag > -- You received this message because you are subscribed to the Google Groups "elasticsearch" group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/35a97ba7-f916-4a15-a84b-2b4d73603405%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.