Here is what I felt I could do with shard allocation filtering:

Use the following properties at the node level in each DC:

In DC 1:

Node 1:

node.group1 = xxx
node.group2 = yyy
node.group4 = zzz

Node 2:

node.group2 = yyy
node.group3 = zzz
node.group5 = bbb

Node 3:

node.group1 = xxx
node.group2 = zzz
node.group6 = ccc


In DC 2:

Node 4:

node.group1 = xxx
node.group4 = aaa
node.group6 = ccc

Node 5: 

node.group2 = yyy
node.group5 = bbb
node.group6 = ccc

Node 6:

node.group3 = zzz
node.group4 = aaa
node.group5 = bbb


All clients in DC1 would send requests with setting:

index.routing.allocation.include.group1 = xxx
OR 
index.routing.allocation.include.group2 = yyy
OR 
index.routing.allocation.include.group3 = zzz

in round robin fashion

Similarly clients in DC 2 would send requests with setting:

index.routing.allocation.include.group4 = aaa
OR
index.routing.allocation.include.group5 = bbb
OR 
index.routing.allocation.include.group6 = ccc

again in round robin fashion, so as to distribute the load evenly across 
the nodes.

Will this work or should I just do snapshot/restore across DCs?

Regards
Parag

On Wednesday, February 11, 2015 at 4:56:08 PM UTC-8, Parag Shah wrote:
>
> Hi all,
>
>      I am trying to setup async replication with a multi-dc setup. Here is 
> my scenario:
>
> 1. I will have 3 nodes in each DC. For now, assume I have only 2 DCs. 
> 2. I would like to use async replication, such that it writes 
> synchronously to the primary shard, but writes to only one node 
> asynchronously in the remote DC.
> 3. I would like to have 2 replicas. So, 3 copies in all.
>
> I have looked at the cluster awareness settings here 
> <http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/modules-cluster.html>.
>  
> However, I am not sure how it will work in practice, because if I simply 
> set "async" replication on each indexing request, it may choose to 
> replicate all 3 copies locally (including the primary) or even worse, 1 
> copy locally and 2 copies in the remote DC.
>
> If I use the dc as the cluster awareness attribute, it will create only 1 
> copy in the local dc and another one in the remote DC. That won't satisfy 
> by 3 copies requirement.
>
> If I create an additional attribute which is zone like this:
>
> cluster.routing.allocation.awareness.attributes: dc, zone
>
> and divide my DC into zone 1 (1 node?) and zone 2 (2 nodes?) and 
> vice-versa in the remote DC,  I might be able to distribute the load 
> between zones in a DC and between DCs for the 3rd copy. However, there is 
> still the possibility that 2 of it's replica shards might be allocated in 
> the remote DC, which will prevent me from obtaining local (DC) quorums.
>
> Can someone shed light on how shard allocation works when you have more 
> cluster allocation attributes than 1?
>
> Also, if there is some known best practice for doing Async Replication 
> with MultiDC such that local quorums are always possible for reads/writes, 
> it will be greatly appreciated. In summary, I am looking for 3 copies 
> including primary. 2 in the local DC and 1 in the remote DC and want to 
> write synchronously only to the primary shard.
>
> Thanks in advance,
> Parag
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/35a97ba7-f916-4a15-a84b-2b4d73603405%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to