Asynchronous, one-way data replication to stand-alone nodes?

Pelle Jansson Mon, 01 Feb 2016 02:28:40 -0800






Hello.

I am not sure this have been asked before. I've tried to search and either I am 
terrible at searching, or the question haven't been asked.

We are trying to figure out how to solve the following problem;

We have systems in various regions in AWS, and we are trying to find a way to 
sync configuration data for various applications we have running there. This 
configuration data is located in an existing 5-node Cassandra cluster today. It 
is a single, small keyspace, just a few MB's worth of data (for now, anyway..).

It might look something like this (I hope formatting survives):

A    B
\  /
  C* Centralized, existing, 5-node cluster
/ \
D   E

Where A, B, D, E = single nodes (or small clusters) in for example AWS.

So A, B, D and E would get data pushed to them from C*, but they should not 
sync any data back to C*.

One thought is to run a stand-alone, single Cassandra node (or a small cluster) 
in each region and sync the data over. One way to do this is to use 
sstableloader, but is there a more convenient, more "out-of-the-box"-way to get 
the data to these nodes?
Would it be possible to somehow use the Multi DC feature in this setup, and 
would it be possible to prevent the stand alone nodes to sync their data back 
to the "main" DC in that case? Or is there features of Cassandra that can do 
this that we've missed?

I hope I've made myself understandable ;)

Thanks in advance for any input on this.

Br,
Pelle Jansson

Asynchronous, one-way data replication to stand-alone nodes?

Reply via email to