Re: multi datacenter replication

Adron Hall Mon, 13 May 2013 12:44:46 -0700

Hey Pat,

 A few answers, thoughts and questions.


1. Each bucket allows (if after v1.1/1.2) replication. In 1.1 and above
there is a repl value that accepts a true or false value. True turns on
realtime and fullsync abilities. In 1.2 above has additional boolean
parameters of realtime, fullsync, or both. Enabling the property via
command line:

curl -v -XPUT -H "Content-Type: application/json" \
-d '{"props":{"repl":true}}' \
http://127.0.0.1:8091/riak/my_bucket

2. Running both styles of replication should be fine. They're on by default
to start. In the particular situation you describe - using realtime on all
the time should work well and then only using the fullsync when the ship
docks and connects at a higher speed - using something to trigger that.

A few additional questions:

   - I recall we spoke about data sizes of 3-5k per ship, but then there
   was all of the data that would go along with each client, could you provide
   more elaboration around what count, sizes, connections to other elements
   and related information data? Why type/pieces would need to go to each
   ship, etc.
   - For the connections to each ship during satellite link what does the
   bandwidth, latency and other characteristics look like? Latency times of
   800, 1500, 6000 or possibly higher 8000, 10000?

Another Architectural Thought:

   - One idea that stands out would be to use Riak as the primary cluster
   but to implement a client that does replication itself specifically for a
   bucket (or buckets). It seems like, from my understanding so far, that the
   client might be the key mechanism to control any type of replication - with
   or without MDC being used. Basically following a standard hub-and-spoke
   server & client model.

Hope that helps, cheers!

-Adron




On Fri, May 10, 2013 at 9:45 AM, Patrick Christopher <
[email protected]> wrote:

> Hi,
> I’m working on an application that will be spread across many (150-200)
> data centers.  I had a great chat with Adron at the Seattle Riak Office
> Hours and I think that Riak can provide the backbone of the solution.
> Adron is a great help but I have come away with (or have come up with) two
> more questions.
>
> 1.    Does riak support specifying bucket level multi data center
> replication? There is a single master data center that has all of the data,
> and that central cluster replicates a single different bucket to each of
> the remote data centers.  Its a hub/spoke model where something at the hub
> has a view of the full data set and something at a spoke end only has a
> view of a single, unique bucket.
>
> 2.       What would be the best way to setup a priority replication
> strategy?  There is always a link between the main dc and the spoke dcs,
> but sometimes its a big fast link and we'd want to do a full replication
> and sometimes its the equivalent of a 56k modem and we only want to
> replicate time critical data.  I think riak can handle this by using the
> real-time sync for critical data and the full-sync for a full sync.  Will
> that work or is that asking for trouble running both styles?
>
>
> And there was a small note in the 
> docs<http://docs.basho.com/riakee/latest/cookbooks/Multi-Data-Center-Replication-Architecture/>
>  that
> says, " ...there are two primary modes of operation..." are there other,
> secondary modes of replication or is this me over-parsing the docs?
>
>
> Thank you,
>
> Pat
>
>
>
> _______________________________________________
> riak-users mailing list
> [email protected]
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>
>


-- 
*Adron B Hall*
Blog <http://compositecode.com/>, Adron.Me <http://adron.me/>,
@adron<http://twitter.com/adron>
with Basho <http://basho.com/> @Basho <https://twitter.com/basho>

_______________________________________________
riak-users mailing list
[email protected]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Re: multi datacenter replication

Reply via email to