Absolutely. Let's keep in sync ourselves on that. I'm very curious how the implementation moves forward & will definitely keep you and the team in the loop on Riak features & especially replication features.
Cheers! -Adron On Thu, May 16, 2013 at 11:53 AM, Patrick Christopher < [email protected]> wrote: > Hi Adron, > Thank you for the reply. The licensing is a concern, but that was for > later. > > I think there will have to be a cluster of some sort on the remote dcs as > the connection between the edge dcs and the central dc is limited to a > small 1-2mbps link that is shared with many other higher priority > applications. The remote dc spends most of its time (70-80%ish) with only > the small link. When the small link is all we have, we would want the edge > dc to reliably store information locally until it can connect to the large > link. > > This is probably not a great fit for riak right now. If you do have any > other suggestions, I'd love to hear them and I'll keep an eye out for how > replication grows in the coming months. > > Pat > > > On Wed, May 15, 2013 at 12:15 PM, Adron Hall <[email protected]> wrote: > >> Currently, as the replication exists today I don't believe the >> replication service would do exactly that. (anyone else on list, plz >> correct me if I'm wrong here). However in the coming months we have that >> capability in the road map. >> >> However I'm just a little hesitant to suggest committing an entire Riak >> Cluster at each remote point solely for replication. There's also the >> licensing that comes into play with the multi-data center replication also. >> Ideally we'd have PS (Professional Services) or your team work together to >> build clients to connect to a main Riak Cluster. >> >> Being that each cluster should have a minimal of 5 nodes, having a >> cluster of 5 nodes at each remote point would seem like overkill - however >> it would replicate, it would just be that a lot of nodes & a lot of >> clusters is a lot for the volume of data. A client could be dramatically >> more minimalistic, it wouldn't require 5 nodes for each remote cluster, and >> could prospectively be dramatically cheaper & more efficient in the end. >> >> I'll loop you in with some others that could elaborate on this >> architecture and see which direction to aim for. >> >> Cheers, >> -Adron >> >> >> On Mon, May 13, 2013 at 5:20 PM, Patrick Christopher < >> [email protected]> wrote: >> >>> Hi Adron, >>> Thanks for the reply! >>> >>> The architectural thought that you pose is the idea I'm going for but >>> having riak do all of the replication, not a new client. The model would >>> be: >>> - single riak cluster >>> - bucket a replicates to data center remoteA >>> - bucket b replicates to data center remoteB >>> - remoteB will never access data from a >>> - remoteA will never access data from b >>> >>> Does riak support that? I've not seen any database that supports that >>> model on its own. >>> >>> Pat >>> >>> >>> On Mon, May 13, 2013 at 12:42 PM, Adron Hall <[email protected]> wrote: >>> >>>> Hey Pat, >>>> >>>> A few answers, thoughts and questions. >>>> >>>> 1. Each bucket allows (if after v1.1/1.2) replication. In 1.1 and above >>>> there is a repl value that accepts a true or false value. True turns on >>>> realtime and fullsync abilities. In 1.2 above has additional boolean >>>> parameters of realtime, fullsync, or both. Enabling the property via >>>> command line: >>>> >>>> curl -v -XPUT -H "Content-Type: application/json" \ >>>> -d '{"props":{"repl":true}}' \ >>>> http://127.0.0.1:8091/riak/my_bucket >>>> >>>> 2. Running both styles of replication should be fine. They're on by >>>> default to start. In the particular situation you describe - using realtime >>>> on all the time should work well and then only using the fullsync when the >>>> ship docks and connects at a higher speed - using something to trigger >>>> that. >>>> >>>> A few additional questions: >>>> >>>> - I recall we spoke about data sizes of 3-5k per ship, but then >>>> there was all of the data that would go along with each client, could >>>> you >>>> provide more elaboration around what count, sizes, connections to other >>>> elements and related information data? Why type/pieces would need to go >>>> to >>>> each ship, etc. >>>> - For the connections to each ship during satellite link what does >>>> the bandwidth, latency and other characteristics look like? Latency >>>> times >>>> of 800, 1500, 6000 or possibly higher 8000, 10000? >>>> >>>> Another Architectural Thought: >>>> >>>> - One idea that stands out would be to use Riak as the primary >>>> cluster but to implement a client that does replication itself >>>> specifically >>>> for a bucket (or buckets). It seems like, from my understanding so far, >>>> that the client might be the key mechanism to control any type of >>>> replication - with or without MDC being used. Basically following a >>>> standard hub-and-spoke server & client model. >>>> >>>> Hope that helps, cheers! >>>> >>>> -Adron >>>> >>>> >>>> >>>> >>>> On Fri, May 10, 2013 at 9:45 AM, Patrick Christopher < >>>> [email protected]> wrote: >>>> >>>>> Hi, >>>>> I’m working on an application that will be spread across many >>>>> (150-200) data centers. I had a great chat with Adron at the Seattle >>>>> Riak Office Hours and I think that Riak can provide the backbone of the >>>>> solution. Adron is a great help but I have come away with (or have come >>>>> up >>>>> with) two more questions. >>>>> >>>>> 1. Does riak support specifying bucket level multi data center >>>>> replication? There is a single master data center that has all of the >>>>> data, >>>>> and that central cluster replicates a single different bucket to each of >>>>> the remote data centers. Its a hub/spoke model where something at the hub >>>>> has a view of the full data set and something at a spoke end only has a >>>>> view of a single, unique bucket. >>>>> >>>>> 2. What would be the best way to setup a priority replication >>>>> strategy? There is always a link between the main dc and the spoke >>>>> dcs, but sometimes its a big fast link and we'd want to do a full >>>>> replication and sometimes its the equivalent of a 56k modem and we only >>>>> want to replicate time critical data. I think riak can handle this by >>>>> using the real-time sync for critical data and the full-sync for a full >>>>> sync. Will that work or is that asking for trouble running both styles? >>>>> >>>>> >>>>> And there was a small note in the >>>>> docs<http://docs.basho.com/riakee/latest/cookbooks/Multi-Data-Center-Replication-Architecture/> >>>>> that >>>>> says, " ...there are two primary modes of operation..." are there >>>>> other, secondary modes of replication or is this me over-parsing the docs? >>>>> >>>>> >>>>> Thank you, >>>>> >>>>> Pat >>>>> >>>>> >>>>> >>>>> _______________________________________________ >>>>> riak-users mailing list >>>>> [email protected] >>>>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com >>>>> >>>>> >>>> >>>> >>>> -- >>>> *Adron B Hall* >>>> Blog <http://compositecode.com/>, Adron.Me <http://adron.me/>, >>>> @adron<http://twitter.com/adron> >>>> with Basho <http://basho.com/> @Basho <https://twitter.com/basho> >>>> >>> >>> >> >> >> -- >> *Adron B Hall* >> Blog <http://compositecode.com/>, Adron.Me <http://adron.me/>, >> @adron<http://twitter.com/adron> >> with Basho <http://basho.com/> @Basho <https://twitter.com/basho> >> > > -- *Adron B Hall* Blog <http://compositecode.com/>, Adron.Me <http://adron.me/>, @adron<http://twitter.com/adron> with Basho <http://basho.com/> @Basho <https://twitter.com/basho>
_______________________________________________ riak-users mailing list [email protected] http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
