Absolutely. Let's keep in sync ourselves on that. I'm very curious how the
implementation moves forward & will definitely keep you and the team in the
loop on Riak features & especially replication features.

Cheers!
-Adron


On Thu, May 16, 2013 at 11:53 AM, Patrick Christopher <
[email protected]> wrote:

> Hi Adron,
> Thank you for the reply.  The licensing is a concern, but that was for
> later.
>
> I think there will have to be a cluster of some sort on the remote dcs as
> the connection between the edge dcs and the central dc is limited to a
> small 1-2mbps link that is shared with many other higher priority
> applications.  The remote dc spends most of its time (70-80%ish) with only
> the small link.  When the small link is all we have, we would want the edge
> dc to reliably store information locally until it can connect to the large
> link.
>
> This is probably not a great fit for riak right now.  If you do have any
> other suggestions, I'd love to hear them and I'll keep an eye out for how
> replication grows in the coming months.
>
> Pat
>
>
> On Wed, May 15, 2013 at 12:15 PM, Adron Hall <[email protected]> wrote:
>
>> Currently, as the replication exists today I don't believe the
>> replication service would do exactly that. (anyone else on list, plz
>> correct me if I'm wrong here).  However in the coming months we have that
>> capability in the road map.
>>
>> However I'm just a little hesitant to suggest committing an entire Riak
>> Cluster at each remote point solely for replication. There's also the
>> licensing that comes into play with the multi-data center replication also.
>> Ideally we'd have PS (Professional Services) or your team work together to
>> build clients to connect to a main Riak Cluster.
>>
>> Being that each cluster should have a minimal of 5 nodes, having a
>> cluster of 5 nodes at each remote point would seem like overkill - however
>> it would replicate, it would just be that a lot of nodes & a lot of
>> clusters is a lot for the volume of data. A client could be dramatically
>> more minimalistic, it wouldn't require 5 nodes for each remote cluster, and
>> could prospectively be dramatically cheaper & more efficient in the end.
>>
>> I'll loop you in with some others that could elaborate on this
>> architecture and see which direction to aim for.
>>
>> Cheers,
>> -Adron
>>
>>
>> On Mon, May 13, 2013 at 5:20 PM, Patrick Christopher <
>> [email protected]> wrote:
>>
>>> Hi Adron,
>>> Thanks for the reply!
>>>
>>> The architectural thought that you pose is the idea I'm going for but
>>> having riak do all of the replication, not a new client.  The model would
>>> be:
>>>   - single riak cluster
>>>   - bucket a replicates to data center remoteA
>>>   - bucket b replicates to data center remoteB
>>>   - remoteB will never access data from a
>>>   - remoteA will never access data from b
>>>
>>> Does riak support that?  I've not seen any database that supports that
>>> model on its own.
>>>
>>> Pat
>>>
>>>
>>> On Mon, May 13, 2013 at 12:42 PM, Adron Hall <[email protected]> wrote:
>>>
>>>> Hey Pat,
>>>>
>>>>  A few answers, thoughts and questions.
>>>>
>>>> 1. Each bucket allows (if after v1.1/1.2) replication. In 1.1 and above
>>>> there is a repl value that accepts a true or false value. True turns on
>>>> realtime and fullsync abilities. In 1.2 above has additional boolean
>>>> parameters of realtime, fullsync, or both. Enabling the property via
>>>> command line:
>>>>
>>>> curl -v -XPUT -H "Content-Type: application/json" \
>>>> -d '{"props":{"repl":true}}' \
>>>> http://127.0.0.1:8091/riak/my_bucket
>>>>
>>>> 2. Running both styles of replication should be fine. They're on by
>>>> default to start. In the particular situation you describe - using realtime
>>>> on all the time should work well and then only using the fullsync when the
>>>> ship docks and connects at a higher speed - using something to trigger 
>>>> that.
>>>>
>>>> A few additional questions:
>>>>
>>>>    - I recall we spoke about data sizes of 3-5k per ship, but then
>>>>    there was all of the data that would go along with each client, could 
>>>> you
>>>>    provide more elaboration around what count, sizes, connections to other
>>>>    elements and related information data? Why type/pieces would need to go 
>>>> to
>>>>    each ship, etc.
>>>>    - For the connections to each ship during satellite link what does
>>>>    the bandwidth, latency and other characteristics look like? Latency 
>>>> times
>>>>    of 800, 1500, 6000 or possibly higher 8000, 10000?
>>>>
>>>> Another Architectural Thought:
>>>>
>>>>    - One idea that stands out would be to use Riak as the primary
>>>>    cluster but to implement a client that does replication itself 
>>>> specifically
>>>>    for a bucket (or buckets). It seems like, from my understanding so far,
>>>>    that the client might be the key mechanism to control any type of
>>>>    replication - with or without MDC being used. Basically following a
>>>>    standard hub-and-spoke server & client model.
>>>>
>>>> Hope that helps, cheers!
>>>>
>>>> -Adron
>>>>
>>>>
>>>>
>>>>
>>>> On Fri, May 10, 2013 at 9:45 AM, Patrick Christopher <
>>>> [email protected]> wrote:
>>>>
>>>>> Hi,
>>>>> I’m working on an application that will be spread across many
>>>>> (150-200) data centers.  I had a great chat with Adron at the Seattle
>>>>> Riak Office Hours and I think that Riak can provide the backbone of the
>>>>> solution.  Adron is a great help but I have come away with (or have come 
>>>>> up
>>>>> with) two more questions.
>>>>>
>>>>> 1.    Does riak support specifying bucket level multi data center
>>>>> replication? There is a single master data center that has all of the 
>>>>> data,
>>>>> and that central cluster replicates a single different bucket to each of
>>>>> the remote data centers.  Its a hub/spoke model where something at the hub
>>>>> has a view of the full data set and something at a spoke end only has a
>>>>> view of a single, unique bucket.
>>>>>
>>>>> 2.       What would be the best way to setup a priority replication
>>>>> strategy?  There is always a link between the main dc and the spoke
>>>>> dcs, but sometimes its a big fast link and we'd want to do a full
>>>>> replication and sometimes its the equivalent of a 56k modem and we only
>>>>> want to replicate time critical data.  I think riak can handle this by
>>>>> using the real-time sync for critical data and the full-sync for a full
>>>>> sync.  Will that work or is that asking for trouble running both styles?
>>>>>
>>>>>
>>>>> And there was a small note in the 
>>>>> docs<http://docs.basho.com/riakee/latest/cookbooks/Multi-Data-Center-Replication-Architecture/>
>>>>>  that
>>>>> says, " ...there are two primary modes of operation..." are there
>>>>> other, secondary modes of replication or is this me over-parsing the docs?
>>>>>
>>>>>
>>>>> Thank you,
>>>>>
>>>>> Pat
>>>>>
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> riak-users mailing list
>>>>> [email protected]
>>>>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>>>>>
>>>>>
>>>>
>>>>
>>>> --
>>>> *Adron B Hall*
>>>>  Blog <http://compositecode.com/>, Adron.Me <http://adron.me/>, 
>>>> @adron<http://twitter.com/adron>
>>>> with Basho <http://basho.com/> @Basho <https://twitter.com/basho>
>>>>
>>>
>>>
>>
>>
>> --
>> *Adron B Hall*
>> Blog <http://compositecode.com/>, Adron.Me <http://adron.me/>, 
>> @adron<http://twitter.com/adron>
>> with Basho <http://basho.com/> @Basho <https://twitter.com/basho>
>>
>
>


-- 
*Adron B Hall*
Blog <http://compositecode.com/>, Adron.Me <http://adron.me/>,
@adron<http://twitter.com/adron>
with Basho <http://basho.com/> @Basho <https://twitter.com/basho>
_______________________________________________
riak-users mailing list
[email protected]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Reply via email to