Re: Best practice -- duplicating and syncing objects

Eric Redmond Thu, 29 Nov 2012 13:09:11 -0800

On Nov 29, 2012, at 9:46 AM, Felix Terkhorn <[email protected]> wrote:


> Thanks, Eric.  This makes sense.  Right now we’re indeed handling the update 
> on the client side.
>  
> Your point about considering normalization in cases where we have to 
> duplicate a lot of data is well taken.  This actually applies to a couple of 
> use cases for us already – in one, we only have a single copy of the data; in 
> another, we could see as many as 10-50 copies.
>  
> For the first case, we can stick to client-side updates of both copies for 
> now.  For the second case, if we actually see that many copies starting to 
> crop up, we can certainly de-duplicate things a bit.
>  
> In case we do go about experimenting with post-commit hooks… my understanding 
> was that the can only be written in erlang.  I’m not entirely sure how this 
> syncs up with your mention of “the number of pooled connections in 
> javascript.”  Forgive me if I’m missing something obvious, there.  Is there 
> some config setting that I can visit to find out how many pooled connections 
> we have available to handle those post-commit hooks?

Sorry. You're correct, post-commit has no javascript option, only pre-commit.

>  -f
>  
> From: Eric Redmond [mailto:[email protected]] 
> Sent: Thursday, November 29, 2012 11:17 AM
> To: Felix Terkhorn
> Cc: [email protected]
> Subject: Re: Best practice -- duplicating and syncing objects
>  
> There's no general best practice for keeping denormalized data in sync, 
> beyond the obvious case, which is to update all values through whatever 
> client you use to update one. If your number of keys are few, this is not 
> going to be a hard hit on your updates. If you have an unbounded number of 
> keys, you may consider normalizing your data model a bit to reduce duplicate 
> data.
>  
> Correct, you do not have to wait for a post commit to fire (actually, you 
> can't).
>  
> You could functionally update objects in a post-commit, though I don't know 
> how commonly this is done. If the post-commit job is long running, you might 
> run out of pooled connections in javascript. You'd also have to be very 
> careful to avoid your aforementioned "infinity loop", since whether by link 
> walking or postcommit hooks, you still run the risk of objects updating each 
> other recursively.
>  
> Eric
>  
> On Nov 29, 2012, at 7:53 AM, Felix Terkhorn <[email protected]> wrote:
> 
> 
> Greetings!
>  
> In the event that we have several documents, [A1, A2, …, An], which contain 
> the same data accessed via different keys, what is the best practice for 
> keeping the data in sync?
>  
> Also, do post commit hooks fire after the client receives a successful 201 or 
> 200 status on a PUT?  That is to say, we don’t have to wait for all 
> post-commit hooks to fire, in order for our client to receive an HTTP success 
> status, right?
>  
> That’s our assumption, and if true, we’d like to exploit that fact in order 
> to keep the response time of the PUT low.  Basically, client could PUT A1, 
> and we could let Riak handle the necessary updates in a post-processing step.
>  
> We could keep the list [A1, A2, …, An] somewhere else, and simply walk that 
> list every time any document in the list is updated, excluding the document 
> itself.  Is this a standard approach?
>  
> We thought of linking objects together, and having them update each other on 
> post-commit, but that seems like it will bring us into infinite loop 
> territory. :-D
>  
> Thanks,
> Felix
> _______________________________________________
> riak-users mailing list
> [email protected]
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>

_______________________________________________
riak-users mailing list
[email protected]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Re: Best practice -- duplicating and syncing objects

Reply via email to