Thanks everyone for these replies (and also Aphyr, off-list).  It has helped me 
confirm my suspicions and sounds like I'm on the right track.

For one of my keys, I am doing sort of a manual "last write wins" by having 
the reader sort siblings by timestamp, then by vtag, to deterministically 
select the same sibling every time.  The reason for keeping the other siblings 
around is they may contain the only references to other keys created along 
with them.  A separate cleanup process can then be sure to delete the referred 
keys before removing the siblings.  And of course the algorithm used to 
determine the winning sibling is shared by both the read function and the 
cleanup function.

On Wednesday, November 02, 2011 07:52:03 AM Bob Ippolito wrote:
> An approach similar to #1 is implemented in statebox
> http://github.com/mochi/statebox - basically the trick is to store an
> operation queue along with the data, and to put some constraints on how
> operations must work so that they can be repeated for conflict resolution.
> 
> On Wednesday, November 2, 2011, Erik Søe Sørensen <[email protected]> wrote:
> > What you'd usually do is somewhere between 2) and 3) - namely, accept
> 
> that siblings might occur (although rarely). Also, you'd have a resolution
> function with the property (besides being deterministic) that
> reconciliating two identical siblings would yield the same - i.e., f(X,X) =
> X.
> 
> > ________________________________________
> > Fra: [email protected] [
> 
> [email protected]] P&#229; vegne af Justin Karneges [
> [email protected]]
> 
> > Sendt: 1. november 2011 21:34
> > Til: [email protected]
> > Emne: safely resolving conflicts on read
> > 
> > Hi,
> > 
> > http://wiki.basho.com/Vector-Clocks.html contains this text:
> > 
> > "It should be noted that if you are trying to resolve conflicts
> 
> automatically,
> 
> > you can end up in a condition with which two clients are simultaneously
> > resolving and creating new conflicts."
> > 
> > If conflict resolution is moved to the reader, I'm curious what
> > strategies people use to avoid this kind of feet stomping.
> > 
> > Some ideas that have come to mind:
> > 
> > 1) Have a deterministic way of deriving a correct/unified/winner value
> 
> from
> 
> > multiple siblings, such that any reading client would always arrive at
> > the same answer.  Simply use this derived answer as the value read, but
> > don't attempt to write a corrected value into Riak.  The value could
> > eventually
> 
> be
> 
> > corrected at the time of a necessary write as opposed to a read reaction.
> > 
> > 2) Determine a correct value per #1 above, but then attempt to write this
> > value back into Riak in such a way that if multiple nodes were to write
> 
> the
> 
> > same value simultaneously then they don't create siblings.  I'm not sure
> 
> if
> 
> > this is possible in Riak?
> > 
> > 3) Determine a correct value per #1 above, and allow exactly one node to
> 
> ever
> 
> > immediately write corrected values after a read.  Something like "if
> 
> hash(key)
> 
> > % node_count == current_node then do_correction".  Since value correction
> 
> is
> 
> > not vital to availability, there's no harm if the owning node is down.  I
> 
> just
> 
> > figure this would allow for self-healing over time.
> > 
> > Maybe there are other ways.  What do people really do?
> > 
> > Thanks,
> > Justin
> > 
> > _______________________________________________
> > riak-users mailing list
> > [email protected]
> > http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
> > 
> > _______________________________________________
> > riak-users mailing list
> > [email protected]
> > http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

_______________________________________________
riak-users mailing list
[email protected]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Reply via email to