On Jul 31, 2011, at 12:58 PM, Simon Slavin wrote: > These two go together. Multi-master replication (one example of which is a > document store) is relatively easy. Datestamp every value (document) and > whichever one has the lastest date is the one you want.
I hear that a lot, but it makes me pretty uncomfortable. People like timestamps because there's some kind of implied causality. It's not there in reality, though, and this type of resolution can lead to harmful (i.e. data loss) results. Just because something happened after something else doesn't mean that it had all the same knowledge that went into the first decision. And those are the best cases. Lamport clocks (and the more general vector clocks) exist because they *explicitly* state causality. That is, if I have perfectly synchronized clocks and two applications running on machines immediately next to each other (trying to avoid the relativity argument[0]), event A can occur that changes the data to a particular state and can be picked up by server one, but not server two. Server one and server two can go to change the data at roughly the same time, but server two was slightly slower and it came in last. Now server two is just eating data *because* it's reacting more slowly. Your timestamp-automated conflict resolver favors slower machines that stay behind. With explicit causality, you state that state B succeeds state A because we knew about state A regardless of when we made our decision. If state B' tries to succeed state A without knowing about state B, then it can happen on an isolated system, but will introduce a conflict when it learns that something had already done this. Now we have two successors for state A and only your application's conflict resolver can make sense of what it means for what state the document should be in. (note that in the case of CouchDB if state B and state B' were the same, this would be recognized as not a conflict, but that's rather a special case). [0]: I always try to avoid the relativity argument, but when you're dealing with systems that are far apart, whether an event happened before another event is entirely up to the perspective of the observer. My CouchDB on Mars might see and react to an event hours before we can observe it on earth. About one (earth) year later, Earth might observe and react to this event hours before Mars can. In both cases, an event coming from the other direction would have flipped its arrival order between the two. While that may seem like a silly thing to discuss, the exact thing happens locally. The theoretical floor of ping time between the east and west coast of the US is about 18ms milliseconds. In practice, you'll get something closer to 40ms. Do you know how much stuff happens within 40ms? From the perspective of the east coast, anything happening on the east cost will appear to occur considerably sooner than anything happening at "the same time" on the west coast. -- dustin sallings _______________________________________________ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users