Jon, On Mon, Nov 19, 2012 at 3:28 AM, Jon Perez <jbperez...@yahoo.com> wrote: > > > Why exactly are these reasons? I suspect they have to do with performance. > If so, how exactly does performance degrade when nodes are spread out > between hosts with latencies in the tens to hundred plus milliseconds (as > is > typical over WANs)? >
Yes, latency is a big reason. Every Riak requests involves N vnodes. If those vnodes are spread across different regions with varying latencies then your deviation grows and higher percentiles go through the roof. For some this many be okay, but the wider you go the more unpredictable your latency profile becomes. Predictable latency is key to many applications built on top of Riak. Many of these apps are web-apps with tight constraints of the maximum time any request should take. Given that most webapps are made up of many components behind the scenes it is vital the each individual component deliver as predictably as possible so that the developers can have some confidence on the end-to-end latency of a request. This is a point made very clear in the Dynamo paper which heavily influenced the design of Riak. > Or does it have more to do with reliability of connections and overhead in > retrying them? I can understand that Bashio has a vested interest in > promoting Riak Enterprise for this, but it would be nice if the technical > details of why were actually laid out in detail. > This is another reason. A node that is dead is indistinguishable from a node that is simply taking a really long time to respond. Once you spread nodes across a WAN the chance for network failure, and thus network partitions, becomes much greater. Riak is designed to always be available for writes but you still want to avoid partitions as much as possible. Partitions are one of the primary causes of siblings, potentially generating lots of sibling resolution. Partitions also cause additional load to be placed on the nodes. Say you had a 6-node cluster configured as 2 3-node clusters in different data centers. If the link between the data centers goes down or becomes too slow you'd end up with a partition between the 2 3-node clusters and each would have to take on the load that was being served by the 6-node cluster. This includes things like disk space, file descriptors, open ports, memory usage, CPU usage, network utilization, etc. > From my limited experience with Riak, getting multiple nodes within a > cluster going is extremely simple, whereas going multiple clusters is a > very > different story and requires a new layer of understanding. It's too bad > that there is a distinction between nodes over "WANs" and "LANs". I guess > the holy grail of dbs is still some ways off, although Riak seems to be the > closest fit right now. > The way Riak is designed is far from the most efficient way to replicate data across a WAN. A lot of that code was/is written with assumptions of LAN and fairly predictable latency. This is one of the reasons we have a separate piece of software that performs this task. This problem is not as easy as some people think. You should checkout Andrew Thompson's RICON talk on Riak's WAN replication. http://vimeo.com/52016325 -Z
_______________________________________________ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com