Jon,

On Mon, Nov 19, 2012 at 3:28 AM, Jon Perez <jbperez...@yahoo.com> wrote:
>
>
> Why exactly are these reasons?  I suspect they have to do with performance.
> If so, how exactly does performance degrade when nodes are spread out
> between hosts with latencies in the tens to hundred plus milliseconds (as
> is
> typical over WANs)?
>

Yes, latency is a big reason.  Every Riak requests involves N vnodes.  If
those vnodes are spread across different regions with varying latencies
then your deviation grows and higher percentiles go through the roof.  For
some this many be okay, but the wider you go the more unpredictable your
latency profile becomes.  Predictable latency is key to many applications
built on top of Riak.  Many of these apps are web-apps with tight
constraints of the maximum time any request should take.  Given that most
webapps are made up of many components behind the scenes it is vital the
each individual component deliver as predictably as possible so that the
developers can have some confidence on the end-to-end latency of a request.
 This is a point made very clear in the Dynamo paper which heavily
influenced the design of Riak.


> Or does it have more to do with reliability of connections and overhead in
> retrying them?  I can understand that Bashio has a vested interest in
> promoting Riak Enterprise for this, but it would be nice if the technical
> details of why were actually laid out in detail.
>

This is another reason.  A node that is dead is indistinguishable from a
node that is simply taking a really long time to respond.  Once you spread
nodes across a WAN the chance for network failure, and thus network
partitions, becomes much greater.  Riak is designed to always be available
for writes but you still want to avoid partitions as much as possible.
 Partitions are one of the primary causes of siblings, potentially
generating lots of sibling resolution.  Partitions also cause additional
load to be placed on the nodes.  Say you had a 6-node cluster configured as
2 3-node clusters in different data centers.  If the link between the data
centers goes down or becomes too slow you'd end up with a partition between
the 2 3-node clusters and each would have to take on the load that was
being served by the 6-node cluster.  This includes things like disk space,
file descriptors, open ports, memory usage, CPU usage, network utilization,
etc.



> From my limited experience with Riak, getting multiple nodes within a
> cluster going is extremely simple, whereas going multiple clusters is a
> very
> different story and requires a new layer of understanding.  It's too bad
> that there is a distinction between nodes over "WANs" and "LANs".  I guess
> the holy grail of dbs is still some ways off, although Riak seems to be the
> closest fit right now.
>


The way Riak is designed is far from the most efficient way to replicate
data across a WAN.  A lot of that code was/is written with assumptions of
LAN and fairly predictable latency.  This is one of the reasons we have a
separate piece of software that performs this task.  This problem is not as
easy as some people think.  You should checkout Andrew Thompson's RICON
talk on Riak's WAN replication.

http://vimeo.com/52016325

-Z
_______________________________________________
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Reply via email to