system freezes for 2 minutes after network partition

shared mailinglists Mon, 24 Oct 2011 05:55:06 -0700

Hi,
I am investigating the feasibility of using Riak for an application where
there is a firm requirement for the system to remain available in the face
of node failures and network partitions. Up to now, I have found Riak to be
quite resilient to node failures, but network partitions are causing more
problems.


My setup involves a cluster of five nodes, each running Riak 1.0 on a
separate Linux box on a common subnet. Let's call the cluster [A,B,C,D,E].
On a sixth box, I run the Java PB Cluster Client 1.0.1. The client fires
fetch and store requests at the cluster using various keys and a gle bucket,
and everything works fine under normal operating conditions. The bucket has
the default properties N=3, W=2, R=2.

During the run, I use iptables to simulate a network partition in which one
node in the cluster is disconnected from the other four (but all five remain
connected to the client). To disconnect from node A, for example, I run:

sudo /sbin/iptables -A INPUT -s <node A> -j REJECT
 sudo /sbin/iptables -A OUTPUT -d <node A> -j REJECT

So at this point we have in effect a majority cluster [A,B,C,D] and a
minority cluster [E}. My expectation was that any request from the client to
the minority cluster will fail because a quorum cannot bbtained for either
read or write requests. On the other hand, I expect requesto the majority
cluster to succeed because there will always be at least two of the three
copies stored within the majority cluster. Therefore I expect very little
impact on the performance of the system, beyond the need to retry 20% of the
requests. Is this a reasonable expectation, or am I missing something
important here ?

What actually happens after the partition is that the whole system freezes
up for some time. For one minute there appears to be no processing done and
nothing is written to the logs. Then after one minute the nodes in the
majority cluster show the following in their logs:

2011-10-24 08:4:10.098 [error] <0.19280.0> ** Node 'riak@<node E>' not
responding **
** Removing (timedout) connection **

The node in the minority cluster has similar log entries, except that it
finds four nodes not responding.

If the network remains partitioned, processing stops for a total of two
minutes before resuming. When the partition is repaired, as expected I see a
lot of handoff activity in the logs and the whole cluster soon becomes
consistent.

The main concern I have is that processing stops altogether for 2 minutes
when the network is partitioned. Can you explain why this is happening, and
whether there is something I can do allow processing to continue ? As far as
I can see, it ought to be possible for the majority partition to continue
processing requests without interruption, and without any errors being
generated.

Thanks in advance for your help.

Malcolm

_______________________________________________
riak-users mailing list
[email protected]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

system freezes for 2 minutes after network partition

Reply via email to