Hi, We are using Cassandra 2.0.14 with Hector 1.1.4. Each node in cluster has an application using Hector and a Cassandra instance.
I want suggestions on the approach we are taking for throttling Cassandra load. Problem Statement: Misbehaved clients can bring down Cassandra clusters by putting excessive load. We want to prevent overloading of Cassandra cluster. Solution Proposed: 1. Run a Test for each application scenario involving Cassandra. Keep on putting more requests in each application Scenario till performance starts deteriorating for the scenario and note the max connection achieved during the tests as follows: For Example: Scenario A=60 Scenario B=70 Scneario C=90 Set rpc_max_threads= max(All scenarios)=90 2. In Hector, set MaxActive connections per host=90 3. As Hector maintains connections PER HOST, Number of open connections by a Hector client on a node increases with cluster size. e.g. On a 3 node cluster, each Hector client will open total of 90 * 3 connections On a 15 node cluster, each Hector client will open total of 90 * 15 connections So, we have set rpc_server_type=hsha to support large client connections. Not sure whether https://issues.apache.org/jira/i#browse/CASSANDRA-7309 is a concern?? 4. At application level, we check cluster load by ADDING active connections created by Hector on EACH node of cluster. If they are already around 95% of ( 90 * (num of Nodes)),we reject tasks to prevent overload. 5. We see that Hector only closes idle connections when borrowing clients from pool .And immediately after closing idle connections, it creates a new one. So, if active connections increase they seldom go down and remain open(except in few exception scenarios). So, we cant rely on ThriftClients JMX metrics by Cassandra to know ACTIVE connections. ThriftClients show open connections rather than active.Is there a better way to know active Cassandra connections on a Cassandra node?? or check Cassandra load to prevent more tasks if a node is already overloaded? I am looking for suggestions on above approach and more ideas on throttling Cassandra load ? Thanks Anuj