from:"Sergey Olefir $JIRA$"

[jira] [Created] (CASSANDRA-5218) Log explosion when another cluster node is down and remaining node is overloaded.

2013-02-04 Thread Sergey Olefir (JIRA)

Sergey Olefir created CASSANDRA-5218:

Summary: Log explosion when another cluster node is down and
remaining node is overloaded.
Key: CASSANDRA-5218
URL: https://issues.apache.org/jira/browse/CASSANDRA-5218
Project: Cassandra
Issue Type: Bug
Affects Versions: 1.1.7
Reporter: Sergey Olefir

I have Cassandra 1.1.7 cluster with 4 nodes in 2 datacenters (2+2). Replication
is configured as DC1:2,DC2:2 (i.e. every node holds the entire data).

I am load-testing counter increments at the rate of about 10k per second. All
writes are directed to two nodes in DC1 (DC2 nodes are basically backup). In
total there's 100 separate clients executing 1-2 batch updates per second.

We wanted to test what happens if one node goes down, so we brought one node
down in DC1 (i.e. the node that was handling half of the incoming writes).

This led to a complete explosion of logs on the remaining alive node in DC1.

There are hundreds of megabytes of logs within an hour all basically saying the
same thing:
ERROR [ReplicateOnWriteStage:5653390] 2013-01-22 12:44:33,611
AbstractCassandraDaemon.java (line 135) Exception in thread
Thread[ReplicateOnWriteStage:5653390,5,main]
java.lang.RuntimeException: java.util.concurrent.TimeoutException
at
org.apache.cassandra.service.StorageProxy$DroppableRunnable.run(StorageProxy.java:1275)

at
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)

at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:662)
Caused by: java.util.concurrent.TimeoutException
at
org.apache.cassandra.service.StorageProxy.sendToHintedEndpoints(StorageProxy.java:311)

at
org.apache.cassandra.service.StorageProxy$7$1.runMayThrow(StorageProxy.java:585)

at
org.apache.cassandra.service.StorageProxy$DroppableRunnable.run(StorageProxy.java:1271)

... 3 more

The logs are completely swamped with this and are thus unusable. It may also
negatively impact the node performance.

According to Aaron Morton:
{quote}The error is the coordinator node protecting it's self.

Basically it cannot handle the volume of local writes + the writes for HH. The
number of in flight hints is greater than…

private static volatile int maxHintsInProgress = 1024 *
Runtime.getRuntime().availableProcessors();{quote}
http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/node-down-log-explosion-tp7584932p7584957.html

I think there are two issues here:
(a) the same exception occurring for the same reason doesn't need to be spammed
into log many times per second;
(b) exception message ought to be more clear about cause -- i.e. in this case
some message about "overload" or "load shedding" might be appropriate.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (CASSANDRA-5080) cassandra-cli doesn't support JMX authentication.

2012-12-20 Thread Sergey Olefir (JIRA)

Sergey Olefir created CASSANDRA-5080:


 Summary: cassandra-cli doesn't support JMX authentication.
 Key: CASSANDRA-5080
 URL: https://issues.apache.org/jira/browse/CASSANDRA-5080
 Project: Cassandra
  Issue Type: Bug
  Components: Tools
Affects Versions: 1.1.7, 1.1.6
Reporter: Sergey Olefir


It seems that cassandra-cli doesn't support JMX user authentication.

Specifically I went about securing our Cassandra cluster slightly -- I've added 
cassandra-level authentication (which cassandra-cli does support), but then I 
discovered that nodetool is still completely unprotected. So I went ahead and 
secured JMX (via -Dcom.sun.management.jmxremote.password.file and 
-Dcom.sun.management.jmxremote.access.file). Nodetool supports JMX 
authentication via -u and -pw options.

However it seems that cassandra-cli doesn't support JMX authentication, e.g.:
{quote}
apache-cassandra-1.1.6\bin>cassandra-cli -h hostname -u experiment -pw password
Starting Cassandra Client
Connected to: "db" on hostname/9160
Welcome to Cassandra CLI version 1.1.6

[experiment@unknown] show keyspaces;
WARNING: Could not connect to the JMX on hostname:7199, information won't be 
shown.

Keyspace: system:
  Replication Strategy: org.apache.cassandra.locator.LocalStrategy
  Durable Writes: true
Options: [replication_factor:1]
... (rest of keyspace output snipped)
{quote}

help connect; and cassandra-cli --help do not seem to indicate that there's any 
way to specify JMX login information.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (CASSANDRA-5218) Log explosion when another cluster node is down and remaining node is overloaded.

[jira] [Created] (CASSANDRA-5080) cassandra-cli doesn't support JMX authentication.

2 matches

Site Navigation

Mail list logo

Footer information