Permanent ReadTimeout

Ja Sam Mon, 12 Jan 2015 06:37:53 -0800

*Environment*


   - Cassandra 2.1.0
   - 5 nodes in one DC (DC_A), 4 nodes in second DC (DC_B)
   - 2500 writes per seconds, I write only to DC_A with local_quorum
   - minimal reads (usually none, sometimes few)

*Problem*

After a few weeks of running I cannot read any data from my cluster,
because I have ReadTimeoutException like following:

ERROR [Thrift:15] 2015-01-07 14:16:21,124
CustomTThreadPoolServer.java:219 - Error occurred during processing of
message.
com.google.common.util.concurrent.UncheckedExecutionException:
java.lang.RuntimeException:
org.apache.cassandra.exceptions.ReadTimeoutException: Operation timed
out - received only 2 responses.

To be precise it is not only problem in my cluster, The second one was
described here: Cassandra GC takes 30 seconds and hangs node
<http://stackoverflow.com/questions/27843538/cassandra-gc-takes-30-seconds-and-hangs-node>
and
I will try to use fix from CASSANDRA-6541
<http://issues.apache.org/jira/browse/CASSANDRA-6541> as leshkin suggested

*Diagnose *

I tried to use some tools which were presented on
http://rustyrazorblade.com/2014/09/cassandra-summit-recap-diagnosing-problems-in-production/
by Jon Haddad and have some strange result.


I tried to run same query in DC_A and DC_B with tracing enabled. Query is
simple:

   SELECT * FROM X.customer_events WHERE customer='1234567' AND
utc_day=16447 AND bucket IN (1,2,3,4,5,6,7,8,9,10);

Where table is defiied as following:

  CREATE TABLE drev_maelstrom.customer_events (customer text,utc_day
int, bucket
int, event_time bigint, event_id blob, event_type int, event blob,

  PRIMARY KEY ((customer, utc_day, bucket), event_time, event_id,
event_type)[...]

Results of the query:

1) In DC_B the query finished in less then a 0.22 of second . In DC_A more
then 2.5 (~10 times longer). -> the problem is that bucket can be in range
form -128 to 256

2) In DC_B it checked ~1000 SSTables with lines like:

   Bloom filter allows skipping sstable 50372 [SharedPool-Worker-7] |
2015-01-12 13:51:49.467001 | 192.168.71.198 |           4782

Where in DC_A it is:

   Bloom filter allows skipping sstable 118886 [SharedPool-Worker-5] |
2015-01-12 14:01:39.520001 | 192.168.61.199 |          25527

3) Total records in both DC were same.


*Question*

The question is quite simple: how can I speed up DC_A - it is my primary
DC, DC_B is mostly for backup, and there is a lot of network partitions
between A and B.

Maybe I should check something more, but I just don't have an idea what it
should be.

Permanent ReadTimeout

Reply via email to