Hi

My system is 4 nodes 64 bit cassandra cluster, 6G big per node,default
configuration (which means 1/3 heap for memtable), replicate number 3,
write all, read one.
When I run stress load testing, I got this TimedOutException, and some
operation failed, and all traffic hang for a while.

And when I have 1G memory 32 bit cassandra on standalone model, I didn't
find so frequently "Stop the world" behavior.

So I wonder what kind of operation will hang the cassandra system.

How to collect information for tuning.

>From the system log and document, I guess there are three type operations:
1) Flush memtable when meet max size
2) Compact SSTable (why?)
3) Java GC

system.log:
 INFO [main] 2012-05-25 16:12:17,054 ColumnFamilyStore.java (line 688)
Enqueuing flush of Memtable-LocationInfo@1229893321(53/66 serialized/live
bytes, 2 ops)
 INFO [FlushWriter:1] 2012-05-25 16:12:17,054 Memtable.java (line 239)
Writing Memtable-LocationInfo@1229893321(53/66 serialized/live bytes, 2 ops)
 INFO [FlushWriter:1] 2012-05-25 16:12:17,166 Memtable.java (line 275)
Completed flushing
/var/proclog/raw/cassandra/data/system/LocationInfo-hb-2-Data.db (163 bytes)
...

 INFO [CompactionExecutor:441] 2012-05-28 08:02:55,345 CompactionTask.java
(line 112) Compacting
[SSTableReader(path='/var/proclog/raw/cassandra/data/myks/queue-hb-41-Data.db'),
SSTableReader(path='/var/proclog/raw/cassandra/data/
myks /queue-hb-32-Data.db'),
SSTableReader(path='/var/proclog/raw/cassandra/data/
myks /queue-hb-37-Data.db'),
SSTableReader(path='/var/proclog/raw/cassandra/data/
myks /queue-hb-53-Data.db')]
...

 WARN [ScheduledTasks:1] 2012-05-28 08:02:26,619 GCInspector.java (line
146) Heap is 0.7993011015621736 full.  You may need to reduce memtable
and/or cache sizes.  Cassandra will now flush up to the two largest
memtables to free up memory.  Adjust flush_largest_memtables_at threshold
in cassandra.yaml if you don't want Cassandra to do this automatically
 INFO [ScheduledTasks:1] 2012-05-28 08:02:54,980 GCInspector.java (line
123) GC for ConcurrentMarkSweep: 728 ms for 2 collections, 3594946600 used;
max is 6274678784
 INFO [ScheduledTasks:1] 2012-05-28 08:41:34,030 GCInspector.java (line
123) GC for ParNew: 1668 ms for 1 collections, 4171503448 used; max is
6274678784
 INFO [ScheduledTasks:1] 2012-05-28 08:41:48,978 GCInspector.java (line
123) GC for ParNew: 1087 ms for 1 collections, 2623067496 used; max is
6274678784
 INFO [ScheduledTasks:1] 2012-05-28 08:41:48,987 GCInspector.java (line
123) GC for ConcurrentMarkSweep: 3198 ms for 3 collections, 2623361280
used; max is 6274678784


Timeout Exception:
Caused by: org.apache.cassandra.thrift.TimedOutException: null
        at
org.apache.cassandra.thrift.Cassandra$batch_mutate_result.read(Cassandra.java:19495)
~[na:na]
        at
org.apache.cassandra.thrift.Cassandra$Client.recv_batch_mutate(Cassandra.java:1035)
~[na:na]
        at
org.apache.cassandra.thrift.Cassandra$Client.batch_mutate(Cassandra.java:1009)
~[na:na]
        at
me.prettyprint.cassandra.service.KeyspaceServiceImpl$1.execute(KeyspaceServiceImpl.java:95)
~[na:na]
        ... 64 common frames omitted

BRs
//Tang Weiqiang

Reply via email to