Hi, I am running some load test in a 5 node Cassandra cluster (EC2, single region, each node has 15 GB RAM, Cassandra version 2.0.6, replication factor 3). My Java program uses Java driver version 2.0.6 and it does 2000 rounds of batch write queries, each with 8 inserts, 8 updates and 8 deletes. When I run the test, I usually see really high CPU usage on some of the nodes. And sometimes I get a time out indicated by com.datastax.driver.core.exceptions.WriteTimeoutException: Cassandra timeout during write query at consistency ONE (1 replica were required but only 0 acknowledged the write). My cluster has 10000 ms write timeout limit. I used VisualVM and did some CPU sampling. What I noticed is that CPU spends majority of the time inside method org.apache.cassandra.db.marshal.AbstractCompositeType.compare() and the snapshot indicates that the call tree is most of the time like this:
java.lang.Thread.State: RUNNABLE at org.apache.cassandra.db.marshal.AbstractCompositeType.compare(AbstractCompositeType.java:98) at org.apache.cassandra.db.marshal.AbstractCompositeType.compare(AbstractCompositeType.java:35) at java.util.Arrays.binarySearch0(Arrays.java:1585) at java.util.Arrays.binarySearch(Arrays.java:1570) at org.apache.cassandra.db.RangeTombstoneList.searchInternal(RangeTombstoneList.java:236) at org.apache.cassandra.db.RangeTombstoneList.isDeleted(RangeTombstoneList.java:210) at org.apache.cassandra.db.DeletionInfo.isDeleted(DeletionInfo.java:136) at org.apache.cassandra.db.DeletionInfo.isDeleted(DeletionInfo.java:123) at org.apache.cassandra.db.AtomicSortedColumns.addAllWithSizeDelta(AtomicSortedColumns.java:193) at org.apache.cassandra.db.Memtable.resolve(Memtable.java:194) at org.apache.cassandra.db.Memtable.put(Memtable.java:158) at org.apache.cassandra.db.ColumnFamilyStore.apply(ColumnFamilyStore.java:891) at org.apache.cassandra.db.Keyspace.apply(Keyspace.java:368) at org.apache.cassandra.db.Keyspace.apply(Keyspace.java:333) at org.apache.cassandra.db.RowMutation.apply(RowMutation.java:206) at org.apache.cassandra.db.RowMutationVerbHandler.doVerb(RowMutationVerbHandler.java:56) at org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:60) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) So it looks like it is tombstone search related even though I am not seeing tombstone warning message in Cassandra's system log. Per http://www.datastax.com/dev/blog/how-cassandra-deals-with-replica-failure, write timeout is not really an error. But with my load and 10 seconds write timeout limit, getting timeout seems to be problematic. Do you guys have some hint or pointer what I should go about this issue? Thanks.