[ https://issues.apache.org/jira/browse/CASSANDRA-11814?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Terry Ma updated CASSANDRA-11814: --------------------------------- Description: we have a cassandra cluster, which has 16 nodes(m4.2xlarge) on aws. The total data size is 2.5TB now. There is a very large table named inbox_musical, the schema is like this: {code} CREATE TABLE feeds.inbox_musical ( owner_id bigint, activity_id bigint, insert_time timestamp, PRIMARY KEY (owner_id, activity_id) ) WITH CLUSTERING ORDER BY (activity_id DESC) AND bloom_filter_fp_chance = 0.1 AND caching = '{"keys":"ALL", "rows_per_partition":"20"}' AND comment = '' AND compaction = {'class': 'org.apache.cassandra.db.compaction.LeveledCompactionStrategy'} AND compression = {'sstable_compression': 'org.apache.cassandra.io.compress.LZ4Compressor'} AND dclocal_read_repair_chance = 0.1 AND default_time_to_live = 15552000 AND gc_grace_seconds = 864000 AND max_index_interval = 2048 AND memtable_flush_period_in_ms = 0 AND min_index_interval = 128 AND read_repair_chance = 0.0 AND speculative_retry = '99.0PERCENTILE'; {code} owner_id is the partition key, and activity_id is clustering key. 95% query can return within 10ms. but a few queries read timeout ( > 5s ). the slow queries is like this: {code:sql} SELECT activity_id FROM inbox_musical WHERE owner_id=5582322 ORDER BY activity_id DESC LIMIT 100; {code} there are only 230 rows in this partition. but the query takes seconds to complete. the query tracing: {code} activity | timestamp | source | source_elapsed ---------------------------------------------------------------------------------------------------------------------------------+----------------------------+--------------+---------------- Execute CQL3 query | 2016-05-17 11:21:26.986000 | 172.31.8.188 | 0 READ message received from /172.31.8.188 [MessagingService-Incoming-/172.31.8.188] | 2016-05-17 11:21:26.985000 | 172.31.15.14 | 13 Executing single-partition query on inbox_musical [SharedPool-Worker-4] | 2016-05-17 11:21:26.985000 | 172.31.15.14 | 115 READ message received from /172.31.8.188 [MessagingService-Incoming-/172.31.8.188] | 2016-05-17 11:21:26.986000 | 172.31.4.135 | 13 Acquiring sstable references [SharedPool-Worker-4] | 2016-05-17 11:21:26.986000 | 172.31.15.14 | 137 Executing single-partition query on inbox_musical [SharedPool-Worker-1] | 2016-05-17 11:21:26.986000 | 172.31.4.135 | 113 Merging memtable tombstones [SharedPool-Worker-4] | 2016-05-17 11:21:26.986000 | 172.31.15.14 | 161 Parsing SELECT activity_id FROM inbox_musical WHERE owner_id=5582322 ORDER BY activity_id DESC LIMIT 100; [SharedPool-Worker-3] | 2016-05-17 11:21:26.986000 | 172.31.8.188 | 66 Acquiring sstable references [SharedPool-Worker-1] | 2016-05-17 11:21:26.986000 | 172.31.4.135 | 136 Key cache hit for sstable 18948 [SharedPool-Worker-4] | 2016-05-17 11:21:26.986000 | 172.31.15.14 | 199 Preparing statement [SharedPool-Worker-3] | 2016-05-17 11:21:26.986000 | 172.31.8.188 | 99 Merging memtable tombstones [SharedPool-Worker-1] | 2016-05-17 11:21:26.986000 | 172.31.4.135 | 155 Seeking to partition beginning in data file [SharedPool-Worker-4] | 2016-05-17 11:21:26.986000 | 172.31.15.14 | 205 Read-repair DC_LOCAL [SharedPool-Worker-3] | 2016-05-17 11:21:26.986000 | 172.31.8.188 | 142 Bloom filter allows skipping sstable 20159 [SharedPool-Worker-1] | 2016-05-17 11:21:26.986000 | 172.31.4.135 | 179 Key cache hit for sstable 18924 [SharedPool-Worker-4] | 2016-05-17 11:21:26.986000 | 172.31.15.14 | 459 reading data from /172.31.4.135 [SharedPool-Worker-3] | 2016-05-17 11:21:26.986000 | 172.31.8.188 | 160 Key cache hit for sstable 20134 [SharedPool-Worker-1] | 2016-05-17 11:21:26.986000 | 172.31.4.135 | 195 Seeking to partition beginning in data file [SharedPool-Worker-4] | 2016-05-17 11:21:26.986000 | 172.31.15.14 | 468 reading data from /172.31.15.14 [SharedPool-Worker-3] | 2016-05-17 11:21:26.986000 | 172.31.8.188 | 212 Seeking to partition beginning in data file [SharedPool-Worker-1] | 2016-05-17 11:21:26.986001 | 172.31.4.135 | 202 Key cache hit for sstable 18547 [SharedPool-Worker-4] | 2016-05-17 11:21:26.986001 | 172.31.15.14 | 782 Sending READ message to /172.31.4.135 [MessagingService-Outgoing-/172.31.4.135] | 2016-05-17 11:21:26.986001 | 172.31.8.188 | 249 Key cache hit for sstable 19782 [SharedPool-Worker-1] | 2016-05-17 11:21:26.986001 | 172.31.4.135 | 376 Seeking to partition beginning in data file [SharedPool-Worker-4] | 2016-05-17 11:21:26.986001 | 172.31.15.14 | 792 Sending READ message to /172.31.15.14 [MessagingService-Outgoing-/172.31.15.14] | 2016-05-17 11:21:26.986001 | 172.31.8.188 | 260 Seeking to partition beginning in data file [SharedPool-Worker-1] | 2016-05-17 11:21:26.986001 | 172.31.4.135 | 388 Skipped 0/3 non-slice-intersecting sstables, included 0 due to tombstones [SharedPool-Worker-4] | 2016-05-17 11:21:26.986001 | 172.31.15.14 | 963 Key cache hit for sstable 16500 [SharedPool-Worker-1] | 2016-05-17 11:21:26.986001 | 172.31.4.135 | 477 Merging data from memtables and 3 sstables [SharedPool-Worker-4] | 2016-05-17 11:21:26.986001 | 172.31.15.14 | 973 Seeking to partition beginning in data file [SharedPool-Worker-1] | 2016-05-17 11:21:26.986001 | 172.31.4.135 | 484 Skipped 0/4 non-slice-intersecting sstables, included 0 due to tombstones [SharedPool-Worker-1] | 2016-05-17 11:21:26.986001 | 172.31.4.135 | 573 Merging data from memtables and 3 sstables [SharedPool-Worker-1] | 2016-05-17 11:21:26.986001 | 172.31.4.135 | 582 Read 101 live and 0 tombstone cells [SharedPool-Worker-1] | 2016-05-17 11:21:31.805000 | 172.31.4.135 | 819660 Enqueuing response to /172.31.8.188 [SharedPool-Worker-1] | 2016-05-17 11:21:31.831000 | 172.31.4.135 | 844972 Sending REQUEST_RESPONSE message to /172.31.8.188 [MessagingService-Outgoing-/172.31.8.188] | 2016-05-17 11:21:31.868000 | 172.31.4.135 | 882380 REQUEST_RESPONSE message received from /172.31.4.135 [MessagingService-Incoming-/172.31.4.135] | 2016-05-17 11:21:31.962000 | 172.31.8.188 | 976119 Processing response from /172.31.4.135 [SharedPool-Worker-2] | 2016-05-17 11:21:31.962000 | 172.31.8.188 | 976166 Request c omplete | 2016-05-17 11:21:32.014093 | 172.31.8.188 | 28093 {code} I used jstack to get some stacks when c* is execute query like above. and found it seems c* take a long time do the following methods: {code} "SharedPool-Worker-5" #221 daemon prio=5 os_prio=0 tid=0x00007f177517a000 nid=0x8b6 runnable [0x00007f1471e5b000] java.lang.Thread.State: RUNNABLE at org.apache.cassandra.db.RangeTombstoneList.addInternal(RangeTombstoneList.java:689) at org.apache.cassandra.db.RangeTombstoneList.insertFrom(RangeTombstoneList.java:649) at org.apache.cassandra.db.RangeTombstoneList.add(RangeTombstoneList.java:170) at org.apache.cassandra.db.RangeTombstoneList.add(RangeTombstoneList.java:142) at org.apache.cassandra.db.DeletionInfo.add(DeletionInfo.java:239) at org.apache.cassandra.db.ArrayBackedSortedColumns.delete(ArrayBackedSortedColumns.java:490) at org.apache.cassandra.db.ColumnFamily.addAtom(ColumnFamily.java:153) at org.apache.cassandra.db.filter.QueryFilter$2.getNext(QueryFilter.java:185) at org.apache.cassandra.db.filter.QueryFilter$2.hasNext(QueryFilter.java:157) at org.apache.cassandra.utils.MergeIterator$Candidate.advance(MergeIterator.java:146) at org.apache.cassandra.utils.MergeIterator$ManyToOne.advance(MergeIterator.java:125) at org.apache.cassandra.utils.MergeIterator$ManyToOne.computeNext(MergeIterator.java:99) at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:143) at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:138) at org.apache.cassandra.db.filter.SliceQueryFilter.collectReducedColumns(SliceQueryFilter.java:264) at org.apache.cassandra.db.filter.QueryFilter.collateColumns(QueryFilter.java:109) at org.apache.cassandra.db.filter.QueryFilter.collateOnDiskAtom(QueryFilter.java:82) at org.apache.cassandra.db.filter.QueryFilter.collateOnDiskAtom(QueryFilter.java:69) at org.apache.cassandra.db.CollationController.collectAllData(CollationController.java:317) at org.apache.cassandra.db.CollationController.getTopLevelColumns(CollationController.java:61) at org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(ColumnFamilyStore.java:2011) at org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1815) at org.apache.cassandra.db.Keyspace.getRow(Keyspace.java:360) at org.apache.cassandra.db.SliceFromReadCommand.getRow(SliceFromReadCommand.java:85) {code} I executed these query when server was on very low load. was: we have a cassandra cluster, which has 16 nodes(m4.2xlarge) on aws. The total data size is 2.5TB now. There is a very large table named inbox_musical, the schema is like this: {code} CREATE TABLE feeds.inbox_musical ( owner_id bigint, activity_id bigint, insert_time timestamp, PRIMARY KEY (owner_id, activity_id) ) WITH CLUSTERING ORDER BY (activity_id DESC) AND bloom_filter_fp_chance = 0.1 AND caching = '{"keys":"ALL", "rows_per_partition":"20"}' AND comment = '' AND compaction = {'class': 'org.apache.cassandra.db.compaction.LeveledCompactionStrategy'} AND compression = {'sstable_compression': 'org.apache.cassandra.io.compress.LZ4Compressor'} AND dclocal_read_repair_chance = 0.1 AND default_time_to_live = 15552000 AND gc_grace_seconds = 864000 AND max_index_interval = 2048 AND memtable_flush_period_in_ms = 0 AND min_index_interval = 128 AND read_repair_chance = 0.0 AND speculative_retry = '99.0PERCENTILE'; {code} owner_id is the partition key, and activity_id is clustering key. 95% query can return within 10ms. but a few queries read timeout ( > 5s ). the slow queries is like this: {code:sql} SELECT activity_id FROM inbox_musical WHERE owner_id=5582322 ORDER BY activity_id DESC LIMIT 100; {code} there are only 230 rows in this partition. but the query takes seconds to complete. the query tracing: {quote} activity | timestamp | source | source_elapsed ---------------------------------------------------------------------------------------------------------------------------------+----------------------------+--------------+---------------- Execute CQL3 query | 2016-05-17 11:21:26.986000 | 172.31.8.188 | 0 READ message received from /172.31.8.188 [MessagingService-Incoming-/172.31.8.188] | 2016-05-17 11:21:26.985000 | 172.31.15.14 | 13 Executing single-partition query on inbox_musical [SharedPool-Worker-4] | 2016-05-17 11:21:26.985000 | 172.31.15.14 | 115 READ message received from /172.31.8.188 [MessagingService-Incoming-/172.31.8.188] | 2016-05-17 11:21:26.986000 | 172.31.4.135 | 13 Acquiring sstable references [SharedPool-Worker-4] | 2016-05-17 11:21:26.986000 | 172.31.15.14 | 137 Executing single-partition query on inbox_musical [SharedPool-Worker-1] | 2016-05-17 11:21:26.986000 | 172.31.4.135 | 113 Merging memtable tombstones [SharedPool-Worker-4] | 2016-05-17 11:21:26.986000 | 172.31.15.14 | 161 Parsing SELECT activity_id FROM inbox_musical WHERE owner_id=5582322 ORDER BY activity_id DESC LIMIT 100; [SharedPool-Worker-3] | 2016-05-17 11:21:26.986000 | 172.31.8.188 | 66 Acquiring sstable references [SharedPool-Worker-1] | 2016-05-17 11:21:26.986000 | 172.31.4.135 | 136 Key cache hit for sstable 18948 [SharedPool-Worker-4] | 2016-05-17 11:21:26.986000 | 172.31.15.14 | 199 Preparing statement [SharedPool-Worker-3] | 2016-05-17 11:21:26.986000 | 172.31.8.188 | 99 Merging memtable tombstones [SharedPool-Worker-1] | 2016-05-17 11:21:26.986000 | 172.31.4.135 | 155 Seeking to partition beginning in data file [SharedPool-Worker-4] | 2016-05-17 11:21:26.986000 | 172.31.15.14 | 205 Read-repair DC_LOCAL [SharedPool-Worker-3] | 2016-05-17 11:21:26.986000 | 172.31.8.188 | 142 Bloom filter allows skipping sstable 20159 [SharedPool-Worker-1] | 2016-05-17 11:21:26.986000 | 172.31.4.135 | 179 Key cache hit for sstable 18924 [SharedPool-Worker-4] | 2016-05-17 11:21:26.986000 | 172.31.15.14 | 459 reading data from /172.31.4.135 [SharedPool-Worker-3] | 2016-05-17 11:21:26.986000 | 172.31.8.188 | 160 Key cache hit for sstable 20134 [SharedPool-Worker-1] | 2016-05-17 11:21:26.986000 | 172.31.4.135 | 195 Seeking to partition beginning in data file [SharedPool-Worker-4] | 2016-05-17 11:21:26.986000 | 172.31.15.14 | 468 reading data from /172.31.15.14 [SharedPool-Worker-3] | 2016-05-17 11:21:26.986000 | 172.31.8.188 | 212 Seeking to partition beginning in data file [SharedPool-Worker-1] | 2016-05-17 11:21:26.986001 | 172.31.4.135 | 202 Key cache hit for sstable 18547 [SharedPool-Worker-4] | 2016-05-17 11:21:26.986001 | 172.31.15.14 | 782 Sending READ message to /172.31.4.135 [MessagingService-Outgoing-/172.31.4.135] | 2016-05-17 11:21:26.986001 | 172.31.8.188 | 249 Key cache hit for sstable 19782 [SharedPool-Worker-1] | 2016-05-17 11:21:26.986001 | 172.31.4.135 | 376 Seeking to partition beginning in data file [SharedPool-Worker-4] | 2016-05-17 11:21:26.986001 | 172.31.15.14 | 792 Sending READ message to /172.31.15.14 [MessagingService-Outgoing-/172.31.15.14] | 2016-05-17 11:21:26.986001 | 172.31.8.188 | 260 Seeking to partition beginning in data file [SharedPool-Worker-1] | 2016-05-17 11:21:26.986001 | 172.31.4.135 | 388 Skipped 0/3 non-slice-intersecting sstables, included 0 due to tombstones [SharedPool-Worker-4] | 2016-05-17 11:21:26.986001 | 172.31.15.14 | 963 Key cache hit for sstable 16500 [SharedPool-Worker-1] | 2016-05-17 11:21:26.986001 | 172.31.4.135 | 477 Merging data from memtables and 3 sstables [SharedPool-Worker-4] | 2016-05-17 11:21:26.986001 | 172.31.15.14 | 973 Seeking to partition beginning in data file [SharedPool-Worker-1] | 2016-05-17 11:21:26.986001 | 172.31.4.135 | 484 Skipped 0/4 non-slice-intersecting sstables, included 0 due to tombstones [SharedPool-Worker-1] | 2016-05-17 11:21:26.986001 | 172.31.4.135 | 573 Merging data from memtables and 3 sstables [SharedPool-Worker-1] | 2016-05-17 11:21:26.986001 | 172.31.4.135 | 582 Read 101 live and 0 tombstone cells [SharedPool-Worker-1] | 2016-05-17 11:21:31.805000 | 172.31.4.135 | 819660 Enqueuing response to /172.31.8.188 [SharedPool-Worker-1] | 2016-05-17 11:21:31.831000 | 172.31.4.135 | 844972 Sending REQUEST_RESPONSE message to /172.31.8.188 [MessagingService-Outgoing-/172.31.8.188] | 2016-05-17 11:21:31.868000 | 172.31.4.135 | 882380 REQUEST_RESPONSE message received from /172.31.4.135 [MessagingService-Incoming-/172.31.4.135] | 2016-05-17 11:21:31.962000 | 172.31.8.188 | 976119 Processing response from /172.31.4.135 [SharedPool-Worker-2] | 2016-05-17 11:21:31.962000 | 172.31.8.188 | 976166 Request c omplete | 2016-05-17 11:21:32.014093 | 172.31.8.188 | 28093 {quote} I used jstack to get some stacks when c* is execute query like above. and found it seems c* take a long time do the following methods: {code} "SharedPool-Worker-5" #221 daemon prio=5 os_prio=0 tid=0x00007f177517a000 nid=0x8b6 runnable [0x00007f1471e5b000] java.lang.Thread.State: RUNNABLE at org.apache.cassandra.db.RangeTombstoneList.addInternal(RangeTombstoneList.java:689) at org.apache.cassandra.db.RangeTombstoneList.insertFrom(RangeTombstoneList.java:649) at org.apache.cassandra.db.RangeTombstoneList.add(RangeTombstoneList.java:170) at org.apache.cassandra.db.RangeTombstoneList.add(RangeTombstoneList.java:142) at org.apache.cassandra.db.DeletionInfo.add(DeletionInfo.java:239) at org.apache.cassandra.db.ArrayBackedSortedColumns.delete(ArrayBackedSortedColumns.java:490) at org.apache.cassandra.db.ColumnFamily.addAtom(ColumnFamily.java:153) at org.apache.cassandra.db.filter.QueryFilter$2.getNext(QueryFilter.java:185) at org.apache.cassandra.db.filter.QueryFilter$2.hasNext(QueryFilter.java:157) at org.apache.cassandra.utils.MergeIterator$Candidate.advance(MergeIterator.java:146) at org.apache.cassandra.utils.MergeIterator$ManyToOne.advance(MergeIterator.java:125) at org.apache.cassandra.utils.MergeIterator$ManyToOne.computeNext(MergeIterator.java:99) at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:143) at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:138) at org.apache.cassandra.db.filter.SliceQueryFilter.collectReducedColumns(SliceQueryFilter.java:264) at org.apache.cassandra.db.filter.QueryFilter.collateColumns(QueryFilter.java:109) at org.apache.cassandra.db.filter.QueryFilter.collateOnDiskAtom(QueryFilter.java:82) at org.apache.cassandra.db.filter.QueryFilter.collateOnDiskAtom(QueryFilter.java:69) at org.apache.cassandra.db.CollationController.collectAllData(CollationController.java:317) at org.apache.cassandra.db.CollationController.getTopLevelColumns(CollationController.java:61) at org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(ColumnFamilyStore.java:2011) at org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1815) at org.apache.cassandra.db.Keyspace.getRow(Keyspace.java:360) at org.apache.cassandra.db.SliceFromReadCommand.getRow(SliceFromReadCommand.java:85) {code} I executed these query when server was on very low load. > some queries are very slow on a small partition > ----------------------------------------------- > > Key: CASSANDRA-11814 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11814 > Project: Cassandra > Issue Type: Bug > Components: CQL > Environment: os: ubuntu server 14.04 > cassandra: 2.2.5 (apt-get installed from datastax repo) > server: ec2 m4.2xlarge > Reporter: Terry Ma > Priority: Minor > Labels: performance > > we have a cassandra cluster, which has 16 nodes(m4.2xlarge) on aws. > The total data size is 2.5TB now. There is a very large table named > inbox_musical, the schema is like this: > {code} > CREATE TABLE feeds.inbox_musical ( > owner_id bigint, > activity_id bigint, > insert_time timestamp, > PRIMARY KEY (owner_id, activity_id) > ) WITH CLUSTERING ORDER BY (activity_id DESC) > AND bloom_filter_fp_chance = 0.1 > AND caching = '{"keys":"ALL", "rows_per_partition":"20"}' > AND comment = '' > AND compaction = {'class': > 'org.apache.cassandra.db.compaction.LeveledCompactionStrategy'} > AND compression = {'sstable_compression': > 'org.apache.cassandra.io.compress.LZ4Compressor'} > AND dclocal_read_repair_chance = 0.1 > AND default_time_to_live = 15552000 > AND gc_grace_seconds = 864000 > AND max_index_interval = 2048 > AND memtable_flush_period_in_ms = 0 > AND min_index_interval = 128 > AND read_repair_chance = 0.0 > AND speculative_retry = '99.0PERCENTILE'; > {code} > owner_id is the partition key, and activity_id is clustering key. > 95% query can return within 10ms. but a few queries read timeout ( > 5s ). > the slow queries is like this: > {code:sql} > SELECT activity_id FROM inbox_musical WHERE owner_id=5582322 ORDER BY > activity_id DESC LIMIT 100; > {code} > there are only 230 rows in this partition. but the query takes seconds to > complete. > the query tracing: > {code} > activity > | timestamp > | source | source_elapsed > ---------------------------------------------------------------------------------------------------------------------------------+----------------------------+--------------+---------------- > > Execute CQL3 query | 2016-05-17 > 11:21:26.986000 | 172.31.8.188 | 0 > READ message received from > /172.31.8.188 [MessagingService-Incoming-/172.31.8.188] | 2016-05-17 > 11:21:26.985000 | 172.31.15.14 | 13 > Executing > single-partition query on inbox_musical [SharedPool-Worker-4] | 2016-05-17 > 11:21:26.985000 | 172.31.15.14 | 115 > READ message received from > /172.31.8.188 [MessagingService-Incoming-/172.31.8.188] | 2016-05-17 > 11:21:26.986000 | 172.31.4.135 | 13 > > Acquiring sstable references [SharedPool-Worker-4] | 2016-05-17 > 11:21:26.986000 | 172.31.15.14 | 137 > Executing > single-partition query on inbox_musical [SharedPool-Worker-1] | 2016-05-17 > 11:21:26.986000 | 172.31.4.135 | 113 > > Merging memtable tombstones [SharedPool-Worker-4] | 2016-05-17 > 11:21:26.986000 | 172.31.15.14 | 161 > Parsing SELECT activity_id FROM inbox_musical WHERE owner_id=5582322 ORDER > BY activity_id DESC LIMIT 100; [SharedPool-Worker-3] | 2016-05-17 > 11:21:26.986000 | 172.31.8.188 | 66 > > Acquiring sstable references [SharedPool-Worker-1] | 2016-05-17 > 11:21:26.986000 | 172.31.4.135 | 136 > > Key cache hit for sstable 18948 [SharedPool-Worker-4] | 2016-05-17 > 11:21:26.986000 | 172.31.15.14 | 199 > > Preparing statement [SharedPool-Worker-3] | 2016-05-17 > 11:21:26.986000 | 172.31.8.188 | 99 > > Merging memtable tombstones [SharedPool-Worker-1] | 2016-05-17 > 11:21:26.986000 | 172.31.4.135 | 155 > Seeking to > partition beginning in data file [SharedPool-Worker-4] | 2016-05-17 > 11:21:26.986000 | 172.31.15.14 | 205 > > Read-repair DC_LOCAL [SharedPool-Worker-3] | 2016-05-17 > 11:21:26.986000 | 172.31.8.188 | 142 > Bloom filter > allows skipping sstable 20159 [SharedPool-Worker-1] | 2016-05-17 > 11:21:26.986000 | 172.31.4.135 | 179 > > Key cache hit for sstable 18924 [SharedPool-Worker-4] | 2016-05-17 > 11:21:26.986000 | 172.31.15.14 | 459 > > reading data from /172.31.4.135 [SharedPool-Worker-3] | 2016-05-17 > 11:21:26.986000 | 172.31.8.188 | 160 > > Key cache hit for sstable 20134 [SharedPool-Worker-1] | 2016-05-17 > 11:21:26.986000 | 172.31.4.135 | 195 > Seeking to > partition beginning in data file [SharedPool-Worker-4] | 2016-05-17 > 11:21:26.986000 | 172.31.15.14 | 468 > > reading data from /172.31.15.14 [SharedPool-Worker-3] | 2016-05-17 > 11:21:26.986000 | 172.31.8.188 | 212 > Seeking to > partition beginning in data file [SharedPool-Worker-1] | 2016-05-17 > 11:21:26.986001 | 172.31.4.135 | 202 > > Key cache hit for sstable 18547 [SharedPool-Worker-4] | 2016-05-17 > 11:21:26.986001 | 172.31.15.14 | 782 > Sending READ message to > /172.31.4.135 [MessagingService-Outgoing-/172.31.4.135] | 2016-05-17 > 11:21:26.986001 | 172.31.8.188 | 249 > > Key cache hit for sstable 19782 [SharedPool-Worker-1] | 2016-05-17 > 11:21:26.986001 | 172.31.4.135 | 376 > Seeking to > partition beginning in data file [SharedPool-Worker-4] | 2016-05-17 > 11:21:26.986001 | 172.31.15.14 | 792 > Sending READ message to > /172.31.15.14 [MessagingService-Outgoing-/172.31.15.14] | 2016-05-17 > 11:21:26.986001 | 172.31.8.188 | 260 > Seeking to > partition beginning in data file [SharedPool-Worker-1] | 2016-05-17 > 11:21:26.986001 | 172.31.4.135 | 388 > Skipped 0/3 non-slice-intersecting sstables, > included 0 due to tombstones [SharedPool-Worker-4] | 2016-05-17 > 11:21:26.986001 | 172.31.15.14 | 963 > > Key cache hit for sstable 16500 [SharedPool-Worker-1] | 2016-05-17 > 11:21:26.986001 | 172.31.4.135 | 477 > Merging data > from memtables and 3 sstables [SharedPool-Worker-4] | 2016-05-17 > 11:21:26.986001 | 172.31.15.14 | 973 > Seeking to > partition beginning in data file [SharedPool-Worker-1] | 2016-05-17 > 11:21:26.986001 | 172.31.4.135 | 484 > Skipped 0/4 non-slice-intersecting sstables, > included 0 due to tombstones [SharedPool-Worker-1] | 2016-05-17 > 11:21:26.986001 | 172.31.4.135 | 573 > Merging data > from memtables and 3 sstables [SharedPool-Worker-1] | 2016-05-17 > 11:21:26.986001 | 172.31.4.135 | 582 > Read > 101 live and 0 tombstone cells [SharedPool-Worker-1] | 2016-05-17 > 11:21:31.805000 | 172.31.4.135 | 819660 > > Enqueuing response to /172.31.8.188 [SharedPool-Worker-1] | 2016-05-17 > 11:21:31.831000 | 172.31.4.135 | 844972 > Sending REQUEST_RESPONSE message to > /172.31.8.188 [MessagingService-Outgoing-/172.31.8.188] | 2016-05-17 > 11:21:31.868000 | 172.31.4.135 | 882380 > REQUEST_RESPONSE message received from > /172.31.4.135 [MessagingService-Incoming-/172.31.4.135] | 2016-05-17 > 11:21:31.962000 | 172.31.8.188 | 976119 > > Processing response from /172.31.4.135 [SharedPool-Worker-2] | 2016-05-17 > 11:21:31.962000 | 172.31.8.188 | 976166 > > Request c > > omplete | 2016-05-17 11:21:32.014093 | > 172.31.8.188 | 28093 > {code} > I used jstack to get some stacks when c* is execute query like above. and > found it seems c* take a long time do the following methods: > {code} > "SharedPool-Worker-5" #221 daemon prio=5 os_prio=0 tid=0x00007f177517a000 > nid=0x8b6 runnable [0x00007f1471e5b000] > java.lang.Thread.State: RUNNABLE > at > org.apache.cassandra.db.RangeTombstoneList.addInternal(RangeTombstoneList.java:689) > at > org.apache.cassandra.db.RangeTombstoneList.insertFrom(RangeTombstoneList.java:649) > at > org.apache.cassandra.db.RangeTombstoneList.add(RangeTombstoneList.java:170) > at > org.apache.cassandra.db.RangeTombstoneList.add(RangeTombstoneList.java:142) > at org.apache.cassandra.db.DeletionInfo.add(DeletionInfo.java:239) > at > org.apache.cassandra.db.ArrayBackedSortedColumns.delete(ArrayBackedSortedColumns.java:490) > at org.apache.cassandra.db.ColumnFamily.addAtom(ColumnFamily.java:153) > at > org.apache.cassandra.db.filter.QueryFilter$2.getNext(QueryFilter.java:185) > at > org.apache.cassandra.db.filter.QueryFilter$2.hasNext(QueryFilter.java:157) > at > org.apache.cassandra.utils.MergeIterator$Candidate.advance(MergeIterator.java:146) > at > org.apache.cassandra.utils.MergeIterator$ManyToOne.advance(MergeIterator.java:125) > at > org.apache.cassandra.utils.MergeIterator$ManyToOne.computeNext(MergeIterator.java:99) > at > com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:143) > at > com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:138) > at > org.apache.cassandra.db.filter.SliceQueryFilter.collectReducedColumns(SliceQueryFilter.java:264) > at > org.apache.cassandra.db.filter.QueryFilter.collateColumns(QueryFilter.java:109) > at > org.apache.cassandra.db.filter.QueryFilter.collateOnDiskAtom(QueryFilter.java:82) > at > org.apache.cassandra.db.filter.QueryFilter.collateOnDiskAtom(QueryFilter.java:69) > at > org.apache.cassandra.db.CollationController.collectAllData(CollationController.java:317) > at > org.apache.cassandra.db.CollationController.getTopLevelColumns(CollationController.java:61) > at > org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(ColumnFamilyStore.java:2011) > at > org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1815) > at org.apache.cassandra.db.Keyspace.getRow(Keyspace.java:360) > at > org.apache.cassandra.db.SliceFromReadCommand.getRow(SliceFromReadCommand.java:85) > {code} > I executed these query when server was on very low load. -- This message was sent by Atlassian JIRA (v6.3.4#6332)