Could you send cfhistograms and cfstats relevant to the read column family?

That could help

Carlos Alonso | Software Engineer | @calonso <https://twitter.com/calonso>

On 17 October 2015 at 16:15, Brice Figureau <
brice+cassan...@daysofwonder.com> wrote:

> Hi,
>
> I've read all I could find on how cassandra works, I'm still wondering why
> the following query takes more than 5s to return on a simple (and modest) 3
> nodes cassandra 2.1.9 cluster:
>
> SELECT sequence_nr, used
> FROM messages
> WHERE persistence_id = 'session-SW' AND partition_nr = 0;
>
> The schema of this table:
> 
> CREATE TABLE akka.messages (
>     persistence_id text,
>     partition_nr bigint,
>     sequence_nr bigint,
>     message blob,
>     used boolean static,
>     PRIMARY KEY ((persistence_id, partition_nr), sequence_nr)
> ) WITH CLUSTERING ORDER BY (sequence_nr ASC)
>     AND bloom_filter_fp_chance = 0.01
>     AND caching = '{"keys":"ALL", "rows_per_partition":"NONE"}'
>     AND comment = ''
>     AND compaction = {'class':
> 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy'}
>     AND compression = {'sstable_compression':
> 'org.apache.cassandra.io.compress.LZ4Compressor'}
>     AND dclocal_read_repair_chance = 0.1
>     AND default_time_to_live = 0
>     AND gc_grace_seconds = 864000
>     AND max_index_interval = 2048
>     AND memtable_flush_period_in_ms = 0
>     AND min_index_interval = 128
>     AND read_repair_chance = 0.0
>     AND speculative_retry = '99.0PERCENTILE';
>
>
> This is a query from Akka cassandra journal (
> https://github.com/krasserm/akka-persistence-cassandra) which returns 33
> rows for the moment.
>
> The time doesn't vary if I use a QUORUM consistency or ONE.
> Note that the same query but with a different `persistence_id` are much
> faster (it might depend on the number of rows returned).
>
> Here's the (redacted) trace taken with cqlsh:
>
> activity
>                         | timestamp                  | source         |
> source_elapsed
>
> ---------------------------------------------------------------------------------------------------+----------------------------+----------------+----------------
>
>       Execute CQL3 query | 2015-10-17 17:01:05.681000 | 192.168.168.26 |
>           0
>   Parsing
> ..................................................................
> [SharedPool-Worker-1] | 2015-10-17 17:01:05.683000 | 192.168.168.26 |
>        84
>             READ message received from /192.168.168.26
> [MessagingService-Incoming-/192.168.168.26] | 2015-10-17 17:01:05.683000
> | 192.168.168.29 |             69
>                                                          Preparing
> statement [SharedPool-Worker-1] | 2015-10-17 17:01:05.683000 |
> 192.168.168.26 |            215
>                                            reading data from /
> 192.168.168.29 [SharedPool-Worker-1] | 2015-10-17 17:01:05.683000 |
> 192.168.168.26 |           1189
>                Sending READ message to /192.168.168.29
> [MessagingService-Outgoing-/192.168.168.29] | 2015-10-17 17:01:05.683000
> | 192.168.168.26 |           1317
>                                 Executing single-partition query on
> messages [SharedPool-Worker-1] | 2015-10-17 17:01:05.684000 |
> 192.168.168.29 |            175
>                                                 Acquiring sstable
> references [SharedPool-Worker-1] | 2015-10-17 17:01:05.684000 |
> 192.168.168.29 |            189
>                                                  Merging memtable
> tombstones [SharedPool-Worker-1] | 2015-10-17 17:01:05.685000 |
> 192.168.168.29 |            204
>                                                Key cache hit for sstable
> 434 [SharedPool-Worker-1] | 2015-10-17 17:01:05.685000 | 192.168.168.29 |
>           257
>                                  Seeking to partition beginning in data
> file [SharedPool-Worker-1] | 2015-10-17 17:01:05.685000 | 192.168.168.29 |
>           269
>                                                Key cache hit for sstable
> 432 [SharedPool-Worker-1] | 2015-10-17 17:01:05.685000 | 192.168.168.29 |
>           514
>                                  Seeking to partition beginning in data
> file [SharedPool-Worker-1] | 2015-10-17 17:01:05.686000 | 192.168.168.29 |
>           527
>                                                Key cache hit for sstable
> 433 [SharedPool-Worker-1] | 2015-10-17 17:01:05.686000 | 192.168.168.29 |
>           779
>                                  Seeking to partition beginning in data
> file [SharedPool-Worker-1] | 2015-10-17 17:01:05.686000 | 192.168.168.29 |
>           789
>    Skipped 0/3 non-slice-intersecting sstables, included 0 due to
> tombstones [SharedPool-Worker-1] | 2015-10-17 17:01:05.686000 |
> 192.168.168.29 |            929
>                                   Merging data from memtables and 3
> sstables [SharedPool-Worker-1] | 2015-10-17 17:01:05.687000 |
> 192.168.168.29 |            956
>                                    speculating read retry on /
> 192.168.168.26 [SharedPool-Worker-1] | 2015-10-17 17:01:05.724000 |
> 192.168.168.26 |          42423
>                Sending READ message to /192.168.168.26
> [MessagingService-Outgoing-/192.168.168.26] | 2015-10-17 17:01:05.724000
> | 192.168.168.26 |          42573
>             READ message received from /192.168.168.26
> [MessagingService-Incoming-/192.168.168.26] | 2015-10-17 17:01:05.724000
> | 192.168.168.26 |          42763
>                                 Executing single-partition query on
> messages [SharedPool-Worker-2] | 2015-10-17 17:01:05.724000 |
> 192.168.168.26 |          42855
>                                                 Acquiring sstable
> references [SharedPool-Worker-2] | 2015-10-17 17:01:05.725000 |
> 192.168.168.26 |          42868
>                                                  Merging memtable
> tombstones [SharedPool-Worker-2] | 2015-10-17 17:01:05.725000 |
> 192.168.168.26 |          42934
>                                     Bloom filter allows skipping sstable
> 443 [SharedPool-Worker-2] | 2015-10-17 17:01:05.725000 | 192.168.168.26 |
>         42965
>                                                Key cache hit for sstable
> 442 [SharedPool-Worker-2] | 2015-10-17 17:01:05.725000 | 192.168.168.26 |
>         42986
>                                  Seeking to partition beginning in data
> file [SharedPool-Worker-2] | 2015-10-17 17:01:05.725000 | 192.168.168.26 |
>         42991
>    Skipped 0/2 non-slice-intersecting sstables, included 0 due to
> tombstones [SharedPool-Worker-2] | 2015-10-17 17:01:05.726000 |
> 192.168.168.26 |          43115
>                                   Merging data from memtables and 1
> sstables [SharedPool-Worker-2] | 2015-10-17 17:01:05.726000 |
> 192.168.168.26 |          43137
>                                           Read 33 live and 0 tombstone
> cells [SharedPool-Worker-2] | 2015-10-17 17:01:11.239000 | 192.168.168.26
> |         557248
>                                        Enqueuing response to /
> 192.168.168.26 [SharedPool-Worker-2] | 2015-10-17 17:01:11.244000 |
> 192.168.168.26 |         562973
>    Sending REQUEST_RESPONSE message to /192.168.168.26
> [MessagingService-Outgoing-/192.168.168.26] | 2015-10-17 17:01:11.245000
> | 192.168.168.26 |         563209
> REQUEST_RESPONSE message received from /192.168.168.26
> [MessagingService-Incoming-/192.168.168.26] | 2015-10-17 17:01:11.302000
> | 192.168.168.26 |         620575
>                                     Processing response from /
> 192.168.168.26 [SharedPool-Worker-2] | 2015-10-17 17:01:11.302000 |
> 192.168.168.26 |         620792
>                                           Read 33 live and 0 tombstone
> cells [SharedPool-Worker-1] | 2015-10-17 17:01:11.330000 | 192.168.168.29
> |         646481
>                                        Enqueuing response to /
> 192.168.168.26 [SharedPool-Worker-1] | 2015-10-17 17:01:11.335000 |
> 192.168.168.29 |         651628
>    Sending REQUEST_RESPONSE message to /192.168.168.26
> [MessagingService-Outgoing-/192.168.168.26] | 2015-10-17 17:01:11.335000
> | 192.168.168.29 |         651769
>
>         Request complete | 2015-10-17 17:01:11.312485 | 192.168.168.26 |
>      631485
>
>
> I'm not sure I read correctly the trace, but it seems the time is taken on
> the:
> "Read 33 live and 0 tombstone cells"
>
> What does this step do?
> How can I troubleshoot it?
>
> The cluster configuration is the default one, with the following changes:
> concurrent_reads and writes at 16 (because it's a very modest hardware, but
> I used 32 without any differences), increased timeouts, and using hsha.
>
> Note that I had to increase the timeouts for the request to succeed and be
> able to trace it (otherwise it was always timing out).
>
> Note that the same request on the same data loaded from a snapshot on a
> 1-node cassandra cluster running in a VM on my laptop doesn't exhibit the
> same issue (it's instantaneous).
>
> Also, a non-related question: in what unit is the column "source_elapsed"?
> And is the "timestamp" taken at the start of the operation or at the end?
>
> Thanks for any answers,
> --
> Brice Figureau
>

Reply via email to