Re: system_auth replication factor in Cassandra 2.1

Erick Ramirez Wed, 30 Aug 2017 18:45:55 -0700

It looks like nodes .113 and .116 have a problem. Repairing system_auth
which only contains 5 users should not take that long. Run with just nodetool
repair system_auth (without the -pr flag).


But first investigate why those 2 nodes are slow to respond. Cheers!

On Thu, Aug 31, 2017 at 3:00 AM, Chuck Reynolds <creyno...@ancestry.com>
wrote:

> select * from users;
>
>
>
> OK here’s the trace.  The times are super long.
>
> name          | super
>
> ---------------+-------
>
>       user1 | False
>
>       user2 | True
>
>       user3 | True
>
>       user4 | False
>
>       user5 | True
>
> (5 rows)
>
> Tracing session: 55a4aa50-8da3-11e7-adbb-e7bbc3a8a72e
>
> activity
>                                                                       |
> timestamp                  | source        | source_elapsed
>
> ------------------------------------------------------------
> ------------------------------------------------------------
> -----------+----------------------------+---------------+----------------
>
>
>                                          Execute CQL3 query | 2017-08-30
> 10:50:31.413000 | xx.xx.xx.113 |           0
>
>                                               READ message received from
> /xx.xx.xx.113 [MessagingService-Incoming-/xx.xx.xx.113] | 2017-08-30
> 10:50:31.398000 | xx.xx.xx.107 |             66
>
>
> Executing single-partition query on users [SharedPool-Worker-1] |
> 2017-08-30 10:50:31.399000 | xx.xx.xx.107 |            110
>
>                                                                               
>   Acquiring
> sstable references [SharedPool-Worker-1] | 2017-08-30 10:50:31.399000 |
> xx.xx.xx.107 |            114
>
>
>      Merging memtable tombstones [SharedPool-Worker-1] | 2017-08-30
> 10:50:31.400000 | xx.xx.xx.107 |            121
>
>
> Key cache hit for sstable 288 [SharedPool-Worker-1] | 2017-08-30
> 10:50:31.400000 | xx.xx.xx.107 |            129
>
>                                                                  Seeking
> to partition beginning in data file [SharedPool-Worker-1] | 2017-08-30
> 10:50:31.400000 | xx.xx.xx.107 |            130
>
>                                    Skipped 0/1 non-slice-intersecting
> sstables, included 0 due to tombstones [SharedPool-Worker-1] | 2017-08-30
> 10:50:31.401000 | xx.xx.xx.107 |            209
>
>                                                                   Merging
> data from memtables and 1 sstables [SharedPool-Worker-1] | 2017-08-30
> 10:50:31.401000 | xx.xx.xx.107 |            211
>
>
> Read 1 live and 0 tombstone cells [SharedPool-Worker-1] | 2017-08-30
> 10:50:31.402000 | xx.xx.xx.107 |            226
>
>
> Enqueuing response to /xx.xx.xx.113 [SharedPool-Worker-1] | 2017-08-30
> 10:50:31.402000 | xx.xx.xx.107 |            321
>
>                                      Sending REQUEST_RESPONSE message to
> /xx.xx.xx.113 [MessagingService-Outgoing-/xx.xx.xx.113] | 2017-08-30
> 10:50:31.402000 | xx.xx.xx.107 |            417
>
>
>  Parsing select * from users; [SharedPool-Worker-2] | 2017-08-30
> 10:50:31.414000 | xx.xx.xx.113 |             22
>
>
> Preparing statement [SharedPool-Worker-2] | 2017-08-30 10:50:31.415000 |
> xx.xx.xx.113 |             58
>
>
> reading data from /xx.xx.xx.107 [SharedPool-Worker-2] | 2017-08-30
> 10:50:31.415000 | xx.xx.xx.113 |            950
>
>                                                  Sending READ message to
> /xx.xx.xx.107 [MessagingService-Outgoing-/xx.xx.xx.107] | 2017-08-30
> 10:50:31.415000 | xx.xx.xx.113 |           1017
>
>                                   REQUEST_RESPONSE message received from
> /xx.xx.xx.107 [MessagingService-Incoming-/xx.xx.xx.107] | 2017-08-30
> 10:50:31.415000 | xx.xx.xx.113 |           1744
>
>
> Processing response from /xx.xx.xx.107 [SharedPool-Worker-1] | 2017-08-30
> 10:50:31.416000 | xx.xx.xx.113 |           1805
>
>
> Computing ranges to query [SharedPool-Worker-2] | 2017-08-30
> 10:50:31.416000 | xx.xx.xx.113 |           1853
>
> Submitting range requests on 63681 ranges with a concurrency of 10056
> (0.009944752 rows per range expected) [SharedPool-Worker-2] | 2017-08-30
> 10:50:31.424000 | xx.xx.xx.113 |          11427
>
>                                        PAGED_RANGE message received from
> /xx.xx.xx.113 [MessagingService-Incoming-/xx.xx.xx.113] | 2017-08-30
> 10:51:25.002000 | xx.xx.xx.116 |             28
>
>              Executing seq scan across 1 sstables for
> [min(-9223372036854775808), min(-9223372036854775808)]
> [SharedPool-Worker-1] | 2017-08-30 10:51:25.002000 | xx.xx.xx.116
> |             82
>
>
> Read 1 live and 0 tombstone cells [SharedPool-Worker-1] | 2017-08-30
> 10:51:25.003000 | xx.xx.xx.116 |            178
>
>
> Read 1 live and 0 tombstone cells [SharedPool-Worker-1] | 2017-08-30
> 10:51:25.003000 | xx.xx.xx.116 |            186
>
>
>            Read 1 live and 0 tombstone cells [SharedPool-Worker-1] |
> 2017-08-30 10:51:25.003000 | xx.xx.xx.116 |            191
>
>
> Read 1 live and 0 tombstone cells [SharedPool-Worker-1] | 2017-08-30
> 10:51:25.003000 | xx.xx.xx.116 |            194
>
>
> Read 1 live and 0 tombstone cells [SharedPool-Worker-1] | 2017-08-30
> 10:51:25.004000 | xx.xx.xx.116 |            198
>
>
> Scanned 5 rows and matched 5 [SharedPool-Worker-1] | 2017-08-30
> 10:51:25.004000 | xx.xx.xx.116 |            224
>
>
>         Enqueuing response to /xx.xx.xx.113 [SharedPool-Worker-1] |
> 2017-08-30 10:51:25.004000 | xx.xx.xx.116 |            240
>
>                                      Sending REQUEST_RESPONSE message to
> /xx.xx.xx.113 [MessagingService-Outgoing-/xx.xx.xx.113] | 2017-08-30
> 10:51:25.004000 | xx.xx.xx.116 |            302
>
>
> Enqueuing request to /xx.xx.xx.116 [SharedPool-Worker-2] | 2017-08-30
> 10:51:25.014000 | xx.xx.xx.113 |         601103
>
>                                                  Submitted 1 concurrent
> range requests covering 63681 ranges [SharedPool-Worker-2] | 2017-08-30
> 10:51:25.014000 | xx.xx.xx.113 |         601120
>
>                                           Sending PAGED_RANGE message to
> /xx.xx.xx.116 [MessagingService-Outgoing-/xx.xx.xx.116] | 2017-08-30
> 10:51:25.015000 | xx.xx.xx.113 |         601190
>
>                                   REQUEST_RESPONSE message received from
> /xx.xx.xx.116 [MessagingService-Incoming-/xx.xx.xx.116] | 2017-08-30
> 10:51:25.015000 | xx.xx.xx.113 |         601771
>
>
> Processing response from /xx.xx.xx.116 [SharedPool-Worker-1] | 2017-08-30
> 10:51:25.015000 | xx.xx.xx.113 |         601824
>
>
>                                              Request complete |
> 2017-08-30 10:51:25.014874 | xx.xx.xx.113 |         601874
>
>
>
>
>
> *From: *Oleksandr Shulgin <oleksandr.shul...@zalando.de>
> *Reply-To: *"user@cassandra.apache.org" <user@cassandra.apache.org>
> *Date: *Wednesday, August 30, 2017 at 10:42 AM
> *To: *User <user@cassandra.apache.org>
> *Subject: *Re: system_auth replication factor in Cassandra 2.1
>
>
>
> On Wed, Aug 30, 2017 at 6:40 PM, Chuck Reynolds <creyno...@ancestry.com>
> wrote:
>
> How many users do you have (or expect to be found in system_auth.users)?
>
>   5 users.
>
> What are the current RF for system_auth and consistency level you are
> using in cqlsh?
>
>  135 in one DC and 227 in the other DC.  Consistency level one
>
>
>
> Still very surprising...
>
>
>
> Did you try to obtain a trace of a timing-out query (with TRACING ON)?
>
> Tracing timeout even though I increased it to 120 seconds.
>
>
>
> Even if cqlsh doesn't print the trace because of timeout, you should be
> still able to find something in system_traces.
>
>
>
> --
>
> Alex
>
>
>

Re: system_auth replication factor in Cassandra 2.1

Reply via email to