RE: Read timeouts when performing rolling restart

Steinmaurer, Thomas Wed, 12 Sep 2018 09:46:21 -0700

Hi,

I remember something that a client using the native protocol gets notified too 
early by Cassandra being ready due to the following issue:
https://issues.apache.org/jira/browse/CASSANDRA-8236


which looks similar, but above was marked as fixed in 2.2.

Thomas

From: Riccardo Ferrari <ferra...@gmail.com>
Sent: Mittwoch, 12. September 2018 18:25
To: user@cassandra.apache.org
Subject: Re: Read timeouts when performing rolling restart

Hi Alain,

Thank you for chiming in!

I was thinking to perform the 'start_native_transport=false' test as well and 
indeed the issue is not showing up. Starting the/a node with native transport 
disabled and letting it cool down lead to no timeout exceptions no dropped 
messages, simply a crystal clean startup. Agreed it is a workaround

# About upgrading:
Yes, I desperately want to upgrade despite is a long and slow task. Just 
reviewing all the changes from 3.0.6 to 3.0.17
is going to be a huge pain, top of your head, any breaking change I should 
absolutely take care of reviewing ?

# describecluster output: YES they agree on the same schema version

# keyspaces:
system WITH replication = {'class': 'LocalStrategy'}
system_schema WITH replication = {'class': 'LocalStrategy'}
system_auth WITH replication = {'class': 'SimpleStrategy', 
'replication_factor': '1'}
system_distributed WITH replication = {'class': 'SimpleStrategy', 
'replication_factor': '3'}
system_traces WITH replication = {'class': 'SimpleStrategy', 
'replication_factor': '2'}

<custom1> WITH replication = {'class': 'SimpleStrategy', 'replication_factor': 
'3'}
<custom2>  WITH replication = {'class': 'SimpleStrategy', 'replication_factor': 
'3'}

# Snitch
Ec2Snitch

## About Snitch and replication:
- We have the default DC and all nodes are in the same RACK
- We are planning to move to GossipingPropertyFileSnitch configuring the 
cassandra-rackdc accortingly.
-- This should be a transparent change, correct?

- Once switched to GPFS, we plan to move to 'NetworkTopologyStrategy' with 
'us-xxxx' DC and replica counts as before
- Then adding a new DC inside the VPC, but this is another story...

Any concerns here ?

# nodetool status <ks>
--  Address         Load       Tokens       Owns (effective)  Host ID           
                    Rack
UN  10.x.x.a  177 GB     256          50.3%             
d8bfe4ad-8138-41fe-89a4-ee9a043095b5  rr
UN  10.x.x.b    152.46 GB  256          51.8%             
7888c077-346b-4e09-96b0-9f6376b8594f  rr
UN  10.x.x.c   159.59 GB  256          49.0%             
329b288e-c5b5-4b55-b75e-fbe9243e75fa  rr
UN  10.x.x.d  162.44 GB  256          49.3%             
07038c11-d200-46a0-9f6a-6e2465580fb1  rr
UN  10.x.x.e    174.9 GB   256          50.5%             
c35b5d51-2d14-4334-9ffc-726f9dd8a214  rr
UN  10.x.x.f  194.71 GB  256          49.2%             
f20f7a87-d5d2-4f38-a963-21e24167b8ac  rr

# gossipinfo
/10.x.x.a
  STATUS:827:NORMAL,-1350078789194251746
  LOAD:289986:1.90078037902E11
  SCHEMA:281088:af4461c3-d269-39bc-9d03-3566031c1e0a
  DC:6:<some-ec2-dc>
  RACK:8:rr
  RELEASE_VERSION:4:3.0.6
  SEVERITY:290040:0.5934718251228333
  NET_VERSION:1:10
  HOST_ID:2:d8bfe4ad-8138-41fe-89a4-ee9a043095b5
  RPC_READY:868:true
  TOKENS:826:<hidden>
/10.x.x.b
  STATUS:16:NORMAL,-1023229528754013265
  LOAD:7113:1.63730480619E11
  SCHEMA:10:af4461c3-d269-39bc-9d03-3566031c1e0a
  DC:6:<some-ec2-dc>
  RACK:8:rr
  RELEASE_VERSION:4:3.0.6
  SEVERITY:7274:0.5988024473190308
  NET_VERSION:1:10
  HOST_ID:2:7888c077-346b-4e09-96b0-9f6376b8594f
  TOKENS:15:<hidden>
/10.x.x.c
  STATUS:732:NORMAL,-1117172759238888547
  LOAD:245839:1.71409806942E11
  SCHEMA:237168:af4461c3-d269-39bc-9d03-3566031c1e0a
  DC:6:<some-ec2-dc>
  RACK:8:rr
  RELEASE_VERSION:4:3.0.6
  SEVERITY:245989:0.0
  NET_VERSION:1:10
  HOST_ID:2:329b288e-c5b5-4b55-b75e-fbe9243e75fa
  RPC_READY:763:true
  TOKENS:731:<hidden>
/10.x.x.d
  STATUS:14:NORMAL,-1004942496246544417
  LOAD:313125:1.74447964917E11
  SCHEMA:304268:af4461c3-d269-39bc-9d03-3566031c1e0a
  DC:6:<some-ec2-dc>
  RACK:8:rr
  RELEASE_VERSION:4:3.0.6
  SEVERITY:313215:0.25641027092933655
  NET_VERSION:1:10
  HOST_ID:2:07038c11-d200-46a0-9f6a-6e2465580fb1
  RPC_READY:56:true
  TOKENS:13:<hidden>
/10.x.x.e
  STATUS:520:NORMAL,-1058809960483771749
  LOAD:276118:1.87831573032E11
  SCHEMA:267327:af4461c3-d269-39bc-9d03-3566031c1e0a
  DC:6:<some-ec2-dc>
  RACK:8:rr
  RELEASE_VERSION:4:3.0.6
  SEVERITY:276217:0.32786884903907776
  NET_VERSION:1:10
  HOST_ID:2:c35b5d51-2d14-4334-9ffc-726f9dd8a214
  RPC_READY:550:true
  TOKENS:519:<hidden>
/10.x.x.f
  STATUS:1081:NORMAL,-1039671799603495012
  LOAD:239114:2.09082017545E11
  SCHEMA:230229:af4461c3-d269-39bc-9d03-3566031c1e0a
  DC:6:<some-ec2-dc>
  RACK:8:rr
  RELEASE_VERSION:4:3.0.6
  SEVERITY:239180:0.5665722489356995
  NET_VERSION:1:10
  HOST_ID:2:f20f7a87-d5d2-4f38-a963-21e24167b8ac
  RPC_READY:1118:true
  TOKENS:1080:<hidden>

## About load and tokens:
- While load is pretty even this does not apply to tokens, I guess we have some 
table with uneven distribution. This should not be the case for high load 
tabels as partition keys are are build with some 'id + <some time format>'
- I was not able to find some documentation about the numbers printed next to 
LOAD, SCHEMA, SEVERITY, RPC_READY ... Is there any doc around ?

# Tombstones
No ERRORS, only WARN about a very specific table that we are aware of. It is an 
append only table read by spark from a batch job. (I guess it is a read_repair 
chance or DTCS misconfig)

## Closing note!
We are on very only m1.xlarge 4 vcpu and raid0 (stripe) on the 4 spinning 
drives, some changes to the cassandra.yml:

- dynamic_snitch: false
- concurrent_reads: 48
- concurrent_compactors: 1 (was 2)
- disk_optimization_strategy: spinning

I have some concerns about the number of concurrent_compactors, what do you 
think?

Thanks!

The contents of this e-mail are intended for the named addressee only. It 
contains information that may be confidential. Unless you are the named 
addressee or an authorized designee, you may not copy or use it, or disclose it 
to anyone else. If you received it in error please notify us immediately and 
then destroy it. Dynatrace Austria GmbH (registration number FN 91482h) is a 
company registered in Linz whose registered office is at 4040 Linz, Austria, 
Freistädterstraße 313

RE: Read timeouts when performing rolling restart

Reply via email to