[ https://issues.apache.org/jira/browse/CASSANDRA-5914?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Brandon Williams reopened CASSANDRA-5914: ----------------------------------------- > Failed replace_node bootstrap leaves gossip in weird state ; possible perf > problem > ---------------------------------------------------------------------------------- > > Key: CASSANDRA-5914 > URL: https://issues.apache.org/jira/browse/CASSANDRA-5914 > Project: Cassandra > Issue Type: Bug > Components: Core > Environment: 1.2.8 > Reporter: Chris Burroughs > > A node was down for a week or two due to hardware disk failure. I tried to > use replace_node to bring up a new node on the same physical host with the > same IPs. (rbranson suspected that using the same IP may be issue prone.) > This failed due to "unable to find sufficient sources for streaming range". > However, gossip for the to-be-replaced node was left in a funky state: > {noformat} > /64.215.255.182 > RACK:NOP > NET_VERSION:6 > HOST_ID:4f3b214b-b03e-46eb-8214-5fab2662a06b > RELEASE_VERSION:1.2.8 > DC:IAD > INTERNAL_IP:10.15.2.182 > SCHEMA:59adb24e-f3cd-3e02-97f0-5b395827453f > RPC_ADDRESS:0.0.0.0 > {noformat} > (See CASSANDRA-5913 for cosmetic issue with nt:status.) > This seems (A) confusing and (B) the failed replace_token correlated with > 95th percentile read latency for this cluster going from 8k microseconds to > around 200k microseconds (on both DCs in a mutli-dc cluster reading at > CL.ONE). I don't have a good theory for the correlation but performance was > bad for over an hour and returned to normal once a successful replace_token > was performed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)