Hi all, so after deep investigation, we found out that this is this problem
https://issues.apache.org/jira/browse/CASSANDRA-8058 Jiri Horky On 10/20/2015 12:00 PM, Jiri Horky wrote: > Hi all, > > we are experiencing a strange behavior when we are trying to bootstrap a > new node. The problem is that the Recent Write Latency goes to 2s on all > the other Cassandra nodes (which are receiving user traffic), which > corresponds to our setting of "write_request_timeout_in_ms: 2000". > > We use Cassandra 2.0.10 and trying to convert to vnodes and increase a > replication factor. So we are adding a new node in new DC (marked as > DCXA) as the only node in new DC with replication factor 3. The reason > for higher RF is that we will be converting another 2 existing servers > to new DC (vnodes) and we want them to get all the data. > > The replication settings look like this: > ALTER KEYSPACE slw WITH replication = { > 'class': 'NetworkTopologyStrategy', > 'DC4': '1', > 'DC5': '1', > 'DC2': '1', > 'DC3': '1', > 'DC0': '1', > 'DC1': '1', > 'DC0A': '3', > 'DC1A': '3', > 'DC2A': '3', > 'DC3A': '3', > 'DC4A': '3', > 'DC5A': '3' > }; > > We were adding the nodes to DC0A->DC4A without any effects on existing > nodes (DCX without A). When we are trying to add DC5A, the abovemention > problem happens, 100% reproducibly. > > I tried to increase number of concurrent_writers from 32 to 128 on the > old nodes, also tried to increase number of flush writers, both with no > effect. The strange thing is that the load, CPU usage, GC, network > throughput - everything is fine on the old nodes which are reporting 2s > of write latency. Nodetool tpstats does not show any blocked/pending > operations. > > I think I must be hitting some limit (because of overall of replicas?) > somewhere. > > Any input would be greatly appreciated. > > Thanks > Jirka H. >