Hi all,

so after deep investigation, we found out that this is this problem

https://issues.apache.org/jira/browse/CASSANDRA-8058

Jiri Horky

On 10/20/2015 12:00 PM, Jiri Horky wrote:
> Hi all,
>
> we are experiencing a strange behavior when we are trying to bootstrap a
> new node. The problem is that the Recent Write Latency goes to 2s on all
> the other Cassandra nodes (which are receiving user traffic), which
> corresponds to our setting of "write_request_timeout_in_ms: 2000".
>
> We use Cassandra 2.0.10 and trying to convert to vnodes and increase a
> replication factor. So we are adding a new node in new DC (marked as
> DCXA) as the only node in new DC with replication factor 3. The reason
> for higher RF is that we will be converting another 2 existing servers
> to new DC (vnodes) and we want them to get all the data.
>
> The replication settings look like this:
> ALTER KEYSPACE slw WITH replication = {
>   'class': 'NetworkTopologyStrategy',
>   'DC4': '1',
>   'DC5': '1',
>   'DC2': '1',
>   'DC3': '1',
>   'DC0': '1',
>   'DC1': '1',
>   'DC0A': '3',
>   'DC1A': '3',
>   'DC2A': '3',
>   'DC3A': '3',
>   'DC4A': '3',
>   'DC5A': '3'
> };
>
> We were adding the nodes to DC0A->DC4A without any effects on existing
> nodes (DCX without A). When we are trying to add DC5A, the abovemention
> problem happens, 100% reproducibly.
>
> I tried to increase number of concurrent_writers from 32 to 128 on the
> old nodes, also tried to increase number of flush writers, both  with no
> effect. The strange thing is that the load, CPU usage, GC, network
> throughput - everything is fine on the old nodes which are reporting 2s
> of write latency. Nodetool tpstats does not show any blocked/pending
> operations.
>
> I think I must be hitting some limit (because of overall of replicas?)
> somewhere.
>
> Any input would be greatly appreciated.
>
> Thanks
> Jirka H.
>

Reply via email to