[jira] [Commented] (CASSANDRA-10423) Paxos/LWT failures when moving node

Roger Schildmeijer (JIRA) Thu, 01 Oct 2015 05:32:09 -0700

    [ 
https://issues.apache.org/jira/browse/CASSANDRA-10423?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14939755#comment-14939755
 ]


Roger Schildmeijer commented on CASSANDRA-10423:
------------------------------------------------

I have 48 nodes in total. Four dcs with 12 nodes each. Using RF=3. The LWT rate 
is very low, single digit lwt requests per second. My usage of LWT is not 
likely to contend a lot. 

>From a single node with 1-2months of uptime:
org.apache.cassandra.metrics.ClientRequest.CASWrite.ConditionNotMet: 946
org.apache.cassandra.metrics.ClientRequest.CASWrite.ContentionHistogram: 50 - 
99th are all 1.0, Max is 12. (count 787)
org.apache.cassandra.metrics.ClientRequest.CASWrite.Timeouts has remained 
static since the node move was done.



> Paxos/LWT failures when moving node
> -----------------------------------
>
>                 Key: CASSANDRA-10423
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-10423
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>         Environment: Cassandra version: 2.0.14
> Java-driver version: 2.0.11
>            Reporter: Roger Schildmeijer
>
> While moving a node (nodetool move <newtoken>) we noticed that lwt started 
> failing for some (~50%) requests. The java-driver (version 2.0.11) returned 
> com.datastax.driver.core.exceptions.WriteTimeoutException: Cassandra timeout 
> during write query at consistency SERIAL (7 replica were required but only 0 
> acknowledged the write). The cluster was not under heavy load.
> I noticed that the failed lwt requests all took just above 1s. That 
> information and the WriteTimeoutException could indicate that this happens:
> https://github.com/apache/cassandra/blob/cassandra-2.0.14/src/java/org/apache/cassandra/service/StorageProxy.java#L268
> I can't explain why though. Why would there be more cas contention just 
> because a node is moving?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-10423) Paxos/LWT failures when moving node

Reply via email to