[ 
https://issues.apache.org/jira/browse/CASSANDRA-13810?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17127408#comment-17127408
 ] 

ZhaoYang commented on CASSANDRA-13810:
--------------------------------------

I am thinking:
1. to remove the debug log from "HintVerbHandler" or change it a no-spam logger.
2. to use exponential backoff when retrying failed hints rather than every 10s.
3. to decrease hint replay page size (512kb), otherwise large amount of 
succeeded hints will be replayed when retrying a page that may contain only 1 
failed hint.

> Overload because of hint pressure + MVs
> ---------------------------------------
>
>                 Key: CASSANDRA-13810
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-13810
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Feature/Materialized Views
>            Reporter: Tom van der Woerdt
>            Priority: Urgent
>              Labels: materializedviews
>             Fix For: 3.0.x, 3.11.x, 4.x
>
>
> Cluster setup: 3 DCs, 20 Cassandra nodes each, all 3.0.14, with approx. 200GB 
> data per machine. Many tables have MVs associated.
> During some maintenance we did a rolling restart of all nodes in the cluster. 
> This caused a buildup of hints/batches, as expected. Most nodes came back 
> just fine, except for two nodes.
> These two nodes came back with a loadavg of >100, and 'nodetool tpstats' 
> showed a million (not exaggerating) MutationStage tasks per second(!). It was 
> clear that these were mostly (all?) mutations coming from hints, as indicated 
> by thousands of log entries per second in debug.log :
> {noformat}
> DEBUG [SharedPool-Worker-107] 2017-08-27 13:16:51,098 HintVerbHandler.java:95 
> - Failed to apply hint
> java.util.concurrent.CompletionException: 
> org.apache.cassandra.exceptions.WriteTimeoutException: Operation timed out - 
> received only 0 responses.
>     at 
> java.util.concurrent.CompletableFuture.encodeThrowable(CompletableFuture.java:292)
>  ~[na:1.8.0_144]
>     at 
> java.util.concurrent.CompletableFuture.completeThrowable(CompletableFuture.java:308)
>  ~[na:1.8.0_144]
>     at 
> java.util.concurrent.CompletableFuture.uniAccept(CompletableFuture.java:647) 
> ~[na:1.8.0_144]
>     at 
> java.util.concurrent.CompletableFuture$UniAccept.tryFire(CompletableFuture.java:632)
>  ~[na:1.8.0_144]
>     at 
> java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:474)
>  ~[na:1.8.0_144]
>     at 
> java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:1977)
>  ~[na:1.8.0_144]
>     at org.apache.cassandra.db.Keyspace.applyInternal(Keyspace.java:481) 
> ~[apache-cassandra-3.0.14.jar:3.0.14]
>     at 
> org.apache.cassandra.db.Keyspace.lambda$applyInternal$0(Keyspace.java:495) 
> ~[apache-cassandra-3.0.14.jar:3.0.14]
>     at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
> ~[na:1.8.0_144]
>     at 
> org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$FutureTask.run(AbstractLocalAwareExecutorService.java:164)
>  ~[apache-cassandra-3.0.14.jar:3.0.14]
>     at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:105) 
> ~[apache-cassandra-3.0.14.jar:3.0.14]
>     at java.lang.Thread.run(Thread.java:748) ~[na:1.8.0_144]
> Caused by: org.apache.cassandra.exceptions.WriteTimeoutException: Operation 
> timed out - received only 0 responses.
>     ... 6 common frames omitted
> {noformat}
> After reading the relevant code, it seems that a hint is considered 
> droppable, and in the mutation path when the table contains a MV and the lock 
> fails to acquire and the mutation is droppable, it throws a WTE without 
> waiting until the timeout expires. This explains why Cassandra is able to 
> process a million mutations per second without actually considering them 
> 'dropped' in the 'nodetool tpstats' output.
> I managed to recover the two nodes by stopping handoffs on all nodes in the 
> cluster and reenabling them one at a time. It's likely that the hint/batchlog 
> settings were sub-optimal on this cluster, but I think that the retry 
> behavior(?) of hints should be improved as it's hard to express hint 
> throughput in kb/s when the mutations can involve MVs.
> More data available upon request -- I'm not sure which bits are relevant and 
> which aren't.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

Reply via email to