[ https://issues.apache.org/jira/browse/CASSANDRA-11891?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Tyler Hobbs updated CASSANDRA-11891: ------------------------------------ Resolution: Fixed Fix Version/s: (was: 3.0.x) (was: 3.x) 3.0.7 3.7 Status: Resolved (was: Patch Available) Thanks. All of the failing tests appear to be known failures or known flaky tests, so I've committed this to 3.0 as [{{5794fb3556a07a662a8a79cfd692eaa459e9066b}}|https://github.com/apache/cassandra/commit/5794fb3556a07a662a8a79cfd692eaa459e9066b] and merged up to 3.7 and trunk. > WriteTimeout during commit log replay due to MV lock > ---------------------------------------------------- > > Key: CASSANDRA-11891 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11891 > Project: Cassandra > Issue Type: Bug > Components: Lifecycle, Local Write-Read Paths > Reporter: Tyler Hobbs > Assignee: Tyler Hobbs > Priority: Critical > Fix For: 3.7, 3.0.7 > > > During commit log replay, if there are materialized views, it's possible for > contention on the MV lock to cause a {{WriteTimeoutException}}. This makes > commit log replay fail, which of course prevents the node from starting up. > This generally means that the operator has to move the commitlog segments to > avoid replay. > Here's a stacktrace of this happening on 3.0.5: > {noformat} > ERROR [main] 2016-05-25 15:10:31,120 CassandraDaemon.java:692 - Exception > encountered during startup > java.lang.RuntimeException: java.util.concurrent.ExecutionException: > org.apache.cassandra.exceptions.WriteTimeoutException: Operation timed out - > received only 0 responses. > at org.apache.cassandra.utils.Throwables.maybeFail(Throwables.java:50) > ~[apache-cassandra-3.0.5.jar:3.0.5] > at > org.apache.cassandra.utils.FBUtilities.waitOnFutures(FBUtilities.java:372) > ~[apache-cassandra-3.0.5.jar:3.0.5] > at > org.apache.cassandra.db.commitlog.CommitLogReplayer.replayMutation(CommitLogReplayer.java:624) > ~[apache-cassandra-3.0.5.jar:3.0.5] > at > org.apache.cassandra.db.commitlog.CommitLogReplayer.replaySyncSection(CommitLogReplayer.java:511) > ~[apache-cassandra-3.0.5.jar:3.0.5] > at > org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:406) > ~[apache-cassandra-3.0.5.jar:3.0.5] > at > org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:153) > ~[apache-cassandra-3.0.5.jar:3.0.5] > at > org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:189) > ~[apache-cassandra-3.0.5.jar:3.0.5] > at > org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:169) > ~[apache-cassandra-3.0.5.jar:3.0.5] > at > org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:283) > [apache-cassandra-3.0.5.jar:3.0.5] > at > org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:551) > [apache-cassandra-3.0.5.jar:3.0.5] > at > org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:679) > [apache-cassandra-3.0.5.jar:3.0.5] > Caused by: java.util.concurrent.ExecutionException: > org.apache.cassandra.exceptions.WriteTimeoutException: Operation timed out - > received only 0 responses. > at > org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$FutureTask.get(AbstractLocalAwareExecutorService.java:200) > ~[apache-cassandra-3.0.5.jar:3.0.5] > at > org.apache.cassandra.utils.FBUtilities.waitOnFutures(FBUtilities.java:365) > ~[apache-cassandra-3.0.5.jar:3.0.5] > ... 9 common frames omitted > Suppressed: java.util.concurrent.ExecutionException: > org.apache.cassandra.exceptions.WriteTimeoutException: Operation timed out - > received only 0 responses. > ... 11 common frames omitted > Caused by: org.apache.cassandra.exceptions.WriteTimeoutException: > Operation timed out - received only 0 responses. > at org.apache.cassandra.db.Keyspace.apply(Keyspace.java:431) > at > org.apache.cassandra.db.Keyspace.lambda$apply$62(Keyspace.java:443) > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > at > org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$FutureTask.run(AbstractLocalAwareExecutorService.java:164) > at > org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:105) > at java.lang.Thread.run(Thread.java:745) > {noformat} > We should ignore the {{write_rpc_timeout}} setting while acquiring MV locks > if we're on the commitlog replay path. -- This message was sent by Atlassian JIRA (v6.3.4#6332)