[ https://issues.apache.org/jira/browse/CASSANDRA-16880?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Josh McKenzie updated CASSANDRA-16880: -------------------------------------- Change Category: Operability Complexity: Low Hanging Fruit Fix Version/s: 4.0.x Status: Open (was: Triage Needed) > Catch read repair timeouts and add metrics to indicate they occurred > -------------------------------------------------------------------- > > Key: CASSANDRA-16880 > URL: https://issues.apache.org/jira/browse/CASSANDRA-16880 > Project: Cassandra > Issue Type: Improvement > Components: Observability/Metrics > Reporter: Josh McKenzie > Assignee: Josh McKenzie > Priority: Normal > Fix For: 4.0.x > > > When we fire off async read repairs onto their own executor they may time out > and in doing so, we don't have anything that stops them from propagating that > timeout exception the way up to CassandraDaemon's uncaught exception handler. > When this happens we logs at ERROR. > Obviously a timeout isn't great, but it's not an ERROR, so we should trap > them instead and add some metrics around this occurrance. -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org