Simon Zhou created CASSANDRA-13387: -------------------------------------- Summary: Metrics for repair Key: CASSANDRA-13387 URL: https://issues.apache.org/jira/browse/CASSANDRA-13387 Project: Cassandra Issue Type: Improvement Reporter: Simon Zhou Assignee: Simon Zhou Priority: Minor
We're missing metrics for repair, especially for errors. From what I observed now, the exception will be caught by UncaughtExceptionHandler set in CassandraDaemon and is categorized as StorageMetrics.exceptions. This is one example: {code} ERROR [AntiEntropyStage:1] 2017-03-27 18:17:08,385 CassandraDaemon.java:207 - Exception in thread Thread[AntiEntropyStage:1,5,main] java.lang.RuntimeException: Parent repair session with id = 8c85d260-1319-11e7-82a2-25090a89015f has failed. at org.apache.cassandra.service.ActiveRepairService.getParentRepairSession(ActiveRepairService.java:377) ~[apache-cassandra-3.0.10.jar:3.0.10] at org.apache.cassandra.service.ActiveRepairService.removeParentRepairSession(ActiveRepairService.java:392) ~[apache-cassandra-3.0.10.jar:3.0.10] at org.apache.cassandra.repair.RepairMessageVerbHandler.doVerb(RepairMessageVerbHandler.java:172) ~[apache-cassandra-3.0.10.jar:3.0.10] at org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:67) ~[apache-cassandra-3.0.10.jar:3.0.10] at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) ~[na:1.8.0_121] at java.util.concurrent.FutureTask.run(FutureTask.java:266) ~[na:1.8.0_121] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) ~[na:1.8.0_121] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [na:1.8.0_121] at java.lang.Thread.run(Thread.java:745) [na:1.8.0_121] {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346)