[ https://issues.apache.org/jira/browse/CASSANDRA-13636?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Jeff Jirsa updated CASSANDRA-13636: ----------------------------------- Description: After having a node go down and restarted, I ran an incremental repair. It exited with the following error: {code:java} [2017-06-26 04:01:12,241] Repair command #39 finished in 0 seconds [2017-06-26 04:01:12,250] Starting repair command #40, repairing keyspace fgp with repair options (parallelism: parallel, primary range: false, incremental: true, job threads: 1, ColumnFamilies: [], dataCenters: [], hosts: [], # of ranges: 790) [2017-06-26 04:01:12,468] Did not get positive replies from all endpoints. List of failed endpoint(s): [10.0.2.13] [2017-06-26 04:01:12,469] Repair command #40 finished with error error: Repair job has failed with the error message: [2017-06-26 04:01:12,468] Did not get positive replies from all endpoints. List of failed endpoint(s): [10.0.2.13] – StackTrace – java.lang.RuntimeException: Repair job has failed with the error message: [2017-06-26 04:01:12,468] Did not get positive replies from all endpoints. List of failed endpoint(s): [10.0.2.13] at org.apache.cassandra.tools.RepairRunner.progress(RepairRunner.java:115) at org.apache.cassandra.utils.progress.jmx.JMXNotificationProgressListener.handleNotification(JMXNotificationProgressListener.java:77) at com.sun.jmx.remote.internal.ClientNotifForwarder$NotifFetcher.dispatchNotification(ClientNotifForwarder.java:583) at com.sun.jmx.remote.internal.ClientNotifForwarder$NotifFetcher.doRun(ClientNotifForwarder.java:533) at com.sun.jmx.remote.internal.ClientNotifForwarder$NotifFetcher.run(ClientNotifForwarder.java:452) at com.sun.jmx.remote.internal.ClientNotifForwarder$LinearExecutor$1.run(ClientNotifForwarder.java:108){code} On the node that did not provide a positive reply, the logs showed: {code:java} INFO 04:01:12 repair #176c3760-5a24-11e7-868e-97841c634787 Starting anticompaction for system_traces.events on 0/[] sstables INFO 04:01:12 repair #176c3760-5a24-11e7-868e-97841c634787 Completed anticompaction successfully INFO 04:01:12 repair #176c3760-5a24-11e7-868e-97841c634787 Completed anticompaction successfully ERROR 04:01:12 Table with id ffcc2ef0-3122-11e7-8f76-b3cac7d588b7 was dropped during prepare phase of repair {code} I was unable to find documentation which describes this situation or how to recover from this situation. Running a full repair results in the same error. was: After having a node go down and restarted, I ran an incremental repair. It exited with the following error: [2017-06-26 04:01:12,241] Repair command #39 finished in 0 seconds [2017-06-26 04:01:12,250] Starting repair command #40, repairing keyspace fgp with repair options (parallelism: parallel, primary range: false, incremental: true, job threads: 1, ColumnFamilies: [], dataCenters: [], hosts: [], # of ranges: 790) [2017-06-26 04:01:12,468] Did not get positive replies from all endpoints. List of failed endpoint(s): [10.0.2.13] [2017-06-26 04:01:12,469] Repair command #40 finished with error error: Repair job has failed with the error message: [2017-06-26 04:01:12,468] Did not get positive replies from all endpoints. List of failed endpoint(s): [10.0.2.13] -- StackTrace -- java.lang.RuntimeException: Repair job has failed with the error message: [2017-06-26 04:01:12,468] Did not get positive replies from all endpoints. List of failed endpoint(s): [10.0.2.13] at org.apache.cassandra.tools.RepairRunner.progress(RepairRunner.java:115) at org.apache.cassandra.utils.progress.jmx.JMXNotificationProgressListener.handleNotification(JMXNotificationProgressListener.java:77) at com.sun.jmx.remote.internal.ClientNotifForwarder$NotifFetcher.dispatchNotification(ClientNotifForwarder.java:583) at com.sun.jmx.remote.internal.ClientNotifForwarder$NotifFetcher.doRun(ClientNotifForwarder.java:533) at com.sun.jmx.remote.internal.ClientNotifForwarder$NotifFetcher.run(ClientNotifForwarder.java:452) at com.sun.jmx.remote.internal.ClientNotifForwarder$LinearExecutor$1.run(ClientNotifForwarder.java:108) On the node that did not provide a positive reply, the logs showed: INFO 04:01:12 [repair #176c3760-5a24-11e7-868e-97841c634787] Starting anticompaction for system_traces.events on 0/[] sstables INFO 04:01:12 [repair #176c3760-5a24-11e7-868e-97841c634787] Completed anticompaction successfully INFO 04:01:12 [repair #176c3760-5a24-11e7-868e-97841c634787] Completed anticompaction successfully ERROR 04:01:12 Table with id ffcc2ef0-3122-11e7-8f76-b3cac7d588b7 was dropped during prepare phase of repair I was unable to find documentation which describes this situation or how to recover from this situation. Running a full repair results in the same error. > No documentation on how to handle error in repair > ------------------------------------------------- > > Key: CASSANDRA-13636 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13636 > Project: Cassandra > Issue Type: Improvement > Components: Documentation and Website > Environment: 6 nodes running Apache Cassandra 3.0.13 running in > docker environment. > Reporter: David Ryan > Priority: Minor > > After having a node go down and restarted, I ran an incremental repair. It > exited with the following error: > {code:java} > [2017-06-26 04:01:12,241] Repair command #39 finished in 0 seconds > [2017-06-26 04:01:12,250] Starting repair command #40, repairing keyspace fgp > with repair options (parallelism: parallel, primary range: false, > incremental: true, job threads: 1, ColumnFamilies: [], dataCenters: [], > hosts: [], # of ranges: 790) > [2017-06-26 04:01:12,468] Did not get positive replies from all endpoints. > List of failed endpoint(s): [10.0.2.13] > [2017-06-26 04:01:12,469] Repair command #40 finished with error > error: Repair job has failed with the error message: [2017-06-26 > 04:01:12,468] Did not get positive replies from all endpoints. List of failed > endpoint(s): [10.0.2.13] > – StackTrace – > java.lang.RuntimeException: Repair job has failed with the error message: > [2017-06-26 04:01:12,468] Did not get positive replies from all endpoints. > List of failed endpoint(s): [10.0.2.13] > at org.apache.cassandra.tools.RepairRunner.progress(RepairRunner.java:115) > at > org.apache.cassandra.utils.progress.jmx.JMXNotificationProgressListener.handleNotification(JMXNotificationProgressListener.java:77) > at > com.sun.jmx.remote.internal.ClientNotifForwarder$NotifFetcher.dispatchNotification(ClientNotifForwarder.java:583) > at > com.sun.jmx.remote.internal.ClientNotifForwarder$NotifFetcher.doRun(ClientNotifForwarder.java:533) > at > com.sun.jmx.remote.internal.ClientNotifForwarder$NotifFetcher.run(ClientNotifForwarder.java:452) > at > com.sun.jmx.remote.internal.ClientNotifForwarder$LinearExecutor$1.run(ClientNotifForwarder.java:108){code} > On the node that did not provide a positive reply, the logs showed: > {code:java} > INFO 04:01:12 repair #176c3760-5a24-11e7-868e-97841c634787 Starting > anticompaction for system_traces.events on 0/[] sstables > INFO 04:01:12 repair #176c3760-5a24-11e7-868e-97841c634787 Completed > anticompaction successfully > INFO 04:01:12 repair #176c3760-5a24-11e7-868e-97841c634787 Completed > anticompaction successfully > ERROR 04:01:12 Table with id ffcc2ef0-3122-11e7-8f76-b3cac7d588b7 was dropped > during prepare phase of repair > {code} > I was unable to find documentation which describes this situation or how to > recover from this situation. Running a full repair results in the same error. -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org