Hello, Situation is as follows:
Repair was started on node X on this keyspace with —full —pr. Repair fails on node Y. Node Y has debug logging on (DEBUG on org.apache.cassandra) and I’m looking at the debug.log. I see following messages related to this repair request: ----------- DEBUG [AntiEntropyStage:1] 2018-01-02 17:52:12,530 RepairMessageVerbHandler.java:114 - Validating ValidationRequest{gcBefore=1511473932} org.apache.cassandra.repair.messages.ValidationRequest@5a17430c DEBUG [ValidationExecutor:4] 2018-01-02 17:52:12,531 StorageService.java:3321 - Forcing flush on keyspace mykeyspace, CF mytable DEBUG [MemtablePostFlush:54] 2018-01-02 17:52:12,531 ColumnFamilyStore.java:954 - forceFlush requested but everything is clean in mytable ERROR [ValidationExecutor:4] 2018-01-02 17:52:12,532 Validator.java:268 - Failed creating a merkle tree for [repair #1df000a0-effa-11e7-8361-b7c9edfbfc33 on mykeyspace/mytable, [(6917529027641081856,-9223372036854775808]]], /123.123.123.123 (see log for details) ----------- then the same about another table and after that which indicates that repair “master” has told to abort basically, right? ----------- DEBUG [AntiEntropyStage:1] 2018-01-02 17:52:12,563 RepairMessageVerbHandler.java:142 - Got anticompaction request AnticompactionRequest{parentRepairSession=1de949e0-effa-11e7-8361-b7c9edfbfc33} org.apache.cassandra.repair.messages.AnticompactionRequest@5dc8be ea ERROR [AntiEntropyStage:1] 2018-01-02 17:52:12,563 RepairMessageVerbHandler.java:168 - Got error, removing parent repair session ERROR [AntiEntropyStage:1] 2018-01-02 17:52:12,564 CassandraDaemon.java:228 - Exception in thread Thread[AntiEntropyStage:1,5,main] java.lang.RuntimeException: java.lang.RuntimeException: Parent repair session with id = 1de949e0-effa-11e7-8361-b7c9edfbfc33 has failed. at org.apache.cassandra.repair.RepairMessageVerbHandler.doVerb(RepairMessageVerbHandler.java:171) ~[apache-cassandra-3.11.0.jar:3.11.0] at org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:66) ~[apache-cassandra-3.11.0.jar:3.11.0] at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) ~[na:1.8.0_111] at java.util.concurrent.FutureTask.run(FutureTask.java:266) ~[na:1.8.0_111] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) ~[na:1.8.0_111] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [na:1.8.0_111] at org.apache.cassandra.concurrent.NamedThreadFactory.lambda$threadLocalDeallocator$0(NamedThreadFactory.java:81) [apache-cassandra-3.11.0.jar:3.11.0] at java.lang.Thread.run(Thread.java:745) ~[na:1.8.0_111] Caused by: java.lang.RuntimeException: Parent repair session with id = 1de949e0-effa-11e7-8361-b7c9edfbfc33 has failed. at org.apache.cassandra.service.ActiveRepairService.getParentRepairSession(ActiveRepairService.java:409) ~[apache-cassandra-3.11.0.jar:3.11.0] at org.apache.cassandra.service.ActiveRepairService.doAntiCompaction(ActiveRepairService.java:444) ~[apache-cassandra-3.11.0.jar:3.11.0] at org.apache.cassandra.repair.RepairMessageVerbHandler.doVerb(RepairMessageVerbHandler.java:143) ~[apache-cassandra-3.11.0.jar:3.11.0] ... 7 common frames omitted ----------- But that is almost all in the log and I don’t really see what the original problem here is. Cassandra flushes the table to start building merkle tree and on next millisecond it already fails the repair but without proper exception or error logging about the problem. Cassandra version is the 3.11.0. Any ideas? Cheers, Hannu --------------------------------------------------------------------- To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org For additional commands, e-mail: user-h...@cassandra.apache.org