We have run restarts on the cluster and that doesn’t seem to help at all.

We ran repair separately for each table that seems to go through usually but 
running a repair on a keyspace doesn’t. 

Anything anyone?

Hannu

> On 3 Jan 2018, at 23:24, Hannu Kröger <hkro...@gmail.com> wrote:
> 
> I can certainly try that. No problem there.
> 
> However wouldn’t we then get this kind of errors if that was the case:
> java.lang.RuntimeException: Cannot start multiple repair sessions over the 
> same sstables
> ?
> 
> Hannu
> 
>> On 3 Jan 2018, at 20:50, Nandakishore Tokala <nandakishore.tok...@gmail.com 
>> <mailto:nandakishore.tok...@gmail.com>> wrote:
>> 
>> hi Hannu,
>> 
>> I think some of the repairs are hanging there. please restart all the nodes 
>> in the  cluster and start the repair 
>> 
>> 
>> Thanks
>> Nanda
>> 
>> On Wed, Jan 3, 2018 at 9:35 AM, Hannu Kröger <hkro...@gmail.com 
>> <mailto:hkro...@gmail.com>> wrote:
>> Additional notes:
>> 
>> 1) If I run the repair just on those tables, it works fine
>> 2) Those tables are empty
>> 
>> Hannu
>> 
>> > On 3 Jan 2018, at 18:23, Hannu Kröger <hkro...@gmail.com 
>> > <mailto:hkro...@gmail.com>> wrote:
>> >
>> > Hello,
>> >
>> > Situation is as follows:
>> >
>> > Repair was started on node X on this keyspace with —full —pr. Repair fails 
>> > on node Y.
>> >
>> > Node Y has debug logging on (DEBUG on org.apache.cassandra) and I’m 
>> > looking at the debug.log. I see following messages related to this repair 
>> > request:
>> >
>> > -----------
>> > DEBUG [AntiEntropyStage:1] 2018-01-02 17:52:12,530 
>> > RepairMessageVerbHandler.java:114 - Validating 
>> > ValidationRequest{gcBefore=1511473932} 
>> > org.apache.cassandra.repair.messages.ValidationRequest@5a17430c
>> > DEBUG [ValidationExecutor:4] 2018-01-02 17:52:12,531 
>> > StorageService.java:3321 - Forcing flush on keyspace mykeyspace, CF mytable
>> > DEBUG [MemtablePostFlush:54] 2018-01-02 17:52:12,531 
>> > ColumnFamilyStore.java:954 - forceFlush requested but everything is clean 
>> > in mytable
>> > ERROR [ValidationExecutor:4] 2018-01-02 17:52:12,532 Validator.java:268 - 
>> > Failed creating a merkle tree for [repair 
>> > #1df000a0-effa-11e7-8361-b7c9edfbfc33 on mykeyspace/mytable, 
>> > [(6917529027641081856,-9223372036854775808]]], /123.123.123.123 
>> > <http://123.123.123.123/> (see log for details)
>> > -----------
>> >
>> > then the same about another table and after that which indicates that 
>> > repair “master” has told to abort basically, right?
>> >
>> > -----------
>> > DEBUG [AntiEntropyStage:1] 2018-01-02 17:52:12,563 
>> > RepairMessageVerbHandler.java:142 - Got anticompaction request 
>> > AnticompactionRequest{parentRepairSession=1de949e0-effa-11e7-8361-b7c9edfbfc33}
>> >  org.apache.cassandra.repair.messages.AnticompactionRequest@5dc8be
>> > ea
>> > ERROR [AntiEntropyStage:1] 2018-01-02 17:52:12,563 
>> > RepairMessageVerbHandler.java:168 - Got error, removing parent repair 
>> > session
>> > ERROR [AntiEntropyStage:1] 2018-01-02 17:52:12,564 
>> > CassandraDaemon.java:228 - Exception in thread 
>> > Thread[AntiEntropyStage:1,5,main]
>> > java.lang.RuntimeException: java.lang.RuntimeException: Parent repair 
>> > session with id = 1de949e0-effa-11e7-8361-b7c9edfbfc33 has failed.
>> >        at 
>> > org.apache.cassandra.repair.RepairMessageVerbHandler.doVerb(RepairMessageVerbHandler.java:171)
>> >  ~[apache-cassandra-3.11.0.jar:3.11.0]
>> >        at org.apache.cassandra.net 
>> > <http://org.apache.cassandra.net/>.MessageDeliveryTask.run(MessageDeliveryTask.java:66)
>> >  ~[apache-cassandra-3.11.0.jar:3.11.0]
>> >        at 
>> > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
>> > ~[na:1.8.0_111]
>> >        at java.util.concurrent.FutureTask.run(FutureTask.java:266) 
>> > ~[na:1.8.0_111]
>> >        at 
>> > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>> >  ~[na:1.8.0_111]
>> >        at 
>> > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>> >  [na:1.8.0_111]
>> >        at 
>> > org.apache.cassandra.concurrent.NamedThreadFactory.lambda$threadLocalDeallocator$0(NamedThreadFactory.java:81)
>> >  [apache-cassandra-3.11.0.jar:3.11.0]
>> >        at java.lang.Thread.run(Thread.java:745) ~[na:1.8.0_111]
>> > Caused by: java.lang.RuntimeException: Parent repair session with id = 
>> > 1de949e0-effa-11e7-8361-b7c9edfbfc33 has failed.
>> >        at 
>> > org.apache.cassandra.service.ActiveRepairService.getParentRepairSession(ActiveRepairService.java:409)
>> >  ~[apache-cassandra-3.11.0.jar:3.11.0]
>> >        at 
>> > org.apache.cassandra.service.ActiveRepairService.doAntiCompaction(ActiveRepairService.java:444)
>> >  ~[apache-cassandra-3.11.0.jar:3.11.0]
>> >        at 
>> > org.apache.cassandra.repair.RepairMessageVerbHandler.doVerb(RepairMessageVerbHandler.java:143)
>> >  ~[apache-cassandra-3.11.0.jar:3.11.0]
>> >        ... 7 common frames omitted
>> > -----------
>> >
>> > But that is almost all in the log and I don’t really see what the original 
>> > problem here is.
>> >
>> > Cassandra flushes the table to start building merkle tree and on next 
>> > millisecond it already fails the repair but without proper exception or 
>> > error logging about the problem.
>> >
>> > Cassandra version is the 3.11.0.
>> >
>> > Any ideas?
>> >
>> > Cheers,
>> > Hannu
>> 
>> 
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org 
>> <mailto:user-unsubscr...@cassandra.apache.org>
>> For additional commands, e-mail: user-h...@cassandra.apache.org 
>> <mailto:user-h...@cassandra.apache.org>
>> 
>> 
>> 
>> 
>> -- 
>> Thanks & Regards,
>> Nanda Kishore
> 

Reply via email to