[ 
https://issues.apache.org/jira/browse/CASSANDRA-7560?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14067331#comment-14067331
 ] 

Yuki Morishita commented on CASSANDRA-7560:
-------------------------------------------

Thanks.
Can you also attach jstack(s) from replica?
I want to check validation compaction is still running.

> 'nodetool repair -pr' leads to indefinitely hanging AntiEntropySession
> ----------------------------------------------------------------------
>
>                 Key: CASSANDRA-7560
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-7560
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>            Reporter: Vladimir Avram
>         Attachments: cassandra_daemon.log, nodetool_command.log
>
>
> Running {{nodetool repair -pr}} will sometimes hang on one of the resulting 
> AntiEntropySessions.
> The system logs will show the repair command starting
> {noformat}
>  INFO [Thread-3079] 2014-07-15 02:22:56,514 StorageService.java (line 2569) 
> Starting repair command #1, repairing 256 ranges for keyspace x
> {noformat}
> You can then see a few AntiEntropySessions completing with:
> {noformat}
> INFO [AntiEntropySessions:2] 2014-07-15 02:28:12,766 RepairSession.java (line 
> 282) [repair #eefb3c30-0bc6-11e4-83f7-a378978d0c49] session completed 
> successfully
> {noformat}
> Finally we reach an AntiEntropySession at some point that hangs just before 
> requesting the merkle trees for the next column family in line for repair. So 
> we first see the previous CF being finished and the whole repair sessions 
> hangs here with no visible progress or errors on this or any of the related 
> nodes.
> {noformat}
> INFO [AntiEntropyStage:1] 2014-07-15 02:38:20,325 RepairSession.java (line 
> 221) [repair #8f85c1b0-0bc8-11e4-83f7-a378978d0c49] previous_cf is fully 
> synced
> {noformat}
> Notes:
> * Single DC 6 node cluster with an average load of 86 GB per node.
> * This appears to be random; it does not always happen on the same CF or on 
> the same session.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to