[ https://issues.apache.org/jira/browse/CASSANDRA-12280?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Benjamin Roth resolved CASSANDRA-12280. --------------------------------------- Resolution: Not A Bug Reproduced In: 3.7, 3.0.8, 3.9 (was: 3.0.8, 3.7, 3.9) Seems like this was a configuration and/or kernel issue. For all those who are interested what I did to fix that issue: - Set tcp keepalive to these values: https://docs.datastax.com/en/cassandra/2.0/cassandra/troubleshooting/trblshootIdleFirewall.html. Before the total timeout was at about 15min. 90s as in the article fails faster. Maybe that helps that things to resume faster causing less hangs. - otc_coalescing_strategy = DISABLED. Don't know if it had a lot of impact but it had no negative effects. I should double test that but ... I'm glad it works right now. - Removed a suspicious host. That was the only host with a different kernel and it seemed that streams from / to this host hung more often than others. After these changes I did not recognize any (obvious and noticeable) hangs any more. > nodetool repair hangs > --------------------- > > Key: CASSANDRA-12280 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12280 > Project: Cassandra > Issue Type: Bug > Reporter: Benjamin Roth > > nodetool repair hangs when repairing a keyspace, does not hang when > repairting table/mv by table/mv. > Command executed (both variants make it hang): > nodetool repair likes like dislike_by_source_mv like_by_contact_mv > match_valid_mv like_out dislike match match_by_contact_mv like_valid_mv > like_out_by_source_mv > OR > nodetool repair likes > Logs: > https://gist.github.com/brstgt/bf8b20fa1942d29ab60926ede7340b75 > Nodetool output: > https://gist.github.com/brstgt/3aa73662da4b0190630ac1aad6c90a6f > Schema: > https://gist.github.com/brstgt/3fd59e0166f86f8065085532e3638097 -- This message was sent by Atlassian JIRA (v6.3.4#6332)