[ https://issues.apache.org/jira/browse/CASSANDRA-11845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15291536#comment-15291536 ]
Paulo Motta commented on CASSANDRA-11845: ----------------------------------------- [~vin01] can you check the output of {{nodetool compactionstats}} on the receiving node, and check if there are secondary indexes being rebuilt? > Hanging repair in cassandra 2.2.4 > --------------------------------- > > Key: CASSANDRA-11845 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11845 > Project: Cassandra > Issue Type: Bug > Components: Streaming and Messaging > Environment: Centos 6 > Reporter: vin01 > Priority: Minor > > So after increasing the streaming_timeout_in_ms value to 3 hours, i was able > to avoid the socketTimeout errors i was getting earlier > (https://issues.apAache.org/jira/browse/CASSANDRA-11826), but now the issue > is repair just stays stuck. > current status :- > [2016-05-19 05:52:50,835] Repair session a0e590e1-1d99-11e6-9d63-b717b380ffdd > for range (-3309358208555432808,-3279958773585646585] finished (progress: 54%) > [2016-05-19 05:53:09,446] Repair session a0e590e3-1d99-11e6-9d63-b717b380ffdd > for range (8149151263857514385,8181801084802729407] finished (progress: 55%) > [2016-05-19 05:53:13,808] Repair session a0e5b7f1-1d99-11e6-9d63-b717b380ffdd > for range (3372779397996730299,3381236471688156773] finished (progress: 55%) > [2016-05-19 05:53:27,543] Repair session a0e5b7f3-1d99-11e6-9d63-b717b380ffdd > for range (-4182952858113330342,-4157904914928848809] finished (progress: 55%) > [2016-05-19 05:53:41,128] Repair session a0e5df00-1d99-11e6-9d63-b717b380ffdd > for range (6499366179019889198,6523760493740195344] finished (progress: 55%) > And its 10:46:25 Now, almost 5 hours since it has been stuck right there. > Earlier i could see repair session going on in system.log but there are no > logs coming in right now, all i get in logs is regular index summary > redistribution logs. > Last logs for repair i saw in logs :- > INFO [RepairJobTask:5] 2016-05-19 05:53:41,125 RepairJob.java:152 - [repair > #a0e5df00-1d99-11e6-9d63-b717b380ffdd] TABLE_NAME is fully synced > INFO [RepairJobTask:5] 2016-05-19 05:53:41,126 RepairSession.java:279 - > [repair #a0e5df00-1d99-11e6-9d63-b717b380ffdd] Session completed successfully > INFO [RepairJobTask:5] 2016-05-19 05:53:41,126 RepairRunnable.java:232 - > Repair session a0e5df00-1d99-11e6-9d63-b717b380ffdd for range > (6499366179019889198,6523760493740195344] finished > Its an incremental repair, and in "nodetool netstats" output i can see logs > like :- > Repair e3055fb0-1d9d-11e6-9d63-b717b380ffdd > /Node-2 > Receiving 8 files, 1093461 bytes total. Already received 8 files, > 1093461 bytes total > > /data/cassandra/data/KEYSPACE_NAME/TABLE_NAME-01ad9750723e11e4bfe0d3887930a87c/tmp-la-80872-big-Data.db > 399475/399475 bytes(100%) received from idx:0/Node-2 > > /data/cassandra/data/KEYSPACE_NAME/TABLE_NAME-01ad9750723e11e4bfe0d3887930a87c/tmp-la-80879-big-Data.db > 53809/53809 bytes(100%) received from idx:0/Node-2 > > /data/cassandra/data/KEYSPACE_NAME/TABLE_NAME-01ad9750723e11e4bfe0d3887930a87c/tmp-la-80878-big-Data.db > 89955/89955 bytes(100%) received from idx:0/Node-2 > > /data/cassandra/data/KEYSPACE_NAME/TABLE_NAME-01ad9750723e11e4bfe0d3887930a87c/tmp-la-80881-big-Data.db > 168790/168790 bytes(100%) received from idx:0/Node-2 > > /data/cassandra/data/KEYSPACE_NAME/TABLE_NAME-01ad9750723e11e4bfe0d3887930a87c/tmp-la-80886-big-Data.db > 107785/107785 bytes(100%) received from idx:0/Node-2 > > /data/cassandra/data/KEYSPACE_NAME/TABLE_NAME-01ad9750723e11e4bfe0d3887930a87c/tmp-la-80880-big-Data.db > 52889/52889 bytes(100%) received from idx:0/Node-2 > > /data/cassandra/data/KEYSPACE_NAME/TABLE_NAME-01ad9750723e11e4bfe0d3887930a87c/tmp-la-80884-big-Data.db > 148882/148882 bytes(100%) received from idx:0/Node-2 > > /data/cassandra/data/KEYSPACE_NAME/TABLE_NAME-01ad9750723e11e4bfe0d3887930a87c/tmp-la-80883-big-Data.db > 71876/71876 bytes(100%) received from idx:0/Node-2 > Sending 5 files, 863321 bytes total. Already sent 5 files, 863321 > bytes total > > /data/cassandra/data/KEYSPACE_NAME/TABLE_NAME-01ad9750723e11e4bfe0d3887930a87c/la-73168-big-Data.db > 161895/161895 bytes(100%) sent to idx:0/Node-2 > > /data/cassandra/data/KEYSPACE_NAME/TABLE_NAME-01ad9750723e11e4bfe0d3887930a87c/la-72604-big-Data.db > 399865/399865 bytes(100%) sent to idx:0/Node-2 > > /data/cassandra/data/KEYSPACE_NAME/TABLE_NAME-01ad9750723e11e4bfe0d3887930a87c/la-73147-big-Data.db > 149066/149066 bytes(100%) sent to idx:0/Node-2 > > /data/cassandra/data/KEYSPACE_NAME/TABLE_NAME-01ad9750723e11e4bfe0d3887930a87c/la-72682-big-Data.db > 126000/126000 bytes(100%) sent to idx:0/Node-2 > > /data/cassandra/data/KEYSPACE_NAME/TABLE_NAME-01ad9750723e11e4bfe0d3887930a87c/la-73173-big-Data.db > 26495/26495 bytes(100%) sent to idx:0/Node-2 > Repair c0c8af20-1d9c-11e6-9d63-b717b380ffdd > /Node-3 > Receiving 11 files, 13896288 bytes total. Already received 11 files, > 13896288 bytes total > > /data/cassandra/data/KEYSPACE_NAME/TABLE_NAME-01ad9750723e11e4bfe0d3887930a87c/tmp-la-79186-big-Data.db > 1598874/1598874 bytes(100%) received from idx:0/Node-3 > > /data/cassandra/data/KEYSPACE_NAME/TABLE_NAME-01ad9750723e11e4bfe0d3887930a87c/tmp-la-79196-big-Data.db > 736365/736365 bytes(100%) received from idx:0/Node-3 > > /data/cassandra/data/KEYSPACE_NAME/TABLE_NAME-01ad9750723e11e4bfe0d3887930a87c/tmp-la-79197-big-Data.db > 326558/326558 bytes(100%) received from idx:0/Node-3 > > /data/cassandra/data/KEYSPACE_NAME/TABLE_NAME-01ad9750723e11e4bfe0d3887930a87c/tmp-la-79187-big-Data.db > 1484827/1484827 bytes(100%) received from idx:0/Node-3 > > /data/cassandra/data/KEYSPACE_NAME/TABLE_NAME-01ad9750723e11e4bfe0d3887930a87c/tmp-la-79180-big-Data.db > 393636/393636 bytes(100%) received from idx:0/Node-3 > > /data/cassandra/data/KEYSPACE_NAME/TABLE_NAME-01ad9750723e11e4bfe0d3887930a87c/tmp-la-79184-big-Data.db > 825459/825459 bytes(100%) received from idx:0/Node-3 > > /data/cassandra/data/KEYSPACE_NAME/TABLE_NAME-01ad9750723e11e4bfe0d3887930a87c/tmp-la-79188-big-Data.db > 3568782/3568782 bytes(100%) received from idx:0/Node-3 > > /data/cassandra/data/KEYSPACE_NAME/TABLE_NAME-01ad9750723e11e4bfe0d3887930a87c/tmp-la-79182-big-Data.db > 271222/271222 bytes(100%) received from idx:0/Node-3 > > /data/cassandra/data/KEYSPACE_NAME/TABLE_NAME-01ad9750723e11e4bfe0d3887930a87c/tmp-la-79193-big-Data.db > 4315497/4315497 bytes(100%) received from idx:0/Node-3 > > /data/cassandra/data/KEYSPACE_NAME/TABLE_NAME-01ad9750723e11e4bfe0d3887930a87c/tmp-la-79183-big-Data.db > 19775/19775 bytes(100%) received from idx:0/Node-3 > > /data/cassandra/data/KEYSPACE_NAME/TABLE_NAME-01ad9750723e11e4bfe0d3887930a87c/tmp-la-79192-big-Data.db > 355293/355293 bytes(100%) received from idx:0/Node-3 > Sending 5 files, 9444101 bytes total. Already sent 5 files, 9444101 > bytes total > > /data/cassandra/data/KEYSPACE_NAME/TABLE_NAME-01ad9750723e11e4bfe0d3887930a87c/la-73168-big-Data.db > 1796825/1796825 bytes(100%) sent to idx:0/Node-3 > > /data/cassandra/data/KEYSPACE_NAME/TABLE_NAME-01ad9750723e11e4bfe0d3887930a87c/la-72604-big-Data.db > 4549996/4549996 bytes(100%) sent to idx:0/Node-3 > > /data/cassandra/data/KEYSPACE_NAME/TABLE_NAME-01ad9750723e11e4bfe0d3887930a87c/la-73147-big-Data.db > 1658881/1658881 bytes(100%) sent to idx:0/Node-3 > > /data/cassandra/data/KEYSPACE_NAME/TABLE_NAME-01ad9750723e11e4bfe0d3887930a87c/la-72682-big-Data.db > 1418335/1418335 bytes(100%) sent to idx:0/Node-3 > > /data/cassandra/data/KEYSPACE_NAME/TABLE_NAME-01ad9750723e11e4bfe0d3887930a87c/la-73173-big-Data.db > 20064/20064 bytes(100%) sent to idx:0/Node-3 > Read Repair Statistics: > Attempted: 1142 > Mismatch (Blocking): 0 > Mismatch (Background): 0 > Pool Name Active Pending Completed > Large messages n/a 0 779 > Small messages n/a 0 14756609 > Gossip messages n/a 0 119647 > The last three fields "Large messages" , "Small messages" and "Gossip > messages" keep changing, "Large messages" has incremented by 2 in last 5 > hours, other 2 are changing more frequently. > I am unable to figure out whether repair is going on or stuck.. If its > stuck.. what should be my course of action if i want to get that table > repaired? -- This message was sent by Atlassian JIRA (v6.3.4#6332)