If some streaming sessions fail on decommission, decommission hangs -------------------------------------------------------------------
Key: CASSANDRA-3730 URL: https://issues.apache.org/jira/browse/CASSANDRA-3730 Project: Cassandra Issue Type: Bug Components: Core Affects Versions: 1.1 Environment: FreeBSD Reporter: Vitalii Tymchyshyn Currently cassandra do not handle StreamOutSession fails, e.g.: // Instead of just not calling the callback on failure, we could have // allow to register a specific callback for failures, but we leave // that to a future ticket (likely CASSANDRA-3112) if (callback != null && success) callback.run(); This means that if during decommission a node that receives decommission data fails or (my case) the node that tries to decommission becomes overloaded, the streaming session fails and decommission don't know anything about this. This makes it hard to decommission overloaded nodes because I need to restart the node to restart decommission. Also I can see next errors because of streaming files try to get streaming session that is closed by gossip: ERROR [Streaming to /10.112.0.216:1] 2012-01-11 15:57:28,882 AbstractCassandraDaemon.java (line 138) Fatal exception in thread Thread[Streaming to /10.112.0.216:1,5,main] java.lang.NullPointerException at org.apache.cassandra.streaming.FileStreamTask.runMayThrow(FileStreamTask.java:97) at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603) at java.lang.Thread.run(Thread.java:679) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira