[ https://issues.apache.org/jira/browse/CASSANDRA-8611?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14709372#comment-14709372 ]
Eric Lubow commented on CASSANDRA-8611: --------------------------------------- I've seen streams hang for days on EC2 as well. This can be especially problematic when you are trying to add capacity. Typically if nothing has happened in an hour, then it's probably the result of a hung stream and waiting another hour doesn't serve to benefit much. The one thing to keep in mind for a timeout of two hours is that on smaller datasets, the timeout for the stream is going to be longer than the entire bootstrap of the machine would take. I think it would be safe to bring thing down to an hour which is also still very conservative. > give streaming_socket_timeout_in_ms a non-zero default > ------------------------------------------------------ > > Key: CASSANDRA-8611 > URL: https://issues.apache.org/jira/browse/CASSANDRA-8611 > Project: Cassandra > Issue Type: Improvement > Components: Core > Reporter: Jeremy Hanna > Assignee: Benjamin Lerer > > Sometimes as mentioned in CASSANDRA-8472 streams will hang. We have > streaming_socket_timeout_in_ms which can retry after a timeout. It would be > good to make a default non-zero value. We don't want to paper over problems, > but streams sometimes hang and you don't want long running streaming > operations to just fail - as in repairs or bootstraps. > streaming_socket_timeout_in_ms should be based on the tcp idle timeout so it > shouldn't be a problem to set it to on the order of minutes. Also the socket > should only be open during the actual streaming and not during operations > such as merkle tree generation. We can set it to a conservative value and > people can set it more aggressively as needed. Disabling as a default, in my > opinion, is too conservative. -- This message was sent by Atlassian JIRA (v6.3.4#6332)