[ https://issues.apache.org/jira/browse/CASSANDRA-13608?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Tania S Engel updated CASSANDRA-13608: -------------------------------------- Fix Version/s: (was: 3.10) > Connection closed/reopened during join causes Cassandra stream to close > ----------------------------------------------------------------------- > > Key: CASSANDRA-13608 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13608 > Project: Cassandra > Issue Type: Bug > Components: Streaming and Messaging > Environment: Cassandra 3.10. Windows Server 2016, 32GB ram, 2TB hard > disk, RAID10 with 4 spindles, 8 Cores > Reporter: Tania S Engel > Attachments: Cassandra 3.10 Join with lots GC collection leads to > socket closure and join hang.mht, Cassandra 3.10 Join with lots GC collection > leads to socket closure and join hang.pdf, Cassandra 3.10 Join with lots GC > collection leads to socket closure and join hang.txt > > > We start a JOIN bootstrap. Primary seed node streams to the replica. The > replica requires some GC cleanup and experiences frequent pauses including a > 12 second old gen cleanup following a memTable flush. Both replica and > primary show _MessagingService IOException: An existing connection was > forcibly closed by the remote host_. The replica MessagingService-Outgoing > reestablishes the connection immediately but the primary > StreamKeepAliveExecutor throws a _java.RuntimeException: Outgoing stream > handler has been closed_. >From that point forward, the replica stays in JOIN > mode, sending keeping alive to the primary. The primary receives the keep > alive, but does not send its own and it repeatedly fails to send a hints file > to the replica. It seems this limping condition would continue indefinitely, > but stops as we stop the replica Cassandra. If we restart the replica > Cassandra the JOIN picks up again but fails with _java.io.IOException: > Corrupt value length 355151036 encountered, as it exceeds the maximum of > 268435456, which is set via max_value_size_in_mb in cassandra.yaml_. We have > not increased this value as we do not have values that large in our data so > we presume it is indeed corrupt and moving past it would not be a good idea. > Please see the attachment for details. -- This message was sent by Atlassian JIRA (v6.4.14#64029) --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org