[jira] [Assigned] (CASSANDRA-5769) Not all STATUS_CHANGE UP events reported via the native protocol
[ https://issues.apache.org/jira/browse/CASSANDRA-5769?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Ellis reassigned CASSANDRA-5769: - Assignee: Sylvain Lebresne (was: Tyler Hobbs) > Not all STATUS_CHANGE UP events reported via the native protocol > > > Key: CASSANDRA-5769 > URL: https://issues.apache.org/jira/browse/CASSANDRA-5769 > Project: Cassandra > Issue Type: Bug > Components: Core >Affects Versions: 1.2.5, 1.2.6 > Environment: Uubuntu 12.04, x86, 64 bit >Reporter: Duncan Sands >Assignee: Sylvain Lebresne >Priority: Minor > > Not all gossip UP events are pushed to native protocol users who have > registered for them. This seems to be a native protocol issue because nodes > themselves get the UP event (as seen in their logs). I can consistently > reproduce this issue as follows: > 1) connect a client to a cluster node ("node1") using the native protocol, > register for TOPOLOGY_CHANGE and STATUS_CHANGE events. (Probably you only > need to register for STATUS_CHANGE to see this, however my client registers > for both). > 2) on another node ("node2"), send SIGSTOP to the Cassandra process. > 3) after about 30 seconds the client gets pushed a STATUS_CHANGE DOWN event > for the stopped node. > 4) on node2, send SIGCONT to the the Cassandra process. > 5) wait forever to get a STATUS_CHANGE UP event. This is failure: no event > is ever received. > Observe that node1 does know that node2 is back up: in its system log I see > for example > INFO [GossipStage:1] 2013-07-17 14:27:41,238 Gossiper.java (line 771) > InetAddress /172.18.34.169 is now UP > shortly after sending SIGCONT to the stopped process. > To eliminate the possibility that my client is at fault, I performed the > following sanity check: > 2') on node2, stopped Cassandra nicely using: sudo service cassandra stop > 4') on node2, restarted Cassandra using: sudo service cassandra start > In this case the client soon after gets a STATUS_CHANGE DOWN event followed > by a STATUS_CHANGE UP event for node2. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Assigned] (CASSANDRA-5769) Not all STATUS_CHANGE UP events reported via the native protocol
[ https://issues.apache.org/jira/browse/CASSANDRA-5769?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Ellis reassigned CASSANDRA-5769: - Assignee: Tyler Hobbs Tyler, can you reproduce? > Not all STATUS_CHANGE UP events reported via the native protocol > > > Key: CASSANDRA-5769 > URL: https://issues.apache.org/jira/browse/CASSANDRA-5769 > Project: Cassandra > Issue Type: Bug > Components: Core >Affects Versions: 1.2.5, 1.2.6 > Environment: Uubuntu 12.04, x86, 64 bit >Reporter: Duncan Sands >Assignee: Tyler Hobbs >Priority: Minor > > Not all gossip UP events are pushed to native protocol users who have > registered for them. This seems to be a native protocol issue because nodes > themselves get the UP event (as seen in their logs). I can consistently > reproduce this issue as follows: > 1) connect a client to a cluster node ("node1") using the native protocol, > register for TOPOLOGY_CHANGE and STATUS_CHANGE events. (Probably you only > need to register for STATUS_CHANGE to see this, however my client registers > for both). > 2) on another node ("node2"), send SIGSTOP to the Cassandra process. > 3) after about 30 seconds the client gets pushed a STATUS_CHANGE DOWN event > for the stopped node. > 4) on node2, send SIGCONT to the the Cassandra process. > 5) wait forever to get a STATUS_CHANGE UP event. This is failure: no event > is ever received. > Observe that node1 does know that node2 is back up: in its system log I see > for example > INFO [GossipStage:1] 2013-07-17 14:27:41,238 Gossiper.java (line 771) > InetAddress /172.18.34.169 is now UP > shortly after sending SIGCONT to the stopped process. > To eliminate the possibility that my client is at fault, I performed the > following sanity check: > 2') on node2, stopped Cassandra nicely using: sudo service cassandra stop > 4') on node2, restarted Cassandra using: sudo service cassandra start > In this case the client soon after gets a STATUS_CHANGE DOWN event followed > by a STATUS_CHANGE UP event for node2. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira