[jira] [Commented] (CASSANDRA-9584) Decommissioning a node on Windows sends the wrong schema change event
[ https://issues.apache.org/jira/browse/CASSANDRA-9584?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14588770#comment-14588770 ] Kishan Karunaratne commented on CASSANDRA-9584: --- This turned out to be a CCM bug, where decommission would fail silently: https://github.com/pcmanus/ccm/issues/307 Decommissioning a node on Windows sends the wrong schema change event - Key: CASSANDRA-9584 URL: https://issues.apache.org/jira/browse/CASSANDRA-9584 Project: Cassandra Issue Type: Bug Environment: C* 2.2.0-rc1 | python-driver 2.6.0-rc1 | Windows Server 2012 R2 64-bit Reporter: Kishan Karunaratne Assignee: Joshua McKenzie Fix For: 2.2.x Decommissioning a node on Windows sends the wrong schema change event: {noformat} cassandra.connection: DEBUG: Message pushed from server: EventMessage(event_type=u'STATUS_CHANGE', trace_id=None, event _args={'change_type': u'DOWN', 'address': ('127.0.0.2', 9042)}, stream_id=-1) {noformat} On Linux I get the correct event: {noformat} cassandra.connection: DEBUG: Message pushed from server: EventMessage(event_type=u'TOPOLOGY_CHANGE', trace_id=None, event_args={'change_type': u'REMOVED_NODE', 'address': ('127.0.0.2', 9042)}, stream_id=-1) {noformat} We are using ccmlib node.py.decommission() which calls nodetool decommission: {noformat} def decommission(self): self.nodetool(decommission) self.status = Status.DECOMMISIONNED self._update_config() {noformat} Interestingly, it does seem to work (correctly?) on CCM CLI: {noformat} PS C:\Users\Administrator ccm status Cluster: '2.2' -- node1: UP node3: UP node2: UP PS C:\Users\Administrator ccm node1 ring Starting NodeTool Datacenter: datacenter1 == AddressRackStatus State LoadOwns Token 3074457345618258602 127.0.0.1 rack1 Up Normal 62.43 KB? -9223372036854775808 127.0.0.2 rack1 Up Normal 104.87 KB ? -3074457345618258603 127.0.0.3 rack1 Up Normal 83.67 KB? 3074457345618258602 Note: Non-system keyspaces don't have the same replication settings, effective ownership information is meaningless PS C:\Users\Administrator ccm node2 decommission PS C:\Users\Administrator ccm status Cluster: '2.2' -- node1: UP node3: UP node2: DECOMMISIONNED PS C:\Users\Administrator ccm node1 ring Starting NodeTool Datacenter: datacenter1 == AddressRackStatus State LoadOwns Token 3074457345618258602 127.0.0.1 rack1 Up Normal 67.11 KB? -9223372036854775808 127.0.0.3 rack1 Up Normal 88.35 KB? 3074457345618258602 {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-9584) Decommissioning a node on Windows sends the wrong schema change event
[ https://issues.apache.org/jira/browse/CASSANDRA-9584?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14586265#comment-14586265 ] Joshua McKenzie commented on CASSANDRA-9584: I am unable to reproduce this locally, win8, on 2.2.0-rc1, 2.2-HEAD, or trunk using 2.6.0c1: {noformat} :\src\python-drivergit branch * (detached from 2.6.0c1) C:\src\decomTestgrep Event *.out 2.2-rc.out:Message pushed from server: EventMessage(event_type=u'TOPOLOGY_CHANGE', trace_id=None, event_args={'change_type': u'REMOVED_NODE', 'address': ('127.0.0.2', 9042)}, stream_id=-1) 2.2.out:Message pushed from server: EventMessage(event_type=u'TOPOLOGY_CHANGE', trace_id=None, event_args={'change_type': u'REMOVED_NODE', 'address': ('127.0.0.2', 9042)}, stream_id=-1) trunk.out:Message pushed from server: EventMessage(event_type=u'TOPOLOGY_CHANGE', trace_id=None, event_args={'change_type': u'REMOVED_NODE', 'address': ('127.0.0.2', 9042)}, stream_id=-1) {noformat} Only outstanding possibility is that we somehow send the wrong message on server12 where we send the correct on win8 but I'm *very* skeptical of that both due to a) the message generation logic here having nothing to do with the underlying OS to my knowledge and b) server12 and win8 sharing a large portion of their libs and kernel logic. [~kishkaru]: is there a possibility there's something else off in your testing environment? Decommissioning a node on Windows sends the wrong schema change event - Key: CASSANDRA-9584 URL: https://issues.apache.org/jira/browse/CASSANDRA-9584 Project: Cassandra Issue Type: Bug Environment: C* 2.2.0-rc1 | python-driver 2.6.0-rc1 | Windows Server 2012 R2 64-bit Reporter: Kishan Karunaratne Assignee: Joshua McKenzie Fix For: 2.2.x Decommissioning a node on Windows sends the wrong schema change event: {noformat} cassandra.connection: DEBUG: Message pushed from server: EventMessage(event_type=u'STATUS_CHANGE', trace_id=None, event _args={'change_type': u'DOWN', 'address': ('127.0.0.2', 9042)}, stream_id=-1) {noformat} On Linux I get the correct event: {noformat} cassandra.connection: DEBUG: Message pushed from server: EventMessage(event_type=u'TOPOLOGY_CHANGE', trace_id=None, event_args={'change_type': u'REMOVED_NODE', 'address': ('127.0.0.2', 9042)}, stream_id=-1) {noformat} We are using ccmlib node.py.decommission() which calls nodetool decommission: {noformat} def decommission(self): self.nodetool(decommission) self.status = Status.DECOMMISIONNED self._update_config() {noformat} Interestingly, it does seem to work (correctly?) on CCM CLI: {noformat} PS C:\Users\Administrator ccm status Cluster: '2.2' -- node1: UP node3: UP node2: UP PS C:\Users\Administrator ccm node1 ring Starting NodeTool Datacenter: datacenter1 == AddressRackStatus State LoadOwns Token 3074457345618258602 127.0.0.1 rack1 Up Normal 62.43 KB? -9223372036854775808 127.0.0.2 rack1 Up Normal 104.87 KB ? -3074457345618258603 127.0.0.3 rack1 Up Normal 83.67 KB? 3074457345618258602 Note: Non-system keyspaces don't have the same replication settings, effective ownership information is meaningless PS C:\Users\Administrator ccm node2 decommission PS C:\Users\Administrator ccm status Cluster: '2.2' -- node1: UP node3: UP node2: DECOMMISIONNED PS C:\Users\Administrator ccm node1 ring Starting NodeTool Datacenter: datacenter1 == AddressRackStatus State LoadOwns Token 3074457345618258602 127.0.0.1 rack1 Up Normal 67.11 KB? -9223372036854775808 127.0.0.3 rack1 Up Normal 88.35 KB? 3074457345618258602 {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-9584) Decommissioning a node on Windows sends the wrong schema change event
[ https://issues.apache.org/jira/browse/CASSANDRA-9584?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14586373#comment-14586373 ] Joshua McKenzie commented on CASSANDRA-9584: Tested on Server 2012 and I'm seeing the correct event (note: I have gnuwin32 tools installed, hence grep above and here): {noformat} jmckenzie@WIN-PERF01 c:\src\decomTest grep Event 2.2.0-rc1.out Message pushed from server: EventMessage(event_type=u'TOPOLOGY_CHANGE', trace_id=None, event_args={'change_type': u'REMOVED_NODE', 'address': ('127.0.0.2', 9042)}, stream_id=-1) {noformat} Decommissioning a node on Windows sends the wrong schema change event - Key: CASSANDRA-9584 URL: https://issues.apache.org/jira/browse/CASSANDRA-9584 Project: Cassandra Issue Type: Bug Environment: C* 2.2.0-rc1 | python-driver 2.6.0-rc1 | Windows Server 2012 R2 64-bit Reporter: Kishan Karunaratne Assignee: Joshua McKenzie Fix For: 2.2.x Decommissioning a node on Windows sends the wrong schema change event: {noformat} cassandra.connection: DEBUG: Message pushed from server: EventMessage(event_type=u'STATUS_CHANGE', trace_id=None, event _args={'change_type': u'DOWN', 'address': ('127.0.0.2', 9042)}, stream_id=-1) {noformat} On Linux I get the correct event: {noformat} cassandra.connection: DEBUG: Message pushed from server: EventMessage(event_type=u'TOPOLOGY_CHANGE', trace_id=None, event_args={'change_type': u'REMOVED_NODE', 'address': ('127.0.0.2', 9042)}, stream_id=-1) {noformat} We are using ccmlib node.py.decommission() which calls nodetool decommission: {noformat} def decommission(self): self.nodetool(decommission) self.status = Status.DECOMMISIONNED self._update_config() {noformat} Interestingly, it does seem to work (correctly?) on CCM CLI: {noformat} PS C:\Users\Administrator ccm status Cluster: '2.2' -- node1: UP node3: UP node2: UP PS C:\Users\Administrator ccm node1 ring Starting NodeTool Datacenter: datacenter1 == AddressRackStatus State LoadOwns Token 3074457345618258602 127.0.0.1 rack1 Up Normal 62.43 KB? -9223372036854775808 127.0.0.2 rack1 Up Normal 104.87 KB ? -3074457345618258603 127.0.0.3 rack1 Up Normal 83.67 KB? 3074457345618258602 Note: Non-system keyspaces don't have the same replication settings, effective ownership information is meaningless PS C:\Users\Administrator ccm node2 decommission PS C:\Users\Administrator ccm status Cluster: '2.2' -- node1: UP node3: UP node2: DECOMMISIONNED PS C:\Users\Administrator ccm node1 ring Starting NodeTool Datacenter: datacenter1 == AddressRackStatus State LoadOwns Token 3074457345618258602 127.0.0.1 rack1 Up Normal 67.11 KB? -9223372036854775808 127.0.0.3 rack1 Up Normal 88.35 KB? 3074457345618258602 {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-9584) Decommissioning a node on Windows sends the wrong schema change event
[ https://issues.apache.org/jira/browse/CASSANDRA-9584?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14586540#comment-14586540 ] Joshua McKenzie commented on CASSANDRA-9584: Can further confirm that the failing long test that prompted this ticket also passes for me locally: {noformat} test_token_aware (tests.integration.long.test_loadbalancingpolicies.LoadBalancingPolicyTests) ... Started: node1 with pid: 6340 Started: node3 with pid: 3536 Started: node2 with pid: 5396 SUCCESS: The process with PID 5396 has been terminated. Creating session Started: node2 with pid: 6396 SUCCESS: The process with PID 6396 has been terminated. Started: node2 with pid: 4140 SUCCESS: The process with PID 4140 has been terminated. SUCCESS: The process with PID 6340 has been terminated. SUCCESS: The process with PID 3536 has been terminated. ok -- Ran 1 test in 211.360s OK {noformat} I did have to make some modifications to the test as it's hard-coded expecting the tokens to be assigned to node 2 and mine were going to node 3, but after inverting the node2/node3 checks the test passes. Will try it on the Server2012 perf machine as well to confirm. This is with ccm master and cassandra-2.2.0-rc1. If it passes on the server as well I'm going to close this as cannot reproduce as it looks like something's up with the testing environment. Decommissioning a node on Windows sends the wrong schema change event - Key: CASSANDRA-9584 URL: https://issues.apache.org/jira/browse/CASSANDRA-9584 Project: Cassandra Issue Type: Bug Environment: C* 2.2.0-rc1 | python-driver 2.6.0-rc1 | Windows Server 2012 R2 64-bit Reporter: Kishan Karunaratne Assignee: Joshua McKenzie Fix For: 2.2.x Decommissioning a node on Windows sends the wrong schema change event: {noformat} cassandra.connection: DEBUG: Message pushed from server: EventMessage(event_type=u'STATUS_CHANGE', trace_id=None, event _args={'change_type': u'DOWN', 'address': ('127.0.0.2', 9042)}, stream_id=-1) {noformat} On Linux I get the correct event: {noformat} cassandra.connection: DEBUG: Message pushed from server: EventMessage(event_type=u'TOPOLOGY_CHANGE', trace_id=None, event_args={'change_type': u'REMOVED_NODE', 'address': ('127.0.0.2', 9042)}, stream_id=-1) {noformat} We are using ccmlib node.py.decommission() which calls nodetool decommission: {noformat} def decommission(self): self.nodetool(decommission) self.status = Status.DECOMMISIONNED self._update_config() {noformat} Interestingly, it does seem to work (correctly?) on CCM CLI: {noformat} PS C:\Users\Administrator ccm status Cluster: '2.2' -- node1: UP node3: UP node2: UP PS C:\Users\Administrator ccm node1 ring Starting NodeTool Datacenter: datacenter1 == AddressRackStatus State LoadOwns Token 3074457345618258602 127.0.0.1 rack1 Up Normal 62.43 KB? -9223372036854775808 127.0.0.2 rack1 Up Normal 104.87 KB ? -3074457345618258603 127.0.0.3 rack1 Up Normal 83.67 KB? 3074457345618258602 Note: Non-system keyspaces don't have the same replication settings, effective ownership information is meaningless PS C:\Users\Administrator ccm node2 decommission PS C:\Users\Administrator ccm status Cluster: '2.2' -- node1: UP node3: UP node2: DECOMMISIONNED PS C:\Users\Administrator ccm node1 ring Starting NodeTool Datacenter: datacenter1 == AddressRackStatus State LoadOwns Token 3074457345618258602 127.0.0.1 rack1 Up Normal 67.11 KB? -9223372036854775808 127.0.0.3 rack1 Up Normal 88.35 KB? 3074457345618258602 {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-9584) Decommissioning a node on Windows sends the wrong schema change event
[ https://issues.apache.org/jira/browse/CASSANDRA-9584?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14582259#comment-14582259 ] Joshua McKenzie commented on CASSANDRA-9584: So to clarify, it fails through ccmlib, works through ccm on command-line, and ??? with regular nodetool decommission from the command-line? Decommissioning a node on Windows sends the wrong schema change event - Key: CASSANDRA-9584 URL: https://issues.apache.org/jira/browse/CASSANDRA-9584 Project: Cassandra Issue Type: Bug Environment: C* 2.2.0-rc1 | python-driver 2.6.0-rc1 | Windows Server 2012 R2 64-bit Reporter: Kishan Karunaratne Decommissioning a node on Windows sends the wrong schema change event: {noformat} cassandra.connection: DEBUG: Message pushed from server: EventMessage(event_type=u'STATUS_CHANGE', trace_id=None, event _args={'change_type': u'DOWN', 'address': ('127.0.0.2', 9042)}, stream_id=-1) {noformat} On Linux I get the correct event: {noformat} cassandra.connection: DEBUG: Message pushed from server: EventMessage(event_type=u'TOPOLOGY_CHANGE', trace_id=None, event_args={'change_type': u'REMOVED_NODE', 'address': ('127.0.0.2', 9042)}, stream_id=-1) {noformat} We are using ccmlib node.py.decommission() which calls nodetool decommission: {noformat} def decommission(self): self.nodetool(decommission) self.status = Status.DECOMMISIONNED self._update_config() {noformat} Interestingly, it does seem to work (correctly?) on CCM CLI: {noformat} PS C:\Users\Administrator ccm status Cluster: '2.2' -- node1: UP node3: UP node2: UP PS C:\Users\Administrator ccm node1 ring Starting NodeTool Datacenter: datacenter1 == AddressRackStatus State LoadOwns Token 3074457345618258602 127.0.0.1 rack1 Up Normal 62.43 KB? -9223372036854775808 127.0.0.2 rack1 Up Normal 104.87 KB ? -3074457345618258603 127.0.0.3 rack1 Up Normal 83.67 KB? 3074457345618258602 Note: Non-system keyspaces don't have the same replication settings, effective ownership information is meaningless PS C:\Users\Administrator ccm node2 decommission PS C:\Users\Administrator ccm status Cluster: '2.2' -- node1: UP node3: UP node2: DECOMMISIONNED PS C:\Users\Administrator ccm node1 ring Starting NodeTool Datacenter: datacenter1 == AddressRackStatus State LoadOwns Token 3074457345618258602 127.0.0.1 rack1 Up Normal 67.11 KB? -9223372036854775808 127.0.0.3 rack1 Up Normal 88.35 KB? 3074457345618258602 {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-9584) Decommissioning a node on Windows sends the wrong schema change event
[ https://issues.apache.org/jira/browse/CASSANDRA-9584?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14582302#comment-14582302 ] Kishan Karunaratne commented on CASSANDRA-9584: --- After changing the default connect port (7199) to the Windows listen port (7100), I was able to use nodetool from the CLI directly to decommission the first node, 127.0.0.1. While CCM's node status was not updated, I was able to verify via nodetool status that the node no longer exists in the ring. However, the Java process still exists for the decommissioned node. Furthermore, I'm still able to query the decommissioned node through both CCM: {noformat} PS C:\Users\Administrator ccm node1 nodetool status Starting NodeTool Datacenter: datacenter1 Status=Up/Down |/ State=Normal/Leaving/Joining/Moving -- AddressLoad Tokens OwnsHost ID Rack UN 127.0.0.2 62.48 KB 1? 4cb1b80e-a83e-4754-9d1c-80afcfe1cc4a rack1 UN 127.0.0.3 62.48 KB 1? d8dd050d-cf88-4c45-97c4-f785db3a1c56 rack1 Note: Non-system keyspaces don't have the same replication settings, effective ownership information is meaningless PS C:\Users\Administrator ccm node1 ring Starting NodeTool Datacenter: datacenter1 == AddressRackStatus State LoadOwnsToken 3074457345618258602 127.0.0.2 rack1 Up Normal 62.48 KB? -3074457345618258603 127.0.0.3 rack1 Up Normal 62.48 KB? 3074457345618258602 Note: Non-system keyspaces don't have the same replication settings, effective ownership information is meaningless {noformat} and through nodetool directly: {noformat} PS C:\Users\jenkins\git\cassandra\bin .\nodetool -p 7100 -h 127.0.0.1 status Starting NodeTool Datacenter: datacenter1 Status=Up/Down |/ State=Normal/Leaving/Joining/Moving -- AddressLoad Tokens OwnsHost ID Rack UN 127.0.0.2 62.48 KB 1? 4cb1b80e-a83e-4754-9d1c-80afcfe1cc4a rack1 UN 127.0.0.3 62.48 KB 1? d8dd050d-cf88-4c45-97c4-f785db3a1c56 rack1 Note: Non-system keyspaces don't have the same replication settings, effective ownership information is meaningless PS C:\Users\jenkins\git\cassandra\bin .\nodetool -p 7100 -h 127.0.0.1 ring Starting NodeTool Datacenter: datacenter1 == AddressRackStatus State LoadOwnsToken 3074457345618258602 127.0.0.2 rack1 Up Normal 62.48 KB? -3074457345618258603 127.0.0.3 rack1 Up Normal 62.48 KB? 3074457345618258602 Note: Non-system keyspaces don't have the same replication settings, effective ownership information is meaningless {noformat} I wasn't able to decommssion other nodes as I get a: {noformat} nodetool: Failed to connect to '127.0.0.2:7100' - ConnectException: 'Connection refused: connect'. {noformat} Decommissioning a node on Windows sends the wrong schema change event - Key: CASSANDRA-9584 URL: https://issues.apache.org/jira/browse/CASSANDRA-9584 Project: Cassandra Issue Type: Bug Environment: C* 2.2.0-rc1 | python-driver 2.6.0-rc1 | Windows Server 2012 R2 64-bit Reporter: Kishan Karunaratne Decommissioning a node on Windows sends the wrong schema change event: {noformat} cassandra.connection: DEBUG: Message pushed from server: EventMessage(event_type=u'STATUS_CHANGE', trace_id=None, event _args={'change_type': u'DOWN', 'address': ('127.0.0.2', 9042)}, stream_id=-1) {noformat} On Linux I get the correct event: {noformat} cassandra.connection: DEBUG: Message pushed from server: EventMessage(event_type=u'TOPOLOGY_CHANGE', trace_id=None, event_args={'change_type': u'REMOVED_NODE', 'address': ('127.0.0.2', 9042)}, stream_id=-1) {noformat} We are using ccmlib node.py.decommission() which calls nodetool decommission: {noformat} def decommission(self): self.nodetool(decommission) self.status = Status.DECOMMISIONNED self._update_config() {noformat} Interestingly, it does seem to work (correctly?) on CCM CLI: {noformat} PS C:\Users\Administrator ccm status Cluster: '2.2' -- node1: UP node3: UP node2: UP PS C:\Users\Administrator ccm node1 ring Starting NodeTool Datacenter: datacenter1 == AddressRackStatus State LoadOwns Token
[jira] [Commented] (CASSANDRA-9584) Decommissioning a node on Windows sends the wrong schema change event
[ https://issues.apache.org/jira/browse/CASSANDRA-9584?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14582318#comment-14582318 ] Kishan Karunaratne commented on CASSANDRA-9584: --- When I stopped and started the cluster via CCM, the decommissioned node returned to the ring. Is this expected behavior? Decommissioning a node on Windows sends the wrong schema change event - Key: CASSANDRA-9584 URL: https://issues.apache.org/jira/browse/CASSANDRA-9584 Project: Cassandra Issue Type: Bug Environment: C* 2.2.0-rc1 | python-driver 2.6.0-rc1 | Windows Server 2012 R2 64-bit Reporter: Kishan Karunaratne Assignee: Joshua McKenzie Fix For: 2.2.x Decommissioning a node on Windows sends the wrong schema change event: {noformat} cassandra.connection: DEBUG: Message pushed from server: EventMessage(event_type=u'STATUS_CHANGE', trace_id=None, event _args={'change_type': u'DOWN', 'address': ('127.0.0.2', 9042)}, stream_id=-1) {noformat} On Linux I get the correct event: {noformat} cassandra.connection: DEBUG: Message pushed from server: EventMessage(event_type=u'TOPOLOGY_CHANGE', trace_id=None, event_args={'change_type': u'REMOVED_NODE', 'address': ('127.0.0.2', 9042)}, stream_id=-1) {noformat} We are using ccmlib node.py.decommission() which calls nodetool decommission: {noformat} def decommission(self): self.nodetool(decommission) self.status = Status.DECOMMISIONNED self._update_config() {noformat} Interestingly, it does seem to work (correctly?) on CCM CLI: {noformat} PS C:\Users\Administrator ccm status Cluster: '2.2' -- node1: UP node3: UP node2: UP PS C:\Users\Administrator ccm node1 ring Starting NodeTool Datacenter: datacenter1 == AddressRackStatus State LoadOwns Token 3074457345618258602 127.0.0.1 rack1 Up Normal 62.43 KB? -9223372036854775808 127.0.0.2 rack1 Up Normal 104.87 KB ? -3074457345618258603 127.0.0.3 rack1 Up Normal 83.67 KB? 3074457345618258602 Note: Non-system keyspaces don't have the same replication settings, effective ownership information is meaningless PS C:\Users\Administrator ccm node2 decommission PS C:\Users\Administrator ccm status Cluster: '2.2' -- node1: UP node3: UP node2: DECOMMISIONNED PS C:\Users\Administrator ccm node1 ring Starting NodeTool Datacenter: datacenter1 == AddressRackStatus State LoadOwns Token 3074457345618258602 127.0.0.1 rack1 Up Normal 67.11 KB? -9223372036854775808 127.0.0.3 rack1 Up Normal 88.35 KB? 3074457345618258602 {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)