[ https://issues.apache.org/jira/browse/CASSANDRA-18555?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17732609#comment-17732609 ]
Stefan Miklosovic edited comment on CASSANDRA-18555 at 6/14/23 4:18 PM: ------------------------------------------------------------------------ AFAIK that property is there if you want to join a node which was decommissioned. I am talking about decommissioning a node which you decommissioned and you want to execute decommissioning again - all of this is done while Cassandra process is still running. The trunk's logic is like this: {code:java} private void prepareToJoin() throws ConfigurationException { if (!joined) { Map<ApplicationState, VersionedValue> appStates = new EnumMap<>(ApplicationState.class); if (SystemKeyspace.wasDecommissioned()) { if (OVERRIDE_DECOMMISSION.getBoolean()) { logger.warn("This node was decommissioned, but overriding by operator request."); SystemKeyspace.setBootstrapState(SystemKeyspace.BootstrapState.COMPLETED); } else { throw new ConfigurationException("This node was decommissioned and will not rejoin the ring unless -D" + OVERRIDE_DECOMMISSION.getKey() + "=true has been set, or all existing data is removed and the node is bootstrapped again"); } } {code} was (Author: smiklosovic): AFAIK that property is there if you want to join a node which was decommissioned. I am talking about decommissioning a node which you decommissioned and you want to execute decommissioning again. The trunk's logic is like this: {code:java} private void prepareToJoin() throws ConfigurationException { if (!joined) { Map<ApplicationState, VersionedValue> appStates = new EnumMap<>(ApplicationState.class); if (SystemKeyspace.wasDecommissioned()) { if (OVERRIDE_DECOMMISSION.getBoolean()) { logger.warn("This node was decommissioned, but overriding by operator request."); SystemKeyspace.setBootstrapState(SystemKeyspace.BootstrapState.COMPLETED); } else { throw new ConfigurationException("This node was decommissioned and will not rejoin the ring unless -D" + OVERRIDE_DECOMMISSION.getKey() + "=true has been set, or all existing data is removed and the node is bootstrapped again"); } } {code} > A new nodetool/JMX command that tells whether node's decommission failed or > not > ------------------------------------------------------------------------------- > > Key: CASSANDRA-18555 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18555 > Project: Cassandra > Issue Type: Task > Components: Observability/JMX > Reporter: Jaydeepkumar Chovatia > Assignee: Jaydeepkumar Chovatia > Priority: Normal > Time Spent: 3h 40m > Remaining Estimate: 0h > > Currently, when a node is being decommissioned and if any failure happens, > then an exception is thrown back to the caller. > But Cassandra's decommission takes considerable time ranging from minutes to > hours to days. There are various scenarios in that the caller may need to > probe the status again: > * The caller times out > * It is not possible to keep the caller hanging for such a long time > And If the caller does not know what happened internally, then it cannot > retry, etc., leading to other issues. > So, in this ticket, I am going to add a new nodetool/JMX command that can be > invoked by the caller anytime, and it will return the correct status. > It might look like a smaller change, but when we need to operate Cassandra at > scale in a large-scale fleet, then this becomes a bottleneck and require > constant operator intervention. -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org