[jira] [Commented] (CASSANDRA-18555) A new nodetool/JMX command that tells whether node's decommission failed or not

Stefan Miklosovic (Jira) Wed, 14 Jun 2023 09:41:04 -0700


    [ 
https://issues.apache.org/jira/browse/CASSANDRA-18555?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17732630#comment-17732630
 ]


Stefan Miklosovic commented on CASSANDRA-18555:
-----------------------------------------------

I will go with mine. I have few cosmetic fixes to add.

For example, if we do this in the console:
{code:java}
$ nodetool decommission
$ nodetool decommission{code}
If the first invocation finishes OK, I think we should log on the second 
invocation that it is already decommissioned. What value there is in executing 
decommissioning after it is successfully decommissioned?

With the current patch, the second operation returns immediately. But a user is 
not notified about the fact it was decommissioned already.

On the other hand if we want to follow *nix principles, this should not emit 
anything as, technically, it decommissioned for the second time too. But I 
think that a user may find it handy to know that it was already decommissioned 
so executing it for the second time will not achieve anything.

> A new nodetool/JMX command that tells whether node's decommission failed or 
> not
> -------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-18555
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-18555
>             Project: Cassandra
>          Issue Type: Task
>          Components: Observability/JMX
>            Reporter: Jaydeepkumar Chovatia
>            Assignee: Jaydeepkumar Chovatia
>            Priority: Normal
>          Time Spent: 3h 40m
>  Remaining Estimate: 0h
>
> Currently, when a node is being decommissioned and if any failure happens, 
> then an exception is thrown back to the caller.
> But Cassandra's decommission takes considerable time ranging from minutes to 
> hours to days. There are various scenarios in that the caller may need to 
> probe the status again:
>  * The caller times out
>  * It is not possible to keep the caller hanging for such a long time
> And If the caller does not know what happened internally, then it cannot 
> retry, etc., leading to other issues.
> So, in this ticket, I am going to add a new nodetool/JMX command that can be 
> invoked by the caller anytime, and it will return the correct status.
> It might look like a smaller change, but when we need to operate Cassandra at 
> scale in a large-scale fleet, then this becomes a bottleneck and require 
> constant operator intervention.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-18555) A new nodetool/JMX command that tells whether node's decommission failed or not

Reply via email to