[jira] [Commented] (CASSANDRA-4162) nodetool disablegossip does not prevent gossip delivery of writes via already-initiated hinted handoff

Robert Coli (Commented) (JIRA) Wed, 18 Apr 2012 10:09:13 -0700

    [ 
https://issues.apache.org/jira/browse/CASSANDRA-4162?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13256705#comment-13256705
 ]


Robert Coli commented on CASSANDRA-4162:
----------------------------------------

> Restarting with -Dcassandra.join_ring=false will do that.

It will also result in the paying of sizable startup penalty, far more severe 
in Cassandra than in most other databases. I can only speak for myself, but I 
don't want to pay a startup penalty (which can in real world be, say, a half 
hour of clock time!) if I don't have to. I think most operators who use 
"disablegossip" and "disablethrift" have a goal of removing a node from the 
cluster while keeping it running, in order to avoid this startup penalty.

While I now understand that "dead" has a very specific meaning in cassandra 
which relates only to Gossip state, I think it is unambiguous that, given the 
typical semantic meaning of "dead" and "alive", people do not expect a "dead" 
node to be accepting writes. As explicated in "The Princess Bride," there is a 
significant difference between "mostly dead" and "all dead."

"
Miracle Max: Whoo-hoo-hoo, look who knows so much. It just so happens that your 
friend here is only MOSTLY dead. There's a big difference between mostly dead 
and all dead. Mostly dead is slightly alive. With all dead, well, with all dead 
there's usually only one thing you can do. 

Inigo Montoya: What's that? 

Miracle Max: Go through his clothes and look for loose change.
"

My goal with this ticket is to establish the best practice for an operator who 
wants to make sure his node is not receiving traffic, but is still up and 
capable of compacting or rejoining the cluster without paying startup penalty. 
It seems so far that the best solution is to use iptables to firewall off port 
7000. 

It is difficult to understand the purpose of "disablethrift" and 
"disablegossip" if the combination of the two does not render the node "all 
dead." I believe most operators will expect them to render a node "all dead." 
At the very minimum, it seems inappropriate to state in the help that nodetool 
disablegossip renders a node "dead" when in fact it renders it "mostly dead."
                
> nodetool disablegossip does not prevent gossip delivery of writes via 
> already-initiated hinted handoff
> ------------------------------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-4162
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-4162
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>    Affects Versions: 1.0.9
>         Environment: reported on IRC, believe it was a linux environment, 
> nick "rhone", cassandra 1.0.8
>            Reporter: Robert Coli
>            Priority: Minor
>              Labels: gossip
>
> This ticket derives from #cassandra, aaron_morton and I assisted a user who 
> had run "disablethrift" and "disablegossip" and was confused as to why he was 
> seeing writes to his node.
> Aaron and I went through a series of debugging questions, user verified that 
> there was traffic on the gossip port. His node was showing as down from the 
> perspective of other nodes, and nodetool also showed that gossip was not 
> active.
> Aaron read the code and had the user turn debug logging on. The user saw 
> Hinted Handoff messages being delivered and Aaron confirmed in the code that 
> a hinted handoff delivery session only checks gossip state when it first 
> starts. As a result, it will continue to deliver hints and disregard gossip 
> state on the target node.
> per nodetool docs
> "
> disablegossip          - Disable gossip (effectively marking the node dead)
> "
> I believe most people will be using disablegossip and disablethrift for 
> operational reasons, and propose that they do not expect HH delivery to 
> continue, via gossip, when they have run "disablegossip".

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-4162) nodetool disablegossip does not prevent gossip delivery of writes via already-initiated hinted handoff

Reply via email to