[ 
https://issues.apache.org/jira/browse/CASSANDRA-6590?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13892699#comment-13892699
 ] 

Brandon Williams commented on CASSANDRA-6590:
---------------------------------------------

I'm not sure why the block in handleMajorStateChange, but because the endpoint 
state is added before that the check for it will never be null, so it always 
says the node restarted (and we should keep the 'UP' message there to keep it 
easy to look for) even though it's the first time it's been seen.

I think the if (!localState.isAlive()) check is problematic, because while it 
got rid of the repeated UP messages, it also seem to introduce a race situation 
where sometimes some nodes would end up in a cluster by themselves.  I briefly 
tried making Echo verbs droppable in CASSANDRA-6661 instead, but that didn't 
help, so I'm not sure why we're seemingly building these requests up, or if 
something else is making realMarkAlive fire so much.

Finally, I think we'll need a separate yaml option, since removing things in a 
minor is kind of mean to upgraders who don't catch it and their server won't 
start.



> Gossip does not heal after a temporary partition at startup
> -----------------------------------------------------------
>
>                 Key: CASSANDRA-6590
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-6590
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>            Reporter: Brandon Williams
>            Assignee: Vijay
>             Fix For: 2.0.6
>
>         Attachments: 0001-CASSANDRA-6590.patch, 0001-logging-for-6590.patch, 
> 6590_disable_echo.txt
>
>
> See CASSANDRA-6571 for background.  If a node is partitioned on startup when 
> the echo command is sent, but then the partition heals, the halves of the 
> partition will never mark each other up despite being able to communicate.  
> This stems from CASSANDRA-3533.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

Reply via email to