Cameron Zemek created CASSANDRA-19580:
-----------------------------------------

             Summary: Unable to contact any seeds with node in hibernate status
                 Key: CASSANDRA-19580
                 URL: https://issues.apache.org/jira/browse/CASSANDRA-19580
             Project: Cassandra
          Issue Type: Bug
            Reporter: Cameron Zemek


We have customer running into the error 'Unable to contact any seeds!' . I have 
been able to reproduce this issue if I kill Cassandra as its joining which will 
put the node into hibernate status. Once a node is in hibernate it will no 
longer receive any SYN messages from other nodes during startup and as it sends 
only itself as digest in outbound SYN messages it never receives any states in 
any of the ACK replies. So once it gets to the check `seenAnySeed` in it fails 
as the endpointStateMap is empty.

 

A workaround is copying the system.peers table from other node but this is less 
than ideal. I tested modifying maybeGossipToSeed as follows:
{code:java}
    /* Possibly gossip to a seed for facilitating partition healing */
    private void maybeGossipToSeed(MessageOut<GossipDigestSyn> prod)
    {
        int size = seeds.size();
        if (size > 0)
        {
            if (size == 1 && seeds.contains(FBUtilities.getBroadcastAddress()))
            {
                return;
            }            if (liveEndpoints.size() == 0)
            {
                List<GossipDigest> gDigests = prod.payload.gDigests;
                if (gDigests.size() == 1 && 
gDigests.get(0).endpoint.equals(FBUtilities.getBroadcastAddress()))
                {
                    gDigests = new ArrayList<GossipDigest>();
                    GossipDigestSyn digestSynMessage = new 
GossipDigestSyn(DatabaseDescriptor.getClusterName(),
                                                                           
DatabaseDescriptor.getPartitionerName(),
                                                                           
gDigests);
                    MessageOut<GossipDigestSyn> message = new 
MessageOut<GossipDigestSyn>(MessagingService.Verb.GOSSIP_DIGEST_SYN,
                                                                                
          digestSynMessage,
                                                                                
          GossipDigestSyn.serializer);
                    sendGossip(message, seeds);
                }
                else
                {
                    sendGossip(prod, seeds);
                }
            }
            else
            {
                /* Gossip with the seed with some probability. */
                double probability = seeds.size() / (double) 
(liveEndpoints.size() + unreachableEndpoints.size());
                double randDbl = random.nextDouble();
                if (randDbl <= probability)
                    sendGossip(prod, seeds);
            }
        }
    }
 {code}
Only problem is this is the same as SYN from shadow round. It does resolve the 
issue however as then receive an ACK with all the states.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

Reply via email to