Jean-Daniel Cryans created KUDU-1934:
----------------------------------------

             Summary: tservers aggressively try to reconnect to masters
                 Key: KUDU-1934
                 URL: https://issues.apache.org/jira/browse/KUDU-1934
             Project: Kudu
          Issue Type: Bug
          Components: tserver
    Affects Versions: 1.3.0
            Reporter: Jean-Daniel Cryans


Related to KUDU-1933, I had mismatched 1.3 snapshots between the master and the 
tservers which caused them to try to reconnect to the master infinitely. Since 
they do it as fast as they can, the logs were quickly full of:

{noformat}
I0307 23:55:21.228502 70832 heartbeater.cc:291] Connected to a master server at 
ve0120.halxg.cloudera.com:7051
I0307 23:55:21.228528 70832 heartbeater.cc:359] Registering TS with master...
I0307 23:55:21.228865 70832 heartbeater.cc:389] Master 
ve0120.halxg.cloudera.com:7051 requested a full tablet report, sending...
W0307 23:55:21.346961 70832 heartbeater.cc:499] Failed to heartbeat to 
ve0120.halxg.cloudera.com:7051: Remote error: Failed to send heartbeat to 
master: Not authorized: invalid CSR: CSR did not contain expected username. 
(CSR: '' RPC: 'kudu')
I0307 23:55:22.347733 70832 heartbeater.cc:291] Connected to a master server at 
ve0120.halxg.cloudera.com:7051
I0307 23:55:22.347757 70832 heartbeater.cc:359] Registering TS with master...
I0307 23:55:22.348042 70832 heartbeater.cc:389] Master 
ve0120.halxg.cloudera.com:7051 requested a full tablet report, sending...
W0307 23:55:22.467021 70832 heartbeater.cc:499] Failed to heartbeat to 
ve0120.halxg.cloudera.com:7051: Remote error: Failed to send heartbeat to 
master: Not authorized: invalid CSR: CSR did not contain expected username. 
(CSR: '' RPC: 'kudu')
{noformat}

Sounds like we should do backoff retries.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to