[ https://issues.apache.org/jira/browse/HBASE-21164?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Mingliang Liu updated HBASE-21164: ---------------------------------- Attachment: HBASE-21164.006.patch > reportForDuty should do (expotential) backoff rather than retry every 3 > seconds (default). > ------------------------------------------------------------------------------------------ > > Key: HBASE-21164 > URL: https://issues.apache.org/jira/browse/HBASE-21164 > Project: HBase > Issue Type: Improvement > Components: regionserver > Reporter: stack > Assignee: Mingliang Liu > Priority: Minor > Attachments: HBASE-21164.005.patch, HBASE-21164.006.patch, > HBASE-21164.branch-2.1.001.patch, HBASE-21164.branch-2.1.002.patch, > HBASE-21164.branch-2.1.003.patch, HBASE-21164.branch-2.1.004.patch > > > RegionServers do reportForDuty on startup to tell Master they are available. > If Master is initializing, and especially on a big cluster when it can take a > while particularly if something is amiss, the log every three seconds is > annoying and doesn't do anything of use. Do backoff if fails up to a > reasonable maximum period. Here is example: > {code} > 2018-09-06 14:01:39,312 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: reportForDuty to > master=vc0207.halxg.cloudera.com,22001,1536266763109 with port=22001, > startcode=1536266763109 > 2018-09-06 14:01:39,312 WARN > org.apache.hadoop.hbase.regionserver.HRegionServer: reportForDuty failed; > sleeping and then retrying. > .... > {code} > For example, I am looking at a large cluster now that had a backlog of > procedure WALs. It is taking a couple of hours recreating the procedure-state > because there are millions of procedures outstanding. Meantime, the Master > log is just full of the above message -- every three seconds... -- This message was sent by Atlassian JIRA (v7.6.3#76005)