[jira] [Updated] (HBASE-21164) reportForDuty should do (expotential) backoff rather than retry every 3 seconds (default).

2018-09-13 Thread Mingliang Liu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21164?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mingliang Liu updated HBASE-21164:
--
Attachment: HBASE-21164.007.patch

> reportForDuty should do (expotential) backoff rather than retry every 3 
> seconds (default).
> --
>
> Key: HBASE-21164
> URL: https://issues.apache.org/jira/browse/HBASE-21164
> Project: HBase
>  Issue Type: Improvement
>  Components: regionserver
>Reporter: stack
>Assignee: Mingliang Liu
>Priority: Minor
> Attachments: HBASE-21164.005.patch, HBASE-21164.006.patch, 
> HBASE-21164.007.patch, HBASE-21164.branch-2.1.001.patch, 
> HBASE-21164.branch-2.1.002.patch, HBASE-21164.branch-2.1.003.patch, 
> HBASE-21164.branch-2.1.004.patch
>
>
> RegionServers do reportForDuty on startup to tell Master they are available. 
> If Master is initializing, and especially on a big cluster when it can take a 
> while particularly if something is amiss, the log every three seconds is 
> annoying and doesn't do anything of use. Do backoff if fails up to a 
> reasonable maximum period. Here is example:
> {code}
> 2018-09-06 14:01:39,312 INFO 
> org.apache.hadoop.hbase.regionserver.HRegionServer: reportForDuty to 
> master=vc0207.halxg.cloudera.com,22001,1536266763109 with port=22001, 
> startcode=1536266763109
> 2018-09-06 14:01:39,312 WARN 
> org.apache.hadoop.hbase.regionserver.HRegionServer: reportForDuty failed; 
> sleeping and then retrying.
> 
> {code}
> For example, I am looking at a large cluster now that had a backlog of 
> procedure WALs. It is taking a couple of hours recreating the procedure-state 
> because there are millions of procedures outstanding. Meantime, the Master 
> log is just full of the above message -- every three seconds...



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-21164) reportForDuty should do (expotential) backoff rather than retry every 3 seconds (default).

2018-09-13 Thread Mingliang Liu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21164?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mingliang Liu updated HBASE-21164:
--
Attachment: HBASE-21164.006.patch

> reportForDuty should do (expotential) backoff rather than retry every 3 
> seconds (default).
> --
>
> Key: HBASE-21164
> URL: https://issues.apache.org/jira/browse/HBASE-21164
> Project: HBase
>  Issue Type: Improvement
>  Components: regionserver
>Reporter: stack
>Assignee: Mingliang Liu
>Priority: Minor
> Attachments: HBASE-21164.005.patch, HBASE-21164.006.patch, 
> HBASE-21164.branch-2.1.001.patch, HBASE-21164.branch-2.1.002.patch, 
> HBASE-21164.branch-2.1.003.patch, HBASE-21164.branch-2.1.004.patch
>
>
> RegionServers do reportForDuty on startup to tell Master they are available. 
> If Master is initializing, and especially on a big cluster when it can take a 
> while particularly if something is amiss, the log every three seconds is 
> annoying and doesn't do anything of use. Do backoff if fails up to a 
> reasonable maximum period. Here is example:
> {code}
> 2018-09-06 14:01:39,312 INFO 
> org.apache.hadoop.hbase.regionserver.HRegionServer: reportForDuty to 
> master=vc0207.halxg.cloudera.com,22001,1536266763109 with port=22001, 
> startcode=1536266763109
> 2018-09-06 14:01:39,312 WARN 
> org.apache.hadoop.hbase.regionserver.HRegionServer: reportForDuty failed; 
> sleeping and then retrying.
> 
> {code}
> For example, I am looking at a large cluster now that had a backlog of 
> procedure WALs. It is taking a couple of hours recreating the procedure-state 
> because there are millions of procedures outstanding. Meantime, the Master 
> log is just full of the above message -- every three seconds...



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-21164) reportForDuty should do (expotential) backoff rather than retry every 3 seconds (default).

2018-09-11 Thread Mingliang Liu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21164?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mingliang Liu updated HBASE-21164:
--
Attachment: HBASE-21164.005.patch

> reportForDuty should do (expotential) backoff rather than retry every 3 
> seconds (default).
> --
>
> Key: HBASE-21164
> URL: https://issues.apache.org/jira/browse/HBASE-21164
> Project: HBase
>  Issue Type: Improvement
>  Components: regionserver
>Reporter: stack
>Assignee: Mingliang Liu
>Priority: Minor
> Attachments: HBASE-21164.005.patch, HBASE-21164.branch-2.1.001.patch, 
> HBASE-21164.branch-2.1.002.patch, HBASE-21164.branch-2.1.003.patch, 
> HBASE-21164.branch-2.1.004.patch
>
>
> RegionServers do reportForDuty on startup to tell Master they are available. 
> If Master is initializing, and especially on a big cluster when it can take a 
> while particularly if something is amiss, the log every three seconds is 
> annoying and doesn't do anything of use. Do backoff if fails up to a 
> reasonable maximum period. Here is example:
> {code}
> 2018-09-06 14:01:39,312 INFO 
> org.apache.hadoop.hbase.regionserver.HRegionServer: reportForDuty to 
> master=vc0207.halxg.cloudera.com,22001,1536266763109 with port=22001, 
> startcode=1536266763109
> 2018-09-06 14:01:39,312 WARN 
> org.apache.hadoop.hbase.regionserver.HRegionServer: reportForDuty failed; 
> sleeping and then retrying.
> 
> {code}
> For example, I am looking at a large cluster now that had a backlog of 
> procedure WALs. It is taking a couple of hours recreating the procedure-state 
> because there are millions of procedures outstanding. Meantime, the Master 
> log is just full of the above message -- every three seconds...



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-21164) reportForDuty should do (expotential) backoff rather than retry every 3 seconds (default).

2018-09-07 Thread Mingliang Liu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21164?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mingliang Liu updated HBASE-21164:
--
Attachment: HBASE-21164.branch-2.1.004.patch

> reportForDuty should do (expotential) backoff rather than retry every 3 
> seconds (default).
> --
>
> Key: HBASE-21164
> URL: https://issues.apache.org/jira/browse/HBASE-21164
> Project: HBase
>  Issue Type: Improvement
>  Components: regionserver
>Reporter: stack
>Assignee: Mingliang Liu
>Priority: Minor
> Attachments: HBASE-21164.branch-2.1.001.patch, 
> HBASE-21164.branch-2.1.002.patch, HBASE-21164.branch-2.1.003.patch, 
> HBASE-21164.branch-2.1.004.patch
>
>
> RegionServers do reportForDuty on startup to tell Master they are available. 
> If Master is initializing, and especially on a big cluster when it can take a 
> while particularly if something is amiss, the log every three seconds is 
> annoying and doesn't do anything of use. Do backoff if fails up to a 
> reasonable maximum period. Here is example:
> {code}
> 2018-09-06 14:01:39,312 INFO 
> org.apache.hadoop.hbase.regionserver.HRegionServer: reportForDuty to 
> master=vc0207.halxg.cloudera.com,22001,1536266763109 with port=22001, 
> startcode=1536266763109
> 2018-09-06 14:01:39,312 WARN 
> org.apache.hadoop.hbase.regionserver.HRegionServer: reportForDuty failed; 
> sleeping and then retrying.
> 
> {code}
> For example, I am looking at a large cluster now that had a backlog of 
> procedure WALs. It is taking a couple of hours recreating the procedure-state 
> because there are millions of procedures outstanding. Meantime, the Master 
> log is just full of the above message -- every three seconds...



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-21164) reportForDuty should do (expotential) backoff rather than retry every 3 seconds (default).

2018-09-07 Thread Mingliang Liu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21164?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mingliang Liu updated HBASE-21164:
--
Attachment: HBASE-21164.branch-2.1.003.patch

> reportForDuty should do (expotential) backoff rather than retry every 3 
> seconds (default).
> --
>
> Key: HBASE-21164
> URL: https://issues.apache.org/jira/browse/HBASE-21164
> Project: HBase
>  Issue Type: Improvement
>  Components: regionserver
>Reporter: stack
>Assignee: Mingliang Liu
>Priority: Minor
> Attachments: HBASE-21164.branch-2.1.001.patch, 
> HBASE-21164.branch-2.1.002.patch, HBASE-21164.branch-2.1.003.patch
>
>
> RegionServers do reportForDuty on startup to tell Master they are available. 
> If Master is initializing, and especially on a big cluster when it can take a 
> while particularly if something is amiss, the log every three seconds is 
> annoying and doesn't do anything of use. Do backoff if fails up to a 
> reasonable maximum period. Here is example:
> {code}
> 2018-09-06 14:01:39,312 INFO 
> org.apache.hadoop.hbase.regionserver.HRegionServer: reportForDuty to 
> master=vc0207.halxg.cloudera.com,22001,1536266763109 with port=22001, 
> startcode=1536266763109
> 2018-09-06 14:01:39,312 WARN 
> org.apache.hadoop.hbase.regionserver.HRegionServer: reportForDuty failed; 
> sleeping and then retrying.
> 
> {code}
> For example, I am looking at a large cluster now that had a backlog of 
> procedure WALs. It is taking a couple of hours recreating the procedure-state 
> because there are millions of procedures outstanding. Meantime, the Master 
> log is just full of the above message -- every three seconds...



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-21164) reportForDuty should do (expotential) backoff rather than retry every 3 seconds (default).

2018-09-07 Thread stack (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21164?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-21164:
--
Status: Patch Available  (was: Open)

> reportForDuty should do (expotential) backoff rather than retry every 3 
> seconds (default).
> --
>
> Key: HBASE-21164
> URL: https://issues.apache.org/jira/browse/HBASE-21164
> Project: HBase
>  Issue Type: Improvement
>  Components: regionserver
>Reporter: stack
>Assignee: Mingliang Liu
>Priority: Minor
> Attachments: HBASE-21164.branch-2.1.001.patch, 
> HBASE-21164.branch-2.1.002.patch
>
>
> RegionServers do reportForDuty on startup to tell Master they are available. 
> If Master is initializing, and especially on a big cluster when it can take a 
> while particularly if something is amiss, the log every three seconds is 
> annoying and doesn't do anything of use. Do backoff if fails up to a 
> reasonable maximum period. Here is example:
> {code}
> 2018-09-06 14:01:39,312 INFO 
> org.apache.hadoop.hbase.regionserver.HRegionServer: reportForDuty to 
> master=vc0207.halxg.cloudera.com,22001,1536266763109 with port=22001, 
> startcode=1536266763109
> 2018-09-06 14:01:39,312 WARN 
> org.apache.hadoop.hbase.regionserver.HRegionServer: reportForDuty failed; 
> sleeping and then retrying.
> 
> {code}
> For example, I am looking at a large cluster now that had a backlog of 
> procedure WALs. It is taking a couple of hours recreating the procedure-state 
> because there are millions of procedures outstanding. Meantime, the Master 
> log is just full of the above message -- every three seconds...



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-21164) reportForDuty should do (expotential) backoff rather than retry every 3 seconds (default).

2018-09-07 Thread Mingliang Liu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21164?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mingliang Liu updated HBASE-21164:
--
Attachment: HBASE-21164.branch-2.1.002.patch

> reportForDuty should do (expotential) backoff rather than retry every 3 
> seconds (default).
> --
>
> Key: HBASE-21164
> URL: https://issues.apache.org/jira/browse/HBASE-21164
> Project: HBase
>  Issue Type: Improvement
>  Components: regionserver
>Reporter: stack
>Assignee: Mingliang Liu
>Priority: Minor
> Attachments: HBASE-21164.branch-2.1.001.patch, 
> HBASE-21164.branch-2.1.002.patch
>
>
> RegionServers do reportForDuty on startup to tell Master they are available. 
> If Master is initializing, and especially on a big cluster when it can take a 
> while particularly if something is amiss, the log every three seconds is 
> annoying and doesn't do anything of use. Do backoff if fails up to a 
> reasonable maximum period. Here is example:
> {code}
> 2018-09-06 14:01:39,312 INFO 
> org.apache.hadoop.hbase.regionserver.HRegionServer: reportForDuty to 
> master=vc0207.halxg.cloudera.com,22001,1536266763109 with port=22001, 
> startcode=1536266763109
> 2018-09-06 14:01:39,312 WARN 
> org.apache.hadoop.hbase.regionserver.HRegionServer: reportForDuty failed; 
> sleeping and then retrying.
> 
> {code}
> For example, I am looking at a large cluster now that had a backlog of 
> procedure WALs. It is taking a couple of hours recreating the procedure-state 
> because there are millions of procedures outstanding. Meantime, the Master 
> log is just full of the above message -- every three seconds...



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-21164) reportForDuty should do (expotential) backoff rather than retry every 3 seconds (default).

2018-09-06 Thread stack (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21164?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-21164:
--
Description: 
RegionServers do reportForDuty on startup to tell Master they are available. If 
Master is initializing, and especially on a big cluster when it can take a 
while particularly if something is amiss, the log every three seconds is 
annoying and doesn't do anything of use. Do backoff if fails up to a reasonable 
maximum period. Here is example:

{code}
2018-09-06 14:01:39,312 INFO 
org.apache.hadoop.hbase.regionserver.HRegionServer: reportForDuty to 
master=vc0207.halxg.cloudera.com,22001,1536266763109 with port=22001, 
startcode=1536266763109
2018-09-06 14:01:39,312 WARN 
org.apache.hadoop.hbase.regionserver.HRegionServer: reportForDuty failed; 
sleeping and then retrying.

{code}

For example, I am looking at a large cluster now that had a backlog of 
procedure WALs. It is taking a couple of hours recreating the procedure-state 
because there are millions of procedures outstanding. Meantime, the Master log 
is just full of the above message -- every three seconds...

  was:
RegionServers do reportForDuty on startup to tell Master they are available. If 
Master is initializing, and especially on a big cluster when it can take a 
while particularly if something is amiss, the log every three seconds is 
annoying and doesn't do anything of use. Do backoff if fails up to a reasonable 
maximum period. Here is example:

{code}
2018-09-06 14:01:39,312 INFO 
org.apache.hadoop.hbase.regionserver.HRegionServer: reportForDuty to 
master=vc0207.halxg.cloudera.com,22001,1536266763109 with port=22001, 
startcode=1536266763109
2018-09-06 14:01:39,312 WARN 
org.apache.hadoop.hbase.regionserver.HRegionServer: reportForDuty failed; 
sleeping and then retrying.

{code}


> reportForDuty should do (expotential) backoff rather than retry every 3 
> seconds (default).
> --
>
> Key: HBASE-21164
> URL: https://issues.apache.org/jira/browse/HBASE-21164
> Project: HBase
>  Issue Type: Improvement
>  Components: regionserver
>Reporter: stack
>Assignee: Mingliang Liu
>Priority: Minor
> Attachments: HBASE-21164.branch-2.1.001.patch
>
>
> RegionServers do reportForDuty on startup to tell Master they are available. 
> If Master is initializing, and especially on a big cluster when it can take a 
> while particularly if something is amiss, the log every three seconds is 
> annoying and doesn't do anything of use. Do backoff if fails up to a 
> reasonable maximum period. Here is example:
> {code}
> 2018-09-06 14:01:39,312 INFO 
> org.apache.hadoop.hbase.regionserver.HRegionServer: reportForDuty to 
> master=vc0207.halxg.cloudera.com,22001,1536266763109 with port=22001, 
> startcode=1536266763109
> 2018-09-06 14:01:39,312 WARN 
> org.apache.hadoop.hbase.regionserver.HRegionServer: reportForDuty failed; 
> sleeping and then retrying.
> 
> {code}
> For example, I am looking at a large cluster now that had a backlog of 
> procedure WALs. It is taking a couple of hours recreating the procedure-state 
> because there are millions of procedures outstanding. Meantime, the Master 
> log is just full of the above message -- every three seconds...



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-21164) reportForDuty should do (expotential) backoff rather than retry every 3 seconds (default).

2018-09-06 Thread stack (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21164?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-21164:
--
Summary: reportForDuty should do (expotential) backoff rather than retry 
every 3 seconds (default).  (was: reportForDuty should do backoff rather than 
retry every 3 seconds (default).)

> reportForDuty should do (expotential) backoff rather than retry every 3 
> seconds (default).
> --
>
> Key: HBASE-21164
> URL: https://issues.apache.org/jira/browse/HBASE-21164
> Project: HBase
>  Issue Type: Improvement
>  Components: regionserver
>Reporter: stack
>Assignee: Mingliang Liu
>Priority: Minor
> Attachments: HBASE-21164.branch-2.1.001.patch
>
>
> RegionServers do reportForDuty on startup to tell Master they are available. 
> If Master is initializing, and especially on a big cluster when it can take a 
> while particularly if something is amiss, the log every three seconds is 
> annoying and doesn't do anything of use. Do backoff if fails up to a 
> reasonable maximum period. Here is example:
> {code}
> 2018-09-06 14:01:39,312 INFO 
> org.apache.hadoop.hbase.regionserver.HRegionServer: reportForDuty to 
> master=vc0207.halxg.cloudera.com,22001,1536266763109 with port=22001, 
> startcode=1536266763109
> 2018-09-06 14:01:39,312 WARN 
> org.apache.hadoop.hbase.regionserver.HRegionServer: reportForDuty failed; 
> sleeping and then retrying.
> 
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)