Viraj Jasani created HBASE-29180:
------------------------------------

             Summary: Apply fail-fast retry limit for UnknownHostException
                 Key: HBASE-29180
                 URL: https://issues.apache.org/jira/browse/HBASE-29180
             Project: HBase
          Issue Type: Sub-task
    Affects Versions: 2.5.11
            Reporter: Viraj Jasani


As part of HBASE-28638, fail-fast retry limit has been introduced for errors 
like CallQueueTooBigException, SaslException, ConnectionClosedException. This 
helps limit the num of retries that RSProcedureDispatcher has to perform while 
executing remote procedures. Since the region open/close fails on the remote 
server, we also trigger SCP for the target server.

We recently came across UnknownHostException as another example of where the 
remote calls can get stuck forever:
{code:java}
WARN  [RSProcedureDispatcher-pool-98034] procedure.RSProcedureDispatcher - 
request to rs1.xyz,60020,1739254267238 failed due to 
java.net.UnknownHostException: Call to address=rs1.xyz:60020 failed on local 
exception: java.net.UnknownHostException: rs1.xyz:60020 could not be resolved, 
try=2867, retrying... , request params: open_region {
  open_info {
    region {
...
... {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to