Viraj Jasani created HBASE-29180:
------------------------------------
Summary: Apply fail-fast retry limit for UnknownHostException
Key: HBASE-29180
URL: https://issues.apache.org/jira/browse/HBASE-29180
Project: HBase
Issue Type: Sub-task
Affects Versions: 2.5.11
Reporter: Viraj Jasani
As part of HBASE-28638, fail-fast retry limit has been introduced for errors
like CallQueueTooBigException, SaslException, ConnectionClosedException. This
helps limit the num of retries that RSProcedureDispatcher has to perform while
executing remote procedures. Since the region open/close fails on the remote
server, we also trigger SCP for the target server.
We recently came across UnknownHostException as another example of where the
remote calls can get stuck forever:
{code:java}
WARN [RSProcedureDispatcher-pool-98034] procedure.RSProcedureDispatcher -
request to rs1.xyz,60020,1739254267238 failed due to
java.net.UnknownHostException: Call to address=rs1.xyz:60020 failed on local
exception: java.net.UnknownHostException: rs1.xyz:60020 could not be resolved,
try=2867, retrying... , request params: open_region {
open_info {
region {
...
... {code}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)