Hi Guys,

I am using hbase 1.2.0 on a kerberos secured cloudera CDH 5.8 cluster.
I have a persistant application that authenticates using keytab and creates
hbase connection. Our code also takes care of reauthentication and
recreating broken connectiion.
The code worked fine in previous versions of hbase. However what we see
with Hbase 1.2 is that after 24 hours the hbase connection does not work
giving following error

org.apache.hadoop.hbase.client.RetriesExhaustedException: Failed after
attempts=2, exceptions:
Tue Feb 13 12:57:51 PST 2018,
RpcRetryingCaller{globalStartTime=1518555467140, pause=100, retries=2},
org.apache.hadoop.hbase.exceptions.ConnectionClosingException: Call to
pdmcdh01.xyz.com/192.168.145.62:60020 failed on local exception:
org.apache.hadoop.hbase.exceptions.ConnectionClosingException: Connection
to pdmcdh01.xyz.com/192.168.145.62:60020 is closing. Call id=137,
waitTime=11
Tue Feb 13 12:58:01 PST 2018,
RpcRetryingCaller{globalStartTime=1518555467140, pause=100, retries=2},
org.apache.hadoop.hbase.exceptions.ConnectionClosingException: Call to
pdmcdh01.xyz.com/192.168.145.62:60020 failed on local exception:
org.apache.hadoop.hbase.exceptions.ConnectionClosingException: Connection
to pdmcdh01.xyz.com/192.168.145.62:60020 is closing. Call id=139,
waitTime=13

        at
org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(RpcRetryingCaller.java:147)
        at org.apache.hadoop.hbase.client.HTable.get(HTable.java:935)
        at org.apache.hadoop.hbase.client.HTable.get(HTable.java:901)
Our code reauthnticates and creates connection again but it still keeps
failing
org.apache.hadoop.hbase.client.RetriesExhaustedException: Failed after
attempts=2, exceptions:
Wed Feb 21 14:30:31 PST 2018,
RpcRetryingCaller{globalStartTime=1519252219159, pause=100, retries=2},
java.io.IOException: Couldn't setup connection for p...@hadoop.xyz.com to
hbase/pdmcdh01.xyz....@hadoop.xyz.com
Wed Feb 21 14:30:31 PST 2018,
RpcRetryingCaller{globalStartTime=1519252219159, pause=100, retries=2},
org.apache.hadoop.hbase.ipc.FailedServerException: This server is in the
failed servers list: pdmcdh01.xyz.com/192.168.145.62:60020

        at
org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(RpcRetryingCaller.java:147)
        at org.apache.hadoop.hbase.client.HTable.get(HTable.java:935)
        at org.apache.hadoop.hbase.client.HTable.get(HTable.java:901)
I know that client keeps server in the failed list for few seconds in order
to reduce too many connection attempts. So I waited and tried after some
time but still same error.
Once we restart our application everything starts working fine again for
next 24 hours.

This 24 hours gap indicates that it could be something related to Kerberos
ticket expiry time, however there is no log to indicate Kerberos
authentication issue.
Moreover we are handling the exception and trying to authenticate and
create connection again but nothing works until we restart JVM. this is
very strange.

I would really appreciate any help or pointers on this issue.

Thanks a lot
Apratim

Reply via email to