Wellington Chevreuil created HBASE-20586:
--------------------------------------------

             Summary: SyncTable tool: Add support for cross-realm remote 
clusters
                 Key: HBASE-20586
                 URL: https://issues.apache.org/jira/browse/HBASE-20586
             Project: HBase
          Issue Type: Improvement
          Components: mapreduce
            Reporter: Wellington Chevreuil
            Assignee: Wellington Chevreuil


One possible scenario for HashTable/SyncTable is for synchronize different 
clusters, for instance, when replication has been enabled but data existed 
already, or due replication issues that may had caused long lags in the 
replication.

For secured clusters under different kerberos realms (with cross-realm properly 
set), though, current SyncTable version would fail to authenticate with the 
remote cluster when trying to read HashTable outputs (when *sourcehashdir* is 
remote) and also when trying to read table data on the remote cluster (when 
*sourcezkcluster* is remote).

The hdfs error would look like this:
{noformat}
INFO mapreduce.Job: Task Id : attempt_1524358175778_105392_m_000000_0, Status : 
FAILED

Error: java.io.IOException: Failed on local exception: java.io.IOException: 
org.apache.hadoop.security.AccessControlException: Client cannot authenticate 
via:[TOKEN, KERBEROS]; Host Details : local host is: "local-host/1.1.1.1"; 
destination host is: "remote-nn":8020;
        at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:772)
        at org.apache.hadoop.ipc.Client.call(Client.java:1506)
        at org.apache.hadoop.ipc.Client.call(Client.java:1439)
        at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:230)
        at com.sun.proxy.$Proxy13.getBlockLocations(Unknown Source)
        at 
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getBlockLocations(ClientNamenodeProtocolTranslatorPB.java:256)
...
        at 
org.apache.hadoop.hbase.mapreduce.HashTable$TableHash.readPropertiesFile(HashTable.java:144)
        at 
org.apache.hadoop.hbase.mapreduce.HashTable$TableHash.read(HashTable.java:105)
        at 
org.apache.hadoop.hbase.mapreduce.SyncTable$SyncMapper.setup(SyncTable.java:188)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:142)
...
Caused by: java.io.IOException: 
org.apache.hadoop.security.AccessControlException: Client cannot authenticate 
via:[TOKEN, KERBEROS]{noformat}
The above can be sorted if the SyncTable job acquires a DT for the remote NN. 
Once hdfs related authentication is done, it's also necessary to authenticate 
against remote HBase, as the below error would arise:
{noformat}
INFO mapreduce.Job: Task Id : attempt_1524358175778_172414_m_000000_0, Status : 
FAILED
Error: org.apache.hadoop.hbase.client.RetriesExhaustedException: Can't get the 
location
at 
org.apache.hadoop.hbase.client.RpcRetryingCallerWithReadReplicas.getRegionLocations(RpcRetryingCallerWithReadReplicas.java:326)
...
at org.apache.hadoop.hbase.client.HTable.getScanner(HTable.java:867)
at 
org.apache.hadoop.hbase.mapreduce.SyncTable$SyncMapper.syncRange(SyncTable.java:331)
...
Caused by: java.io.IOException: Could not set up IO Streams to 
remote-rs-host/1.1.1.2:60020
at 
org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection.setupIOstreams(RpcClientImpl.java:786)
...
Caused by: java.lang.RuntimeException: SASL authentication failed. The most 
likely cause is missing or invalid credentials. Consider 'kinit'.
...
Caused by: GSSException: No valid credentials provided (Mechanism level: Failed 
to find any Kerberos tgt)
...{noformat}
The above would need additional authentication logic against the remote hbase 
cluster.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to