[ 
https://issues.apache.org/jira/browse/HBASE-8675?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15021535#comment-15021535
 ] 

Sumit Nigam commented on HBASE-8675:
------------------------------------

I'd like to understand that is it guaranteed to be Kerberos being unreachable 
issue? I have similar problem but my error message is:

15/11/15 15:46:53 ERROR client.ZooKeeperSaslClient: An error: 
(java.security.PrivilegedActionException: javax.security.sasl.SaslException: 
GSS initiate failed [Caused by GSSException: No valid credentials provided 
(Mechanism level: Connection reset)]) occurred when evaluating Zookeeper Quorum 
Member's  received SASL token. Zookeeper Client will go to AUTH_FAILED state.
15/11/15 15:46:53 ERROR zookeeper.ClientCnxn: SASL authentication with 
Zookeeper Quorum member failed: javax.security.sasl.SaslException: An error: 
(java.security.PrivilegedActionException: javax.security.sasl.SaslException: 
GSS initiate failed [Caused by GSSException: No valid credentials provided 
(Mechanism level: Connection reset)]) occurred when evaluating Zookeeper Quorum 
Member's  received SASL token. Zookeeper Client will go to AUTH_FAILED state.


The mechanism level points to connection reset. Is that error being reported 
for kerberos server or for zookeeper client's inability to connect with 
zookeeper quorum?

> Two active Hmasters for AUTH_FAILED in secure hbase cluster
> -----------------------------------------------------------
>
>                 Key: HBASE-8675
>                 URL: https://issues.apache.org/jira/browse/HBASE-8675
>             Project: HBase
>          Issue Type: Bug
>          Components: master
>            Reporter: Liu Shaohui
>            Priority: Critical
>         Attachments: HBASE-8675-0.94-v1.patch
>
>
> In our product cluster, because of the net problem to kerberos server, the 
> ZooKeeperWatcher in active hmaster fails to Auth , gets a connection Event of 
> AUTH_FAILED  and loose the master lock. But the zookeeper watcher ignores the 
> event, so the old active hmaster keeps to be active. After the net problem is 
> fixed, the backup hmaster gets the master lock and becomes active. There are 
> two two active hmasters in the cluster.
> 2013-05-30 09:44:21,004 ERROR 
> org.apache.zookeeper.client.ZooKeeperSaslClient: An error: 
> (java.security.PrivilegedActionException: javax.security.sasl.SaslException: 
> GSS initiate failed [Caused by GSSException: No valid credentials provided 
> (Mechanism level: krb1.xiaomi.net)]) occurred when evaluating Zookeeper 
> Quorum Member's  received SASL token. Zookeeper Client will go to AUTH_FAILED 
> state.
> 2013-05-30 09:54:07,755 WARN org.apache.hadoop.hbase.zookeeper.ZKUtil: 
> hconnection-0x3e10d98be405bc Unable to set watcher on znode /hbase/master
> org.apache.zookeeper.KeeperException$AuthFailedException: KeeperErrorCode = 
> AuthFailed for /hbase/master
>         at 
> org.apache.zookeeper.KeeperException.create(KeeperException.java:123)
>         at 
> org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
>         at org.apache.zookeeper.ZooKeeper.exists(ZooKeeper.java:1036)
>         at 
> org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.exists(RecoverableZooKeeper.java:166)
>         at 
> org.apache.hadoop.hbase.zookeeper.ZKUtil.watchAndCheckExists(ZKUtil.java:231)
>         at 
> org.apache.hadoop.hbase.zookeeper.ZooKeeperNodeTracker.start(ZooKeeperNodeTracker.java:76)
>         at 
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.ensureZookeeperTrackers(HConnectionManager.java:595)
>         at 
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegion(HConnectionManager.java:850)
>         at 
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegion(HConnectionManager.java:825)
>         at org.apache.hadoop.hbase.client.HTable.finishSetup(HTable.java:286)
>         at org.apache.hadoop.hbase.client.HTable.<init>(HTable.java:201)
>         at 
> org.apache.hadoop.hbase.catalog.MetaReader.getHTable(MetaReader.java:200)
>         at 
> org.apache.hadoop.hbase.catalog.MetaReader.getMetaHTable(MetaReader.java:226)
>         at 
> org.apache.hadoop.hbase.catalog.MetaReader.fullScan(MetaReader.java:705)
>         at 
> org.apache.hadoop.hbase.catalog.MetaReader.fullScan(MetaReader.java:183)
>         at 
> org.apache.hadoop.hbase.catalog.MetaReader.fullScan(MetaReader.java:168)
>         at 
> org.apache.hadoop.hbase.master.CatalogJanitor.getSplitParents(CatalogJanitor.java:123)
>         at 
> org.apache.hadoop.hbase.master.CatalogJanitor.scan(CatalogJanitor.java:134)
>         at 
> org.apache.hadoop.hbase.master.CatalogJanitor.chore(CatalogJanitor.java:92)
>         at org.apache.hadoop.hbase.Chore.run(Chore.java:67)
>         at java.lang.Thread.run(Thread.java:662)
> I want to just abort the hmaster server if AuthFailed or SaslAuthenticated. 
> Any better idea about this issue? 
> For ZookeeperWatcher is used in many classes, will the aborting will bring 
> more problems? Any more problems we need consider? 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to