[ https://issues.apache.org/jira/browse/ZOOKEEPER-4334?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Emil Kleszcz updated ZOOKEEPER-4334: ------------------------------------ Description: I faced an issue while trying to use alternative aliases with Zookeeper quorum when SASL is enabled. The errors I get in zookeeper log are the following: ``` 2021-07-12 21:04:46,437 [myid:3] - WARN [NIOWorkerThread-3:ZooKeeperServer@1661] - Client /<IP addr>:37368 failed to SASL authenticate: {} javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: Failure unspecified at GSS-API level (Mechanism level: Checksum failed)] at com.sun.security.sasl.gsskerb.GssKrb5Server.evaluateResponse(GssKrb5Server.java:199) at org.apache.zookeeper.server.ZooKeeperSaslServer.evaluateResponse(ZooKeeperSaslServer.java:49) at org.apache.zookeeper.server.ZooKeeperServer.processSasl(ZooKeeperServer.java:1650) at org.apache.zookeeper.server.ZooKeeperServer.processPacket(ZooKeeperServer.java:1599) at org.apache.zookeeper.server.NIOServerCnxn.readRequest(NIOServerCnxn.java:379) at org.apache.zookeeper.server.NIOServerCnxn.readPayload(NIOServerCnxn.java:182) at org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:339) at org.apache.zookeeper.server.NIOServerCnxnFactory$IOWorkRequest.doWork(NIOServerCnxnFactory.java:522) at org.apache.zookeeper.server.WorkerService$ScheduledWorkRequest.run(WorkerService.java:154) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) Caused by: GSSException: Failure unspecified at GSS-API level (Mechanism level: Checksum failed) at sun.security.jgss.krb5.Krb5Context.acceptSecContext(Krb5Context.java:856) at sun.security.jgss.GSSContextImpl.acceptSecContext(GSSContextImpl.java:342) at sun.security.jgss.GSSContextImpl.acceptSecContext(GSSContextImpl.java:285) at com.sun.security.sasl.gsskerb.GssKrb5Server.evaluateResponse(GssKrb5Server.java:167) ... 11 more Caused by: KrbException: Checksum failed at sun.security.krb5.internal.crypto.Aes256CtsHmacSha1EType.decrypt(Aes256CtsHmacSha1EType.java:102) at sun.security.krb5.internal.crypto.Aes256CtsHmacSha1EType.decrypt(Aes256CtsHmacSha1EType.java:94) at sun.security.krb5.EncryptedData.decrypt(EncryptedData.java:175) at sun.security.krb5.KrbApReq.authenticate(KrbApReq.java:281) at sun.security.krb5.KrbApReq.<init>(KrbApReq.java:149) at sun.security.jgss.krb5.InitSecContextToken.<init>(InitSecContextToken.java:108) at sun.security.jgss.krb5.Krb5Context.acceptSecContext(Krb5Context.java:829) ... 14 more Caused by: java.security.GeneralSecurityException: Checksum failed at sun.security.krb5.internal.crypto.dk.AesDkCrypto.decryptCTS(AesDkCrypto.java:451) at sun.security.krb5.internal.crypto.dk.AesDkCrypto.decrypt(AesDkCrypto.java:272) at sun.security.krb5.internal.crypto.Aes256.decrypt(Aes256.java:76) at sun.security.krb5.internal.crypto.Aes256CtsHmacSha1EType.decrypt(Aes256CtsHmacSha1EType.java:100) ... 20 more ``` What did I do? 1) created host aliases for each quorum node (a,b,c): zk1, zk2, zk3 2) Changed in zoo.cfg: changed from server.1=a server.2=b server.3=c to: server.1=zk1 server.2=zk2 server.3=zk3 (at this stage after restarting the ensemble all works as expected. 3) Generate new keytab with alias-based principals and host-based principals in zookeeper.keytab 4) Change jaas.conf (server) definition from: Server { com.sun.security.auth.module.Krb5LoginModule required useKeyTab=true keyTab="/etc/zookeeper/conf/zookeeper.keytab" storeKey=true useTicketCache=false principal="zookeeper/a.com@COM"; } ; to Server { com.sun.security.auth.module.Krb5LoginModule required useKeyTab=true keyTab="/etc/zookeeper/conf/zookeeper.keytab" storeKey=true useTicketCache=false principal="zookeeper/zk1.com@COM"; } ; >From that moment, after restarting quorum members, I get the above error. Now, why do I do this? To allow other services such as zkfc,hbase,hdfs,yarn to connect to the quorum using aliases. Interestingly, without changing the zookeeper principal, hbase works perfectly, but the other 3 services fail with: ``` <2021-07-12T20:45:19.491+0200> <INFO> <org.apache.zookeeper.ZooKeeper>: <Initiating client connection, connectString=zk01.com:2181,zk02.com:2181,zk03.com:2181 sessionTimeout=10000 watcher=org.apache.hadoop.ha.ActiveStandbyElector$WatcherWithClientRef@3246fb96> <2021-07-12T20:45:19.519+0200> <INFO> <org.apache.zookeeper.Login>: <Client successfully logged in.> <2021-07-12T20:45:19.521+0200> <INFO> <org.apache.zookeeper.Login>: <TGT refresh thread started.> <2021-07-12T20:45:19.524+0200> <INFO> <org.apache.zookeeper.Login>: <TGT valid starting at: Mon Jul 12 20:45:19 CEST 2021> <2021-07-12T20:45:19.524+0200> <INFO> <org.apache.zookeeper.Login>: <TGT expires: Tue Jul 13 21:45:19 CEST 2021> <2021-07-12T20:45:19.524+0200> <INFO> <org.apache.zookeeper.Login>: <TGT refresh sleeping until: Tue Jul 13 17:05:16 CEST 2021> <2021-07-12T20:45:19.524+0200> <INFO> <org.apache.zookeeper.client.ZooKeeperSaslClient>: <Client will use GSSAPI as SASL mechanism.> <2021-07-12T20:45:19.530+0200> <INFO> <org.apache.zookeeper.ClientCnxn>: <Opening socket connection to server zk02.com/<ip addr>:2181. Will attempt to SASL-authenticate using Login Context section 'Client'> <2021-07-12T20:45:19.535+0200> <INFO> <org.apache.zookeeper.ClientCnxn>: <Socket connection established to zk02.com/<ip addr>:2181, initiating session> <2021-07-12T20:45:19.543+0200> <INFO> <org.apache.zookeeper.ClientCnxn>: <Session establishment complete on server zk02.com/<ip addr>:2181, sessionid = 0x200247870fb0007, negotiated timeout = 10000> <2021-07-12T20:45:19.561+0200> <ERROR> <org.apache.zookeeper.client.ZooKeeperSaslClient>: <SASL authentication failed using login context 'Client' with exception: {}> javax.security.sasl.SaslException: Error in authenticating with a Zookeeper Quorum member: the quorum member's saslToken is null. at org.apache.zookeeper.client.ZooKeeperSaslClient.createSaslToken(ZooKeeperSaslClient.java:279) at org.apache.zookeeper.client.ZooKeeperSaslClient.respondToServer(ZooKeeperSaslClient.java:242) at org.apache.zookeeper.ClientCnxn$SendThread.readResponse(ClientCnxn.java:805) at org.apache.zookeeper.ClientCnxnSocketNIO.doIO(ClientCnxnSocketNIO.java:94) at org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:366) at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1145) <2021-07-12T20:45:19.564+0200> <INFO> <org.apache.zookeeper.ClientCnxn>: <Unable to read additional data from server sessionid 0x200247870fb0007, likely server has closed socket, closing socket connection and attempting reconnect> <2021-07-12T20:45:19.671+0200> <INFO> <org.apache.hadoop.ha.ActiveStandbyElector>: <Session connected.> <2021-07-12T20:45:19.672+0200> <ERROR> <org.apache.hadoop.hdfs.tools.DFSZKFailoverController>: <DFSZKFailOverController exiting due to earlier exception java.io.IOException: Couldn't determine existence of znode ``` When I change the principle of zookeeper hbase starts failing with this error and other services except for the zookeeper itself is somehow working fine. After that, I cannot connect manually to the zk quorum using zkCli and zookeeper-client with all possible combinations of principals. I wonder if that may have something to do with the "Server environment:host.name=" pointing to the canonical name (and not the alias) during the startup. The same happens after specifying the alias with clientPortAddress=. was: I faced an issue while trying to use alternative aliases with Zookeeper quorum when SASL is enabled. The errors I get in zookeeper log are the following: ``` 2021-07-12 21:04:46,437 [myid:3] - WARN [NIOWorkerThread-3:ZooKeeperServer@1661] - Client /<IP addr>:37368 failed to SASL authenticate: {} javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: Failure unspecified at GSS-API level (Mechanism level: Checksum failed)] at com.sun.security.sasl.gsskerb.GssKrb5Server.evaluateResponse(GssKrb5Server.java:199) at org.apache.zookeeper.server.ZooKeeperSaslServer.evaluateResponse(ZooKeeperSaslServer.java:49) at org.apache.zookeeper.server.ZooKeeperServer.processSasl(ZooKeeperServer.java:1650) at org.apache.zookeeper.server.ZooKeeperServer.processPacket(ZooKeeperServer.java:1599) at org.apache.zookeeper.server.NIOServerCnxn.readRequest(NIOServerCnxn.java:379) at org.apache.zookeeper.server.NIOServerCnxn.readPayload(NIOServerCnxn.java:182) at org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:339) at org.apache.zookeeper.server.NIOServerCnxnFactory$IOWorkRequest.doWork(NIOServerCnxnFactory.java:522) at org.apache.zookeeper.server.WorkerService$ScheduledWorkRequest.run(WorkerService.java:154) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) Caused by: GSSException: Failure unspecified at GSS-API level (Mechanism level: Checksum failed) at sun.security.jgss.krb5.Krb5Context.acceptSecContext(Krb5Context.java:856) at sun.security.jgss.GSSContextImpl.acceptSecContext(GSSContextImpl.java:342) at sun.security.jgss.GSSContextImpl.acceptSecContext(GSSContextImpl.java:285) at com.sun.security.sasl.gsskerb.GssKrb5Server.evaluateResponse(GssKrb5Server.java:167) ... 11 more Caused by: KrbException: Checksum failed at sun.security.krb5.internal.crypto.Aes256CtsHmacSha1EType.decrypt(Aes256CtsHmacSha1EType.java:102) at sun.security.krb5.internal.crypto.Aes256CtsHmacSha1EType.decrypt(Aes256CtsHmacSha1EType.java:94) at sun.security.krb5.EncryptedData.decrypt(EncryptedData.java:175) at sun.security.krb5.KrbApReq.authenticate(KrbApReq.java:281) at sun.security.krb5.KrbApReq.<init>(KrbApReq.java:149) at sun.security.jgss.krb5.InitSecContextToken.<init>(InitSecContextToken.java:108) at sun.security.jgss.krb5.Krb5Context.acceptSecContext(Krb5Context.java:829) ... 14 more Caused by: java.security.GeneralSecurityException: Checksum failed at sun.security.krb5.internal.crypto.dk.AesDkCrypto.decryptCTS(AesDkCrypto.java:451) at sun.security.krb5.internal.crypto.dk.AesDkCrypto.decrypt(AesDkCrypto.java:272) at sun.security.krb5.internal.crypto.Aes256.decrypt(Aes256.java:76) at sun.security.krb5.internal.crypto.Aes256CtsHmacSha1EType.decrypt(Aes256CtsHmacSha1EType.java:100) ... 20 more ``` What did I do? 1) created host aliases for each quorum node (a,b,c): zk1, zk2, zk3 2) Changed in zoo.cfg: changed from server.1=a server.2=b server.3=c to: server.1=zk1 server.2=zk2 server.3=zk3 (at this stage after restarting the ensemble all works as expected. 3) Generate new keytab with alias-based principals and host-based principals in zookeeper.keytab 4) Change jaas.conf (server) definition from: Server { com.sun.security.auth.module.Krb5LoginModule required useKeyTab=true keyTab="/etc/zookeeper/conf/zookeeper.keytab" storeKey=true useTicketCache=false principal="zookeeper/a.com@COM"; } ; to Server { com.sun.security.auth.module.Krb5LoginModule required useKeyTab=true keyTab="/etc/zookeeper/conf/zookeeper.keytab" storeKey=true useTicketCache=false principal="zookeeper/zk1.com@COM"; } ; >From that moment, after restarting quorum members, I get the above error. Now, why do I do this? To allow other services such as zkfc,hbase,hdfs,yarn to connect to the quorum using aliases. Interestingly, without changing the zookeeper principal, hbase works perfectly, but the other 3 services fail with: ``` <2021-07-12T20:45:19.491+0200> <INFO> <org.apache.zookeeper.ZooKeeper>: <Initiating client connection, connectString=zk01.com:2181,zk02.com:2181,zk03.com:2181 sessionTimeout=10000 watcher=org.apache.hadoop.ha.ActiveStandbyElector$WatcherWithClientRef@3246fb96> <2021-07-12T20:45:19.519+0200> <INFO> <org.apache.zookeeper.Login>: <Client successfully logged in.> <2021-07-12T20:45:19.521+0200> <INFO> <org.apache.zookeeper.Login>: <TGT refresh thread started.> <2021-07-12T20:45:19.524+0200> <INFO> <org.apache.zookeeper.Login>: <TGT valid starting at: Mon Jul 12 20:45:19 CEST 2021> <2021-07-12T20:45:19.524+0200> <INFO> <org.apache.zookeeper.Login>: <TGT expires: Tue Jul 13 21:45:19 CEST 2021> <2021-07-12T20:45:19.524+0200> <INFO> <org.apache.zookeeper.Login>: <TGT refresh sleeping until: Tue Jul 13 17:05:16 CEST 2021> <2021-07-12T20:45:19.524+0200> <INFO> <org.apache.zookeeper.client.ZooKeeperSaslClient>: <Client will use GSSAPI as SASL mechanism.> <2021-07-12T20:45:19.530+0200> <INFO> <org.apache.zookeeper.ClientCnxn>: <Opening socket connection to server zk02.com/<ip addr>:2181. Will attempt to SASL-authenticate using Login Context section 'Client'> <2021-07-12T20:45:19.535+0200> <INFO> <org.apache.zookeeper.ClientCnxn>: <Socket connection established to zk02.com/<ip addr>:2181, initiating session> <2021-07-12T20:45:19.543+0200> <INFO> <org.apache.zookeeper.ClientCnxn>: <Session establishment complete on server zk02.com/<ip addr>:2181, sessionid = 0x200247870fb0007, negotiated timeout = 10000> <2021-07-12T20:45:19.561+0200> <ERROR> <org.apache.zookeeper.client.ZooKeeperSaslClient>: <SASL authentication failed using login context 'Client' with exception: {}> javax.security.sasl.SaslException: Error in authenticating with a Zookeeper Quorum member: the quorum member's saslToken is null. at org.apache.zookeeper.client.ZooKeeperSaslClient.createSaslToken(ZooKeeperSaslClient.java:279) at org.apache.zookeeper.client.ZooKeeperSaslClient.respondToServer(ZooKeeperSaslClient.java:242) at org.apache.zookeeper.ClientCnxn$SendThread.readResponse(ClientCnxn.java:805) at org.apache.zookeeper.ClientCnxnSocketNIO.doIO(ClientCnxnSocketNIO.java:94) at org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:366) at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1145) <2021-07-12T20:45:19.564+0200> <INFO> <org.apache.zookeeper.ClientCnxn>: <Unable to read additional data from server sessionid 0x200247870fb0007, likely server has closed socket, closing socket connection and attempting reconnect> <2021-07-12T20:45:19.671+0200> <INFO> <org.apache.hadoop.ha.ActiveStandbyElector>: <Session connected.> <2021-07-12T20:45:19.672+0200> <ERROR> <org.apache.hadoop.hdfs.tools.DFSZKFailoverController>: <DFSZKFailOverController exiting due to earlier exception java.io.IOException: Couldn't determine existence of znode ``` When I change the principle of zookeeper hbase starts failing with this error and other services except for the zookeeper itself is somehow working fine. After that, I cannot connect manually to the zk quorum using zkCli and zookeeper-client with all possible combinations of principals. I wonder if that may have something to do with the "Server environment:host.name=" pointing to the canonical name (and not the alias) during the startup. > SASL authentication fails when using host aliases > ------------------------------------------------- > > Key: ZOOKEEPER-4334 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-4334 > Project: ZooKeeper > Issue Type: Bug > Affects Versions: 3.6.1 > Reporter: Emil Kleszcz > Priority: Critical > > I faced an issue while trying to use alternative aliases with Zookeeper > quorum when SASL is enabled. The errors I get in zookeeper log are the > following: > ``` > 2021-07-12 21:04:46,437 [myid:3] - WARN > [NIOWorkerThread-3:ZooKeeperServer@1661] - Client /<IP addr>:37368 failed to > SASL authenticate: {} > javax.security.sasl.SaslException: GSS initiate failed [Caused by > GSSException: Failure unspecified at GSS-API level (Mechanism level: Checksum > failed)] > at > com.sun.security.sasl.gsskerb.GssKrb5Server.evaluateResponse(GssKrb5Server.java:199) > at > org.apache.zookeeper.server.ZooKeeperSaslServer.evaluateResponse(ZooKeeperSaslServer.java:49) > at > org.apache.zookeeper.server.ZooKeeperServer.processSasl(ZooKeeperServer.java:1650) > at > org.apache.zookeeper.server.ZooKeeperServer.processPacket(ZooKeeperServer.java:1599) > at > org.apache.zookeeper.server.NIOServerCnxn.readRequest(NIOServerCnxn.java:379) > at > org.apache.zookeeper.server.NIOServerCnxn.readPayload(NIOServerCnxn.java:182) > at org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:339) > at > org.apache.zookeeper.server.NIOServerCnxnFactory$IOWorkRequest.doWork(NIOServerCnxnFactory.java:522) > at > org.apache.zookeeper.server.WorkerService$ScheduledWorkRequest.run(WorkerService.java:154) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) > Caused by: GSSException: Failure unspecified at GSS-API level (Mechanism > level: Checksum failed) > at sun.security.jgss.krb5.Krb5Context.acceptSecContext(Krb5Context.java:856) > at sun.security.jgss.GSSContextImpl.acceptSecContext(GSSContextImpl.java:342) > at sun.security.jgss.GSSContextImpl.acceptSecContext(GSSContextImpl.java:285) > at > com.sun.security.sasl.gsskerb.GssKrb5Server.evaluateResponse(GssKrb5Server.java:167) > ... 11 more > Caused by: KrbException: Checksum failed > at > sun.security.krb5.internal.crypto.Aes256CtsHmacSha1EType.decrypt(Aes256CtsHmacSha1EType.java:102) > at > sun.security.krb5.internal.crypto.Aes256CtsHmacSha1EType.decrypt(Aes256CtsHmacSha1EType.java:94) > at sun.security.krb5.EncryptedData.decrypt(EncryptedData.java:175) > at sun.security.krb5.KrbApReq.authenticate(KrbApReq.java:281) > at sun.security.krb5.KrbApReq.<init>(KrbApReq.java:149) > at > sun.security.jgss.krb5.InitSecContextToken.<init>(InitSecContextToken.java:108) > at sun.security.jgss.krb5.Krb5Context.acceptSecContext(Krb5Context.java:829) > ... 14 more > Caused by: java.security.GeneralSecurityException: Checksum failed > at > sun.security.krb5.internal.crypto.dk.AesDkCrypto.decryptCTS(AesDkCrypto.java:451) > at > sun.security.krb5.internal.crypto.dk.AesDkCrypto.decrypt(AesDkCrypto.java:272) > at sun.security.krb5.internal.crypto.Aes256.decrypt(Aes256.java:76) > at > sun.security.krb5.internal.crypto.Aes256CtsHmacSha1EType.decrypt(Aes256CtsHmacSha1EType.java:100) > ... 20 more > ``` > What did I do? > 1) created host aliases for each quorum node (a,b,c): zk1, zk2, zk3 > 2) Changed in zoo.cfg: > changed from > server.1=a > server.2=b > server.3=c > to: > server.1=zk1 > server.2=zk2 > server.3=zk3 > (at this stage after restarting the ensemble all works as expected. > 3) Generate new keytab with alias-based principals and host-based principals > in zookeeper.keytab > 4) Change jaas.conf (server) definition from: > Server > { com.sun.security.auth.module.Krb5LoginModule required useKeyTab=true > keyTab="/etc/zookeeper/conf/zookeeper.keytab" storeKey=true > useTicketCache=false principal="zookeeper/a.com@COM"; } > ; > to > Server > { com.sun.security.auth.module.Krb5LoginModule required useKeyTab=true > keyTab="/etc/zookeeper/conf/zookeeper.keytab" storeKey=true > useTicketCache=false principal="zookeeper/zk1.com@COM"; } > ; > From that moment, after restarting quorum members, I get the above error. > Now, why do I do this? > To allow other services such as zkfc,hbase,hdfs,yarn to connect to the > quorum using aliases. Interestingly, without changing the zookeeper > principal, hbase works perfectly, but the other 3 services fail with: > ``` > <2021-07-12T20:45:19.491+0200> <INFO> <org.apache.zookeeper.ZooKeeper>: > <Initiating client connection, > connectString=zk01.com:2181,zk02.com:2181,zk03.com:2181 sessionTimeout=10000 > watcher=org.apache.hadoop.ha.ActiveStandbyElector$WatcherWithClientRef@3246fb96> > <2021-07-12T20:45:19.519+0200> <INFO> <org.apache.zookeeper.Login>: <Client > successfully logged in.> > <2021-07-12T20:45:19.521+0200> <INFO> <org.apache.zookeeper.Login>: <TGT > refresh thread started.> > <2021-07-12T20:45:19.524+0200> <INFO> <org.apache.zookeeper.Login>: <TGT > valid starting at: Mon Jul 12 20:45:19 CEST 2021> > <2021-07-12T20:45:19.524+0200> <INFO> <org.apache.zookeeper.Login>: <TGT > expires: Tue Jul 13 21:45:19 CEST 2021> > <2021-07-12T20:45:19.524+0200> <INFO> <org.apache.zookeeper.Login>: <TGT > refresh sleeping until: Tue Jul 13 17:05:16 CEST 2021> > <2021-07-12T20:45:19.524+0200> <INFO> > <org.apache.zookeeper.client.ZooKeeperSaslClient>: <Client will use GSSAPI as > SASL mechanism.> > <2021-07-12T20:45:19.530+0200> <INFO> <org.apache.zookeeper.ClientCnxn>: > <Opening socket connection to server zk02.com/<ip addr>:2181. Will attempt to > SASL-authenticate using Login Context section 'Client'> > <2021-07-12T20:45:19.535+0200> <INFO> <org.apache.zookeeper.ClientCnxn>: > <Socket connection established to zk02.com/<ip addr>:2181, initiating session> > <2021-07-12T20:45:19.543+0200> <INFO> <org.apache.zookeeper.ClientCnxn>: > <Session establishment complete on server zk02.com/<ip addr>:2181, sessionid > = 0x200247870fb0007, negotiated timeout = 10000> > <2021-07-12T20:45:19.561+0200> <ERROR> > <org.apache.zookeeper.client.ZooKeeperSaslClient>: <SASL authentication > failed using login context 'Client' with exception: {}> > javax.security.sasl.SaslException: Error in authenticating with a Zookeeper > Quorum member: the quorum member's saslToken is null. > at > org.apache.zookeeper.client.ZooKeeperSaslClient.createSaslToken(ZooKeeperSaslClient.java:279) > at > org.apache.zookeeper.client.ZooKeeperSaslClient.respondToServer(ZooKeeperSaslClient.java:242) > at > org.apache.zookeeper.ClientCnxn$SendThread.readResponse(ClientCnxn.java:805) > at org.apache.zookeeper.ClientCnxnSocketNIO.doIO(ClientCnxnSocketNIO.java:94) > at > org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:366) > at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1145) > <2021-07-12T20:45:19.564+0200> <INFO> <org.apache.zookeeper.ClientCnxn>: > <Unable to read additional data from server sessionid 0x200247870fb0007, > likely server has closed socket, closing socket connection and attempting > reconnect> > <2021-07-12T20:45:19.671+0200> <INFO> > <org.apache.hadoop.ha.ActiveStandbyElector>: <Session connected.> > <2021-07-12T20:45:19.672+0200> <ERROR> > <org.apache.hadoop.hdfs.tools.DFSZKFailoverController>: > <DFSZKFailOverController exiting due to earlier exception > java.io.IOException: Couldn't determine existence of znode > ``` > When I change the principle of zookeeper hbase starts failing with this > error and other services except for the zookeeper itself is somehow working > fine. After that, I cannot connect manually to the zk quorum using zkCli and > zookeeper-client with all possible combinations of principals. > I wonder if that may have something to do with the "Server > environment:host.name=" pointing to the canonical name (and not the alias) > during the startup. The same happens after specifying the alias with > clientPortAddress=. -- This message was sent by Atlassian Jira (v8.3.4#803005)