This may get you further. https://community.hortonworks.com/content/supportkb/150574/how-to-enable-debug-logging-for-beeline.html
From: rabii lamriq <[email protected]> Reply-To: "[email protected]" <[email protected]> Date: Wednesday, January 16, 2019 at 2:06 AM To: "[email protected]" <[email protected]> Subject: Re: Load balancing of Hiveserver2 through Knox Thank you all for your feedback, for load balanceing using zookeeper via knox, I think there will be an impact in perfermance if we initiate this check every new connection, but we can optimise this if knox check zookeeper namespace every new session (client), Or can we develope some thing intern in knox without use zookeeper. Otherwise, the second solution that i tested is as it reported by Kevin is to make load balancer behind Knox before HS2, but I have problem with authentification kerberos, below my configuration using haproxy: 1) Servers used hahost0 = HAProxy server hs2host1 = HS2 instance 1 hs2host2 = HS2 instance 2 2) Set up HAProxy with minimum configuration [root@hahost ~]# vim /etc/haproxy/haproxy.cfg # This is the setup for HS2. beeline client connect to hahost:10001. # HAProxy will balance connections among the list of servers listed below. listen hiveserver2 :10001 mode tcp option tcplog balance source server has2_instance1 has2host1.localdomain:10001 server has2_instance2 has2host2.localdomain:10001 3) Create a new service principal for HAProxy and for two instances of HS2 from KDC server ipa service-add HTTP/hahost.localdomain@CLUSTER ipa service-add HTTP/hs2host1.localdomain@CLUSTER ipa service-add HTTP/hs2host2.localdomain@CLUSTER 4) Generate a new keytab containing all HTTP / <FQDN HS2> services for all HS2 servers and the HTTP / <load_balancer> service: ipa-getkeytab -s <Kerberos_server> -p HTTP/hahost.localdomain@CLUSTER -k /tmp/spnego.service.keytab ipa-getkeytab -s <Kerberos_server> -p HTTP/hs2host1.localdomain@CLUSTER -k /tmp/spnego.service.keytab ipa-getkeytab -s <Kerberos_server> -p HTTP/hs2host2.localdomain@CLUSTER -k /tmp/spnego.service.keytab 5) Copy this keytab into HAS2 servers scp /tmp/spnego.service.keytab hs2host1.localdomain:/etc/security/keytabs/spnego.service.keytab scp /tmp/spnego.service.keytab hs2host2.localdomain:/etc/security/keytabs/spnego.service.keytab 6) Make sure owner and permission: chown root:hadoop /etc/security/keytabs/spnego.service.keytab chmod 440 /etc/security/keytabs/spnego.service.keytab 7) Confirm: [root@hs2host1 ~]# klist -kt /etc/security/keytabs/spnego.service.keytab Keytab name: FILE:/etc/security/keytabs/spnego.service.keytab KVNO Timestamp Principal ---- ----------------- -------------------------------------------------------- 1 01/14/19 09:54:06 HTTP/hahost.localdomain@CLUSTER 1 01/14/19 09:54:06 HTTP/hahost.localdomain@CLUSTER 1 01/14/19 09:54:06 HTTP/hahost.localdomain@CLUSTER 1 01/14/19 09:54:06 HTTP/hahost.localdomain@CLUSTER 4 01/14/19 09:54:18 HTTP/hs2host1.localdomain@CLUSTER 4 01/14/19 09:54:18 HTTP/hs2host1.localdomain@CLUSTER 4 01/14/19 09:54:18 HTTP/hs2host1.localdomain@CLUSTER 4 01/14/19 09:54:18 HTTP/hs2host1.localdomain@CLUSTER 9 01/14/19 09:54:33 HTTP/hs2host2.localdomain@CLUSTER 9 01/14/19 09:54:33 HTTP/hs2host2.localdomain@CLUSTER 9 01/14/19 09:54:33 HTTP/hs2host2.localdomain@CLUSTER 9 01/14/19 09:54:33 HTTP/hs2host2.localdomain@CLUSTER [root@hs2host2 ~]# klist -kt /etc/security/keytabs/spnego.service.keytab Keytab name: FILE:/etc/security/keytabs/spnego.service.keytab KVNO Timestamp Principal ---- ----------------- -------------------------------------------------------- 1 01/14/19 09:55:12 HTTP/hahost.localdomain@CLUSTER 1 01/14/19 09:55:12 HTTP/hahost.localdomain@CLUSTER 1 01/14/19 09:55:12 HTTP/hahost.localdomain@CLUSTER 1 01/14/19 09:55:12 HTTP/hahost.localdomain@CLUSTER 4 01/14/19 09:55:33 HTTP/hs2host1.localdomain@CLUSTER 4 01/14/19 09:55:33 HTTP/hs2host1.localdomain@CLUSTER 4 01/14/19 09:55:33 HTTP/hs2host1.localdomain@CLUSTER 4 01/14/19 09:55:33 HTTP/hs2host1.localdomain@CLUSTER 9 01/14/19 09:55:33 HTTP/hs2host2.localdomain@CLUSTER 9 01/14/19 09:55:33 HTTP/hs2host2.localdomain@CLUSTER 9 01/14/19 09:55:33 HTTP/hs2host2.localdomain@CLUSTER 9 01/14/19 09:55:33 HTTP/hs2host2.localdomain@CLUSTER 8) from Ambari change those parameter in hdfs-site.xml config hadoop.proxyuser.HTTP.groups=* hadoop.proxyuser.HTTP.hosts=* 9) restart HDFS and Hive from ambari 10) Execute kinit in each HS2 instance kinit -kt /etc/security/keytabs/spnego.service.keytab HTTP/hahost.localdomain@CLUSTER kinit -kt /etc/security/keytabs/spnego.service.keytab HTTP/hs1host1.localdomain@CLUSTER kinit -kt /etc/security/keytabs/spnego.service.keytab HTTP/hs1host2.localdomain@CLUSTER 10) Test connection HS2 from beeline beeline -u 'jdbc:hive2://hahost.localdomain:10001/;principal=HTTP/_HOST@CLUSTER;transportMode=http;httpPath=cliservice' 19/01/16 10:55:46 [main]: ERROR jdbc.HiveConnection: Error opening session org.apache.thrift.transport.TTransportException: HTTP Response code: 401 at org.apache.thrift.transport.THttpClient.flushUsingHttpClient(THttpClient.java:262) at org.apache.thrift.transport.THttpClient.flush(THttpClient.java:313) at org.apache.thrift.TServiceClient.sendBase(TServiceClient.java:73) at org.apache.thrift.TServiceClient.sendBase(TServiceClient.java:62) at org.apache.hive.service.cli.thrift.TCLIService$Client.send_OpenSession(TCLIService.java:158) at org.apache.hive.service.cli.thrift.TCLIService$Client.OpenSession(TCLIService.java:150) at org.apache.hive.jdbc.HiveConnection.openSession(HiveConnection.java:622) at org.apache.hive.jdbc.HiveConnection.<init>(HiveConnection.java:221) at org.apache.hive.jdbc.HiveDriver.connect(HiveDriver.java:105) at java.sql.DriverManager.getConnection(DriverManager.java:664) at java.sql.DriverManager.getConnection(DriverManager.java:208) at org.apache.hive.beeline.DatabaseConnection.connect(DatabaseConnection.java:146) at org.apache.hive.beeline.DatabaseConnection.getConnection(DatabaseConnection.java:211) at org.apache.hive.beeline.Commands.connect(Commands.java:1204) at org.apache.hive.beeline.Commands.connect(Commands.java:1100) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.hive.beeline.ReflectiveCommandHandler.execute(ReflectiveCommandHandler.java:54) at org.apache.hive.beeline.BeeLine.dispatch(BeeLine.java:998) at org.apache.hive.beeline.BeeLine.initArgs(BeeLine.java:717) at org.apache.hive.beeline.BeeLine.begin(BeeLine.java:779) at org.apache.hive.beeline.BeeLine.mainWithInputRedirection(BeeLine.java:493) at org.apache.hive.beeline.BeeLine.main(BeeLine.java:476) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.hadoop.util.RunJar.run(RunJar.java:233) at org.apache.hadoop.util.RunJar.main(RunJar.java:148) Error: Could not establish connection to jdbc:hive2://hahost.localdomain:10001/;principal=HTTP/_HOST@CLUSTER;transportMode=http;httpPath=cliservice: HTTP Response code: 401 (state=08S01,code=0) Beeline version 1.2.1000.2.6.5.0-292 by Apache Hive 0: jdbc:hive2://hahost.localdomain (closed)> 11) Log HS2 2019-01-16 10:55:46,744 INFO [HiveServer2-HttpHandler-Pool: Thread-48]: thrift.ThriftHttpServlet (ThriftHttpServlet.java:doPost(145)) - Could not validate cookie sent, will try to generate a new cookie 2019-01-16 10:55:46,745 INFO [HiveServer2-HttpHandler-Pool: Thread-48]: thrift.ThriftHttpServlet (ThriftHttpServlet.java:doKerberosAuth(398)) - Failed to authenticate with http/_HOST kerberos principal, trying with hive/_HOST kerberos principal 2019-01-16 10:55:46,745 ERROR [HiveServer2-HttpHandler-Pool: Thread-48]: thrift.ThriftHttpServlet (ThriftHttpServlet.java:doKerberosAuth(406)) - Failed to authenticate with hive/_HOST kerberos principal 2019-01-16 10:55:46,746 ERROR [HiveServer2-HttpHandler-Pool: Thread-48]: thrift.ThriftHttpServlet (ThriftHttpServlet.java:doPost(209)) - Error: org.apache.hive.service.auth.HttpAuthenticationException: java.lang.reflect.UndeclaredThrowableException at org.apache.hive.service.cli.thrift.ThriftHttpServlet.doKerberosAuth(ThriftHttpServlet.java:407) at org.apache.hive.service.cli.thrift.ThriftHttpServlet.doPost(ThriftHttpServlet.java:159) at javax.servlet.http.HttpServlet.service(HttpServlet.java:727) at javax.servlet.http.HttpServlet.service(HttpServlet.java:820) at org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:565) at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:479) at org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:225) at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1031) at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:406) at org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:186) at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:965) at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:117) at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:111) at org.eclipse.jetty.server.Server.handle(Server.java:349) at org.eclipse.jetty.server.AbstractHttpConnection.handleRequest(AbstractHttpConnection.java:449) at org.eclipse.jetty.server.AbstractHttpConnection$RequestHandler.content(AbstractHttpConnection.java:925) at org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:857) at org.eclipse.jetty.http.HttpParser.parseAvailable(HttpParser.java:235) at org.eclipse.jetty.server.AsyncHttpConnection.handle(AsyncHttpConnection.java:76) at org.eclipse.jetty.io.nio.SelectChannelEndPoint.handle(SelectChannelEndPoint.java:609) at org.eclipse.jetty.io.nio.SelectChannelEndPoint$1.run(SelectChannelEndPoint.java:45) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) Caused by: java.lang.reflect.UndeclaredThrowableException at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1887) at org.apache.hive.service.cli.thrift.ThriftHttpServlet.doKerberosAuth(ThriftHttpServlet.java:404) ... 23 more Caused by: org.apache.hive.service.auth.HttpAuthenticationException: Kerberos authentication failed: at org.apache.hive.service.cli.thrift.ThriftHttpServlet$HttpKerberosServerAction.run(ThriftHttpServlet.java:463) at org.apache.hive.service.cli.thrift.ThriftHttpServlet$HttpKerberosServerAction.run(ThriftHttpServlet.java:412) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1869) ... 24 more Caused by: GSSException: Failure unspecified at GSS-API level (Mechanism level: Checksum failed) at sun.security.jgss.krb5.Krb5Context.acceptSecContext(Krb5Context.java:856) at sun.security.jgss.GSSContextImpl.acceptSecContext(GSSContextImpl.java:342) at sun.security.jgss.GSSContextImpl.acceptSecContext(GSSContextImpl.java:285) at org.apache.hive.service.cli.thrift.ThriftHttpServlet$HttpKerberosServerAction.run(ThriftHttpServlet.java:451) ... 28 more Caused by: KrbException: Checksum failed at sun.security.krb5.internal.crypto.Aes256CtsHmacSha1EType.decrypt(Aes256CtsHmacSha1EType.java:102) at sun.security.krb5.internal.crypto.Aes256CtsHmacSha1EType.decrypt(Aes256CtsHmacSha1EType.java:94) at sun.security.krb5.EncryptedData.decrypt(EncryptedData.java:175) at sun.security.krb5.KrbApReq.authenticate(KrbApReq.java:281) at sun.security.krb5.KrbApReq.<init>(KrbApReq.java:149) at sun.security.jgss.krb5.InitSecContextToken.<init>(InitSecContextToken.java:108) at sun.security.jgss.krb5.Krb5Context.acceptSecContext(Krb5Context.java:829) ... 31 more Caused by: java.security.GeneralSecurityException: Checksum failed at sun.security.krb5.internal.crypto.dk.AesDkCrypto.decryptCTS(AesDkCrypto.java:451) at sun.security.krb5.internal.crypto.dk.AesDkCrypto.decrypt(AesDkCrypto.java:272) at sun.security.krb5.internal.crypto.Aes256.decrypt(Aes256.java:76) at sun.security.krb5.internal.crypto.Aes256CtsHmacSha1EType.decrypt(Aes256CtsHmacSha1EType.java:100) ... 37 more Le mar. 15 janv. 2019 à 21:36, Kevin Risden <[email protected]<mailto:[email protected]>> a écrit : The main issue with load balancing HS2 is that HS2 is stateful. This means that you need to keep the same user going back to the same HS2 instance. If you connect to HS2 via JDBC, when the next set of rows is requested it must go back to the same HS2 instance. Knox doesn't support load balancing backends currently. Adding an HTTP load balancer behind Knox before HS2 can be done but need to be careful with Kerberos and sticky sessions. Kevin Risden On Tue, Jan 15, 2019 at 11:16 AM David Villarreal <[email protected]<mailto:[email protected]>> wrote: > > Hi Rabii, > > > > There is a lot to think about here. I don’t think every request/connection > would be a good design to check zookeeper every time, but maybe if there is a > way to identify a new client-session we could design it to go check > zookeeper. We would also need to see what impact in performance this could > be. But I do like the concept. Just keep in mind for zookeeper, I don’t > think this is a true loadbalancer in the hive code. I believe it randomly > returns a host:port for a registered hiveserver2 instance. > > > > Best regards, > > > > David > > From: rabii lamriq <[email protected]<mailto:[email protected]>> > Reply-To: "[email protected]<mailto:[email protected]>" > <[email protected]<mailto:[email protected]>> > Date: Tuesday, January 15, 2019 at 1:01 AM > To: "[email protected]<mailto:[email protected]>" > <[email protected]<mailto:[email protected]>> > Subject: Load balancing of Hiveserver2 through Knox > > > > Hi > > > > I am using knox to connect to HS2, but Knox ensure only HA and not Load > balancing. > > > > In fact, I noticed that there are a load balancing when I connect to HS2 > using Zookeeper only, but using Knox, knox connect to zookeeper to get an > available instance of HS2, then use this instance for all connection. > > > > My question is : can we make any thing to let knox to connect to zookeeper in > each new connection in order to get a different instance for each new > connection to HS2. > > > > Best > > Rabii
