Thank you all for your feedback, for load balanceing using zookeeper via
knox, I think there will be an impact in perfermance if we initiate this
check every new connection, but we can optimise this if knox check
zookeeper namespace every new session (client), Or can we develope some
thing intern in knox without use zookeeper.
Otherwise, the second solution that i tested is as it reported by Kevin is
to make load balancer behind Knox before HS2, but I have problem with
authentification kerberos, below my configuration using haproxy:
1) Servers used
hahost0 = HAProxy server
hs2host1 = HS2 instance 1
hs2host2 = HS2 instance 2
2) Set up HAProxy with minimum configuration
[root@hahost ~]# vim /etc/haproxy/haproxy.cfg
# This is the setup for HS2. beeline client connect to hahost:10001.
# HAProxy will balance connections among the list of servers listed below.
listen hiveserver2 :10001
mode tcp
option tcplog
balance source
server has2_instance1 has2host1.localdomain:10001
server has2_instance2 has2host2.localdomain:10001
3) Create a new service principal for HAProxy and for two instances of HS2
from KDC server
ipa service-add HTTP/hahost.localdomain@CLUSTER
ipa service-add HTTP/hs2host1.localdomain@CLUSTER
ipa service-add HTTP/hs2host2.localdomain@CLUSTER
4) Generate a new keytab containing all HTTP / <FQDN HS2> services for all
HS2 servers and the HTTP / <load_balancer> service:
ipa-getkeytab -s <Kerberos_server> -p HTTP/hahost.localdomain@CLUSTER -k
/tmp/spnego.service.keytab
ipa-getkeytab -s <Kerberos_server> -p HTTP/hs2host1.localdomain@CLUSTER -k
/tmp/spnego.service.keytab
ipa-getkeytab -s <Kerberos_server> -p HTTP/hs2host2.localdomain@CLUSTER -k
/tmp/spnego.service.keytab
5) Copy this keytab into HAS2 servers
scp /tmp/spnego.service.keytab
hs2host1.localdomain:/etc/security/keytabs/spnego.service.keytab
scp /tmp/spnego.service.keytab
hs2host2.localdomain:/etc/security/keytabs/spnego.service.keytab
6) Make sure owner and permission:
chown root:hadoop /etc/security/keytabs/spnego.service.keytab
chmod 440 /etc/security/keytabs/spnego.service.keytab
7) Confirm:
[root@hs2host1 ~]# klist -kt /etc/security/keytabs/spnego.service.keytab
Keytab name: FILE:/etc/security/keytabs/spnego.service.keytab
KVNO Timestamp Principal
---- -----------------
--------------------------------------------------------
1 01/14/19 09:54:06 HTTP/hahost.localdomain@CLUSTER
1 01/14/19 09:54:06 HTTP/hahost.localdomain@CLUSTER
1 01/14/19 09:54:06 HTTP/hahost.localdomain@CLUSTER
1 01/14/19 09:54:06 HTTP/hahost.localdomain@CLUSTER
4 01/14/19 09:54:18 HTTP/hs2host1.localdomain@CLUSTER
4 01/14/19 09:54:18 HTTP/hs2host1.localdomain@CLUSTER
4 01/14/19 09:54:18 HTTP/hs2host1.localdomain@CLUSTER
4 01/14/19 09:54:18 HTTP/hs2host1.localdomain@CLUSTER
9 01/14/19 09:54:33 HTTP/hs2host2.localdomain@CLUSTER
9 01/14/19 09:54:33 HTTP/hs2host2.localdomain@CLUSTER
9 01/14/19 09:54:33 HTTP/hs2host2.localdomain@CLUSTER
9 01/14/19 09:54:33 HTTP/hs2host2.localdomain@CLUSTER
[root@hs2host2 ~]# klist -kt /etc/security/keytabs/spnego.service.keytab
Keytab name: FILE:/etc/security/keytabs/spnego.service.keytab
KVNO Timestamp Principal
---- -----------------
--------------------------------------------------------
1 01/14/19 09:55:12 HTTP/hahost.localdomain@CLUSTER
1 01/14/19 09:55:12 HTTP/hahost.localdomain@CLUSTER
1 01/14/19 09:55:12 HTTP/hahost.localdomain@CLUSTER
1 01/14/19 09:55:12 HTTP/hahost.localdomain@CLUSTER
4 01/14/19 09:55:33 HTTP/hs2host1.localdomain@CLUSTER
4 01/14/19 09:55:33 HTTP/hs2host1.localdomain@CLUSTER
4 01/14/19 09:55:33 HTTP/hs2host1.localdomain@CLUSTER
4 01/14/19 09:55:33 HTTP/hs2host1.localdomain@CLUSTER
9 01/14/19 09:55:33 HTTP/hs2host2.localdomain@CLUSTER
9 01/14/19 09:55:33 HTTP/hs2host2.localdomain@CLUSTER
9 01/14/19 09:55:33 HTTP/hs2host2.localdomain@CLUSTER
9 01/14/19 09:55:33 HTTP/hs2host2.localdomain@CLUSTER
8) from Ambari change those parameter in hdfs-site.xml config
hadoop.proxyuser.HTTP.groups=*
hadoop.proxyuser.HTTP.hosts=*
9) restart HDFS and Hive from ambari
10) Execute kinit in each HS2 instance
kinit -kt /etc/security/keytabs/spnego.service.keytab
HTTP/hahost.localdomain@CLUSTER
kinit -kt /etc/security/keytabs/spnego.service.keytab
HTTP/hs1host1.localdomain@CLUSTER
kinit -kt /etc/security/keytabs/spnego.service.keytab
HTTP/hs1host2.localdomain@CLUSTER
10) Test connection HS2 from beeline
beeline -u
'jdbc:hive2://hahost.localdomain:10001/;principal=HTTP/_HOST@CLUSTER
;transportMode=http;httpPath=cliservice'
19/01/16 10:55:46 [main]: ERROR jdbc.HiveConnection: Error opening session
org.apache.thrift.transport.TTransportException: HTTP Response code: 401
at
org.apache.thrift.transport.THttpClient.flushUsingHttpClient(THttpClient.java:262)
at
org.apache.thrift.transport.THttpClient.flush(THttpClient.java:313)
at org.apache.thrift.TServiceClient.sendBase(TServiceClient.java:73)
at org.apache.thrift.TServiceClient.sendBase(TServiceClient.java:62)
at
org.apache.hive.service.cli.thrift.TCLIService$Client.send_OpenSession(TCLIService.java:158)
at
org.apache.hive.service.cli.thrift.TCLIService$Client.OpenSession(TCLIService.java:150)
at
org.apache.hive.jdbc.HiveConnection.openSession(HiveConnection.java:622)
at
org.apache.hive.jdbc.HiveConnection.<init>(HiveConnection.java:221)
at org.apache.hive.jdbc.HiveDriver.connect(HiveDriver.java:105)
at java.sql.DriverManager.getConnection(DriverManager.java:664)
at java.sql.DriverManager.getConnection(DriverManager.java:208)
at
org.apache.hive.beeline.DatabaseConnection.connect(DatabaseConnection.java:146)
at
org.apache.hive.beeline.DatabaseConnection.getConnection(DatabaseConnection.java:211)
at org.apache.hive.beeline.Commands.connect(Commands.java:1204)
at org.apache.hive.beeline.Commands.connect(Commands.java:1100)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at
org.apache.hive.beeline.ReflectiveCommandHandler.execute(ReflectiveCommandHandler.java:54)
at org.apache.hive.beeline.BeeLine.dispatch(BeeLine.java:998)
at org.apache.hive.beeline.BeeLine.initArgs(BeeLine.java:717)
at org.apache.hive.beeline.BeeLine.begin(BeeLine.java:779)
at
org.apache.hive.beeline.BeeLine.mainWithInputRedirection(BeeLine.java:493)
at org.apache.hive.beeline.BeeLine.main(BeeLine.java:476)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hadoop.util.RunJar.run(RunJar.java:233)
at org.apache.hadoop.util.RunJar.main(RunJar.java:148)
Error: Could not establish connection to
jdbc:hive2://hahost.localdomain:10001/;principal=HTTP/_HOST@CLUSTER;transportMode=http;httpPath=cliservice:
HTTP Response code: 401 (state=08S01,code=0)
Beeline version 1.2.1000.2.6.5.0-292 by Apache Hive
0: jdbc:hive2://hahost.localdomain (closed)>
11) Log HS2
2019-01-16 10:55:46,744 INFO [HiveServer2-HttpHandler-Pool: Thread-48]:
thrift.ThriftHttpServlet (ThriftHttpServlet.java:doPost(145)) - Could not
validate cookie sent, will try to generate a new cookie
2019-01-16 10:55:46,745 INFO [HiveServer2-HttpHandler-Pool: Thread-48]:
thrift.ThriftHttpServlet (ThriftHttpServlet.java:doKerberosAuth(398)) -
Failed to authenticate with http/_HOST kerberos principal, trying with
hive/_HOST kerberos
principal
2019-01-16 10:55:46,745 ERROR [HiveServer2-HttpHandler-Pool: Thread-48]:
thrift.ThriftHttpServlet (ThriftHttpServlet.java:doKerberosAuth(406)) -
Failed to authenticate with hive/_HOST kerberos principal
2019-01-16 10:55:46,746 ERROR [HiveServer2-HttpHandler-Pool: Thread-48]:
thrift.ThriftHttpServlet (ThriftHttpServlet.java:doPost(209)) - Error:
org.apache.hive.service.auth.HttpAuthenticationException:
java.lang.reflect.UndeclaredThrowableException
at
org.apache.hive.service.cli.thrift.ThriftHttpServlet.doKerberosAuth(ThriftHttpServlet.java:407)
at
org.apache.hive.service.cli.thrift.ThriftHttpServlet.doPost(ThriftHttpServlet.java:159)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:727)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:820)
at
org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:565)
at
org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:479)
at
org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:225)
at
org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1031)
at
org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:406)
at
org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:186)
at
org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:965)
at
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:117)
at
org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:111)
at org.eclipse.jetty.server.Server.handle(Server.java:349)
at
org.eclipse.jetty.server.AbstractHttpConnection.handleRequest(AbstractHttpConnection.java:449)
at
org.eclipse.jetty.server.AbstractHttpConnection$RequestHandler.content(AbstractHttpConnection.java:925)
at org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:857)
at
org.eclipse.jetty.http.HttpParser.parseAvailable(HttpParser.java:235)
at
org.eclipse.jetty.server.AsyncHttpConnection.handle(AsyncHttpConnection.java:76)
at
org.eclipse.jetty.io.nio.SelectChannelEndPoint.handle(SelectChannelEndPoint.java:609)
at
org.eclipse.jetty.io.nio.SelectChannelEndPoint$1.run(SelectChannelEndPoint.java:45)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.reflect.UndeclaredThrowableException
at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1887)
at
org.apache.hive.service.cli.thrift.ThriftHttpServlet.doKerberosAuth(ThriftHttpServlet.java:404)
... 23 more
Caused by: org.apache.hive.service.auth.HttpAuthenticationException:
Kerberos authentication failed:
at
org.apache.hive.service.cli.thrift.ThriftHttpServlet$HttpKerberosServerAction.run(ThriftHttpServlet.java:463)
at
org.apache.hive.service.cli.thrift.ThriftHttpServlet$HttpKerberosServerAction.run(ThriftHttpServlet.java:412)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1869)
... 24 more
Caused by: GSSException: Failure unspecified at GSS-API level (Mechanism
level: Checksum failed)
at
sun.security.jgss.krb5.Krb5Context.acceptSecContext(Krb5Context.java:856)
at
sun.security.jgss.GSSContextImpl.acceptSecContext(GSSContextImpl.java:342)
at
sun.security.jgss.GSSContextImpl.acceptSecContext(GSSContextImpl.java:285)
at
org.apache.hive.service.cli.thrift.ThriftHttpServlet$HttpKerberosServerAction.run(ThriftHttpServlet.java:451)
... 28 more
Caused by: KrbException: Checksum failed
at
sun.security.krb5.internal.crypto.Aes256CtsHmacSha1EType.decrypt(Aes256CtsHmacSha1EType.java:102)
at
sun.security.krb5.internal.crypto.Aes256CtsHmacSha1EType.decrypt(Aes256CtsHmacSha1EType.java:94)
at sun.security.krb5.EncryptedData.decrypt(EncryptedData.java:175)
at sun.security.krb5.KrbApReq.authenticate(KrbApReq.java:281)
at sun.security.krb5.KrbApReq.<init>(KrbApReq.java:149)
at
sun.security.jgss.krb5.InitSecContextToken.<init>(InitSecContextToken.java:108)
at
sun.security.jgss.krb5.Krb5Context.acceptSecContext(Krb5Context.java:829)
... 31 more
Caused by: java.security.GeneralSecurityException: Checksum failed
at
sun.security.krb5.internal.crypto.dk.AesDkCrypto.decryptCTS(AesDkCrypto.java:451)
at
sun.security.krb5.internal.crypto.dk.AesDkCrypto.decrypt(AesDkCrypto.java:272)
at sun.security.krb5.internal.crypto.Aes256.decrypt(Aes256.java:76)
at
sun.security.krb5.internal.crypto.Aes256CtsHmacSha1EType.decrypt(Aes256CtsHmacSha1EType.java:100)
... 37 more
Le mar. 15 janv. 2019 à 21:36, Kevin Risden <[email protected]> a écrit :
> The main issue with load balancing HS2 is that HS2 is stateful. This
> means that you need to keep the same user going back to the same HS2
> instance. If you connect to HS2 via JDBC, when the next set of rows is
> requested it must go back to the same HS2 instance.
>
> Knox doesn't support load balancing backends currently. Adding an HTTP
> load balancer behind Knox before HS2 can be done but need to be
> careful with Kerberos and sticky sessions.
>
>
> Kevin Risden
>
> On Tue, Jan 15, 2019 at 11:16 AM David Villarreal
> <[email protected]> wrote:
> >
> > Hi Rabii,
> >
> >
> >
> > There is a lot to think about here. I don’t think every
> request/connection would be a good design to check zookeeper every time,
> but maybe if there is a way to identify a new client-session we could
> design it to go check zookeeper. We would also need to see what impact in
> performance this could be. But I do like the concept. Just keep in mind
> for zookeeper, I don’t think this is a true loadbalancer in the hive code.
> I believe it randomly returns a host:port for a registered hiveserver2
> instance.
> >
> >
> >
> > Best regards,
> >
> >
> >
> > David
> >
> > From: rabii lamriq <[email protected]>
> > Reply-To: "[email protected]" <[email protected]>
> > Date: Tuesday, January 15, 2019 at 1:01 AM
> > To: "[email protected]" <[email protected]>
> > Subject: Load balancing of Hiveserver2 through Knox
> >
> >
> >
> > Hi
> >
> >
> >
> > I am using knox to connect to HS2, but Knox ensure only HA and not Load
> balancing.
> >
> >
> >
> > In fact, I noticed that there are a load balancing when I connect to HS2
> using Zookeeper only, but using Knox, knox connect to zookeeper to get an
> available instance of HS2, then use this instance for all connection.
> >
> >
> >
> > My question is : can we make any thing to let knox to connect to
> zookeeper in each new connection in order to get a different instance for
> each new connection to HS2.
> >
> >
> >
> > Best
> >
> > Rabii
>