The main issue with load balancing HS2 is that HS2 is stateful. This means that you need to keep the same user going back to the same HS2 instance. If you connect to HS2 via JDBC, when the next set of rows is requested it must go back to the same HS2 instance.
Knox doesn't support load balancing backends currently. Adding an HTTP load balancer behind Knox before HS2 can be done but need to be careful with Kerberos and sticky sessions. Kevin Risden On Tue, Jan 15, 2019 at 11:16 AM David Villarreal <[email protected]> wrote: > > Hi Rabii, > > > > There is a lot to think about here. I don’t think every request/connection > would be a good design to check zookeeper every time, but maybe if there is a > way to identify a new client-session we could design it to go check > zookeeper. We would also need to see what impact in performance this could > be. But I do like the concept. Just keep in mind for zookeeper, I don’t > think this is a true loadbalancer in the hive code. I believe it randomly > returns a host:port for a registered hiveserver2 instance. > > > > Best regards, > > > > David > > From: rabii lamriq <[email protected]> > Reply-To: "[email protected]" <[email protected]> > Date: Tuesday, January 15, 2019 at 1:01 AM > To: "[email protected]" <[email protected]> > Subject: Load balancing of Hiveserver2 through Knox > > > > Hi > > > > I am using knox to connect to HS2, but Knox ensure only HA and not Load > balancing. > > > > In fact, I noticed that there are a load balancing when I connect to HS2 > using Zookeeper only, but using Knox, knox connect to zookeeper to get an > available instance of HS2, then use this instance for all connection. > > > > My question is : can we make any thing to let knox to connect to zookeeper in > each new connection in order to get a different instance for each new > connection to HS2. > > > > Best > > Rabii
