Qiheng He created HIVE-28424:
--------------------------------

             Summary: The Docker Image of HiveServer2 should provide an env 
variable defining the `hostname:port` passed into the znode
                 Key: HIVE-28424
                 URL: https://issues.apache.org/jira/browse/HIVE-28424
             Project: Hive
          Issue Type: Improvement
            Reporter: Qiheng He


* The Docker Image of HiveServer2 should provide an environment variable 
defining the *hostname:port* passed into the znode.
 * This requirement may seem a bit strange at first glance, but it requires the 
introduction of a small service orchestration scenario related to 
[https://github.com/dbeaver/dbeaver/issues/22777] .
 * For the Docker Image of HiveServer2 on {*}apache/hive:4.0.0{*}, if I need to 
enable Zookeeper Service Discovery, I apparently need to overwrite the 
*hive-site.xml* in the Docker Image of {*}apache/hive:4.0.0{*}. I tested what 
needs to be done to achieve this at 
[https://github.com/linghengqian/hivesever2-v400-sd-test] . First I need to 
define a docker-compose file to pull in the zookeeper server.

{code:bash}
services:
  zookeeper-server:
    image: zookeeper:3.9.2-jre-17
    restart: always
    ports:
      - "2181:2181"
  hive-server2:
    image: apache/hive:4.0.0
    restart: always
    hostname: '127.0.0.1'
    depends_on:
      zookeeper-server:
        condition: service_started
    environment:
      SERVICE_NAME: hiveserver2
      HIVE_CUSTOM_CONF_DIR: /hive_custom_conf
    ports:
      - "10000:10000"
      - "10002:10002"
    volumes:
      - ./hive-custom-conf:/hive_custom_conf 
{code}
 - Setting the hostname of the hive-server2 docker container to *127.0.0.1* 
already compromises the local docker network. This is because (HiveServer2 
hostname + :10000) is always passed to the znode in the zookeeper server, which 
cannot be changed externally. Generally, the znode node at 
*/hiveserver2/serverUri=localhost:10000;version=4.0.0;sequence=0000000000* has 
the content 
*hive.server2.instance.uri=localhost:10000;hive.server2.authentication=NONE;hive.server2.transport.mode=binary;hive.server2.thrift.sasl.qop=auth;hive.server2.thrift.bind.host=localhost;hive.server2.thrift.port=10000;hive.server2.use.SSL=false*
 . This can be observed from the zookeeper ui on the web by deploying a Docker 
container called *elkozmon/zoonavigator:1.1.3* .
- At this point, I also need to mount a *hive-site.xml* into the Docker Image 
of HiveServer2. Most of the content here is repeated with 
https://github.com/apache/hive/blob/rel/release-4.0.0/packaging/src/docker/conf/hive-site.xml,
 but since *hive-site.xml* does not seem to exist in multiple copies, I can 
only repeat the definition.

{code:xml}
<?xml version="1.0" encoding="UTF-8"?>
<configuration>
    <property>
        <name>hive.server2.enable.doAs</name>
        <value>false</value>
    </property>
    <property>
        <name>hive.tez.exec.inplace.progress</name>
        <value>false</value>
    </property>
    <property>
        <name>hive.tez.exec.print.summary</name>
        <value>true</value>
    </property>
    <property>
        <name>hive.exec.scratchdir</name>
        <value>/opt/hive/scratch_dir</value>
    </property>
    <property>
        <name>hive.user.install.directory</name>
        <value>/opt/hive/install_dir</value>
    </property>
    <property>
        <name>tez.runtime.optimize.local.fetch</name>
        <value>true</value>
    </property>
    <property>
        <name>hive.exec.submit.local.task.via.child</name>
        <value>false</value>
    </property>
    <property>
        <name>mapreduce.framework.name</name>
        <value>local</value>
    </property>
    <property>
        <name>tez.local.mode</name>
        <value>true</value>
    </property>
    <property>
        <name>hive.metastore.warehouse.dir</name>
        <value>/opt/hive/data/warehouse</value>
    </property>
    <property>
        <name>metastore.metastore.event.db.notification.api.auth</name>
        <value>false</value>
    </property>

    <property>
        <name>hive.server2.support.dynamic.service.discovery</name>
        <value>true</value>
    </property>
    <property>
        <name>hive.zookeeper.quorum</name>
        <value>zookeeper-server:2181</value>
    </property>
</configuration>
{code}

- At this point, outside of Docker Compose's Network, I can connect to the 
deployed HiveServer2 in dbeaver via the jdbcUrl of 
*jdbc:hive2://localhost:2181/;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=hiveserver2;*.
 
- But if the docker compose file is defined like this. I only changed the 
hostname of both containers in the same docker network.

{code:bash}
services:
  zookeeper-server:
    image: zookeeper:3.9.2-jre-17
    hostname: 'zookeeper-server'
    restart: always
    ports:
      - "2181:2181"
  hive-server2:
    image: apache/hive:4.0.0
    restart: always
    hostname: 'server2.hive.com'
    depends_on:
      zookeeper-server:
        condition: service_started
    environment:
      SERVICE_NAME: hiveserver2
      HIVE_CUSTOM_CONF_DIR: /hive_custom_conf
    ports:
      - "10000:10000"
      - "10002:10002"
    volumes:
      - ./hive-custom-conf:/hive_custom_conf 
{code}
- Apparently, using the jdbcUrl of 
*jdbc:hive2://localhost:2181/;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=hiveserver2;*
 still connects to zookeeper, but not to HiveServer2. Because at this time in 
the zookeeper server, there is only the znode 
*/hiveserver2/serverUri=server2.hive.com:10000;version=4.0.0;sequence=0000000000*,
 and its content is 
*hive.server2.instance.uri=server2.hive.com:10000;hive.server2.authentication=NONE;hive.server2.transport.mode=binary;hive.server2.thrift.sasl.qop=auth;hive.server2.thrift.bind.host=server2.hive.com;hive.server2.thrift.port=10000;hive.server2.use.SSL=false*.
 And *server2.hive.com:10000* is not accessible outside the docker network, 
which actually affects the local debugging experience.

{code:bash}
com.zaxxer.hikari.pool.HikariPool$PoolInitializationException: Failed to 
initialize pool: Could not open client transport for any of the Server URI's in 
ZooKeeper: Socket is closed by peer.

        at 
com.zaxxer.hikari.pool.HikariPool.throwPoolInitializationException(HikariPool.java:596)
        at com.zaxxer.hikari.pool.HikariPool.checkFailFast(HikariPool.java:582)
        at com.zaxxer.hikari.pool.HikariPool.<init>(HikariPool.java:115)
        at com.zaxxer.hikari.HikariDataSource.<init>(HikariDataSource.java:81)
        at com.lingh.HiveTest.test(HiveTest.java:20)
        at java.base/java.lang.reflect.Method.invoke(Method.java:580)
        at java.base/java.util.ArrayList.forEach(ArrayList.java:1597)
        at java.base/java.util.ArrayList.forEach(ArrayList.java:1597)
Caused by: java.sql.SQLException: Could not open client transport for any of 
the Server URI's in ZooKeeper: Socket is closed by peer.
        at org.apache.hive.jdbc.HiveConnection.<init>(HiveConnection.java:420)
        at org.apache.hive.jdbc.HiveConnection.<init>(HiveConnection.java:285)
        at org.apache.hive.jdbc.HiveDriver.connect(HiveDriver.java:94)
        at 
com.zaxxer.hikari.util.DriverDataSource.getConnection(DriverDataSource.java:121)
        at com.zaxxer.hikari.pool.PoolBase.newConnection(PoolBase.java:364)
        at com.zaxxer.hikari.pool.PoolBase.newPoolEntry(PoolBase.java:206)
        at 
com.zaxxer.hikari.pool.HikariPool.createPoolEntry(HikariPool.java:476)
        at com.zaxxer.hikari.pool.HikariPool.checkFailFast(HikariPool.java:561)
        ... 6 more
Caused by: org.apache.hive.org.apache.thrift.transport.TTransportException: 
Socket is closed by peer.
        at 
org.apache.hive.org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:184)
        at 
org.apache.hive.org.apache.thrift.transport.TTransport.readAll(TTransport.java:109)
        at 
org.apache.hive.org.apache.thrift.transport.TSaslTransport.receiveSaslMessage(TSaslTransport.java:151)
        at 
org.apache.hive.org.apache.thrift.transport.TSaslTransport.open(TSaslTransport.java:272)
        at 
org.apache.hive.org.apache.thrift.transport.TSaslClientTransport.open(TSaslClientTransport.java:39)
        at 
org.apache.hive.jdbc.HiveConnection.openTransport(HiveConnection.java:512)
        at org.apache.hive.jdbc.HiveConnection.<init>(HiveConnection.java:382)
        ... 13 more
{code}

- I don't seem to see any way in the documentation to change the hiveserver2 
hostname and port passed into the zookeeper node for HiveServer2 in the Docker 
Image. It would be nice if there was an easier way to change the hiveserver2 
hostname and port passed into the zookeeper node, such as giving the docker 
image an environment variable.
- I have set up a small unit test at 
https://github.com/linghengqian/hivesever2-v400-sd-test for testing, and the 
instructions for running are in the README.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to