Qiheng He created HIVE-28424: -------------------------------- Summary: The Docker Image of HiveServer2 should provide an env variable defining the `hostname:port` passed into the znode Key: HIVE-28424 URL: https://issues.apache.org/jira/browse/HIVE-28424 Project: Hive Issue Type: Improvement Reporter: Qiheng He
* The Docker Image of HiveServer2 should provide an environment variable defining the *hostname:port* passed into the znode. * This requirement may seem a bit strange at first glance, but it requires the introduction of a small service orchestration scenario related to [https://github.com/dbeaver/dbeaver/issues/22777] . * For the Docker Image of HiveServer2 on {*}apache/hive:4.0.0{*}, if I need to enable Zookeeper Service Discovery, I apparently need to overwrite the *hive-site.xml* in the Docker Image of {*}apache/hive:4.0.0{*}. I tested what needs to be done to achieve this at [https://github.com/linghengqian/hivesever2-v400-sd-test] . First I need to define a docker-compose file to pull in the zookeeper server. {code:bash} services: zookeeper-server: image: zookeeper:3.9.2-jre-17 restart: always ports: - "2181:2181" hive-server2: image: apache/hive:4.0.0 restart: always hostname: '127.0.0.1' depends_on: zookeeper-server: condition: service_started environment: SERVICE_NAME: hiveserver2 HIVE_CUSTOM_CONF_DIR: /hive_custom_conf ports: - "10000:10000" - "10002:10002" volumes: - ./hive-custom-conf:/hive_custom_conf {code} - Setting the hostname of the hive-server2 docker container to *127.0.0.1* already compromises the local docker network. This is because (HiveServer2 hostname + :10000) is always passed to the znode in the zookeeper server, which cannot be changed externally. Generally, the znode node at */hiveserver2/serverUri=localhost:10000;version=4.0.0;sequence=0000000000* has the content *hive.server2.instance.uri=localhost:10000;hive.server2.authentication=NONE;hive.server2.transport.mode=binary;hive.server2.thrift.sasl.qop=auth;hive.server2.thrift.bind.host=localhost;hive.server2.thrift.port=10000;hive.server2.use.SSL=false* . This can be observed from the zookeeper ui on the web by deploying a Docker container called *elkozmon/zoonavigator:1.1.3* . - At this point, I also need to mount a *hive-site.xml* into the Docker Image of HiveServer2. Most of the content here is repeated with https://github.com/apache/hive/blob/rel/release-4.0.0/packaging/src/docker/conf/hive-site.xml, but since *hive-site.xml* does not seem to exist in multiple copies, I can only repeat the definition. {code:xml} <?xml version="1.0" encoding="UTF-8"?> <configuration> <property> <name>hive.server2.enable.doAs</name> <value>false</value> </property> <property> <name>hive.tez.exec.inplace.progress</name> <value>false</value> </property> <property> <name>hive.tez.exec.print.summary</name> <value>true</value> </property> <property> <name>hive.exec.scratchdir</name> <value>/opt/hive/scratch_dir</value> </property> <property> <name>hive.user.install.directory</name> <value>/opt/hive/install_dir</value> </property> <property> <name>tez.runtime.optimize.local.fetch</name> <value>true</value> </property> <property> <name>hive.exec.submit.local.task.via.child</name> <value>false</value> </property> <property> <name>mapreduce.framework.name</name> <value>local</value> </property> <property> <name>tez.local.mode</name> <value>true</value> </property> <property> <name>hive.metastore.warehouse.dir</name> <value>/opt/hive/data/warehouse</value> </property> <property> <name>metastore.metastore.event.db.notification.api.auth</name> <value>false</value> </property> <property> <name>hive.server2.support.dynamic.service.discovery</name> <value>true</value> </property> <property> <name>hive.zookeeper.quorum</name> <value>zookeeper-server:2181</value> </property> </configuration> {code} - At this point, outside of Docker Compose's Network, I can connect to the deployed HiveServer2 in dbeaver via the jdbcUrl of *jdbc:hive2://localhost:2181/;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=hiveserver2;*. - But if the docker compose file is defined like this. I only changed the hostname of both containers in the same docker network. {code:bash} services: zookeeper-server: image: zookeeper:3.9.2-jre-17 hostname: 'zookeeper-server' restart: always ports: - "2181:2181" hive-server2: image: apache/hive:4.0.0 restart: always hostname: 'server2.hive.com' depends_on: zookeeper-server: condition: service_started environment: SERVICE_NAME: hiveserver2 HIVE_CUSTOM_CONF_DIR: /hive_custom_conf ports: - "10000:10000" - "10002:10002" volumes: - ./hive-custom-conf:/hive_custom_conf {code} - Apparently, using the jdbcUrl of *jdbc:hive2://localhost:2181/;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=hiveserver2;* still connects to zookeeper, but not to HiveServer2. Because at this time in the zookeeper server, there is only the znode */hiveserver2/serverUri=server2.hive.com:10000;version=4.0.0;sequence=0000000000*, and its content is *hive.server2.instance.uri=server2.hive.com:10000;hive.server2.authentication=NONE;hive.server2.transport.mode=binary;hive.server2.thrift.sasl.qop=auth;hive.server2.thrift.bind.host=server2.hive.com;hive.server2.thrift.port=10000;hive.server2.use.SSL=false*. And *server2.hive.com:10000* is not accessible outside the docker network, which actually affects the local debugging experience. {code:bash} com.zaxxer.hikari.pool.HikariPool$PoolInitializationException: Failed to initialize pool: Could not open client transport for any of the Server URI's in ZooKeeper: Socket is closed by peer. at com.zaxxer.hikari.pool.HikariPool.throwPoolInitializationException(HikariPool.java:596) at com.zaxxer.hikari.pool.HikariPool.checkFailFast(HikariPool.java:582) at com.zaxxer.hikari.pool.HikariPool.<init>(HikariPool.java:115) at com.zaxxer.hikari.HikariDataSource.<init>(HikariDataSource.java:81) at com.lingh.HiveTest.test(HiveTest.java:20) at java.base/java.lang.reflect.Method.invoke(Method.java:580) at java.base/java.util.ArrayList.forEach(ArrayList.java:1597) at java.base/java.util.ArrayList.forEach(ArrayList.java:1597) Caused by: java.sql.SQLException: Could not open client transport for any of the Server URI's in ZooKeeper: Socket is closed by peer. at org.apache.hive.jdbc.HiveConnection.<init>(HiveConnection.java:420) at org.apache.hive.jdbc.HiveConnection.<init>(HiveConnection.java:285) at org.apache.hive.jdbc.HiveDriver.connect(HiveDriver.java:94) at com.zaxxer.hikari.util.DriverDataSource.getConnection(DriverDataSource.java:121) at com.zaxxer.hikari.pool.PoolBase.newConnection(PoolBase.java:364) at com.zaxxer.hikari.pool.PoolBase.newPoolEntry(PoolBase.java:206) at com.zaxxer.hikari.pool.HikariPool.createPoolEntry(HikariPool.java:476) at com.zaxxer.hikari.pool.HikariPool.checkFailFast(HikariPool.java:561) ... 6 more Caused by: org.apache.hive.org.apache.thrift.transport.TTransportException: Socket is closed by peer. at org.apache.hive.org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:184) at org.apache.hive.org.apache.thrift.transport.TTransport.readAll(TTransport.java:109) at org.apache.hive.org.apache.thrift.transport.TSaslTransport.receiveSaslMessage(TSaslTransport.java:151) at org.apache.hive.org.apache.thrift.transport.TSaslTransport.open(TSaslTransport.java:272) at org.apache.hive.org.apache.thrift.transport.TSaslClientTransport.open(TSaslClientTransport.java:39) at org.apache.hive.jdbc.HiveConnection.openTransport(HiveConnection.java:512) at org.apache.hive.jdbc.HiveConnection.<init>(HiveConnection.java:382) ... 13 more {code} - I don't seem to see any way in the documentation to change the hiveserver2 hostname and port passed into the zookeeper node for HiveServer2 in the Docker Image. It would be nice if there was an easier way to change the hiveserver2 hostname and port passed into the zookeeper node, such as giving the docker image an environment variable. - I have set up a small unit test at https://github.com/linghengqian/hivesever2-v400-sd-test for testing, and the instructions for running are in the README. -- This message was sent by Atlassian Jira (v8.20.10#820010)