[ https://issues.apache.org/jira/browse/HIVE-28295?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17870342#comment-17870342 ]
Qiheng He commented on HIVE-28295: ---------------------------------- According to the discussion at HIVE-28418 , the use of embedded HiveServer2 is no longer encouraged and the related documents will be deleted. The current issue can be closed. > HiveServer2 JDBC driver's embedded mode cannot be used in JDK22 > --------------------------------------------------------------- > > Key: HIVE-28295 > URL: https://issues.apache.org/jira/browse/HIVE-28295 > Project: Hive > Issue Type: Bug > Reporter: Qiheng He > Priority: Major > > - HiveServer2 JDBC driver's embedded mode cannot be used in JDK22. I am new > to Hive to a large extent and I am not quite sure if this is a documentation > issue or a bug in Hive. I came from > [https://github.com/apache/shardingsphere/issues/29052] and I am trying to > write unit tests for the SQL parsing module of Hive for Apache ShardingSphere > under GraalVM Native Image. > - I noticed that > [https://cwiki.apache.org/confluence/display/Hive/HiveServer2+Clients#HiveServer2Clients-JDBC] > mentioned that in addition to starting HiveServer2 through Docker, it can > also start an embedded HiveServer2 through the JDBC Driver, just like > H2database. Since the corresponding documentation does not mention the > involved Maven modules, I realized that the following documentation seems to > be outdated. > {code:bash} > # To run the program in embedded mode, we need the following additional jars > in the classpath > # from hive/build/dist/lib > # hive-exec*.jar > # hive-metastore*.jar > # antlr-runtime-3.0.1.jar > # derby.jar > # jdo2-api-2.1.jar > # jpox-core-1.2.2.jar > # jpox-rdbms-1.2.2.jar > # and from hadoop/build > # hadoop-core*.jar > # as well as hive/build/dist/conf, any HIVE_AUX_JARS_PATH set, > # and hadoop jars necessary to run MR jobs (eg lzo codec) > {code} > - I guessed that {*}jpox-core-1.2.2.jar{*} and {*}jpox-rdbms-1.2.2.jar{*} > refer to {*}jpox:jpox-core:1.2.0-beta-5{*} and > {*}jpox:jpox-rdbms:1.2.0-beta-5{*} from Maven Central. But when I wrote the > unit test and observed the Error Log, I realized that what I actually needed > was {*}org.datanucleus:datanucleus-api-jdo:5.2.9{*} and > {*}org.datanucleus:datanucleus-rdbms:5.2.10{*}. The document does not seem to > mention the existence of datanucleus. > {code:bash} > Caused by: MetaException(message:Got exception: > org.apache.hadoop.hive.metastore.api.MetaException > java.lang.ClassNotFoundException: > org.datanucleus.api.jdo.JDOPersistenceManagerFactory) > {code} > - Using only {*}hive-exec.jar{*} and {*}hive-metastore.jar{*} represented by > {*}org.apache.hive:hive-exec:4.0.0{*} and > {*}org.apache.hive:hive-metastore:4.0.0{*} does not seem to contain the Java > class {*}org.apache.hive.jdbc.HiveDriver{*} necessary to create > {*}jdbc:hive2:///{*}. It seems that {*}org.apache.hive:hive-jdbc:4.0.0{*} and > {*}org.apache.hive:hive-service:4.0.0{*} are always required. > - The requirement for {*}hadoop-core.jar{*} appears to be outdated, what is > actually required is the shaded package from > {*}org.apache.hadoop:hadoop-client-runtime:3.3.6{*}. > - Even after dealing with these issues, I still don't understand why creating > an embedded HiveServer2 via a JDBC URL would throw additional errors. I > created a git with minimal unit tests at > https://github.com/linghengqian/hive-embedded-mode-test . To run unit tests > under JDK22, just run the following command on an {*}Ubuntu 22.04.4{*} > machine with git and {*}SDKMAN!{*} installed. > {code:bash} > sdk install java 22.0.1-graalce > sdk use java 22.0.1-graalce > git clone g...@github.com:linghengqian/hive-embedded-mode-test.git > cd ./hive-embedded-mode-test/ > ./mvnw clean test > {code} > - I just used the following dependencies. I also set > {*}--add-opens=java.base/java.net=ALL-UNNAMED{*} via > {*}maven-surefire-plugin{*} to get around Hive's limitations. > {code:bash} > org.apache.hive:hive-jdbc:4.0.0 > org.apache.hive:hive-service:4.0.0 > org.apache.hadoop:hadoop-client-runtime:3.3.6 > org.datanucleus:datanucleus-api-jdo:5.2.9 > org.datanucleus:datanucleus-rdbms:5.2.10 > {code} > - The core logic of the unit test {*}com.lingh.HiveTest{*} is to create a > HiveServer2, which contains a database named {*}demo_ds_0{*} and executes > some test SQL. Refer to > https://github.com/linghengqian/hive-embedded-mode-test/blob/master/src/test/java/com/lingh/HiveTest.java > . > - Error Log as > https://github.com/linghengqian/hive-embedded-mode-test/blob/master/README.md > . > {code:bash} > Caused by: MetaException(message:Got exception: > org.apache.hadoop.hive.metastore.api.MetaException Version information not > found in metastore.) > at > org.apache.hadoop.hive.metastore.utils.MetaStoreUtils.throwMetaException(MetaStoreUtils.java:193) > at > org.apache.hadoop.hive.metastore.HiveMetaStoreClient.callEmbeddedMetastore(HiveMetaStoreClient.java:311) > at > org.apache.hadoop.hive.metastore.HiveMetaStoreClient.<init>(HiveMetaStoreClient.java:222) > at > org.apache.hadoop.hive.ql.metadata.HiveMetaStoreClientWithLocalCache.<init>(HiveMetaStoreClientWithLocalCache.java:118) > at > org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient.<init>(SessionHiveMetaStoreClient.java:154) > at > java.base/jdk.internal.reflect.DirectConstructorHandleAccessor.newInstance(DirectConstructorHandleAccessor.java:62) > ... 30 more > {code} -- This message was sent by Atlassian Jira (v8.20.10#820010)