liang yu created HIVE-27917:
-------------------------------

             Summary: hive metastore: MetaStoreDirectSql waste time on Testing 
connection to MYSQL Database
                 Key: HIVE-27917
                 URL: https://issues.apache.org/jira/browse/HIVE-27917
             Project: Hive
          Issue Type: Improvement
            Reporter: liang yu
            Assignee: liang yu


Using Hive-3.1.3.   HADOOP-3.3.4. 

 

Description:

Using hive-cli to execute sql, and then exit. hive client will take more than 1 
minute to finish the connection to hive metastore. This caused my serial 
execution stuck beacause each execution will create a new client and then exit.

 

Analysis:

When a client try to connect hive metastore server, using direct sql, it will 
init a MetaStoreDirectSql Object, following is a trace of how it is created:

 
{code:java}
org.apache.hadoop.hive.metastore.MetaStoreDirectSql.getProductName(MetaStoreDirectSql.java:184)
org.apache.hadoop.hive.metastore.MetaStoreDirectSql.<init>(MetaStoreDirectSql. 
java:184)
org.apache.hadoop.hive.metastore.ObjectStore.initializeHelper(ObjectStore.java:499)
org.apache.hadoop.hive.metastore.ObjectStore.initialize(0bjectStore.java:421)
org.apache.hadoop.hive.metastore.ObjectStore.setConff(ObjectStore.java:376)
org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:79)
org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:139)
org.apache.hadoop.hive.metastore.RawStoreProxy.<init>(RawStoreProxy.java:59)
org.apache.hadoop.hive.metastore.RawStoreProxy.getProxy(RawStoreProxy.java:67)
org.apache.hadoop.hive.metastore.HiveMetaStoreSHMSHandler.newRawStoreForConf 
(HiveMetaStore.java:720)
org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.getMSForConf 
(HiveMetaStore.java:698)
org.apache.hadoop.hive.metastore.HiveMetaStoreSHMSHandler.getMS 
(HiveMetaStore.java:692) {code}
 

 
 
{code:java}
org.datanucleus.api.jdo.JDOQuery.execute(JDOQuery.java:216)
org.apache.hadoop.hive.metastore.MetaStoreDirectSql.ensureDbInit(MetaStoreDirectSql.
 java:240)
org.apache.hadoop.hive.metastore.MetaStoreDirectSql.<init>(MetaStoreDirectSql. 
java:184)
org.apache.hadoop.hive.metastore.ObjectStore.initializeHelper(ObjectStore.java:499)
org.apache.hadoop.hive.metastore.ObjectStore.initialize(0bjectStore.java:421)
org.apache.hadoop.hive.metastore.ObjectStore.setConff(ObjectStore.java:376)
org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:79)
org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:139)
org.apache.hadoop.hive.metastore.RawStoreProxy.<init>(RawStoreProxy.java:59)
org.apache.hadoop.hive.metastore.RawStoreProxy.getProxy(RawStoreProxy.java:67)
org.apache.hadoop.hive.metastore.HiveMetaStoreSHMSHandler.newRawStoreForConf 
(HiveMetaStore.java:720)
org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.getMSForConf 
(HiveMetaStore.java:698)
org.apache.hadoop.hive.metastore.HiveMetaStoreSHMSHandler.getMS 
(HiveMetaStore.java:692){code}
In these two places, the hive metastore server will try to connect to mysql and 
execute a simple sql, this will take tens of milliseconds which is a waste of 
time cause there is a Lock hold during the init. When there gets more and more 
clients try to connect the server, it will cause block and client will get 
stuck.
 
Solution:
We use the MYSQL as our database, so we don't need to use getProductName to 
determine the database type, and add a new Configuration to pass the database 
type.
We have our DBA, so we don't need to use ensureDbInit to make sure that the 
mysql database is ready to connect.
 
 
 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to