[ 
https://issues.apache.org/jira/browse/HIVE-20740?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16666237#comment-16666237
 ] 

Vihang Karajgaonkar commented on HIVE-20740:
--------------------------------------------

Spent some more time on this. Based on what I understand, there is a race in 
the existing code which could lead to a NPE. Eg.

Thread A calls setConf method with a new datasource properties. Thread A 
acquires {{proplock}} and sets {{pmf = null}}

Meanwhile Thread B is in {{getPMF}} method which even though is static 
synchronized doesn't know if pmf is being updated by some other thread in 
{{setConf}} object. This could lead to a NPE on Thread B. Or some error on 
Thread A when it tries to get {{PersistenceManager}} from unitialized pmf.

The good news this is very unlikely to happen. When datasource properties are 
changed, it needs a restart of HMS service which means there are no other 
threads requesting {{pmf}} until HMS up and running.

But still having {{pmf}} and {{props}} objects as private members which need to 
be synchronized when updating can be handled more cleaning if we move them to a 
separate wrapper class which deals with these details. Latest patch refactors 
some of this code and hopefully makes is cleaner than the current 
implementation.

> Remove global lock in ObjectStore.setConf method
> ------------------------------------------------
>
>                 Key: HIVE-20740
>                 URL: https://issues.apache.org/jira/browse/HIVE-20740
>             Project: Hive
>          Issue Type: Improvement
>            Reporter: Vihang Karajgaonkar
>            Assignee: Vihang Karajgaonkar
>            Priority: Major
>         Attachments: HIVE-20740.01.patch, HIVE-20740.02.patch, 
> HIVE-20740.04.patch, HIVE-20740.05.patch
>
>
> The ObjectStore#setConf method has a global lock which can block other 
> clients in concurrent workloads.
> {code}
> @Override
>   @SuppressWarnings("nls")
>   public void setConf(Configuration conf) {
>     // Although an instance of ObjectStore is accessed by one thread, there 
> may
>     // be many threads with ObjectStore instances. So the static variables
>     // pmf and prop need to be protected with locks.
>     pmfPropLock.lock();
>     try {
>       isInitialized = false;
>       this.conf = conf;
>       this.areTxnStatsSupported = MetastoreConf.getBoolVar(conf, 
> ConfVars.HIVE_TXN_STATS_ENABLED);
>       configureSSL(conf);
>       Properties propsFromConf = getDataSourceProps(conf);
>       boolean propsChanged = !propsFromConf.equals(prop);
>       if (propsChanged) {
>         if (pmf != null){
>           clearOutPmfClassLoaderCache(pmf);
>           if (!forTwoMetastoreTesting) {
>             // close the underlying connection pool to avoid leaks
>             pmf.close();
>           }
>         }
>         pmf = null;
>         prop = null;
>       }
>       assert(!isActiveTransaction());
>       shutdown();
>       // Always want to re-create pm as we don't know if it were created by 
> the
>       // most recent instance of the pmf
>       pm = null;
>       directSql = null;
>       expressionProxy = null;
>       openTrasactionCalls = 0;
>       currentTransaction = null;
>       transactionStatus = TXN_STATUS.NO_STATE;
>       initialize(propsFromConf);
>       String partitionValidationRegex =
>           MetastoreConf.getVar(this.conf, 
> ConfVars.PARTITION_NAME_WHITELIST_PATTERN);
>       if (partitionValidationRegex != null && 
> !partitionValidationRegex.isEmpty()) {
>         partitionValidationPattern = 
> Pattern.compile(partitionValidationRegex);
>       } else {
>         partitionValidationPattern = null;
>       }
>       // Note, if metrics have not been initialized this will return null, 
> which means we aren't
>       // using metrics.  Thus we should always check whether this is non-null 
> before using.
>       MetricRegistry registry = Metrics.getRegistry();
>       if (registry != null) {
>         directSqlErrors = 
> Metrics.getOrCreateCounter(MetricsConstants.DIRECTSQL_ERRORS);
>       }
>       this.batchSize = MetastoreConf.getIntVar(conf, 
> ConfVars.RAWSTORE_PARTITION_BATCH_SIZE);
>       if (!isInitialized) {
>         throw new RuntimeException(
>         "Unable to create persistence manager. Check dss.log for details");
>       } else {
>         LOG.debug("Initialized ObjectStore");
>       }
>     } finally {
>       pmfPropLock.unlock();
>     }
>   }
> {code}
> The {{pmfPropLock}} is a static object and it disallows any other new 
> connection to HMS which is trying to instantiate ObjectStore. We should 
> either remove the lock or reduce the scope of the lock so that it is held for 
> a very small amount of time.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to