Zhenhua Wang created SPARK-22394: ------------------------------------ Summary: Redundant synchronization for metastore access Key: SPARK-22394 URL: https://issues.apache.org/jira/browse/SPARK-22394 Project: Spark Issue Type: Improvement Components: SQL Affects Versions: 2.3.0 Reporter: Zhenhua Wang
Before Spark 2.x, synchronization for metastore access was protected at [line229 in ClientWrapper |https://github.com/apache/spark/blob/branch-1.6/sql/hive/src/main/scala/org/apache/spark/sql/hive/client/ClientWrapper.scala#L229] (now it's at [line203 in HiveClientWrapper |https://github.com/apache/spark/blob/master/sql/hive/src/main/scala/org/apache/spark/sql/hive/client/HiveClientImpl.scala#L203]). After Spark 2.x, HiveExternalCatalog was introduced by [SPARK-13080|https://github.com/apache/spark/pull/11293], where an extra level of synchronization was added at [line95|https://github.com/apache/spark/blob/master/sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveExternalCatalog.scala#L95]. That is, now we have two levels of synchronization: one is HiveExternalCatalog and the other is IsolatedClientLoader in HiveClientImpl. But since both HiveExternalCatalog and IsolatedClientLoader are shared among all spark sessions, I think the extra level of synchronization in HiveExternalCatalog is redundant, thus can be removed. -- This message was sent by Atlassian JIRA (v6.4.14#64029) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org