[ https://issues.apache.org/jira/browse/DRILL-4126?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Jinfeng Ni resolved DRILL-4126. ------------------------------- Resolution: Fixed Fixed in commit: https://git-wip-us.apache.org/repos/asf?p=drill.git;a=commit;h=02ce44bcab37bce0176e472b194d16acb6d167b7 > Adding HiveMetaStore caching when impersonation is enabled. > ------------------------------------------------------------ > > Key: DRILL-4126 > URL: https://issues.apache.org/jira/browse/DRILL-4126 > Project: Apache Drill > Issue Type: Bug > Reporter: Jinfeng Ni > Assignee: Jinfeng Ni > > Currently, HiveMetastore caching is used only when impersonation is disabled, > such that all the hivemetastore call goes through > NonCloseableHiveClientWithCaching [1]. However, if impersonation is enabled, > caching is not used for HiveMetastore access. > This could significantly increase the planning time when hive storage plugin > is enabled, or when running a query against INFORMATION_SCHEMA. Depending on > the # of databases/tables in Hive storage plugin, the planning time or > INFORMATION_SCHEMA query could become unacceptable. This becomes even worse > if the hive metastore is running on a different node from drillbit, making > the access of hivemetastore even slower. > We are seeing that it could takes 30~60 seconds for planning time, or > execution time for INFORMATION_SCHEMA query. The long planning or execution > time for INFORMATION_SCHEMA query prevents Drill from acting "interactively" > for such queries. > We should enable caching when impersonation is used. As long as the > authorizer verifies the user has the access to databases/tables, we should > get the data from caching. By doing that, we should see reduced number of api > call to HiveMetaStore. > [1] > https://github.com/apache/drill/blob/master/contrib/storage-hive/core/src/main/java/org/apache/drill/exec/store/hive/DrillHiveMetaStoreClient.java#L299 -- This message was sent by Atlassian JIRA (v6.3.4#6332)