[ https://issues.apache.org/jira/browse/SPARK-27899?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Xiao Li reassigned SPARK-27899: ------------------------------- Assignee: Lantao Jin > Make HiveMetastoreClient.getTableObjectsByName available in > ExternalCatalog/SessionCatalog API > ---------------------------------------------------------------------------------------------- > > Key: SPARK-27899 > URL: https://issues.apache.org/jira/browse/SPARK-27899 > Project: Spark > Issue Type: Improvement > Components: SQL > Affects Versions: 2.4.0 > Reporter: Juliusz Sompolski > Assignee: Lantao Jin > Priority: Major > > The new Spark ThriftServer SparkGetTablesOperation implemented in > https://github.com/apache/spark/pull/22794 does a catalog.getTableMetadata > request for every table. This can get very slow for large schemas (~50ms per > table with an external Hive metastore). > Hive ThriftServer GetTablesOperation uses > HiveMetastoreClient.getTableObjectsByName to get table information in bulk, > but we don't expose that through our APIs that go through Hive -> > HiveClientImpl (HiveClient) -> HiveExternalCatalog (ExternalCatalog) -> > SessionCatalog. > If we added and exposed getTableObjectsByName through our catalog APIs, we > could resolve that performance problem in SparkGetTablesOperation. -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org