[ https://issues.apache.org/jira/browse/FLINK-20416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17272523#comment-17272523 ]
Jark Wu commented on FLINK-20416: --------------------------------- [~lirui], I agree with you. It's not a good way to change {{HiveMetastoreClientWrapper}} directly. Maybe we can introduce an interface of {{HiveMetastoreClient}} which defines the basic metastore methods and support two implementations, {{CachingHiveMetastoreClient}} and {{DefaultHiveMetastoreClient}}. The {{CachingHiveMetastoreClient}} have the cache entries and delegate method calls to the underlying {{HiveMetastoreClient}}. > Need a cached catalog for HiveCatalog > ------------------------------------- > > Key: FLINK-20416 > URL: https://issues.apache.org/jira/browse/FLINK-20416 > Project: Flink > Issue Type: Improvement > Components: Connectors / Common, Connectors / Hive, Table SQL / API, > Table SQL / Planner > Reporter: Sebastian Liu > Priority: Major > Labels: pull-request-available > Attachments: hms cache.jpg > > > For OLAP scenarios, There are usually some analytical queries which running > time is relatively short. These queries are also sensitive to latency. In the > current Blink sql processing, parse/validate/optimize stages are all need > meta data from catalog API. But each request to the catalog requires re-run > of the underlying meta query. > > We may need a cached catalog which can cache the table schema and statistic > info to avoid unnecessary repeated meta requests. > Design > doc:[https://docs.google.com/document/d/1oL8HUpv2WaF6OkFvbH5iefXkOJB__Dal_bYsIZJA_Gk/edit?usp=sharing] > I have submitted a related PR for adding a genetic cached catalog, which can > delegate other implementations of {{AbstractCatalog. }} > {{[https://github.com/apache/flink/pull/14260]}} -- This message was sent by Atlassian Jira (v8.3.4#803005)