Todd Lipcon created HIVE-19715:
----------------------------------
Summary: Consolidated and flexible API for fetching partition
metadata from HMS
Key: HIVE-19715
URL: https://issues.apache.org/jira/browse/HIVE-19715
Project: Hive
Issue Type: New Feature
Components: Standalone Metastore
Reporter: Todd Lipcon
Currently, the HMS thrift API exposes 17 different APIs for fetching
partition-related information. There is somewhat of a combinatorial explosion
going on, where each API has variants with and without "auth" info, by pspecs
vs names, by filters, by exprs, etc. Having all of these separate APIs long
term is a maintenance burden and also more confusing for consumers.
Additionally, even with all of these APIs, there is a lack of granularity in
fetching only the information needed for a particular use case. For example, in
some use cases it may be beneficial to only fetch the partition locations
without wasting effort fetching statistics, etc.
This JIRA proposes that we add a new "one API to rule them all" for fetching
partition info. The request and response would be encapsulated in structs. Some
desirable properties:
- the request should be able to specify which pieces of information are
required (eg location, properties, etc)
- in the case of partition parameters, the request should be able to do either
whitelisting or blacklisting (eg to exclude large incremental column stats HLL
dumped in there by Impala)
- the request should optionally specify auth info (to encompas the "with_auth"
variants)
- the request should be able to designate the set of partitions to access
through one of several different methods (eg "all", list<name>, expr,
part_vals, etc)
- the struct should be easily evolvable so that new pieces of info can be added
- the response should be designed in such a way as to avoid transferring
redundant information for common cases (eg simple "dictionary coding" of
strings like parameter names, etc)
- the API should support some form of pagination for tables with large
partition counts
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)