[ https://issues.apache.org/jira/browse/HIVE-19715?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Vihang Karajgaonkar updated HIVE-19715: --------------------------------------- Attachment: HIVE-19715-design-doc.pdf > Consolidated and flexible API for fetching partition metadata from HMS > ---------------------------------------------------------------------- > > Key: HIVE-19715 > URL: https://issues.apache.org/jira/browse/HIVE-19715 > Project: Hive > Issue Type: New Feature > Components: Standalone Metastore > Reporter: Todd Lipcon > Assignee: Vihang Karajgaonkar > Priority: Major > Attachments: HIVE-19715-design-doc.pdf > > > Currently, the HMS thrift API exposes 17 different APIs for fetching > partition-related information. There is somewhat of a combinatorial explosion > going on, where each API has variants with and without "auth" info, by pspecs > vs names, by filters, by exprs, etc. Having all of these separate APIs long > term is a maintenance burden and also more confusing for consumers. > Additionally, even with all of these APIs, there is a lack of granularity in > fetching only the information needed for a particular use case. For example, > in some use cases it may be beneficial to only fetch the partition locations > without wasting effort fetching statistics, etc. > This JIRA proposes that we add a new "one API to rule them all" for fetching > partition info. The request and response would be encapsulated in structs. > Some desirable properties: > - the request should be able to specify which pieces of information are > required (eg location, properties, etc) > - in the case of partition parameters, the request should be able to do > either whitelisting or blacklisting (eg to exclude large incremental column > stats HLL dumped in there by Impala) > - the request should optionally specify auth info (to encompas the > "with_auth" variants) > - the request should be able to designate the set of partitions to access > through one of several different methods (eg "all", list<name>, expr, > part_vals, etc) > - the struct should be easily evolvable so that new pieces of info can be > added > - the response should be designed in such a way as to avoid transferring > redundant information for common cases (eg simple "dictionary coding" of > strings like parameter names, etc) > - the API should support some form of pagination for tables with large > partition counts -- This message was sent by Atlassian JIRA (v7.6.3#76005)