David Maughan created HIVE-12158:
------------------------------------
Summary: Add methods to HCatClient for partition synchronization
Key: HIVE-12158
URL: https://issues.apache.org/jira/browse/HIVE-12158
Project: Hive
Issue Type: Improvement
Components: HCatalog
Reporter: David Maughan
Priority: Minor
We have a use case where we have a list of partitions that are created as a
result of a batch job (new or updated) outside of Hive and would like to
synchronize them with the Hive MetaStore. We would like to use the HCatalog
{{HCatClient}} but it currently does not seem to support this. However it is
possible with the {{HiveMetaStoreClient}} directly. I am proposing to add the
following methods to {{HCatClient}} and {{HCatClientHMSImpl}}:
1. A method for altering partitions. The implementation would delegate to
{{HiveMetaStoreClient#alter_partitions}}. I've used "update" instead of "alter"
in the name so it's consistent with the {{HCatClient#updateTableSchema}} method.
{code}
public void updatePartitions(List<HCatPartition> partitions) throws
HCatException
{code}
2. A method for altering or adding partitions depending on whether they already
exist or not. The implementation would split the given list into a list of
existing partitions (using {{HiveMetaStoreClient#getPartitionsByNames}} and
{{Warehouse#makePartName}} to determine existence), and a list of new
partitions. Then the appropriate add/update calls would be issued:
{code}
public void addOrUpdatePartitions(List<HCatPartition> partitions) throws
HCatException
{code}
Are these acceptable? Are there any standards that should be followed here?
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)