[
https://issues.apache.org/jira/browse/HIVE-8371?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Thiruvel Thirumoolan reassigned HIVE-8371:
------------------------------------------
Assignee: Thiruvel Thirumoolan
> HCatStorer should fail by default when publishing to an existing partition
> --------------------------------------------------------------------------
>
> Key: HIVE-8371
> URL: https://issues.apache.org/jira/browse/HIVE-8371
> Project: Hive
> Issue Type: Bug
> Components: HCatalog
> Affects Versions: 0.13.0, 0.14.0, 0.13.1
> Reporter: Thiruvel Thirumoolan
> Assignee: Thiruvel Thirumoolan
> Labels: hcatalog, partition
>
> In Hive-12 and before (on in previous HCatalog releases) HCatStorer would
> fail if the partition already exists (whether before launching the job or
> during commit depending on the partitioning). HIVE-6406 changed that behavior
> and by default does an append. This causes data quality issues since an rerun
> (or duplicate run) won't fail (when it used to) and will just append to the
> partition.
> A preferable approach would be to leave HCatStorer behavior as is (fail
> during a duplicate publish) and support append through an option. Overwrite
> also can be implemented in a similar fashion. Eg:
> store A into 'db.table' using
> org.apache.hive.hcatalog.pig.HCatStorer('partspec', '', ' -append');
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)