[ https://issues.apache.org/jira/browse/HIVE-27951?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Denys Kuzmenko updated HIVE-27951: ---------------------------------- Status: Patch Available (was: Open) > hcatalog dynamic partitioning fails with partition already exist error when > exist parent partitions path > -------------------------------------------------------------------------------------------------------- > > Key: HIVE-27951 > URL: https://issues.apache.org/jira/browse/HIVE-27951 > Project: Hive > Issue Type: Bug > Components: HCatalog > Affects Versions: 4.0.0-beta-1 > Reporter: Yi Zhang > Assignee: Yi Zhang > Priority: Critical > Labels: pull-request-available > > if a table have multiple partitions (part1=x1, part2=y1), when insert into a > new partition(part1=x1, part2=y2) hcatalog FileOutputCommitterContainer > throws path already exists error > > reproduce: > create table source(id int, part1 string, part2 string); > create table target(id int) partitioned by (part1 string, part2 string) > insert into table source values (1, "x1", "y1"), (2, "x1", "y2"); > > pig -useHcatalog > A = load 'source' using org.apache.hive.hcatalog.pig.HCatLoader(); > B = filter A by (part2 == 'y1'); > // following succeeds > store B into 'target' USING org.apache.hive.hcatalog.pig.HCatStorer(); > //following fails with duplicate publishing error > C = filter A by (part2 == 'y2'); > store C into 'target' USING org.apache.hive.hcatalog.pig.HCatStorer(); > > > ``` > Partition already present with given partition key values : Data already > exists in /user/hive/warehouse/target_data/part1=x1, duplicate publish not > possible. > at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputCommitter.commitJob(PigOutputCommitter.java:243) > at > org.apache.hadoop.mapreduce.v2.app.commit.CommitterEventHandler$EventProcessor.handleJobCommit(CommitterEventHandler.java:286) > > Caused by: org.apache.hive.hcatalog.common.HCatException : 2002 : Partition > already present with given partition key values : Data already exists in > /user/hive/warehouse/target_data/part1=x1, duplicate publish not possible. > at > org.apache.hive.hcatalog.mapreduce.FileOutputCommitterContainer.moveTaskOutputs(FileOutputCommitterContainer.java:564) > at > org.apache.hive.hcatalog.mapreduce.FileOutputCommitterContainer.registerPartitions(FileOutputCommitterContainer.java:949) > at > org.apache.hive.hcatalog.mapreduce.FileOutputCommitterContainer.commitJob(FileOutputCommitterContainer.java:273) > at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputCommitter.commitJob(PigOutputCommitter.java:241) > ``` -- This message was sent by Atlassian Jira (v8.20.10#820010)