yigress opened a new pull request, #4979:
URL: https://github.com/apache/hive/pull/4979
### What changes were proposed in this pull request?
clean up the code for readibilty
fix the issue when parent partition path exists for multi partitioned
dynamic insert.
### Why are the changes needed?
fix bug:
if a table have multiple partitions (part1=x1, part2=y1), when insert into a
new partition(part1=x1, part2=y2) hcatalog FileOutputCommitterContainer throws
path part1=x1 already exists error. This is due to the path checking stops at
parent level,
pig -useHcatalog
A = load 'source' using org.apache.hive.hcatalog.pig.HCatLoader();
B = filter A by (part2 == 'y1');
// following succeeds
store B into 'target' USING org.apache.hive.hcatalog.pig.HCatStorer();
//following fails with duplicate publishing error
C = filter A by (part2 == 'y2');
store C into 'target' USING org.apache.hive.hcatalog.pig.HCatStorer();
```
Partition already present with given partition key values : Data already
exists in /user/hive/warehouse/target/part1=x1, duplicate publish not possible.
at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputCommitter.commitJob(PigOutputCommitter.java:243)
at
org.apache.hadoop.mapreduce.v2.app.commit.CommitterEventHandler$EventProcessor.handleJobCommit(CommitterEventHandler.java:286)
Caused by: org.apache.hive.hcatalog.common.HCatException : 2002 : Partition
already present with given partition key values : Data already exists in
/user/hive/warehouse/target/part1=x1, duplicate publish not possible.
at
org.apache.hive.hcatalog.mapreduce.FileOutputCommitterContainer.moveTaskOutputs(FileOutputCommitterContainer.java:564)
at
org.apache.hive.hcatalog.mapreduce.FileOutputCommitterContainer.registerPartitions(FileOutputCommitterContainer.java:949)
at
org.apache.hive.hcatalog.mapreduce.FileOutputCommitterContainer.commitJob(FileOutputCommitterContainer.java:273)
at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputCommitter.commitJob(PigOutputCommitter.java:241)
```
### Does this PR introduce _any_ user-facing change?
No
### Is the change a dependency upgrade?
No
### How was this patch tested?
updated unit test to include the use case that is affected by the bug
mvn clean test
-Dtest=TestHCatExternalDynamicPartitioned,TestHCatDynamicPartitioned,TestHCatPartitioned,TestHCatNonPartitioned
also tested locally with pig -useHCatalog
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]