Kevin Wilfong created HIVE-3149:
-----------------------------------
Summary: Dynamically generated paritions deleted by BlockMergeTask
Key: HIVE-3149
URL: https://issues.apache.org/jira/browse/HIVE-3149
Project: Hive
Issue Type: Bug
Components: Query Processor
Reporter: Kevin Wilfong
Assignee: Kevin Wilfong
Priority: Critical
Fix For: 0.10.0
When creating partitions in a table using dynamic partitions and a Block Merge
Task is executed at the end of the query, some partitions may be lost.
Specifically if the values of two or more dynamic partition keys end in the
same sequence of numbers, all but the largest will be dropped.
I was not able to confirm it, but I suspect that if a map reduce job is
speculated as part of the merge, the duplicate data will not be deleted either.
E.g.
insert overwrite table merge_dynamic_part partition (ds = '2008-04-08', hr)
select key, value, if(key % 2 == 0, 'a1', 'b1') as hr from srcpart_merge_dp_rc
where ds = '2008-04-08';
In this query, if a Block Merge Task is executed at the end, only one of the
partitions ds=2008-04-08/hr=a1 and ds=2008-04-08/hr=b1 will appear in the final
table.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira