[
https://issues.apache.org/jira/browse/HIVE-29166?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Dmitriy Fingerman updated HIVE-29166:
-------------------------------------
Description:
The attached sql script with a repeated MERGE query generates duplicates.
If any of the following 2 changes are done to the script than there are no
duplicates:
# hive.auto.convert.join=true –> hive.auto.convert.join=false
# The order of columns in CLUSTER BY doesn't match the order of columns in
CREATE TABLE. If the order matches then there are no duplicates.
It was also found that a query like below returns wrong results:
{code:java}
select * from omsexternal_order_mapping_backup
left outer join omsexternal_order_mapping__2025_08_26_03__transactional on
...{code}
This is what MERGE query does under the hood.
> Repeated MERGE query generates duplicates
> -----------------------------------------
>
> Key: HIVE-29166
> URL: https://issues.apache.org/jira/browse/HIVE-29166
> Project: Hive
> Issue Type: Bug
> Reporter: Dmitriy Fingerman
> Priority: Major
> Attachments: merge_duplicates.q
>
>
> The attached sql script with a repeated MERGE query generates duplicates.
> If any of the following 2 changes are done to the script than there are no
> duplicates:
> # hive.auto.convert.join=true –> hive.auto.convert.join=false
> # The order of columns in CLUSTER BY doesn't match the order of columns in
> CREATE TABLE. If the order matches then there are no duplicates.
> It was also found that a query like below returns wrong results:
> {code:java}
> select * from omsexternal_order_mapping_backup
> left outer join omsexternal_order_mapping__2025_08_26_03__transactional on
> ...{code}
> This is what MERGE query does under the hood.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)