[ 
https://issues.apache.org/jira/browse/HIVE-28700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18012152#comment-18012152
 ] 

Stamatis Zampetakis commented on HIVE-28700:
--------------------------------------------

[~dengzh] Can you please add a few extra details in the ticket to explain what 
the problem is and why the patch fixes the problem? Having the exact repro 
steps is very useful but in order for someone to understand if they are hitting 
this problem or not it would help to elaborate a bit more about the 
bug/solution. 

> MRCompactor may cause data loss when performing the major compaction
> --------------------------------------------------------------------
>
>                 Key: HIVE-28700
>                 URL: https://issues.apache.org/jira/browse/HIVE-28700
>             Project: Hive
>          Issue Type: Bug
>          Components: Hive
>    Affects Versions: 4.0.0, 4.0.1
>            Reporter: Zhihua Deng
>            Assignee: Zhihua Deng
>            Priority: Blocker
>              Labels: hive-4.1.0-must, pull-request-available
>             Fix For: 4.1.0
>
>
> Steps to repro:
> set mapreduce.job.reduces=7;
> create table ext(a int);
> insert into table ext values(1),(2),(3),(3),(3),(3),(4),(5),(6),(7);
> create table full_acid(a int) stored as orc 
> tblproperties("transactional"="true");
> insert overwrite table full_acid select * from ext where a  = 3;
> insert into table full_acid select * from ext where a != 3 group by a;
> select * from full_acid;
> alter table full_acid compact 'major' and wait;
> select * from full_acid;
> After the major compaction, the full_acid table misses records with a = 3;



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to