[ 
https://issues.apache.org/jira/browse/HIVE-21164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16882646#comment-16882646
 ] 

Sankar Hariappan commented on HIVE-21164:
-----------------------------------------

[~vgumashta]
Yes, removing MoveTask will impact replication.
We add notification log for the ACID writes using 
Hive.addWriteNotificationLog() method. So, we need to invoke this right after 
the data is copied to target location. Compaction won't have any impact as we 
don't replicate the compacted data files and instead we expect user to enable 
compaction at target cluster. So, we just need to invoke 
Hive.addWriteNotificationLog()  only for other ACID writes.
This would impact only incremental replication of write operations on ACID 
tables. But, from the failed tests, it seems even bootstrap replication doesn't 
replicate the data properly. Those failures could be related to some bug in 
insert logic. We may need to validate if writes are proper in source cluster.


> ACID: explore how we can avoid a move step during inserts/compaction
> --------------------------------------------------------------------
>
>                 Key: HIVE-21164
>                 URL: https://issues.apache.org/jira/browse/HIVE-21164
>             Project: Hive
>          Issue Type: Bug
>          Components: Transactions
>    Affects Versions: 3.1.1
>            Reporter: Vaibhav Gumashta
>            Assignee: Vaibhav Gumashta
>            Priority: Major
>         Attachments: HIVE-21164.1.patch, HIVE-21164.2.patch, 
> HIVE-21164.3.patch, HIVE-21164.4.patch, HIVE-21164.5.patch, HIVE-21164.6.patch
>
>
> Currently, we write compacted data to a temporary location and then move the 
> files to a final location, which is an expensive operation on some cloud file 
> systems. Since HIVE-20823 is already in, it can control the visibility of 
> compacted data for the readers. Therefore, we can perhaps avoid writing data 
> to a temporary location and directly write compacted data to the intended 
> final path.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to