[ 
https://issues.apache.org/jira/browse/HIVE-14841?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15668011#comment-15668011
 ] 

Sergey Shelukhin commented on HIVE-14841:
-----------------------------------------

Is it possible to do work in the branch? This causes immense conflicts with 
hive-14535 branch, and I see tons of comments that purport with FIXMEs and 
stuff to move code around and refactor this and that.
I think this should be done on the branch and merged once when ready, so that 
conflicts with parallel changes to the code affected by the moves are minimized.

> Replication - Phase 2
> ---------------------
>
>                 Key: HIVE-14841
>                 URL: https://issues.apache.org/jira/browse/HIVE-14841
>             Project: Hive
>          Issue Type: New Feature
>          Components: repl
>    Affects Versions: 2.1.0
>            Reporter: Sushanth Sowmyan
>            Assignee: Sushanth Sowmyan
>
> Per email sent out to the dev list, the current implementation of replication 
> in hive has certain drawbacks, for instance :
> * Replication follows a rubberbanding pattern, wherein different tables/ptns 
> can be in a different/mixed state on the destination, so that unless all 
> events are caught up on, we do not have an equivalent warehouse. Thus, this 
> only satisfies DR cases, not load balancing usecases, and the secondary 
> warehouse is really only seen as a backup, rather than as a live warehouse 
> that trails the primary.
> * The base implementation is a naive implementation, and has several 
> performance problems, including a large amount of duplication of data for 
> subsequent events, as mentioned in HIVE-13348, having to copy out entire 
> partitions/tables when just a delta of files might be sufficient/etc. Also, 
> using EXPORT/IMPORT allows us a simple implementation, but at the cost of 
> tons of temporary space, much of which is not actually applied at the 
> destination.
> Thus, to track this, we now create a new branch (repl2) and a uber-jira(this 
> one) to track experimental development towards improvement of this situation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to