[ 
https://issues.apache.org/jira/browse/HIVE-20911?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

anishek reassigned HIVE-20911:
------------------------------

    Assignee: anishek

> External Table Replication for Hive
> -----------------------------------
>
>                 Key: HIVE-20911
>                 URL: https://issues.apache.org/jira/browse/HIVE-20911
>             Project: Hive
>          Issue Type: Bug
>          Components: HiveServer2
>    Affects Versions: 4.0.0
>            Reporter: anishek
>            Assignee: anishek
>            Priority: Critical
>             Fix For: 4.0.0
>
>
> External tables are not replicated currently as part of hive replication. As 
> part of this jira we want to enable that.
> Approach:
> * Target cluster will have a top level base directory config that will be 
> used to copy all data relevant to external tables. This will be provided via 
> the *with* clause in the *repl load* command. This base path will be prefixed 
> to the path of the same external table on source cluster.
> * Since changes to directories on the external table can happen without hive 
> knowing it, hence we cant capture the relevant events when ever new data is 
> added or removed, we will have to copy the data from the source path to 
> target path for external tables every time we run incremental replication.
> ** this will require incremental *repl dump*  to now create an additional 
> file *\_external\_tables\_info* with data in the following form 
> {code}
> OpearationType,tableName,base64Encoded(tableDataLocation)
> {code}
> where OpeartionType can be one in (ADD, REMOVE)
> ** *repl load* will look up all the external tables on target and remove 
> tables listed with REMOVE type in the above file.
> ** For the remaining tables it will create tasks for the corresponding paths 
> from source to target along with the existing tasks for incremental load.
> * New External tables will be created with data copied as part of regular 
> tasks wile incremental load, applying the base directory prefix
> * Bootstrap will also create / copy these external tables as part of their 
> regular workflow, applying the base directory prefix



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to