[jira] [Updated] (HIVE-21763) Incremental replication to allow changing include/exclude tables list in replication policy.

Sankar Hariappan (JIRA) Tue, 04 Jun 2019 20:59:44 -0700


     [ 
https://issues.apache.org/jira/browse/HIVE-21763?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Sankar Hariappan updated HIVE-21763:
------------------------------------
    Description: 
- REPL DUMP takes 2 inputs along with existing FROM and WITH clause.
{code}
- REPL DUMP <current_repl_policy> [REPLACE <previous_repl_policy> FROM 
<last_repl_id> WITH <key_values_list>;
- current_repl_policy and previous_repl_policy can be any format mentioned in 
Point-4.
- REPLACE clause to be supported to take previous repl policy as input. If 
REPLACE clause is not there, then the policy remains unchanged.
- Rest of the format remains same.
{code}
- Now, REPL DUMP on this DB will replicate the tables based on 
current_repl_policy.
- Single table replication of format <db_name>.t1 doesn’t allow changing the 
policy dynamically. So REPLACE clause is not allowed if previous_repl_policy of 
this format.
- If any table is added dynamically either due to change in regular expression 
or added to include list should be bootstrapped using independant table level 
replication policy.
{code}
- Hive will automatically figure out the list of tables newly included in the 
list by comparing the current_repl_policy & previous_repl_policy inputs and 
combine bootstrap dump for added tables as part of incremental dump. 
"_bootstrap" directory can be created in dump dir to accommodate all tables to 
be bootstrapped.
- If any table is renamed, then it may gets dynamically added/removed for 
replication based on defined replication policy + include/exclude list. So, 
Hive will perform bootstrap for the table which is just included after rename.
{code}
- REPL LOAD on incremental dump should check for "_bootstrap" directory and 
perform bootstrap load on them first and then continue with incremental load 
based on events directories.
- REPL LOAD should check for changes in repl policy and drop the tables/views 
excluded in the new policy  compared to previous policy.

  was:
- REPL DUMP takes 2 inputs along with existing FROM and WITH clause.
{code}
- REPL DUMP <current_repl_policy> [REPLACE <previous_repl_policy> FROM 
<last_repl_id> WITH <key_values_list>;
- current_repl_policy and previous_repl_policy can be any format mentioned in 
Point-4.
- REPLACE clause to be supported to take previous repl policy as input. If 
REPLACE clause is not there, then the policy remains unchanged.
- Rest of the format remains same.
{code}
- Now, REPL DUMP on this DB will replicate the tables based on 
current_repl_policy.
- If any table is added dynamically either due to change in regular expression 
or added to include list should be bootstrapped using independant table level 
replication policy.
{code}
- Hive will automatically figure out the list of tables newly included in the 
list by comparing the current_repl_policy & previous_repl_policy inputs and 
combine bootstrap dump for added tables as part of incremental dump. 
"_bootstrap" directory can be created in dump dir to accommodate all tables to 
be bootstrapped.
- If any table is renamed, then it may gets dynamically added/removed for 
replication based on defined replication policy + include/exclude list. So, 
Hive will perform bootstrap for the table which is just included after rename.
{code}
- REPL LOAD on incremental dump should check for "_bootstrap" directory and 
perform bootstrap load on them first and then continue with incremental load 
based on events directories.
- REPL LOAD should check for changes in repl policy and drop the tables/views 
excluded in the new policy  compared to previous policy.


> Incremental replication to allow changing include/exclude tables list in 
> replication policy.
> --------------------------------------------------------------------------------------------
>
>                 Key: HIVE-21763
>                 URL: https://issues.apache.org/jira/browse/HIVE-21763
>             Project: Hive
>          Issue Type: Sub-task
>          Components: repl
>            Reporter: Sankar Hariappan
>            Assignee: Sankar Hariappan
>            Priority: Major
>              Labels: DR, Replication
>
> - REPL DUMP takes 2 inputs along with existing FROM and WITH clause.
> {code}
> - REPL DUMP <current_repl_policy> [REPLACE <previous_repl_policy> FROM 
> <last_repl_id> WITH <key_values_list>;
> - current_repl_policy and previous_repl_policy can be any format mentioned in 
> Point-4.
> - REPLACE clause to be supported to take previous repl policy as input. If 
> REPLACE clause is not there, then the policy remains unchanged.
> - Rest of the format remains same.
> {code}
> - Now, REPL DUMP on this DB will replicate the tables based on 
> current_repl_policy.
> - Single table replication of format <db_name>.t1 doesn’t allow changing the 
> policy dynamically. So REPLACE clause is not allowed if previous_repl_policy 
> of this format.
> - If any table is added dynamically either due to change in regular 
> expression or added to include list should be bootstrapped using independant 
> table level replication policy.
> {code}
> - Hive will automatically figure out the list of tables newly included in the 
> list by comparing the current_repl_policy & previous_repl_policy inputs and 
> combine bootstrap dump for added tables as part of incremental dump. 
> "_bootstrap" directory can be created in dump dir to accommodate all tables 
> to be bootstrapped.
> - If any table is renamed, then it may gets dynamically added/removed for 
> replication based on defined replication policy + include/exclude list. So, 
> Hive will perform bootstrap for the table which is just included after rename.
> {code}
> - REPL LOAD on incremental dump should check for "_bootstrap" directory and 
> perform bootstrap load on them first and then continue with incremental load 
> based on events directories.
> - REPL LOAD should check for changes in repl policy and drop the tables/views 
> excluded in the new policy  compared to previous policy.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-21763) Incremental replication to allow changing include/exclude tables list in replication policy.

Reply via email to