[ https://issues.apache.org/jira/browse/HIVE-21763?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Sankar Hariappan updated HIVE-21763: ------------------------------------ Description: - REPL DUMP takes 2 inputs along with existing FROM and WITH clause. {code} - REPL DUMP <current_repl_policy> [REPLACE <previous_repl_policy> FROM <last_repl_id> WITH <key_values_list>; - current_repl_policy and previous_repl_policy can be any format mentioned in Point-4. - REPLACE clause to be supported to take previous repl policy as input. If REPLACE clause is not there, then the policy remains unchanged. - Rest of the format remains same. {code} - Now, REPL DUMP on this DB will replicate the tables based on current_repl_policy. - Single table replication of format <db_name>.t1 doesn’t allow changing the policy dynamically. So REPLACE clause is not allowed if previous_repl_policy of this format. - If any table is added dynamically either due to change in regular expression or added to include list should be bootstrapped using independant table level replication policy. {code} - Hive will automatically figure out the list of tables newly included in the list by comparing the current_repl_policy & previous_repl_policy inputs and combine bootstrap dump for added tables as part of incremental dump. "_bootstrap" directory can be created in dump dir to accommodate all tables to be bootstrapped. - If any table is renamed, then it may gets dynamically added/removed for replication based on defined replication policy + include/exclude list. So, Hive will perform bootstrap for the table which is just included after rename. {code} - REPL LOAD on incremental dump should check for "_bootstrap" directory and perform bootstrap load on them first and then continue with incremental load based on events directories. - REPL LOAD should check for changes in repl policy and drop the tables/views excluded in the new policy compared to previous policy. was: - REPL DUMP takes 2 inputs along with existing FROM and WITH clause. {code} - REPL DUMP <current_repl_policy> [REPLACE <previous_repl_policy> FROM <last_repl_id> WITH <key_values_list>; - current_repl_policy and previous_repl_policy can be any format mentioned in Point-4. - REPLACE clause to be supported to take previous repl policy as input. If REPLACE clause is not there, then the policy remains unchanged. - Rest of the format remains same. {code} - Now, REPL DUMP on this DB will replicate the tables based on current_repl_policy. - If any table is added dynamically either due to change in regular expression or added to include list should be bootstrapped using independant table level replication policy. {code} - Hive will automatically figure out the list of tables newly included in the list by comparing the current_repl_policy & previous_repl_policy inputs and combine bootstrap dump for added tables as part of incremental dump. "_bootstrap" directory can be created in dump dir to accommodate all tables to be bootstrapped. - If any table is renamed, then it may gets dynamically added/removed for replication based on defined replication policy + include/exclude list. So, Hive will perform bootstrap for the table which is just included after rename. {code} - REPL LOAD on incremental dump should check for "_bootstrap" directory and perform bootstrap load on them first and then continue with incremental load based on events directories. - REPL LOAD should check for changes in repl policy and drop the tables/views excluded in the new policy compared to previous policy. > Incremental replication to allow changing include/exclude tables list in > replication policy. > -------------------------------------------------------------------------------------------- > > Key: HIVE-21763 > URL: https://issues.apache.org/jira/browse/HIVE-21763 > Project: Hive > Issue Type: Sub-task > Components: repl > Reporter: Sankar Hariappan > Assignee: Sankar Hariappan > Priority: Major > Labels: DR, Replication > > - REPL DUMP takes 2 inputs along with existing FROM and WITH clause. > {code} > - REPL DUMP <current_repl_policy> [REPLACE <previous_repl_policy> FROM > <last_repl_id> WITH <key_values_list>; > - current_repl_policy and previous_repl_policy can be any format mentioned in > Point-4. > - REPLACE clause to be supported to take previous repl policy as input. If > REPLACE clause is not there, then the policy remains unchanged. > - Rest of the format remains same. > {code} > - Now, REPL DUMP on this DB will replicate the tables based on > current_repl_policy. > - Single table replication of format <db_name>.t1 doesn’t allow changing the > policy dynamically. So REPLACE clause is not allowed if previous_repl_policy > of this format. > - If any table is added dynamically either due to change in regular > expression or added to include list should be bootstrapped using independant > table level replication policy. > {code} > - Hive will automatically figure out the list of tables newly included in the > list by comparing the current_repl_policy & previous_repl_policy inputs and > combine bootstrap dump for added tables as part of incremental dump. > "_bootstrap" directory can be created in dump dir to accommodate all tables > to be bootstrapped. > - If any table is renamed, then it may gets dynamically added/removed for > replication based on defined replication policy + include/exclude list. So, > Hive will perform bootstrap for the table which is just included after rename. > {code} > - REPL LOAD on incremental dump should check for "_bootstrap" directory and > perform bootstrap load on them first and then continue with incremental load > based on events directories. > - REPL LOAD should check for changes in repl policy and drop the tables/views > excluded in the new policy compared to previous policy. -- This message was sent by Atlassian JIRA (v7.6.3#76005)