Sankar Hariappan created HIVE-21286:
---------------------------------------
Summary: Hive should support clean-up of incrementally
bootstrapped tables when retry from different dump.
Key: HIVE-21286
URL: https://issues.apache.org/jira/browse/HIVE-21286
Project: Hive
Issue Type: Bug
Components: repl
Affects Versions: 4.0.0
Reporter: Sankar Hariappan
Assignee: Sankar Hariappan
If external tables are enabled for replication on an existing repl policy, then
bootstrapping of external tables are combined with incremental dump.
If incremental bootstrap load fails with non-retryable error for which user
will have to manually drop all the external tables before trying with another
bootstrap dump. For full bootstrap, to retry with different dump, we suggested
user to drop the DB but in this case they need to manually drop all the
external tables which is not so user friendly. So, need to handle it in Hive
side as follows.
REPL LOAD takes additional config (passed by user in WITH clause) that says,
drop all the tables which are part of this bootstrap dump. There are 4 cases
possible.
1. Only external tables - Drop all external tables before triggering bootstrap
load.
2. Only ACID/MM tables - Drop all ACID/MM tables before triggering bootstrap
load.
3. Both external and ACID/MM tables - Drop both external and ACID/MM tables
before triggering bootstrap load.
3. Table level replication with bootstrap - Drop all the tables that match the
diff in previous and current repl policy (pattern+include/exclude list) before
triggering bootstrap load.
Configuration: hive.repl.bootstrap.cleanup.type=
{1=external_tables, 2=transactional_tables,
3=external_and_transactional_tables, 4=table_level}
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)