[ 
https://issues.apache.org/jira/browse/FALCON-1390?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15034945#comment-15034945
 ] 

Balu Vellanki commented on FALCON-1390:
---------------------------------------

{code}
1. Get list of tables for specified DB.
2. Check whether returned tables from step 1 exists on target Hive server.
3. If exists, don’t bootstrap and get last event id.
{code}

This will not work in all scenarios. HiveServer deletes notification log older 
than N days (N can be configured).  If the table exists and the last event id 
is 100, you cannot assume that all events after 100 will be available in the 
NOTIFICATION_LOG table in the source. The log might only have notifications 
starting at 1000. 

Any task that replaces manual bootstrap will have to generate the entire table 
data on source, copy it to target hive and import the data. Once this task is 
complete, the bootstrap process will have to set the last_event_id in the 
target hive table. 

> Develop Auto Bootstrapping for HiveDR
> -------------------------------------
>
>                 Key: FALCON-1390
>                 URL: https://issues.apache.org/jira/browse/FALCON-1390
>             Project: Falcon
>          Issue Type: New Feature
>    Affects Versions: 0.7
>            Reporter: Peeyush Bishnoi
>            Assignee: Peeyush Bishnoi
>         Attachments: AutoBootstrap_DB_Table.pdf
>
>
> Currently Hive DR require manual bootstrap of Database and Table to be 
> replicated, if not available on target cluster. It is good to automate the 
> Database and Table bootstrap so that user should not perform manually.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to