[ 
https://issues.apache.org/jira/browse/HUDI-2477?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinoth Chandar reassigned HUDI-2477:
------------------------------------

    Assignee: sivabalan narayanan  (was: Manoj Govindassamy)

> Restore fails after adding rollback plan and rollback.requested instant w/ 
> metadata enabled
> -------------------------------------------------------------------------------------------
>
>                 Key: HUDI-2477
>                 URL: https://issues.apache.org/jira/browse/HUDI-2477
>             Project: Apache Hudi
>          Issue Type: Task
>          Components: Writer Core
>            Reporter: sivabalan narayanan
>            Assignee: sivabalan narayanan
>            Priority: Blocker
>             Fix For: 0.11.0
>
>
> restore triggers rollback of N commits and then finally will commit the 
> restore. None of rollbacks will be published to timeline. 
> But after we have added the rollback.requested instant, restore is breaking 
> w/ metadata enabled. 
> Here is what is happening:
> Restore
>      schedule rollback for all of N commits. this will produce 
> rollback.requested instants to timeline. Remember we can't skip this 
> publishing, bcoz, rollback action executor depends on this. 
>     trigger rollback action executor. which will execute the rollback. but 
> this time we may not publish the rollbacks. and so there won't be a rollback 
> completed instant. 
> and now to finalize restore, we apply the changes to metadata table before we 
> can commit the restore to datatable. Here is where the issue is. We do check 
> if bootstrapping is required. chances that last synced instant to metadata 
> table is not active anymore in data table and so it triggers a bootstrap. but 
> we do allow bootstrap only if there are no pending operations in datatable. 
> But all rollbacks are surfacing as pending operations and hence we fail here. 
>  
> This could also be an issue when we try to play with bootstrap in original 
> dataset. 
> bootstrap. and for some reason you want to rollback bootstrap. this might end 
> up in this state too. 
> to illustrate clearly. 
> bootstrap
>     also apply changes to metadata. there is only one commit.
> rollback bootstrap
>    this is a restore operation. so, we first do a rollback which will create 
> rollback.requested instant. 
>                and to finalize restore, we try to apply the restore to 
> metadata. 
>                     this goes into bootstrap code path. last synced instant 
> is not found in datatimeline. we assume its archived and so trigger a 
> rebootstrap and so delete the metadata table. 
>                     and then try to do the actual bootstrap. but since there 
> is a pending operation in datatimeline (rollback.requested), we will not do 
> any bootstrap only. and so the state remains. i.e. metadata table is deleted. 
> and the actually applying restore commit will fail. 
>  
>  
> We also need to think even if metadata is enabled, should we leave the 
> rollback instant in timeline itself.or should we clean it up after committing 
> restore to timeline. 
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

Reply via email to