[jira] [Assigned] (HUDI-2477) Restore fails after adding rollback plan and rollback.requested instant w/ metadata enabled
[ https://issues.apache.org/jira/browse/HUDI-2477?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar reassigned HUDI-2477: Assignee: sivabalan narayanan (was: Manoj Govindassamy) > Restore fails after adding rollback plan and rollback.requested instant w/ > metadata enabled > --- > > Key: HUDI-2477 > URL: https://issues.apache.org/jira/browse/HUDI-2477 > Project: Apache Hudi > Issue Type: Task > Components: Writer Core >Reporter: sivabalan narayanan >Assignee: sivabalan narayanan >Priority: Blocker > Fix For: 0.11.0 > > > restore triggers rollback of N commits and then finally will commit the > restore. None of rollbacks will be published to timeline. > But after we have added the rollback.requested instant, restore is breaking > w/ metadata enabled. > Here is what is happening: > Restore > schedule rollback for all of N commits. this will produce > rollback.requested instants to timeline. Remember we can't skip this > publishing, bcoz, rollback action executor depends on this. > trigger rollback action executor. which will execute the rollback. but > this time we may not publish the rollbacks. and so there won't be a rollback > completed instant. > and now to finalize restore, we apply the changes to metadata table before we > can commit the restore to datatable. Here is where the issue is. We do check > if bootstrapping is required. chances that last synced instant to metadata > table is not active anymore in data table and so it triggers a bootstrap. but > we do allow bootstrap only if there are no pending operations in datatable. > But all rollbacks are surfacing as pending operations and hence we fail here. > > This could also be an issue when we try to play with bootstrap in original > dataset. > bootstrap. and for some reason you want to rollback bootstrap. this might end > up in this state too. > to illustrate clearly. > bootstrap > also apply changes to metadata. there is only one commit. > rollback bootstrap > this is a restore operation. so, we first do a rollback which will create > rollback.requested instant. > and to finalize restore, we try to apply the restore to > metadata. > this goes into bootstrap code path. last synced instant > is not found in datatimeline. we assume its archived and so trigger a > rebootstrap and so delete the metadata table. > and then try to do the actual bootstrap. but since there > is a pending operation in datatimeline (rollback.requested), we will not do > any bootstrap only. and so the state remains. i.e. metadata table is deleted. > and the actually applying restore commit will fail. > > > We also need to think even if metadata is enabled, should we leave the > rollback instant in timeline itself.or should we clean it up after committing > restore to timeline. > > > -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Assigned] (HUDI-2477) Restore fails after adding rollback plan and rollback.requested instant w/ metadata enabled
[ https://issues.apache.org/jira/browse/HUDI-2477?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar reassigned HUDI-2477: Assignee: Manoj Govindassamy (was: Vinoth Chandar) > Restore fails after adding rollback plan and rollback.requested instant w/ > metadata enabled > --- > > Key: HUDI-2477 > URL: https://issues.apache.org/jira/browse/HUDI-2477 > Project: Apache Hudi > Issue Type: Sub-task > Components: Writer Core >Reporter: sivabalan narayanan >Assignee: Manoj Govindassamy >Priority: Blocker > Fix For: 0.11.0 > > > restore triggers rollback of N commits and then finally will commit the > restore. None of rollbacks will be published to timeline. > But after we have added the rollback.requested instant, restore is breaking > w/ metadata enabled. > Here is what is happening: > Restore > schedule rollback for all of N commits. this will produce > rollback.requested instants to timeline. Remember we can't skip this > publishing, bcoz, rollback action executor depends on this. > trigger rollback action executor. which will execute the rollback. but > this time we may not publish the rollbacks. and so there won't be a rollback > completed instant. > and now to finalize restore, we apply the changes to metadata table before we > can commit the restore to datatable. Here is where the issue is. We do check > if bootstrapping is required. chances that last synced instant to metadata > table is not active anymore in data table and so it triggers a bootstrap. but > we do allow bootstrap only if there are no pending operations in datatable. > But all rollbacks are surfacing as pending operations and hence we fail here. > > This could also be an issue when we try to play with bootstrap in original > dataset. > bootstrap. and for some reason you want to rollback bootstrap. this might end > up in this state too. > to illustrate clearly. > bootstrap > also apply changes to metadata. there is only one commit. > rollback bootstrap > this is a restore operation. so, we first do a rollback which will create > rollback.requested instant. > and to finalize restore, we try to apply the restore to > metadata. > this goes into bootstrap code path. last synced instant > is not found in datatimeline. we assume its archived and so trigger a > rebootstrap and so delete the metadata table. > and then try to do the actual bootstrap. but since there > is a pending operation in datatimeline (rollback.requested), we will not do > any bootstrap only. and so the state remains. i.e. metadata table is deleted. > and the actually applying restore commit will fail. > > > We also need to think even if metadata is enabled, should we leave the > rollback instant in timeline itself.or should we clean it up after committing > restore to timeline. > > > -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Assigned] (HUDI-2477) Restore fails after adding rollback plan and rollback.requested instant w/ metadata enabled
[ https://issues.apache.org/jira/browse/HUDI-2477?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar reassigned HUDI-2477: Assignee: Vinoth Chandar (was: sivabalan narayanan) > Restore fails after adding rollback plan and rollback.requested instant w/ > metadata enabled > --- > > Key: HUDI-2477 > URL: https://issues.apache.org/jira/browse/HUDI-2477 > Project: Apache Hudi > Issue Type: Sub-task > Components: Writer Core >Reporter: sivabalan narayanan >Assignee: Vinoth Chandar >Priority: Blocker > Fix For: 0.10.0 > > > restore triggers rollback of N commits and then finally will commit the > restore. None of rollbacks will be published to timeline. > But after we have added the rollback.requested instant, restore is breaking > w/ metadata enabled. > Here is what is happening: > Restore > schedule rollback for all of N commits. this will produce > rollback.requested instants to timeline. Remember we can't skip this > publishing, bcoz, rollback action executor depends on this. > trigger rollback action executor. which will execute the rollback. but > this time we may not publish the rollbacks. and so there won't be a rollback > completed instant. > and now to finalize restore, we apply the changes to metadata table before we > can commit the restore to datatable. Here is where the issue is. We do check > if bootstrapping is required. chances that last synced instant to metadata > table is not active anymore in data table and so it triggers a bootstrap. but > we do allow bootstrap only if there are no pending operations in datatable. > But all rollbacks are surfacing as pending operations and hence we fail here. > > This could also be an issue when we try to play with bootstrap in original > dataset. > bootstrap. and for some reason you want to rollback bootstrap. this might end > up in this state too. > to illustrate clearly. > bootstrap > also apply changes to metadata. there is only one commit. > rollback bootstrap > this is a restore operation. so, we first do a rollback which will create > rollback.requested instant. > and to finalize restore, we try to apply the restore to > metadata. > this goes into bootstrap code path. last synced instant > is not found in datatimeline. we assume its archived and so trigger a > rebootstrap and so delete the metadata table. > and then try to do the actual bootstrap. but since there > is a pending operation in datatimeline (rollback.requested), we will not do > any bootstrap only. and so the state remains. i.e. metadata table is deleted. > and the actually applying restore commit will fail. > > > We also need to think even if metadata is enabled, should we leave the > rollback instant in timeline itself.or should we clean it up after committing > restore to timeline. > > > -- This message was sent by Atlassian Jira (v8.3.4#803005)