[ https://issues.apache.org/jira/browse/HUDI-2712?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
sivabalan narayanan resolved HUDI-2712. --------------------------------------- > rollback of a partially failed commit which has new partitions fails with > metadata table > ---------------------------------------------------------------------------------------- > > Key: HUDI-2712 > URL: https://issues.apache.org/jira/browse/HUDI-2712 > Project: Apache Hudi > Issue Type: Sub-task > Affects Versions: 0.10.0 > Reporter: sivabalan narayanan > Assignee: sivabalan narayanan > Priority: Blocker > Labels: pull-request-available > Fix For: 0.10.0 > > > When a commit is being rolledback, and the commit has new partitions which > was not present in the table before, files pertaining to this new partition > may not be part of rollback plan. and so these files will be end up dangling > w/o being cleaned up. > > Eg: > commit 1: p1 (5 files) p2(5 files) > commit2: p1(3 files) p2(3 files) p3(2 files) partial failed write. > > when commit3 is triggered, it will rollback commit2 > when generating rollback plan, we first fetch all partitions from > TableFileSystemView which will hit metadata table when enabled. > This may return only p1 and p2 and not p3(since commit2 is not completed) > and then we do fs.list and filter out files that matches the commit2. > So, in this case, we might miss to rollback the files added to p3. > > > -- This message was sent by Atlassian Jira (v8.20.1#820001)