yihua opened a new pull request, #10913:
URL: https://github.com/apache/hudi/pull/10913

   ### Change Logs
   
   When there are repeated duplicate deletes to the partition file list in 
`files` partition of the MDT, the current HoodieMetadataPayload merging logic 
drops such "deletion", causing the file that is deleted from the file system 
and supposed to be deleted from MDT file listing still left in MDT, because the 
merging logic of file system metadata does not account for such a case.  Other 
MDT partitions have already considered repeated deletes when merging payloads.
   
   This PR fixes the logic.  New tests are added around the repeated deletes 
(the tests fail before the fix and succeed after the fix).
   
   Here's a concrete example of how this bug causes the ingestion to fail:
   
   (1) A data file and file group are replaced by clustering.  The data file is 
still on the file system and in MDT file listing.
   
   (2) A cleaner plan is generated to delete the data file.
   
   (3) The cleaner plan is executed the first time, and fails before commit due 
to Spark job shutdown.
   
   (4) The ingestion continues and succeeds, and another cleaner plan is 
generated containing the same data file/file group to delete.
   
   (5) The first cleaner plan is successfully executed, incurring deletion to 
the file list with a metadata payload, and this is added to one log file in 
MDT, e.g.,
   ```
   HoodieMetadataPayload {key=partition, type=2, Files: {creations=[], 
deletions=[7f6b146e-cd43-4fd3-9ce0-118232562569-0_63-29223-5579389_20240303214408245.parquet],
 }}
   ```
   (6) The second cleaner plan is also successfully executed, incurring 
deletion to the file list with a metadata payload containing the same data file 
to delete, and this is added to a subsequent log file in the same file slice in 
MDT, e.g.,
   ```
   HoodieMetadataPayload {key=partition, type=2, Files: {creations=[], 
deletions=[7f6b146e-cd43-4fd3-9ce0-118232562569-0_63-29223-5579389_20240303214408245.parquet],
 }} 
   ```
   (7) The replacecommit corresponds to the clustering is archived as the 
cleaner has deleted the replaced file groups.
   
   (8) When reading MDT or MDT compaction happens, the merging of these two 
metadata payloads with identical deletes leads to empty deletion, so the data 
file is not deleted from the partition file list in MDT.  The expected behavior 
is to keep the data file in the "deletions" field.
   ```
   HoodieMetadataPayload {key=partition, type=2, Files: {creations=[], 
deletions=[], }}
   ```
   (9) Next time, when doing upsert and indexing, the deleted data file is 
served by the file system view based on MDT (e.g., 
`7f6b146e-cd43-4fd3-9ce0-118232562569-0_63-29223-5579389_20240303214408245.parquet`),
 and the data file cannot be found on the file system, causing the ingestion to 
fail.
   
   ### Impact
   
   MDT bug fix.
   
   ### Risk level
   
   low
   
   ### Documentation Update
   
   N/A
   
   ### Contributor's checklist
   
   - [ ] Read through [contributor's 
guide](https://hudi.apache.org/contribute/how-to-contribute)
   - [ ] Change Logs and Impact were stated clearly
   - [ ] Adequate tests were added if applicable
   - [ ] CI passed
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to