uicosp opened a new pull request, #6173:
URL: https://github.com/apache/paimon/pull/6173

   [core] Fix checkpoint recovery failure for compacted changelog files
   
     ### Purpose
   
     Fixes checkpoint recovery failures when using precommit-compact 
functionality introduced in commit [flink] add coordinate and worker operator 
for small changelog files compaction (#4380).
   
     **Root Cause:**
     Compacted changelog files have two types of file names:
     1. Real files: `compacted-changelog-xxx$bid-len.cc-format`
     2. Fake files: `compacted-changelog-xxx$bid-len-off-len2.cc-format`
   
     Fake file names point to segments of real files but don't exist in the 
filesystem. The `checkFilesExistence` method was directly checking these fake 
file paths, causing recovery failures.
   
     **Solution:**
     - Created `CompactedChangelogPathResolver` utility class to resolve fake 
file paths to real file paths
     - Modified `TableCommitImpl.checkFilesExistence()` to resolve all paths 
before checking existence
     - Added deduplication logic since multiple fake files may resolve to the 
same real file
     - Path resolution rules:
       - Real files (`xxx$bid-len.cc-format`): return original path
       - Fake files (`xxx$bid-len-off-len2.cc-format`): resolve to 
`bucket-bid/xxx$bid-len.cc-format`
   
     ### Tests
   
     - Added unit tests in `CompactedChangelogPathResolverTest` to verify path 
resolution logic
     - Existing checkpoint recovery tests should now pass with compacted 
changelog files
   
     ### API and Format
   
     No changes to public API or storage format. This is an internal fix for 
file path resolution.
   
     ### Documentation
   
     No new features introduced. This is a bug fix for existing 
precommit-compact functionality.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to