krisnaru commented on PR #15975:
URL: https://github.com/apache/iceberg/pull/15975#issuecomment-4332559522

     * The incremental case is still broken *                                   
                                                                                
                                                        
                                                                                
                                                                                
                                                    
     Scenario: Incremental replication after append + RewriteManifests + 
snapshot expiration:                                                            
                                                           
                                                                                
                                                                                
                                                    
     1. T1: Incremental replication done. Source has {S1, S2}. Target has all 
files.                                                                          
                                                      
     2. Between T1 and T2: Append → S3 (file C), RewriteManifests → S4 (merges 
all manifests), Expire → S3 removed from metadata.
     3. T2: Next incremental replication (start=T1):                            
                                                                                
                                                    
       - endMetadata.snapshots() = {S1, S2, S4} (S3 expired)                    
                                                                                
                                                    
       - deltaSnapshotIds = {S4} (lines 370-372 in RewriteTablePathSparkAction) 
                                                                                
                                                    
       - S4's rewritten manifests are selected for processing 
(ManifestFile.snapshotId = S4) ✓                                                
                                                                      
       - But file C has entry.snapshotId() = S3, and {S4}.contains(S3) == false 
                                                                                
                                                    
       - File C is excluded from copyPlan — live data silently lost             
                                                                                
                                                    
                                                                                
                                                                                
                                                    
     The root cause: deltaSnapshotIds is derived from endMetadata.snapshots(), 
which excludes expired snapshots. When a file's adding snapshot is expired 
between replications, the entry-level filter drops it even though it's live and 
unreplicated.                                                                   
                                                                                
                         
                                                                                
                                                                                
                                                    
     Why snapshotId-based entry filtering is fundamentally fragile?             
                                                                                
                                                     
      
     entry.snapshotId() reflects the snapshot that originally added the file — 
which can be expired at any time by table maintenance. Using it as a 
correctness-critical filter for determining which files to copy means the copy 
plan's correctness depends on snapshot retention policy, which is outside the 
replication system's control.
                                                                                
                                                                                
                                                    
     A more robust approach (which we've tested extensively in our 
implementation) is:                                                             
                                         
     1. Include all live entries in the copy plan — filter by entry.isLive() 
only, no snapshotId check                                                       
                                                       
     2. Deduplicate for incremental copies via anti-join — compare collected 
files against files already present at the target (from startVersion), using a 
distributed anti-join rather than a snapshot ID set
     3. Recover expired snapshot IDs from version history — if snapshot ID 
filtering is needed at the manifest level, iterate through intermediate version 
files (not just endMetadata) to collect all snapshot IDs  including expired 
ones                                                                            
                                                                                
                             
                                                                                
                                                                                
                                                    
   This design is immune to any combination of RewriteManifests, 
RewriteDataFiles, and snapshot expiration between replications. We have 15+ 
integration tests covering every combination (append/overwrite/delete + rewrite 
manifests/data files + expire) and they all pass.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to