adp2201 commented on PR #14797:
URL: https://github.com/apache/iceberg/pull/14797#issuecomment-4049987731

   Thanks for continuing this work — the implementation and discussion here are 
very helpful.
   
   Given the mixed reports (works for some setups, occasional duplicates for 
others), could we tighten the merge criteria around a clear correctness 
contract before merge?
   
   Specifically, it would help to have:
   1. A documented behavior matrix for CDC/upsert mode (DV path vs 
equality-delete fallback, MOR/COW expectations, partitioned vs unpartitioned),
   2. A deterministic integration test (or test matrix) that reproduces rapid 
consecutive updates to the same key across commit boundaries and validates no 
duplicate live rows,
   3. Explicit operational requirements in docs (required table props, 
compaction cadence, non-null identifier constraints, and known limitations).
   
   That would make it much easier for users to adopt safely and for maintainers 
to evaluate long-term support risk.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to