xuyangzhong commented on PR #26879:
URL: https://github.com/apache/flink/pull/26879#issuecomment-3173232075

   > @xuyangzhong what do you think about this change? Is acceptable to simply 
update the affected test or do we need changes in `DuplicateChangesInferRule`? 
The current Flink behavior is incorrect: if a sink requests retract, the engine 
needs to guarantee that -U will be send, independent of the primary key 
declaration.
   
   Regarding this change for delta join, I've thought about it, and it seems 
necessary for the DuplicateChangesInferRule to return false when dealing with 
retract sinks (in 
https://github.com/apache/flink/blob/cae5fb4d3b6d9e0c10c3539ea4994fc1ad463b70/flink-table/flink-table-planner/src/main/java/org/apache/flink/table/planner/plan/rules/physical/stream/DuplicateChangesInferRule.java#L169).
 This is because delta joins might generate redundant data, while retract sinks 
require data to appear in pairs as both '+' and '-' forms.
   
   This issue stands on its own, so I can go ahead and create a separate jira 
to resolve it if you prefer. It's entirely your decision.
   
   In summary, here are the situations regarding the failed cases below:
   
   - DuplicateChangesInferRuleTest.testSinkWithMaterialize: the 
`duplicateChanges` in upstream operator for sink should be `DISALLOW `.
   -  DeltaJoinTest.testCdcSource: just apply new changes.
   - DeltaJoinTest.testWithAggregatingAfterJoin: just apply new changes.
   - DeltaJoinTest.testWithAggregatingSourceTableBeforeJoin: just apply new 
changes.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to