aIbrahiim commented on PR #37950:
URL: https://github.com/apache/beam/pull/37950#issuecomment-4132399063

   > Nice, the per-line split correctly removes the dependency on 
non-deterministic flush boundaries, and should indeed fix the flakes.
   > 
   > One question: the original assertion also implicitly verified 
intra-transaction ordering (e.g., that the DELETE for SingerId 1 always 
followed the INSERT for SingerId 1+2 within the same commit). With 
`containsInAnyOrder` on individual lines, that ordering property is no longer 
asserted.
   > 
   > Is that intentional, or would it be worth adding a lightweight ordering 
check per commit timestamp / transaction ID on top of this?
   
   @bvolpato Great point and you’re absolutely right. that relaxation is 
intentional for this test as the flake came from timer flush grouping, so I 
changed the assertion to validate stable semantics only (presence/count of 
records) and removed dependence on grouping order across flushes
   
   also this is right that this drops an implicit ordering signal. I dont want 
to lose that coverage entirely, so I can add a small follow up assertion that 
checks ordering using the existing sort key (commitTimestamp, transactionId) in 
a deterministic way, without relying on flush chunk boundaries


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to