featzhang commented on PR #27351: URL: https://github.com/apache/flink/pull/27351#issuecomment-4174913018
The CI failure in `DataGeneratorSourceITCase#testGatedRateLimiter` is **unrelated to this PR's changes**. **Root Cause Analysis:** This PR only modifies files under `docs/`, `flink-python/`, and `flink-table/` for the URL decoding functionality. The failing test is located in `flink-connectors/flink-connector-datagen-test/` — there is **zero overlap** between the changed modules and the failing test module. The failure is caused by a pre-existing race condition in `FirstCheckpointFilter`: the checkpoint barrier can arrive before all upstream elements emitted in the same checkpoint cycle by `GatedRateLimiter` have been processed, causing the filter to prematurely discard elements and making the assertion `assertThat(results).hasSize(capacityPerCheckpoint)` fail non-deterministically. **Fix:** This flaky test has been fixed in a separate PR: https://github.com/featzhang/flink/tree/fix-datagen-flaky-test (tracking issue: [FLINK-39388](https://issues.apache.org/jira/browse/FLINK-39388)). The fix refactors `FirstCheckpointFilter` to implement `CheckpointListener` in addition to `CheckpointedFunction`, moving the element cutoff logic from `snapshotState()` to `notifyCheckpointComplete()`, so that collection stops only after the first checkpoint has fully completed and all in-flight elements have been processed downstream. This PR (#27351) itself is not affected by the flaky test and should be mergeable once the CI re-run passes. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
