lostluck commented on PR #29439: URL: https://github.com/apache/beam/pull/29439#issuecomment-1820251049
There were two other flaky tests that I've bundled in this as well, since they were conflating on runs. Moving to the proper errgroup for parallelism limiting, and auto cancellation of other inprogress bundles on one bundle failure seems to have resolved the remaining flakiness (locally at least). The issue was the execution & error handling weren't uniformly handled, leading to a gap where an error could be missed between bundle completion (causing the element manager to exit successfully with all work completed), and actually handling an error (leading to false positive executions). This got bigger than anticipated, so I'm happy to split it into component parts at reviewers discretion. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
