lhotari commented on pull request #10148: URL: https://github.com/apache/pulsar/pull/10148#issuecomment-817138225
> I think current issue is also we're grouping too much flaky tests together which take a lot of time to execute, and if some of them time out then remaining tests don't even get a chance to be executed or retried, like normal flow usually have 3 retry but flaky group can barely finish 1 single run. That is true before this PR goes in. :) The root cause of the problems in CI has been the problems that a lot of resources haven't been properly released in tests and this caused memory and thread leaks. The shutdown sequence of the broker wasn't synchronous and that caused some tests to use a lot of resources since the shutdown of previous broker(s) could have been executing while the next test started. This together with quite a few PulsarClient and ExecutorService leaks has been the root cause of many CI and test problems. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected]
