pvary commented on PR #15312: URL: https://github.com/apache/iceberg/pull/15312#issuecomment-4352158562
@rdblue: Thanks for the detailed reply. This helps clarify where our views differ. > First, I'm assuming that we are only worrying about the exiting pools. Agreed. > The other pools should have their own lifecycle management. Fully agree. > The non-exiting pools are what I would expect FileIO instances to use, unless the pools are shared. In practice, all `FileIO` implementations currently rely on static exiting worker pools. The pattern is that these pools are created on first use and then shared across instances. That approach has two advantages: - The threads are used only occasionally, so sharing them keeps the overall thread footprint small. - The pool size naturally bounds the level of outgoing parallelism. > This PR proposes that we add: > - A way to remove shutdown hooks from all thread pools > - A global shutdown for all thread pools created by ThreadPools The naming may indeed need improvement, but the proposed shutdown only targets the **exiting** pools created by `newExitingWorkerPool` and `newExitingScheduledPool`. These are exactly the pools where we currently lack direct control over shutdown. > the problem you're saying we need to solve is that the pools created for FileIO have inconsistently managed lifecycles. From my perspective, the pattern across `FileIO` implementations is actually quite consistent: they require a global shared pool per `FileIO` type, reused across instances and normally closed only on exit. The issue is that “exit” is not always equivalent to a JVM shutdown. In cases like hot replacement, we need a way to release those resources even while the JVM keeps running. Based on this discussion, I see a few possible directions: 1. Adopt the previously mentioned (but somewhat invasive) change to the `FileIO API`. 2. Introduce a `ThreadPoolManager` outside of `ThreadPools` and migrate `FileIO` implementations to use managed pools. 3. Keep the `ThreadPoolManager` within `ThreadPools`, but don’t let it manage the standard pools (`WORKER_POOL`, `DELETE_WORKER_POOL`, `AuthRefreshPoolHolder.INSTANCE`). Independently of which path we take, I think we should ensure that standard pools are lazily initialized, and that every place using them (using the methods: `getWorkerPool`, `getDeleteWorkerPool`, `authRefreshPool`) allows injecting an external pool as a substitute. I would prefer 2 or 3. Do you think it would be reasonable? Do you have another suggestion? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
