pvary commented on PR #15312:
URL: https://github.com/apache/iceberg/pull/15312#issuecomment-4342379909
@rdblue
> Allowing them to be closed allows one class to break functionality for
another.
Technically, nothing prevents users from shutting down the static pools
already, for example:
```
ThreadPools.getWorkerPool().shutdown();
```
Today we mostly rely on reviews to prevent this, and in some cases it is not
necessarily obvious. If the goal is to guarantee that users cannot shut these
pools down, that could be addressed more directly (e.g., by wrapping the
returned executor), but that safeguard does not exist today.
Changing all usages of `newExitingWorkerPool` / `newExitingScheduledPool`
would also be fairly invasive. These thread pools are used across several
`FileIO` implementations (e.g., `HadoopFileIO` → `newExitingWorkerPool`,
`S3FileIO` / `GCSFileIO` → `newExitingScheduledPool`, `ADLSFileIO` →
`getWorkerPool`).
Do you think a better long‑term direction would be to change
`FileIO.initialize(Map<String, String> properties)` to something like:
```
FileIO.initialize(
Map<String, String> properties,
Function<Class<? extends FileIO>, ScheduledExecutorService>
executorFactory)
```
and propagate this through the `FileIO` implementations?
In practice, `FileIO` instances are often created by catalogs, which suggest
that we change the `Catalog.initialize(String name, Map<String, String>
properties)` to something like:
```
Catalog.initialize(
String name,
Map<String, String> properties,
Function<Class<? extends FileIO>, ScheduledExecutorService>
executorFactory)
```
and propagate this through the `Catalog` implementations and change the
`CatalogUtil.loadCatalog(String impl, String catalogName, Map<String, String>
properties, Object hadoopConf)` to allow setting the factory.
The FileIO implementations can't manage the lifecycle for the pools as they
don't know if they should keep the shared resources open or they should close
them too.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]