Re: [I] Implement `Closable` interface for class `HiveCatalog` and `HiveClientPool` [iceberg]
yuqi1129 commented on issue #10100: URL: https://github.com/apache/iceberg/issues/10100#issuecomment-2056806262 > @yuqi1129: Caffeine suggests using Java9 system scheduler if possible. [Wiki: Cleanup (ben-manes/caffeine)](https://github.com/ben-manes/caffeine/wiki/Cleanup) > > Would it work for you, or the specific classpath handling you are using, would cause issues with it? Unfortunately, we plan to support Java8 , what's more, the `Cache` instance is internal field of `CachedClientPool`, I'm afraid we can't change it except for the reflection mechanism. Let me explain our user cases. The Gravitino supports creating multiple catalogs, for example, assuming we have created 3 Iceberg catalogs ,each Iceberg catalog will use a separate classloader to load iceberg-related jars (include `iceberg-hive-metastore.jar`) to provide service. After some time, we do not need a specific catalog and we plan to drop it. When we plan to close the corresponding classloader, it cannot be GC by the JVM as there is still a daemon thread mentioned before in the issue, so we need a mechanism to explicitly close the daemon thread. can you give me some advice, thanks. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org
Re: [I] Implement `Closable` interface for class `HiveCatalog` and `HiveClientPool` [iceberg]
pvary commented on issue #10100: URL: https://github.com/apache/iceberg/issues/10100#issuecomment-2048947358 @yuqi1129: Caffeine suggests using Java9 system scheduler if possible. https://github.com/ben-manes/caffeine/wiki/Cleanup Would it work for you, or the specific classpath handling you are using, would cause issues with it? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org
Re: [I] Implement `Closable` interface for class `HiveCatalog` and `HiveClientPool` [iceberg]
yuqi1129 commented on issue #10100: URL: https://github.com/apache/iceberg/issues/10100#issuecomment-2044152196 > @yuqi1129: Do I understand correctly, that your main issue is that the cleaner threads are remaining, and those are the only ones causing the issue? Yeah, I want a mechanism to explicitly close them. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org
Re: [I] Implement `Closable` interface for class `HiveCatalog` and `HiveClientPool` [iceberg]
yuqi1129 commented on issue #10100: URL: https://github.com/apache/iceberg/issues/10100#issuecomment-2044151328 > Can you use the daemon thread from the jdk (via systemScheduler)? Then how can I use the `systemScheduler` here? I can't change the following code from an external project as it's part of the Iceberg inner class. ``` // Code block from CachedClientPool#init clientPoolCache = Caffeine.newBuilder() .expireAfterAccess(evictionInterval, TimeUnit.MILLISECONDS) .removalListener((ignored, value, cause) -> ((HiveClientPool) value).close()) .scheduler( // This daemon thread can't be explicitly closed. Scheduler.forScheduledExecutorService( ThreadPools.newScheduledPool("hive-metastore-cleaner", 1))) .build(); ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org
Re: [I] Implement `Closable` interface for class `HiveCatalog` and `HiveClientPool` [iceberg]
pvary commented on issue #10100: URL: https://github.com/apache/iceberg/issues/10100#issuecomment-2044147391 @yuqi1129: Do I understand correctly, that your main issue is that the cleaner threads are remaining, and those are the only ones causing the issue? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org
Re: [I] Implement `Closable` interface for class `HiveCatalog` and `HiveClientPool` [iceberg]
ben-manes commented on issue #10100: URL: https://github.com/apache/iceberg/issues/10100#issuecomment-2044074963 Can you use the daemon thread from the jdk (via systemScheduler)? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org
Re: [I] Implement `Closable` interface for class `HiveCatalog` and `HiveClientPool` [iceberg]
yuqi1129 commented on issue #10100: URL: https://github.com/apache/iceberg/issues/10100#issuecomment-2042946236 > @yuqi1129: Could setting the `client.pool.cache.eviction-interval-ms`, and decreasing the `clients` size help your case? I'm afraid it won't work for me, decreasing `client.pool.cache.eviction-interval-ms` will not affect the fact that thread pool `hive-metastore-cleaner` is a daemon thread, only when the JVM exit can it be closed automatically. We can't close it in an isolated class loader as we won't be able to get its reference. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org
Re: [I] Implement `Closable` interface for class `HiveCatalog` and `HiveClientPool` [iceberg]
pvary commented on issue #10100: URL: https://github.com/apache/iceberg/issues/10100#issuecomment-2042811946 @yuqi1129: Could setting the `client.pool.cache.eviction-interval-ms`, and decreasing the `clients` size help your case? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org
Re: [I] Implement `Closable` interface for class `HiveCatalog` and `HiveClientPool` [iceberg]
yuqi1129 commented on issue #10100: URL: https://github.com/apache/iceberg/issues/10100#issuecomment-2041959506 @pvary Thank you for your kind reply. The reason we need to close the `HiveCatalog` and `HiveClientPool` is that we are working on a metadata management system https://github.com/datastrato/gravitino and Gravitino will manage several catalogs. Each Iceberg catalog will have the following operations: 1. Create an Iceberg catalog. In this step, Gravitino will use a `IsolatedClassloader` to load classes that belongs to Icebergs catalog 2. `LoadTable`, `LoadSchema`, and other operations under the classloader 3. Finally, we are going to drop the Iceberg catalog if it's no longer needed, so we need to close the catalog, close the classloader, and release associated resources. However, due to a daemon thread introduced by `CachedClientPool` (Implementation class of CachedClientPool), the classloader can't be GC and thus will leads to OOM eventually. ```java // Code block from CachedClientPool#init clientPoolCache = Caffeine.newBuilder() .expireAfterAccess(evictionInterval, TimeUnit.MILLISECONDS) .removalListener((ignored, value, cause) -> ((HiveClientPool) value).close()) .scheduler( // This daemon thread can't be explicitly closed. Scheduler.forScheduledExecutorService( ThreadPools.newScheduledPool("hive-metastore-cleaner", 1))) .build(); ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org
Re: [I] Implement `Closable` interface for class `HiveCatalog` and `HiveClientPool` [iceberg]
pvary commented on issue #10100: URL: https://github.com/apache/iceberg/issues/10100#issuecomment-2041879370 This comes up from time-to-time. Like: https://github.com/apache/iceberg/issues/1502#issuecomment-698635234 The original intention was to keep the HMS client pool open between catalog creations, since opening the connection has a significant overhead, and a typical user uses only a single pool. What is your use case which needs this behavior change? Thanks, Peter -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org