[jira] [Commented] (FLINK-20614) Registered sql drivers not deregistered after task finished in session cluster
[ https://issues.apache.org/jira/browse/FLINK-20614?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17335978#comment-17335978 ] Flink Jira Bot commented on FLINK-20614: This issue was labeled "stale-major" 7 ago and has not received any updates so it is being deprioritized. If this ticket is actually Major, please raise the priority and ask a committer to assign you the issue or revive the public discussion. > Registered sql drivers not deregistered after task finished in session cluster > -- > > Key: FLINK-20614 > URL: https://issues.apache.org/jira/browse/FLINK-20614 > Project: Flink > Issue Type: Bug > Components: Connectors / JDBC, Runtime / Task >Affects Versions: 1.12.0, 1.13.0 >Reporter: Kezhu Wang >Priority: Major > Labels: stale-major > > {{DriverManager}} keeps registered drivers in its internal data structures > which prevents they from gc after task finished. I confirm it in standalone > session cluster by observing that {{ChildFirstClassLoader}} could not be > reclaimed after several {{GC.run}}, it should exist in all session clusters. > Tomcat documents > [this|https://ci.apache.org/projects/tomcat/tomcat85/docs/jndi-datasource-examples-howto.html#DriverManager,_the_service_provider_mechanism_and_memory_leaks] > and fixes/circumvents this with > [JdbcLeakPrevention|https://github.com/apache/tomcat/blob/master/java/org/apache/catalina/loader/JdbcLeakPrevention.java#L30]. > Should we solve this in runtime ? Or treat it as connector and clients' > responsibility to solve it using > {{RuntimeContext.registerUserCodeClassLoaderReleaseHookIfAbsent}} or similar ? > Personally, it would be nice to solve in runtime as a catch-all to avoid > memory-leaking and provide consistent behavior to clients cross per-job and > session mode. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (FLINK-20614) Registered sql drivers not deregistered after task finished in session cluster
[ https://issues.apache.org/jira/browse/FLINK-20614?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17327460#comment-17327460 ] Flink Jira Bot commented on FLINK-20614: This major issue is unassigned and itself and all of its Sub-Tasks have not been updated for 30 days. So, it has been labeled "stale-major". If this ticket is indeed "major", please either assign yourself or give an update. Afterwards, please remove the label. In 7 days the issue will be deprioritized. > Registered sql drivers not deregistered after task finished in session cluster > -- > > Key: FLINK-20614 > URL: https://issues.apache.org/jira/browse/FLINK-20614 > Project: Flink > Issue Type: Bug > Components: Connectors / JDBC, Runtime / Task >Affects Versions: 1.12.0, 1.13.0 >Reporter: Kezhu Wang >Priority: Major > Labels: stale-major > > {{DriverManager}} keeps registered drivers in its internal data structures > which prevents they from gc after task finished. I confirm it in standalone > session cluster by observing that {{ChildFirstClassLoader}} could not be > reclaimed after several {{GC.run}}, it should exist in all session clusters. > Tomcat documents > [this|https://ci.apache.org/projects/tomcat/tomcat85/docs/jndi-datasource-examples-howto.html#DriverManager,_the_service_provider_mechanism_and_memory_leaks] > and fixes/circumvents this with > [JdbcLeakPrevention|https://github.com/apache/tomcat/blob/master/java/org/apache/catalina/loader/JdbcLeakPrevention.java#L30]. > Should we solve this in runtime ? Or treat it as connector and clients' > responsibility to solve it using > {{RuntimeContext.registerUserCodeClassLoaderReleaseHookIfAbsent}} or similar ? > Personally, it would be nice to solve in runtime as a catch-all to avoid > memory-leaking and provide consistent behavior to clients cross per-job and > session mode. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (FLINK-20614) Registered sql drivers not deregistered after task finished in session cluster
[ https://issues.apache.org/jira/browse/FLINK-20614?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17252707#comment-17252707 ] Flavio Pompermaier commented on FLINK-20614: Thank you [~kezhuw] for all your insights. I understand that running Flink in a per-job mode is the easiest way to "solve" this issue but not all customers can afford to manage a YARN or Kubernates cluster...especially when you're trying to sell a solution. To me it's very important to to have a fix for unloading properly JDBC drivers and I agree with you that this "dirty work" is done transparently to users > Registered sql drivers not deregistered after task finished in session cluster > -- > > Key: FLINK-20614 > URL: https://issues.apache.org/jira/browse/FLINK-20614 > Project: Flink > Issue Type: Bug > Components: Connectors / JDBC, Runtime / Task >Affects Versions: 1.12.0, 1.13.0 >Reporter: Kezhu Wang >Priority: Major > > {{DriverManager}} keeps registered drivers in its internal data structures > which prevents they from gc after task finished. I confirm it in standalone > session cluster by observing that {{ChildFirstClassLoader}} could not be > reclaimed after several {{GC.run}}, it should exist in all session clusters. > Tomcat documents > [this|https://ci.apache.org/projects/tomcat/tomcat85/docs/jndi-datasource-examples-howto.html#DriverManager,_the_service_provider_mechanism_and_memory_leaks] > and fixes/circumvents this with > [JdbcLeakPrevention|https://github.com/apache/tomcat/blob/master/java/org/apache/catalina/loader/JdbcLeakPrevention.java#L30]. > Should we solve this in runtime ? Or treat it as connector and clients' > responsibility to solve it using > {{RuntimeContext.registerUserCodeClassLoaderReleaseHookIfAbsent}} or similar ? > Personally, it would be nice to solve in runtime as a catch-all to avoid > memory-leaking and provide consistent behavior to clients cross per-job and > session mode. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (FLINK-20614) Registered sql drivers not deregistered after task finished in session cluster
[ https://issues.apache.org/jira/browse/FLINK-20614?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17252423#comment-17252423 ] Kezhu Wang commented on FLINK-20614: [~chesnay] Sorry for ping you again. I thought about suggested fix in [debugging_classloading|https://ci.apache.org/projects/flink/flink-docs-release-1.12/ops/debugging/debugging_classloading.html#unloading-of-dynamically-loaded-classes-in-user-code], I think it is unfriendly to end users as it requires users to change their deployment and treats jdbc drivers specially than other dependencies. Comparing to other two cases "Lingering Threads" and "Interners", users are kind of innocent to this issue, they are not writing bad-practice or buggy code. Though, this is actually introduced by JDK, but as a framework, it would be good for users that the dirty work is done at Flink side not their sides. It is an once-for-all solution, we do it once in Flink, all users will benefit from then. Aside from this discussion, I will dig into JDK to find why [these issues|https://ci.apache.org/projects/tomcat/tomcat85/docs/jndi-datasource-examples-howto.html#DriverManager,_the_service_provider_mechanism_and_memory_leaks] were not fixed and is there any possibility to fix it in future JDK version. It may take a long time, I will update what I get here. > Registered sql drivers not deregistered after task finished in session cluster > -- > > Key: FLINK-20614 > URL: https://issues.apache.org/jira/browse/FLINK-20614 > Project: Flink > Issue Type: Bug > Components: Connectors / JDBC, Runtime / Task >Affects Versions: 1.12.0, 1.13.0 >Reporter: Kezhu Wang >Priority: Major > > {{DriverManager}} keeps registered drivers in its internal data structures > which prevents they from gc after task finished. I confirm it in standalone > session cluster by observing that {{ChildFirstClassLoader}} could not be > reclaimed after several {{GC.run}}, it should exist in all session clusters. > Tomcat documents > [this|https://ci.apache.org/projects/tomcat/tomcat85/docs/jndi-datasource-examples-howto.html#DriverManager,_the_service_provider_mechanism_and_memory_leaks] > and fixes/circumvents this with > [JdbcLeakPrevention|https://github.com/apache/tomcat/blob/master/java/org/apache/catalina/loader/JdbcLeakPrevention.java#L30]. > Should we solve this in runtime ? Or treat it as connector and clients' > responsibility to solve it using > {{RuntimeContext.registerUserCodeClassLoaderReleaseHookIfAbsent}} or similar ? > Personally, it would be nice to solve in runtime as a catch-all to avoid > memory-leaking and provide consistent behavior to clients cross per-job and > session mode. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (FLINK-20614) Registered sql drivers not deregistered after task finished in session cluster
[ https://issues.apache.org/jira/browse/FLINK-20614?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17250391#comment-17250391 ] Kezhu Wang commented on FLINK-20614: [~chesnay] Sorry, I did not aware of FLINK-19005 before. Yeh, it is dirty. But if we don't the dirty work, users may reach FLINK-19005 or [debugging_classloading|https://ci.apache.org/projects/flink/flink-docs-release-1.12/ops/debugging/debugging_classloading.html#unloading-of-dynamically-loaded-classes-in-user-code] if they use jdbc in session mode, it is probably not a good experience. [~f.pompermaier] "jdbc is a mess", I agree with [~chesnay] . Tomcat documents two issues in their [docs|https://ci.apache.org/projects/tomcat/tomcat85/docs/jndi-datasource-examples-howto.html#DriverManager,_the_service_provider_mechanism_and_memory_leaks], another one is new drivers will not be auto-discovered by {{DriverManager.getConnection}}. Probably, the ultimate solution for clients is using per-job with dedicated resource management cluster especially in production environment, this should also avoid other class loader issues documented in [debugging_classloading|https://ci.apache.org/projects/flink/flink-docs-release-1.12/ops/debugging/debugging_classloading.html#unloading-of-dynamically-loaded-classes-in-user-code]. > Registered sql drivers not deregistered after task finished in session cluster > -- > > Key: FLINK-20614 > URL: https://issues.apache.org/jira/browse/FLINK-20614 > Project: Flink > Issue Type: Bug > Components: Connectors / JDBC, Runtime / Task >Affects Versions: 1.12.0, 1.13.0 >Reporter: Kezhu Wang >Priority: Major > > {{DriverManager}} keeps registered drivers in its internal data structures > which prevents they from gc after task finished. I confirm it in standalone > session cluster by observing that {{ChildFirstClassLoader}} could not be > reclaimed after several {{GC.run}}, it should exist in all session clusters. > Tomcat documents > [this|https://ci.apache.org/projects/tomcat/tomcat85/docs/jndi-datasource-examples-howto.html#DriverManager,_the_service_provider_mechanism_and_memory_leaks] > and fixes/circumvents this with > [JdbcLeakPrevention|https://github.com/apache/tomcat/blob/master/java/org/apache/catalina/loader/JdbcLeakPrevention.java#L30]. > Should we solve this in runtime ? Or treat it as connector and clients' > responsibility to solve it using > {{RuntimeContext.registerUserCodeClassLoaderReleaseHookIfAbsent}} or similar ? > Personally, it would be nice to solve in runtime as a catch-all to avoid > memory-leaking and provide consistent behavior to clients cross per-job and > session mode. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (FLINK-20614) Registered sql drivers not deregistered after task finished in session cluster
[ https://issues.apache.org/jira/browse/FLINK-20614?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17250199#comment-17250199 ] Flavio Pompermaier commented on FLINK-20614: I think that the Flink documentation is not very clear about where to put the JDBC dependencies, if within the fat JAR or in the lib folder (for example Fabian at [1] says that it should be no differences). In my opinion the way to solve the JDBC drivers leak should have a dedicated blog post [1][https://stackoverflow.com/questions/40528002/jdbc-driver-cannot-be-found-when-reading-a-dataset-from-an-sql-database-in-apach] > Registered sql drivers not deregistered after task finished in session cluster > -- > > Key: FLINK-20614 > URL: https://issues.apache.org/jira/browse/FLINK-20614 > Project: Flink > Issue Type: Bug > Components: Connectors / JDBC, Runtime / Task >Affects Versions: 1.12.0, 1.13.0 >Reporter: Kezhu Wang >Priority: Major > > {{DriverManager}} keeps registered drivers in its internal data structures > which prevents they from gc after task finished. I confirm it in standalone > session cluster by observing that {{ChildFirstClassLoader}} could not be > reclaimed after several {{GC.run}}, it should exist in all session clusters. > Tomcat documents > [this|https://ci.apache.org/projects/tomcat/tomcat85/docs/jndi-datasource-examples-howto.html#DriverManager,_the_service_provider_mechanism_and_memory_leaks] > and fixes/circumvents this with > [JdbcLeakPrevention|https://github.com/apache/tomcat/blob/master/java/org/apache/catalina/loader/JdbcLeakPrevention.java#L30]. > Should we solve this in runtime ? Or treat it as connector and clients' > responsibility to solve it using > {{RuntimeContext.registerUserCodeClassLoaderReleaseHookIfAbsent}} or similar ? > Personally, it would be nice to solve in runtime as a catch-all to avoid > memory-leaking and provide consistent behavior to clients cross per-job and > session mode. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (FLINK-20614) Registered sql drivers not deregistered after task finished in session cluster
[ https://issues.apache.org/jira/browse/FLINK-20614?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17250164#comment-17250164 ] Flavio Pompermaier commented on FLINK-20614: Could this problem be related also to a missing proper close for Java Tasks as in FLINK-20333? > Registered sql drivers not deregistered after task finished in session cluster > -- > > Key: FLINK-20614 > URL: https://issues.apache.org/jira/browse/FLINK-20614 > Project: Flink > Issue Type: Bug > Components: Connectors / JDBC, Runtime / Task >Affects Versions: 1.12.0, 1.13.0 >Reporter: Kezhu Wang >Priority: Major > > {{DriverManager}} keeps registered drivers in its internal data structures > which prevents they from gc after task finished. I confirm it in standalone > session cluster by observing that {{ChildFirstClassLoader}} could not be > reclaimed after several {{GC.run}}, it should exist in all session clusters. > Tomcat documents > [this|https://ci.apache.org/projects/tomcat/tomcat85/docs/jndi-datasource-examples-howto.html#DriverManager,_the_service_provider_mechanism_and_memory_leaks] > and fixes/circumvents this with > [JdbcLeakPrevention|https://github.com/apache/tomcat/blob/master/java/org/apache/catalina/loader/JdbcLeakPrevention.java#L30]. > Should we solve this in runtime ? Or treat it as connector and clients' > responsibility to solve it using > {{RuntimeContext.registerUserCodeClassLoaderReleaseHookIfAbsent}} or similar ? > Personally, it would be nice to solve in runtime as a catch-all to avoid > memory-leaking and provide consistent behavior to clients cross per-job and > session mode. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (FLINK-20614) Registered sql drivers not deregistered after task finished in session cluster
[ https://issues.apache.org/jira/browse/FLINK-20614?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17250028#comment-17250028 ] Chesnay Schepler commented on FLINK-20614: -- This issue was documented in FLINK-19005 (https://ci.apache.org/projects/flink/flink-docs-release-1.12/ops/debugging/debugging_classloading.html#unloading-of-dynamically-loaded-classes-in-user-code). The tomcat approach pretty much does what I suggested in FLINK-19005. Looking at what they [actually have to do to make it work |https://github.com/apache/tomcat/blob/efc6af6778ff3c1605d8b053f6fd2a4d9fd8e0d3/java/org/apache/catalina/loader/WebappClassLoaderBase.java#L1673] I'd rather not actually go down that route. > Registered sql drivers not deregistered after task finished in session cluster > -- > > Key: FLINK-20614 > URL: https://issues.apache.org/jira/browse/FLINK-20614 > Project: Flink > Issue Type: Bug > Components: Connectors / JDBC, Runtime / Task >Affects Versions: 1.12.0, 1.13.0 >Reporter: Kezhu Wang >Priority: Major > > {{DriverManager}} keeps registered drivers in its internal data structures > which prevents they from gc after task finished. I confirm it in standalone > session cluster by observing that {{ChildFirstClassLoader}} could not be > reclaimed after several {{GC.run}}, it should exist in all session clusters. > Tomcat documents > [this|https://ci.apache.org/projects/tomcat/tomcat85/docs/jndi-datasource-examples-howto.html#DriverManager,_the_service_provider_mechanism_and_memory_leaks] > and fixes/circumvents this with > [JdbcLeakPrevention|https://github.com/apache/tomcat/blob/master/java/org/apache/catalina/loader/JdbcLeakPrevention.java#L30]. > Should we solve this in runtime ? Or treat it as connector and clients' > responsibility to solve it using > {{RuntimeContext.registerUserCodeClassLoaderReleaseHookIfAbsent}} or similar ? > Personally, it would be nice to solve in runtime as a catch-all to avoid > memory-leaking and provide consistent behavior to clients cross per-job and > session mode. -- This message was sent by Atlassian Jira (v8.3.4#803005)