ashishdeole opened a new issue, #25510: URL: https://github.com/apache/beam/issues/25510
### What happened? ### Pre-requisites We should have an independent flink cluster running. (either standalone or using kubernetes resource provider). We used flink 1.14.5. And requirement is to you session mode of flink job submission. ### Steps to reproduce 1. Use WordCount example from beam ( latest stable and even with 2.38.0). 2. Run the wordcount pipeline on the flink cluster. 3. Submit the same multiple times. Based on JVM configuration, after some jobs we get Outofmemoryerror - metaspace. 4. Even if we dont wait for outofmemory and get the heapdump - it can be observed that ChildFirstClassLoader is not getting garbage collected after each job is finished. Leading to memory leakage. ### Analysis 1. When jobs are submitted to flink in session mode, flink uses jobmanager-io-thread for each submitted job using thread pool. 2. Beam PipelineOptionsFactory has threadLocal DefaultDeserializationContext.Impl. 3. DefaultDeserializationContext.Impl is loaded from childfirstclassloader and after job completion as the job manager thread goes back to pool this threadLocal is not removed and leads to memory leak. PFA heapdump produced from Outofmemory - metaspace after submitting wordcount job 4-5 times ( jobmanager.memory.jvm-metaspace.size: 100mb was deliberately kept to reproduce). ### References 1. https://cwiki.apache.org/confluence/display/FLINK/Debugging+ClassLoader+leaks 2. https://github.com/apache/beam/blob/master/sdks/java/core/src/main/java/org/apache/beam/sdk/options/PipelineOptionsFactory.java 3. deserializationcontext was made threadlocal using this [PR ](https://github.com/apache/beam/pull/16680) for a bug BEAM-13782. ### Issue Priority Priority: 2 (default / most bugs should be filed as P2) ### Issue Components - [ ] Component: Python SDK - [X] Component: Java SDK - [ ] Component: Go SDK - [ ] Component: Typescript SDK - [ ] Component: IO connector - [ ] Component: Beam examples - [ ] Component: Beam playground - [ ] Component: Beam katas - [ ] Component: Website - [ ] Component: Spark Runner - [ ] Component: Flink Runner - [ ] Component: Samza Runner - [ ] Component: Twister2 Runner - [ ] Component: Hazelcast Jet Runner - [ ] Component: Google Cloud Dataflow Runner -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@beam.apache.org.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org