Micah Wylde created BEAM-5640: --------------------------------- Summary: Portable python sdk worker leaks memory when PyOpenSSL package is present Key: BEAM-5640 URL: https://issues.apache.org/jira/browse/BEAM-5640 Project: Beam Issue Type: Bug Components: sdk-py-harness Reporter: Micah Wylde Assignee: Robert Bradshaw
When PyOpenSSL package is installed on a system (e.g., in a virtualenv) the python sdk_worker process leaks memory. I've validated this when using the flink portable runner in streaming mode, but it may occur in other configurations as well. The leak is pretty significant, amounting to tens of MBs/sec. I've put together a reproduction for the issue [here|https://github.com/mwylde/beam/tree/micah_memory_leak]. That branch includes a flink streaming data source that generates data, as well as a python pipeline that demonstrates the issue. To reproduce: {code:java} check out the branch: $ git clone g...@github.com:mwylde/beam.git $ git checkout micah_memory_leak build the python docker container with pyopenssl installed: $ cd beam $ ./gradlew :beam-sdks-python-container:docker start the job server with embedded flink cluster: $ ./gradlew runShadow run the pipeline: $ ./gradlew :beam-sdks-python:streamingLeak{code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)