Jaydeep Karia created FLINK-38161:
-------------------------------------
Summary: PyFlink: Unable to find Pemja library in thread mode when
Python dependencies are installed in a separate layer
Key: FLINK-38161
URL: https://issues.apache.org/jira/browse/FLINK-38161
Project: Flink
Issue Type: Bug
Components: API / Python
Affects Versions: 1.20.2, 1.19.3
Reporter: Jaydeep Karia
*Problem Summary*
PyFlink applications running in *thread mode* fail at startup with a
{{RuntimeException: Failed to find PemJa Library}} when Python dependencies are
installed into a non-standard {{site-packages}} directory. This is a common
scenario in modern containerized environments, particularly when using tools
like Cloud Native Buildpacks, which create separate layers for dependencies.
h3. Environment & Steps to Reproduce
This issue occurs in environments where {{pip}} installs packages into a
directory that is different from the system's primary Python {{site-packages}}
folder.
# *Set up an environment* with layered {{site-packages}} directories. For
example, using buildpacks might result in a file structure like this:
** *System {{{}site-packages{}}}:*
{{/layers/<python-buildpack>/runtime-depends/opt/bb/lib/python3.11/site-packages}}
** *Pip-installed {{{}site-packages{}}}:*
{{/layers/<pip-buildpack>/requirements/lib/python3.11/site-packages/}}
# *Install {{apache-flink and pemja}}* into the pip layer. The current
{{flink-python}} distributions pin the {{pemja}} dependency to version
{{{}0.4.1{}}}.
# *Execute a PyFlink job* that is configured to run in {{{}THREAD_MODE{}}}.
# {*}Observe the failure{*}. The job fails immediately on startup with a
{{{}RuntimeException{}}}, as Flink cannot locate the Pemja native library.
h3. Root Cause Analysis 🔍
The version of {{pemja}} currently pinned by {{flink-python}}
({{{}pemja==0.4.1{}}}) attempts to locate its native library by executing the
following Python code to find the {{site-packages}} path:
{{"import sysconfig; print(sysconfig.get_paths()[\"purelib\"])";}}
This implementation can be seen in the [pemja 0.4.1 source code on
GitHub|https://github.com/alibaba/pemja/blob/release-0.4/src/main/java/pemja/utils/CommonUtils.java#L39].
The core problem is that {{sysconfig.get_paths()["purelib"]}} {*}only returns
the standard library's installation path{*}. It does not search or account for
other paths where pip might install packages, such as a separate buildpack
layer or a virtual environment. As a result, when {{pemja}} is installed in any
location other than the system's default {{purelib}} directory, Flink is unable
to find its library files and fails.
h3. Proposed Solution ✅
This path discovery issue has been addressed and resolved in *{{pemja}} version
{{0.6.0}} and newer: [https://github.com/alibaba/pemja/pull/65]*
The {{pemja}} dependency version specified in {{flink-python}} needs to be
upgraded to allow for a more recent version that incorporates this fix.
*Required Action:*
Upgrade the {{install_requires}} for {{pemja}} within the
{{flink-python/setup.py}} file.
* *Current Line:* {{pemja==0.4.1}}
* *Proposed Change:* {{pemja>=0.6.0}}
* *File Location:* [{{flink-python/setup.py}} on
GitHub|https://www.google.com/search?q=%5Bhttps://github.com/apache/flink/blob/release-1.20.2/flink-python/setup.py%23L327%5D(https://github.com/apache/flink/blob/release-1.20.2/flink-python/setup.py%23L327)]
--
This message was sent by Atlassian Jira
(v8.20.10#820010)