Jaydeep Karia created FLINK-38161:
-------------------------------------

             Summary: PyFlink: Unable to find Pemja library in thread mode when 
Python dependencies are installed in a separate layer
                 Key: FLINK-38161
                 URL: https://issues.apache.org/jira/browse/FLINK-38161
             Project: Flink
          Issue Type: Bug
          Components: API / Python
    Affects Versions: 1.20.2, 1.19.3
            Reporter: Jaydeep Karia


*Problem Summary*

PyFlink applications running in *thread mode* fail at startup with a 
{{RuntimeException: Failed to find PemJa Library}} when Python dependencies are 
installed into a non-standard {{site-packages}} directory. This is a common 
scenario in modern containerized environments, particularly when using tools 
like Cloud Native Buildpacks, which create separate layers for dependencies.
h3. Environment & Steps to Reproduce

This issue occurs in environments where {{pip}} installs packages into a 
directory that is different from the system's primary Python {{site-packages}} 
folder.
 # *Set up an environment* with layered {{site-packages}} directories. For 
example, using buildpacks might result in a file structure like this:

 ** *System {{{}site-packages{}}}:* 
{{/layers/<python-buildpack>/runtime-depends/opt/bb/lib/python3.11/site-packages}}

 ** *Pip-installed {{{}site-packages{}}}:* 
{{/layers/<pip-buildpack>/requirements/lib/python3.11/site-packages/}}

 # *Install {{apache-flink and pemja}}* into the pip layer. The current 
{{flink-python}} distributions pin the {{pemja}} dependency to version 
{{{}0.4.1{}}}.

 # *Execute a PyFlink job* that is configured to run in {{{}THREAD_MODE{}}}.

 # {*}Observe the failure{*}. The job fails immediately on startup with a 
{{{}RuntimeException{}}}, as Flink cannot locate the Pemja native library.

h3. Root Cause Analysis 🔍

The version of {{pemja}} currently pinned by {{flink-python}} 
({{{}pemja==0.4.1{}}}) attempts to locate its native library by executing the 
following Python code to find the {{site-packages}} path:

 

{{"import sysconfig; print(sysconfig.get_paths()[\"purelib\"])";}}

This implementation can be seen in the [pemja 0.4.1 source code on 
GitHub|https://github.com/alibaba/pemja/blob/release-0.4/src/main/java/pemja/utils/CommonUtils.java#L39].

The core problem is that {{sysconfig.get_paths()["purelib"]}} {*}only returns 
the standard library's installation path{*}. It does not search or account for 
other paths where pip might install packages, such as a separate buildpack 
layer or a virtual environment. As a result, when {{pemja}} is installed in any 
location other than the system's default {{purelib}} directory, Flink is unable 
to find its library files and fails.
h3. Proposed Solution ✅

This path discovery issue has been addressed and resolved in *{{pemja}} version 
{{0.6.0}} and newer: [https://github.com/alibaba/pemja/pull/65]*

The {{pemja}} dependency version specified in {{flink-python}} needs to be 
upgraded to allow for a more recent version that incorporates this fix.

*Required Action:*

Upgrade the {{install_requires}} for {{pemja}} within the 
{{flink-python/setup.py}} file.
 * *Current Line:* {{pemja==0.4.1}}

 * *Proposed Change:* {{pemja>=0.6.0}}

 * *File Location:* [{{flink-python/setup.py}} on 
GitHub|https://www.google.com/search?q=%5Bhttps://github.com/apache/flink/blob/release-1.20.2/flink-python/setup.py%23L327%5D(https://github.com/apache/flink/blob/release-1.20.2/flink-python/setup.py%23L327)]

 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to