[
https://issues.apache.org/jira/browse/FLINK-38161?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Jaydeep Karia updated FLINK-38161:
----------------------------------
Description:
*Problem Summary*
PyFlink applications running in *thread mode* fail at startup with a
{{RuntimeException: Failed to find PemJa Library}} when Python dependencies are
installed into a non-standard {{site-packages}} directory. This is a common
scenario in modern containerized environments, particularly when using tools
like Cloud Native Buildpacks, which create separate layers for dependencies.
h3. Environment & Steps to Reproduce
This issue occurs in environments where {{pip}} installs packages into a
directory that is different from the system's primary Python {{site-packages}}
folder.
# *Set up an environment* with layered {{site-packages}} directories. For
example, using buildpacks might result in a file structure like this:
** *System {{{}site-packages{}}}:*
{{/layers/<python-buildpack>/runtime-depends/opt/bb/lib/python3.11/site-packages}}
** *Pip-installed {{{}site-packages{}}}:*
{{/layers/<pip-buildpack>/requirements/lib/python3.11/site-packages/}}
# *Install {{apache-flink and pemja}}* into the pip layer. The current
{{flink-python}} distributions pin the {{pemja}} dependency to version
{{{}0.4.1{}}}.
# *Execute a PyFlink job* that is configured to run in {{{}THREAD_MODE{}}}.
# {*}Observe the failure{*}. The job fails immediately on startup with a
{{{}RuntimeException{}}}, as Flink cannot locate the Pemja native library.
h3. Root Cause Analysis 🔍
The version of {{pemja}} currently pinned by {{flink-python}}
({{{}pemja==0.4.1{}}}) attempts to locate its native library by executing the
following Python code to find the {{site-packages}} path:
{{"import sysconfig; print(sysconfig.get_paths()[\"purelib\"])";}}
This implementation can be seen in the [pemja 0.4.1 source code on
GitHub|https://github.com/alibaba/pemja/blob/release-0.4/src/main/java/pemja/utils/CommonUtils.java#L39].
The core problem is that {{sysconfig.get_paths()["purelib"]}} {*}only returns
the standard library's installation path{*}. It does not search or account for
other paths where pip might install packages, such as a separate buildpack
layer or a virtual environment. As a result, when {{pemja}} is installed in any
location other than the system's default {{purelib}} directory, Flink is unable
to find its library files and fails.
h3. Proposed Solution ✅
This path discovery issue has been addressed and resolved in *{{pemja}} version
{{0.6.0}} and newer: [https://github.com/alibaba/pemja/pull/65]*
The {{pemja}} dependency version specified in {{flink-python}} needs to be
upgraded to allow for a more recent version that incorporates this fix.
*Required Action:*
Upgrade the {{install_requires}} for {{pemja}} within the
{{flink-python/setup.py}} file.
* *Current Line:* {{pemja==0.4.1}}
* *Proposed Change:* {{pemja>=0.6.0}}
* *File Location:* [{{flink-python/setup.py}} on
GitHub|https://github.com/apache/flink/blob/release-1.20.2/flink-python/setup.py#L327]
was:
*Problem Summary*
PyFlink applications running in *thread mode* fail at startup with a
{{RuntimeException: Failed to find PemJa Library}} when Python dependencies are
installed into a non-standard {{site-packages}} directory. This is a common
scenario in modern containerized environments, particularly when using tools
like Cloud Native Buildpacks, which create separate layers for dependencies.
h3. Environment & Steps to Reproduce
This issue occurs in environments where {{pip}} installs packages into a
directory that is different from the system's primary Python {{site-packages}}
folder.
# *Set up an environment* with layered {{site-packages}} directories. For
example, using buildpacks might result in a file structure like this:
** *System {{{}site-packages{}}}:*
{{/layers/<python-buildpack>/runtime-depends/opt/bb/lib/python3.11/site-packages}}
** *Pip-installed {{{}site-packages{}}}:*
{{/layers/<pip-buildpack>/requirements/lib/python3.11/site-packages/}}
# *Install {{apache-flink and pemja}}* into the pip layer. The current
{{flink-python}} distributions pin the {{pemja}} dependency to version
{{{}0.4.1{}}}.
# *Execute a PyFlink job* that is configured to run in {{{}THREAD_MODE{}}}.
# {*}Observe the failure{*}. The job fails immediately on startup with a
{{{}RuntimeException{}}}, as Flink cannot locate the Pemja native library.
h3. Root Cause Analysis 🔍
The version of {{pemja}} currently pinned by {{flink-python}}
({{{}pemja==0.4.1{}}}) attempts to locate its native library by executing the
following Python code to find the {{site-packages}} path:
{{"import sysconfig; print(sysconfig.get_paths()[\"purelib\"])";}}
This implementation can be seen in the [pemja 0.4.1 source code on
GitHub|https://github.com/alibaba/pemja/blob/release-0.4/src/main/java/pemja/utils/CommonUtils.java#L39].
The core problem is that {{sysconfig.get_paths()["purelib"]}} {*}only returns
the standard library's installation path{*}. It does not search or account for
other paths where pip might install packages, such as a separate buildpack
layer or a virtual environment. As a result, when {{pemja}} is installed in any
location other than the system's default {{purelib}} directory, Flink is unable
to find its library files and fails.
h3. Proposed Solution ✅
This path discovery issue has been addressed and resolved in *{{pemja}} version
{{0.6.0}} and newer: [https://github.com/alibaba/pemja/pull/65]*
The {{pemja}} dependency version specified in {{flink-python}} needs to be
upgraded to allow for a more recent version that incorporates this fix.
*Required Action:*
Upgrade the {{install_requires}} for {{pemja}} within the
{{flink-python/setup.py}} file.
* *Current Line:* {{pemja==0.4.1}}
* *Proposed Change:* {{pemja>=0.6.0}}
* *File Location:* [{{flink-python/setup.py}} on
GitHub|https://www.google.com/search?q=%5Bhttps://github.com/apache/flink/blob/release-1.20.2/flink-python/setup.py%23L327%5D(https://github.com/apache/flink/blob/release-1.20.2/flink-python/setup.py%23L327)]
> PyFlink: Unable to find Pemja library in thread mode when Python dependencies
> are installed in a separate layer
> ---------------------------------------------------------------------------------------------------------------
>
> Key: FLINK-38161
> URL: https://issues.apache.org/jira/browse/FLINK-38161
> Project: Flink
> Issue Type: Bug
> Components: API / Python
> Affects Versions: 1.19.3, 1.20.2
> Reporter: Jaydeep Karia
> Priority: Major
>
> *Problem Summary*
> PyFlink applications running in *thread mode* fail at startup with a
> {{RuntimeException: Failed to find PemJa Library}} when Python dependencies
> are installed into a non-standard {{site-packages}} directory. This is a
> common scenario in modern containerized environments, particularly when using
> tools like Cloud Native Buildpacks, which create separate layers for
> dependencies.
> h3. Environment & Steps to Reproduce
> This issue occurs in environments where {{pip}} installs packages into a
> directory that is different from the system's primary Python
> {{site-packages}} folder.
> # *Set up an environment* with layered {{site-packages}} directories. For
> example, using buildpacks might result in a file structure like this:
> ** *System {{{}site-packages{}}}:*
> {{/layers/<python-buildpack>/runtime-depends/opt/bb/lib/python3.11/site-packages}}
> ** *Pip-installed {{{}site-packages{}}}:*
> {{/layers/<pip-buildpack>/requirements/lib/python3.11/site-packages/}}
> # *Install {{apache-flink and pemja}}* into the pip layer. The current
> {{flink-python}} distributions pin the {{pemja}} dependency to version
> {{{}0.4.1{}}}.
> # *Execute a PyFlink job* that is configured to run in {{{}THREAD_MODE{}}}.
> # {*}Observe the failure{*}. The job fails immediately on startup with a
> {{{}RuntimeException{}}}, as Flink cannot locate the Pemja native library.
> h3. Root Cause Analysis 🔍
> The version of {{pemja}} currently pinned by {{flink-python}}
> ({{{}pemja==0.4.1{}}}) attempts to locate its native library by executing the
> following Python code to find the {{site-packages}} path:
>
> {{"import sysconfig; print(sysconfig.get_paths()[\"purelib\"])";}}
> This implementation can be seen in the [pemja 0.4.1 source code on
> GitHub|https://github.com/alibaba/pemja/blob/release-0.4/src/main/java/pemja/utils/CommonUtils.java#L39].
> The core problem is that {{sysconfig.get_paths()["purelib"]}} {*}only returns
> the standard library's installation path{*}. It does not search or account
> for other paths where pip might install packages, such as a separate
> buildpack layer or a virtual environment. As a result, when {{pemja}} is
> installed in any location other than the system's default {{purelib}}
> directory, Flink is unable to find its library files and fails.
> h3. Proposed Solution ✅
> This path discovery issue has been addressed and resolved in *{{pemja}}
> version {{0.6.0}} and newer: [https://github.com/alibaba/pemja/pull/65]*
> The {{pemja}} dependency version specified in {{flink-python}} needs to be
> upgraded to allow for a more recent version that incorporates this fix.
> *Required Action:*
> Upgrade the {{install_requires}} for {{pemja}} within the
> {{flink-python/setup.py}} file.
> * *Current Line:* {{pemja==0.4.1}}
> * *Proposed Change:* {{pemja>=0.6.0}}
> * *File Location:* [{{flink-python/setup.py}} on
> GitHub|https://github.com/apache/flink/blob/release-1.20.2/flink-python/setup.py#L327]
>
--
This message was sent by Atlassian Jira
(v8.20.10#820010)