[ 
https://issues.apache.org/jira/browse/FLINK-32758?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matthias Pohl updated FLINK-32758:
----------------------------------
    Affects Version/s: 1.18.0

> PyFlink bounds are overly restrictive and outdated
> --------------------------------------------------
>
>                 Key: FLINK-32758
>                 URL: https://issues.apache.org/jira/browse/FLINK-32758
>             Project: Flink
>          Issue Type: Improvement
>          Components: API / Python
>    Affects Versions: 1.18.0, 1.17.1, 1.19.0
>            Reporter: Deepyaman Datta
>            Assignee: Deepyaman Datta
>            Priority: Blocker
>              Labels: pull-request-available, test-stability
>         Attachments: image-2023-08-29-10-19-37-977.png
>
>
> Hi! I am part of a team building the Flink backend for Ibis 
> ([https://github.com/ibis-project/ibis]). We would like to leverage PyFlink 
> under the hood for execution; however, PyFlink's requirements are 
> incompatible with several other Ibis requirements. Beyond Ibis, PyFlink's 
> outdated and restrictive requirements prevent it from being used alongside 
> most recent releases of Python data libraries.
> Some of the major libraries we (and likely others in the Python community 
> interested in using PyFlink alongside other libraries) need compatibility 
> with:
>  * PyArrow (at least >=10.0.0, but there's no reason not to be also be 
> compatible with latest)
>  * pandas (should be compatible with 2.x series, but also probably with 
> 1.4.x, released January 2022, and 1.5.x)
>  * numpy (1.22 was released in December 2022)
>  * Newer releases of Apache Beam
>  * Newer releases of cython
> Furthermore, uncapped dependencies could be more generally preferable, as 
> they avoid the need for frequent PyFlink releases as newer versions of 
> libraries are released. A common (and great) argument for not upper-bounding 
> dependencies, especially for libraries: 
> [https://iscinumpy.dev/post/bound-version-constraints/]
> I am currently testing removing upper bounds in 
> [https://github.com/apache/flink/pull/23141]; so far, builds pass without 
> issue in 
> [b65c072|https://github.com/apache/flink/pull/23141/commits/b65c0723ed66e01e83d718f770aa916f41f34581],
>  and I'm currently waiting on 
> [c8eb15c|https://github.com/apache/flink/pull/23141/commits/c8eb15cbc371dc259fb4fda5395f0f55e08ea9c6]
>  to see if I can get PyArrow to resolve >=10.0.0. Solving the proposed 
> dependencies results in:
> {{#}}
> {{# This file is autogenerated by pip-compile with Python 3.8}}
> {{# by the following command:}}
> {{#}}
> {{#    pip-compile --config=pyproject.toml 
> --output-file=dev/compiled-requirements.txt dev/dev-requirements.txt}}
> {{#}}
> {{apache-beam==2.49.0}}
> {{    # via -r dev/dev-requirements.txt}}
> {{avro-python3==1.10.2}}
> {{    # via -r dev/dev-requirements.txt}}
> {{certifi==2023.7.22}}
> {{    # via requests}}
> {{charset-normalizer==3.2.0}}
> {{    # via requests}}
> {{cloudpickle==2.2.1}}
> {{    # via}}
> {{    #   -r dev/dev-requirements.txt}}
> {{    #   apache-beam}}
> {{crcmod==1.7}}
> {{    # via apache-beam}}
> {{cython==3.0.0}}
> {{    # via -r dev/dev-requirements.txt}}
> {{dill==0.3.1.1}}
> {{    # via apache-beam}}
> {{dnspython==2.4.1}}
> {{    # via pymongo}}
> {{docopt==0.6.2}}
> {{    # via hdfs}}
> {{exceptiongroup==1.1.2}}
> {{    # via pytest}}
> {{fastavro==1.8.2}}
> {{    # via}}
> {{    #   -r dev/dev-requirements.txt}}
> {{    #   apache-beam}}
> {{fasteners==0.18}}
> {{    # via apache-beam}}
> {{find-libpython==0.3.0}}
> {{    # via pemja}}
> {{grpcio==1.56.2}}
> {{    # via}}
> {{    #   -r dev/dev-requirements.txt}}
> {{    #   apache-beam}}
> {{    #   grpcio-tools}}
> {{grpcio-tools==1.56.2}}
> {{    # via -r dev/dev-requirements.txt}}
> {{hdfs==2.7.0}}
> {{    # via apache-beam}}
> {{httplib2==0.22.0}}
> {{    # via}}
> {{    #   -r dev/dev-requirements.txt}}
> {{    #   apache-beam}}
> {{idna==3.4}}
> {{    # via requests}}
> {{iniconfig==2.0.0}}
> {{    # via pytest}}
> {{numpy==1.24.4}}
> {{    # via}}
> {{    #   -r dev/dev-requirements.txt}}
> {{    #   apache-beam}}
> {{    #   pandas}}
> {{    #   pyarrow}}
> {{objsize==0.6.1}}
> {{    # via apache-beam}}
> {{orjson==3.9.2}}
> {{    # via apache-beam}}
> {{packaging==23.1}}
> {{    # via pytest}}
> {{pandas==2.0.3}}
> {{    # via -r dev/dev-requirements.txt}}
> {{pemja==0.3.0 ; platform_system != "Windows"}}
> {{    # via -r dev/dev-requirements.txt}}
> {{pluggy==1.2.0}}
> {{    # via pytest}}
> {{proto-plus==1.22.3}}
> {{    # via apache-beam}}
> {{protobuf==4.23.4}}
> {{    # via}}
> {{    #   -r dev/dev-requirements.txt}}
> {{    #   apache-beam}}
> {{    #   grpcio-tools}}
> {{    #   proto-plus}}
> {{py4j==0.10.9.7}}
> {{    # via -r dev/dev-requirements.txt}}
> {{pyarrow==11.0.0}}
> {{    # via}}
> {{    #   -r dev/dev-requirements.txt}}
> {{    #   apache-beam}}
> {{pydot==1.4.2}}
> {{    # via apache-beam}}
> {{pymongo==4.4.1}}
> {{    # via apache-beam}}
> {{pyparsing==3.1.1}}
> {{    # via}}
> {{    #   httplib2}}
> {{    #   pydot}}
> {{pytest==7.4.0}}
> {{    # via -r dev/dev-requirements.txt}}
> {{python-dateutil==2.8.2}}
> {{    # via}}
> {{    #   -r dev/dev-requirements.txt}}
> {{    #   apache-beam}}
> {{    #   pandas}}
> {{pytz==2023.3}}
> {{    # via}}
> {{    #   -r dev/dev-requirements.txt}}
> {{    #   apache-beam}}
> {{    #   pandas}}
> {{regex==2023.6.3}}
> {{    # via apache-beam}}
> {{requests==2.31.0}}
> {{    # via}}
> {{    #   apache-beam}}
> {{    #   hdfs}}
> {{six==1.16.0}}
> {{    # via}}
> {{    #   hdfs}}
> {{    #   python-dateutil}}
> {{tomli==2.0.1}}
> {{    # via pytest}}
> {{typing-extensions==4.7.1}}
> {{    # via apache-beam}}
> {{tzdata==2023.3}}
> {{    # via pandas}}
> {{urllib3==2.0.4}}
> {{    # via requests}}
> {{wheel==0.41.0}}
> {{    # via -r dev/dev-requirements.txt}}
> {{zstandard==0.21.0}}
> {{    # via apache-beam}}
> {{# The following packages are considered to be unsafe in a requirements 
> file:}}
> {{# pip}}
> {{# setuptools}}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to