[jira] [Updated] (FLINK-32758) PyFlink bounds are overly restrictive and outdated

2023-08-29 Thread Matthias Pohl (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-32758?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matthias Pohl updated FLINK-32758:
--
Affects Version/s: (was: 1.18.0)

> PyFlink bounds are overly restrictive and outdated
> --
>
> Key: FLINK-32758
> URL: https://issues.apache.org/jira/browse/FLINK-32758
> Project: Flink
>  Issue Type: Improvement
>  Components: API / Python
>Affects Versions: 1.17.1, 1.19.0
>Reporter: Deepyaman Datta
>Assignee: Deepyaman Datta
>Priority: Blocker
>  Labels: pull-request-available, test-stability
> Attachments: image-2023-08-29-10-19-37-977.png
>
>
> Hi! I am part of a team building the Flink backend for Ibis 
> ([https://github.com/ibis-project/ibis]). We would like to leverage PyFlink 
> under the hood for execution; however, PyFlink's requirements are 
> incompatible with several other Ibis requirements. Beyond Ibis, PyFlink's 
> outdated and restrictive requirements prevent it from being used alongside 
> most recent releases of Python data libraries.
> Some of the major libraries we (and likely others in the Python community 
> interested in using PyFlink alongside other libraries) need compatibility 
> with:
>  * PyArrow (at least >=10.0.0, but there's no reason not to be also be 
> compatible with latest)
>  * pandas (should be compatible with 2.x series, but also probably with 
> 1.4.x, released January 2022, and 1.5.x)
>  * numpy (1.22 was released in December 2022)
>  * Newer releases of Apache Beam
>  * Newer releases of cython
> Furthermore, uncapped dependencies could be more generally preferable, as 
> they avoid the need for frequent PyFlink releases as newer versions of 
> libraries are released. A common (and great) argument for not upper-bounding 
> dependencies, especially for libraries: 
> [https://iscinumpy.dev/post/bound-version-constraints/]
> I am currently testing removing upper bounds in 
> [https://github.com/apache/flink/pull/23141]; so far, builds pass without 
> issue in 
> [b65c072|https://github.com/apache/flink/pull/23141/commits/b65c0723ed66e01e83d718f770aa916f41f34581],
>  and I'm currently waiting on 
> [c8eb15c|https://github.com/apache/flink/pull/23141/commits/c8eb15cbc371dc259fb4fda5395f0f55e08ea9c6]
>  to see if I can get PyArrow to resolve >=10.0.0. Solving the proposed 
> dependencies results in:
> {{#}}
> {{# This file is autogenerated by pip-compile with Python 3.8}}
> {{# by the following command:}}
> {{#}}
> {{#    pip-compile --config=pyproject.toml 
> --output-file=dev/compiled-requirements.txt dev/dev-requirements.txt}}
> {{#}}
> {{apache-beam==2.49.0}}
> {{    # via -r dev/dev-requirements.txt}}
> {{avro-python3==1.10.2}}
> {{    # via -r dev/dev-requirements.txt}}
> {{certifi==2023.7.22}}
> {{    # via requests}}
> {{charset-normalizer==3.2.0}}
> {{    # via requests}}
> {{cloudpickle==2.2.1}}
> {{    # via}}
> {{    #   -r dev/dev-requirements.txt}}
> {{    #   apache-beam}}
> {{crcmod==1.7}}
> {{    # via apache-beam}}
> {{cython==3.0.0}}
> {{    # via -r dev/dev-requirements.txt}}
> {{dill==0.3.1.1}}
> {{    # via apache-beam}}
> {{dnspython==2.4.1}}
> {{    # via pymongo}}
> {{docopt==0.6.2}}
> {{    # via hdfs}}
> {{exceptiongroup==1.1.2}}
> {{    # via pytest}}
> {{fastavro==1.8.2}}
> {{    # via}}
> {{    #   -r dev/dev-requirements.txt}}
> {{    #   apache-beam}}
> {{fasteners==0.18}}
> {{    # via apache-beam}}
> {{find-libpython==0.3.0}}
> {{    # via pemja}}
> {{grpcio==1.56.2}}
> {{    # via}}
> {{    #   -r dev/dev-requirements.txt}}
> {{    #   apache-beam}}
> {{    #   grpcio-tools}}
> {{grpcio-tools==1.56.2}}
> {{    # via -r dev/dev-requirements.txt}}
> {{hdfs==2.7.0}}
> {{    # via apache-beam}}
> {{httplib2==0.22.0}}
> {{    # via}}
> {{    #   -r dev/dev-requirements.txt}}
> {{    #   apache-beam}}
> {{idna==3.4}}
> {{    # via requests}}
> {{iniconfig==2.0.0}}
> {{    # via pytest}}
> {{numpy==1.24.4}}
> {{    # via}}
> {{    #   -r dev/dev-requirements.txt}}
> {{    #   apache-beam}}
> {{    #   pandas}}
> {{    #   pyarrow}}
> {{objsize==0.6.1}}
> {{    # via apache-beam}}
> {{orjson==3.9.2}}
> {{    # via apache-beam}}
> {{packaging==23.1}}
> {{    # via pytest}}
> {{pandas==2.0.3}}
> {{    # via -r dev/dev-requirements.txt}}
> {{pemja==0.3.0 ; platform_system != "Windows"}}
> {{    # via -r dev/dev-requirements.txt}}
> {{pluggy==1.2.0}}
> {{    # via pytest}}
> {{proto-plus==1.22.3}}
> {{    # via apache-beam}}
> {{protobuf==4.23.4}}
> {{    # via}}
> {{    #   -r dev/dev-requirements.txt}}
> {{    #   apache-beam}}
> {{    #   grpcio-tools}}
> {{    #   proto-plus}}
> {{py4j==0.10.9.7}}
> {{    # via -r dev/dev-requirements.txt}}
> {{pyarrow==11.0.0}}
> {{    # via}}
> {{    #   -r dev/dev-requirements.txt}}
> {{    #   apache-beam}}
> {{pydot==1.4.2}}
> {{    # 

[jira] [Updated] (FLINK-32758) PyFlink bounds are overly restrictive and outdated

2023-08-29 Thread Matthias Pohl (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-32758?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matthias Pohl updated FLINK-32758:
--
Affects Version/s: 1.18.0

> PyFlink bounds are overly restrictive and outdated
> --
>
> Key: FLINK-32758
> URL: https://issues.apache.org/jira/browse/FLINK-32758
> Project: Flink
>  Issue Type: Improvement
>  Components: API / Python
>Affects Versions: 1.18.0, 1.17.1, 1.19.0
>Reporter: Deepyaman Datta
>Assignee: Deepyaman Datta
>Priority: Blocker
>  Labels: pull-request-available, test-stability
> Attachments: image-2023-08-29-10-19-37-977.png
>
>
> Hi! I am part of a team building the Flink backend for Ibis 
> ([https://github.com/ibis-project/ibis]). We would like to leverage PyFlink 
> under the hood for execution; however, PyFlink's requirements are 
> incompatible with several other Ibis requirements. Beyond Ibis, PyFlink's 
> outdated and restrictive requirements prevent it from being used alongside 
> most recent releases of Python data libraries.
> Some of the major libraries we (and likely others in the Python community 
> interested in using PyFlink alongside other libraries) need compatibility 
> with:
>  * PyArrow (at least >=10.0.0, but there's no reason not to be also be 
> compatible with latest)
>  * pandas (should be compatible with 2.x series, but also probably with 
> 1.4.x, released January 2022, and 1.5.x)
>  * numpy (1.22 was released in December 2022)
>  * Newer releases of Apache Beam
>  * Newer releases of cython
> Furthermore, uncapped dependencies could be more generally preferable, as 
> they avoid the need for frequent PyFlink releases as newer versions of 
> libraries are released. A common (and great) argument for not upper-bounding 
> dependencies, especially for libraries: 
> [https://iscinumpy.dev/post/bound-version-constraints/]
> I am currently testing removing upper bounds in 
> [https://github.com/apache/flink/pull/23141]; so far, builds pass without 
> issue in 
> [b65c072|https://github.com/apache/flink/pull/23141/commits/b65c0723ed66e01e83d718f770aa916f41f34581],
>  and I'm currently waiting on 
> [c8eb15c|https://github.com/apache/flink/pull/23141/commits/c8eb15cbc371dc259fb4fda5395f0f55e08ea9c6]
>  to see if I can get PyArrow to resolve >=10.0.0. Solving the proposed 
> dependencies results in:
> {{#}}
> {{# This file is autogenerated by pip-compile with Python 3.8}}
> {{# by the following command:}}
> {{#}}
> {{#    pip-compile --config=pyproject.toml 
> --output-file=dev/compiled-requirements.txt dev/dev-requirements.txt}}
> {{#}}
> {{apache-beam==2.49.0}}
> {{    # via -r dev/dev-requirements.txt}}
> {{avro-python3==1.10.2}}
> {{    # via -r dev/dev-requirements.txt}}
> {{certifi==2023.7.22}}
> {{    # via requests}}
> {{charset-normalizer==3.2.0}}
> {{    # via requests}}
> {{cloudpickle==2.2.1}}
> {{    # via}}
> {{    #   -r dev/dev-requirements.txt}}
> {{    #   apache-beam}}
> {{crcmod==1.7}}
> {{    # via apache-beam}}
> {{cython==3.0.0}}
> {{    # via -r dev/dev-requirements.txt}}
> {{dill==0.3.1.1}}
> {{    # via apache-beam}}
> {{dnspython==2.4.1}}
> {{    # via pymongo}}
> {{docopt==0.6.2}}
> {{    # via hdfs}}
> {{exceptiongroup==1.1.2}}
> {{    # via pytest}}
> {{fastavro==1.8.2}}
> {{    # via}}
> {{    #   -r dev/dev-requirements.txt}}
> {{    #   apache-beam}}
> {{fasteners==0.18}}
> {{    # via apache-beam}}
> {{find-libpython==0.3.0}}
> {{    # via pemja}}
> {{grpcio==1.56.2}}
> {{    # via}}
> {{    #   -r dev/dev-requirements.txt}}
> {{    #   apache-beam}}
> {{    #   grpcio-tools}}
> {{grpcio-tools==1.56.2}}
> {{    # via -r dev/dev-requirements.txt}}
> {{hdfs==2.7.0}}
> {{    # via apache-beam}}
> {{httplib2==0.22.0}}
> {{    # via}}
> {{    #   -r dev/dev-requirements.txt}}
> {{    #   apache-beam}}
> {{idna==3.4}}
> {{    # via requests}}
> {{iniconfig==2.0.0}}
> {{    # via pytest}}
> {{numpy==1.24.4}}
> {{    # via}}
> {{    #   -r dev/dev-requirements.txt}}
> {{    #   apache-beam}}
> {{    #   pandas}}
> {{    #   pyarrow}}
> {{objsize==0.6.1}}
> {{    # via apache-beam}}
> {{orjson==3.9.2}}
> {{    # via apache-beam}}
> {{packaging==23.1}}
> {{    # via pytest}}
> {{pandas==2.0.3}}
> {{    # via -r dev/dev-requirements.txt}}
> {{pemja==0.3.0 ; platform_system != "Windows"}}
> {{    # via -r dev/dev-requirements.txt}}
> {{pluggy==1.2.0}}
> {{    # via pytest}}
> {{proto-plus==1.22.3}}
> {{    # via apache-beam}}
> {{protobuf==4.23.4}}
> {{    # via}}
> {{    #   -r dev/dev-requirements.txt}}
> {{    #   apache-beam}}
> {{    #   grpcio-tools}}
> {{    #   proto-plus}}
> {{py4j==0.10.9.7}}
> {{    # via -r dev/dev-requirements.txt}}
> {{pyarrow==11.0.0}}
> {{    # via}}
> {{    #   -r dev/dev-requirements.txt}}
> {{    #   apache-beam}}
> {{pydot==1.4.2}}
> {{    # 

[jira] [Updated] (FLINK-32758) PyFlink bounds are overly restrictive and outdated

2023-08-28 Thread Dian Fu (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-32758?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dian Fu updated FLINK-32758:

Attachment: image-2023-08-29-10-19-37-977.png

> PyFlink bounds are overly restrictive and outdated
> --
>
> Key: FLINK-32758
> URL: https://issues.apache.org/jira/browse/FLINK-32758
> Project: Flink
>  Issue Type: Improvement
>  Components: API / Python
>Affects Versions: 1.17.1, 1.19.0
>Reporter: Deepyaman Datta
>Assignee: Deepyaman Datta
>Priority: Blocker
>  Labels: pull-request-available, test-stability
> Attachments: image-2023-08-29-10-19-37-977.png
>
>
> Hi! I am part of a team building the Flink backend for Ibis 
> ([https://github.com/ibis-project/ibis]). We would like to leverage PyFlink 
> under the hood for execution; however, PyFlink's requirements are 
> incompatible with several other Ibis requirements. Beyond Ibis, PyFlink's 
> outdated and restrictive requirements prevent it from being used alongside 
> most recent releases of Python data libraries.
> Some of the major libraries we (and likely others in the Python community 
> interested in using PyFlink alongside other libraries) need compatibility 
> with:
>  * PyArrow (at least >=10.0.0, but there's no reason not to be also be 
> compatible with latest)
>  * pandas (should be compatible with 2.x series, but also probably with 
> 1.4.x, released January 2022, and 1.5.x)
>  * numpy (1.22 was released in December 2022)
>  * Newer releases of Apache Beam
>  * Newer releases of cython
> Furthermore, uncapped dependencies could be more generally preferable, as 
> they avoid the need for frequent PyFlink releases as newer versions of 
> libraries are released. A common (and great) argument for not upper-bounding 
> dependencies, especially for libraries: 
> [https://iscinumpy.dev/post/bound-version-constraints/]
> I am currently testing removing upper bounds in 
> [https://github.com/apache/flink/pull/23141]; so far, builds pass without 
> issue in 
> [b65c072|https://github.com/apache/flink/pull/23141/commits/b65c0723ed66e01e83d718f770aa916f41f34581],
>  and I'm currently waiting on 
> [c8eb15c|https://github.com/apache/flink/pull/23141/commits/c8eb15cbc371dc259fb4fda5395f0f55e08ea9c6]
>  to see if I can get PyArrow to resolve >=10.0.0. Solving the proposed 
> dependencies results in:
> {{#}}
> {{# This file is autogenerated by pip-compile with Python 3.8}}
> {{# by the following command:}}
> {{#}}
> {{#    pip-compile --config=pyproject.toml 
> --output-file=dev/compiled-requirements.txt dev/dev-requirements.txt}}
> {{#}}
> {{apache-beam==2.49.0}}
> {{    # via -r dev/dev-requirements.txt}}
> {{avro-python3==1.10.2}}
> {{    # via -r dev/dev-requirements.txt}}
> {{certifi==2023.7.22}}
> {{    # via requests}}
> {{charset-normalizer==3.2.0}}
> {{    # via requests}}
> {{cloudpickle==2.2.1}}
> {{    # via}}
> {{    #   -r dev/dev-requirements.txt}}
> {{    #   apache-beam}}
> {{crcmod==1.7}}
> {{    # via apache-beam}}
> {{cython==3.0.0}}
> {{    # via -r dev/dev-requirements.txt}}
> {{dill==0.3.1.1}}
> {{    # via apache-beam}}
> {{dnspython==2.4.1}}
> {{    # via pymongo}}
> {{docopt==0.6.2}}
> {{    # via hdfs}}
> {{exceptiongroup==1.1.2}}
> {{    # via pytest}}
> {{fastavro==1.8.2}}
> {{    # via}}
> {{    #   -r dev/dev-requirements.txt}}
> {{    #   apache-beam}}
> {{fasteners==0.18}}
> {{    # via apache-beam}}
> {{find-libpython==0.3.0}}
> {{    # via pemja}}
> {{grpcio==1.56.2}}
> {{    # via}}
> {{    #   -r dev/dev-requirements.txt}}
> {{    #   apache-beam}}
> {{    #   grpcio-tools}}
> {{grpcio-tools==1.56.2}}
> {{    # via -r dev/dev-requirements.txt}}
> {{hdfs==2.7.0}}
> {{    # via apache-beam}}
> {{httplib2==0.22.0}}
> {{    # via}}
> {{    #   -r dev/dev-requirements.txt}}
> {{    #   apache-beam}}
> {{idna==3.4}}
> {{    # via requests}}
> {{iniconfig==2.0.0}}
> {{    # via pytest}}
> {{numpy==1.24.4}}
> {{    # via}}
> {{    #   -r dev/dev-requirements.txt}}
> {{    #   apache-beam}}
> {{    #   pandas}}
> {{    #   pyarrow}}
> {{objsize==0.6.1}}
> {{    # via apache-beam}}
> {{orjson==3.9.2}}
> {{    # via apache-beam}}
> {{packaging==23.1}}
> {{    # via pytest}}
> {{pandas==2.0.3}}
> {{    # via -r dev/dev-requirements.txt}}
> {{pemja==0.3.0 ; platform_system != "Windows"}}
> {{    # via -r dev/dev-requirements.txt}}
> {{pluggy==1.2.0}}
> {{    # via pytest}}
> {{proto-plus==1.22.3}}
> {{    # via apache-beam}}
> {{protobuf==4.23.4}}
> {{    # via}}
> {{    #   -r dev/dev-requirements.txt}}
> {{    #   apache-beam}}
> {{    #   grpcio-tools}}
> {{    #   proto-plus}}
> {{py4j==0.10.9.7}}
> {{    # via -r dev/dev-requirements.txt}}
> {{pyarrow==11.0.0}}
> {{    # via}}
> {{    #   -r dev/dev-requirements.txt}}
> {{    #   apache-beam}}
> {{pydot==1.4.2}}
> {{    # 

[jira] [Updated] (FLINK-32758) PyFlink bounds are overly restrictive and outdated

2023-08-28 Thread Sergey Nuyanzin (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-32758?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Nuyanzin updated FLINK-32758:

Labels: pull-request-available test-stability  (was: pull-request-available)

> PyFlink bounds are overly restrictive and outdated
> --
>
> Key: FLINK-32758
> URL: https://issues.apache.org/jira/browse/FLINK-32758
> Project: Flink
>  Issue Type: Improvement
>  Components: API / Python
>Affects Versions: 1.17.1
>Reporter: Deepyaman Datta
>Assignee: Deepyaman Datta
>Priority: Blocker
>  Labels: pull-request-available, test-stability
>
> Hi! I am part of a team building the Flink backend for Ibis 
> ([https://github.com/ibis-project/ibis]). We would like to leverage PyFlink 
> under the hood for execution; however, PyFlink's requirements are 
> incompatible with several other Ibis requirements. Beyond Ibis, PyFlink's 
> outdated and restrictive requirements prevent it from being used alongside 
> most recent releases of Python data libraries.
> Some of the major libraries we (and likely others in the Python community 
> interested in using PyFlink alongside other libraries) need compatibility 
> with:
>  * PyArrow (at least >=10.0.0, but there's no reason not to be also be 
> compatible with latest)
>  * pandas (should be compatible with 2.x series, but also probably with 
> 1.4.x, released January 2022, and 1.5.x)
>  * numpy (1.22 was released in December 2022)
>  * Newer releases of Apache Beam
>  * Newer releases of cython
> Furthermore, uncapped dependencies could be more generally preferable, as 
> they avoid the need for frequent PyFlink releases as newer versions of 
> libraries are released. A common (and great) argument for not upper-bounding 
> dependencies, especially for libraries: 
> [https://iscinumpy.dev/post/bound-version-constraints/]
> I am currently testing removing upper bounds in 
> [https://github.com/apache/flink/pull/23141]; so far, builds pass without 
> issue in 
> [b65c072|https://github.com/apache/flink/pull/23141/commits/b65c0723ed66e01e83d718f770aa916f41f34581],
>  and I'm currently waiting on 
> [c8eb15c|https://github.com/apache/flink/pull/23141/commits/c8eb15cbc371dc259fb4fda5395f0f55e08ea9c6]
>  to see if I can get PyArrow to resolve >=10.0.0. Solving the proposed 
> dependencies results in:
> {{#}}
> {{# This file is autogenerated by pip-compile with Python 3.8}}
> {{# by the following command:}}
> {{#}}
> {{#    pip-compile --config=pyproject.toml 
> --output-file=dev/compiled-requirements.txt dev/dev-requirements.txt}}
> {{#}}
> {{apache-beam==2.49.0}}
> {{    # via -r dev/dev-requirements.txt}}
> {{avro-python3==1.10.2}}
> {{    # via -r dev/dev-requirements.txt}}
> {{certifi==2023.7.22}}
> {{    # via requests}}
> {{charset-normalizer==3.2.0}}
> {{    # via requests}}
> {{cloudpickle==2.2.1}}
> {{    # via}}
> {{    #   -r dev/dev-requirements.txt}}
> {{    #   apache-beam}}
> {{crcmod==1.7}}
> {{    # via apache-beam}}
> {{cython==3.0.0}}
> {{    # via -r dev/dev-requirements.txt}}
> {{dill==0.3.1.1}}
> {{    # via apache-beam}}
> {{dnspython==2.4.1}}
> {{    # via pymongo}}
> {{docopt==0.6.2}}
> {{    # via hdfs}}
> {{exceptiongroup==1.1.2}}
> {{    # via pytest}}
> {{fastavro==1.8.2}}
> {{    # via}}
> {{    #   -r dev/dev-requirements.txt}}
> {{    #   apache-beam}}
> {{fasteners==0.18}}
> {{    # via apache-beam}}
> {{find-libpython==0.3.0}}
> {{    # via pemja}}
> {{grpcio==1.56.2}}
> {{    # via}}
> {{    #   -r dev/dev-requirements.txt}}
> {{    #   apache-beam}}
> {{    #   grpcio-tools}}
> {{grpcio-tools==1.56.2}}
> {{    # via -r dev/dev-requirements.txt}}
> {{hdfs==2.7.0}}
> {{    # via apache-beam}}
> {{httplib2==0.22.0}}
> {{    # via}}
> {{    #   -r dev/dev-requirements.txt}}
> {{    #   apache-beam}}
> {{idna==3.4}}
> {{    # via requests}}
> {{iniconfig==2.0.0}}
> {{    # via pytest}}
> {{numpy==1.24.4}}
> {{    # via}}
> {{    #   -r dev/dev-requirements.txt}}
> {{    #   apache-beam}}
> {{    #   pandas}}
> {{    #   pyarrow}}
> {{objsize==0.6.1}}
> {{    # via apache-beam}}
> {{orjson==3.9.2}}
> {{    # via apache-beam}}
> {{packaging==23.1}}
> {{    # via pytest}}
> {{pandas==2.0.3}}
> {{    # via -r dev/dev-requirements.txt}}
> {{pemja==0.3.0 ; platform_system != "Windows"}}
> {{    # via -r dev/dev-requirements.txt}}
> {{pluggy==1.2.0}}
> {{    # via pytest}}
> {{proto-plus==1.22.3}}
> {{    # via apache-beam}}
> {{protobuf==4.23.4}}
> {{    # via}}
> {{    #   -r dev/dev-requirements.txt}}
> {{    #   apache-beam}}
> {{    #   grpcio-tools}}
> {{    #   proto-plus}}
> {{py4j==0.10.9.7}}
> {{    # via -r dev/dev-requirements.txt}}
> {{pyarrow==11.0.0}}
> {{    # via}}
> {{    #   -r dev/dev-requirements.txt}}
> {{    #   apache-beam}}
> {{pydot==1.4.2}}
> {{    # via apache-beam}}
> 

[jira] [Updated] (FLINK-32758) PyFlink bounds are overly restrictive and outdated

2023-08-28 Thread Sergey Nuyanzin (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-32758?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Nuyanzin updated FLINK-32758:

Affects Version/s: 1.19.0

> PyFlink bounds are overly restrictive and outdated
> --
>
> Key: FLINK-32758
> URL: https://issues.apache.org/jira/browse/FLINK-32758
> Project: Flink
>  Issue Type: Improvement
>  Components: API / Python
>Affects Versions: 1.17.1, 1.19.0
>Reporter: Deepyaman Datta
>Assignee: Deepyaman Datta
>Priority: Blocker
>  Labels: pull-request-available, test-stability
>
> Hi! I am part of a team building the Flink backend for Ibis 
> ([https://github.com/ibis-project/ibis]). We would like to leverage PyFlink 
> under the hood for execution; however, PyFlink's requirements are 
> incompatible with several other Ibis requirements. Beyond Ibis, PyFlink's 
> outdated and restrictive requirements prevent it from being used alongside 
> most recent releases of Python data libraries.
> Some of the major libraries we (and likely others in the Python community 
> interested in using PyFlink alongside other libraries) need compatibility 
> with:
>  * PyArrow (at least >=10.0.0, but there's no reason not to be also be 
> compatible with latest)
>  * pandas (should be compatible with 2.x series, but also probably with 
> 1.4.x, released January 2022, and 1.5.x)
>  * numpy (1.22 was released in December 2022)
>  * Newer releases of Apache Beam
>  * Newer releases of cython
> Furthermore, uncapped dependencies could be more generally preferable, as 
> they avoid the need for frequent PyFlink releases as newer versions of 
> libraries are released. A common (and great) argument for not upper-bounding 
> dependencies, especially for libraries: 
> [https://iscinumpy.dev/post/bound-version-constraints/]
> I am currently testing removing upper bounds in 
> [https://github.com/apache/flink/pull/23141]; so far, builds pass without 
> issue in 
> [b65c072|https://github.com/apache/flink/pull/23141/commits/b65c0723ed66e01e83d718f770aa916f41f34581],
>  and I'm currently waiting on 
> [c8eb15c|https://github.com/apache/flink/pull/23141/commits/c8eb15cbc371dc259fb4fda5395f0f55e08ea9c6]
>  to see if I can get PyArrow to resolve >=10.0.0. Solving the proposed 
> dependencies results in:
> {{#}}
> {{# This file is autogenerated by pip-compile with Python 3.8}}
> {{# by the following command:}}
> {{#}}
> {{#    pip-compile --config=pyproject.toml 
> --output-file=dev/compiled-requirements.txt dev/dev-requirements.txt}}
> {{#}}
> {{apache-beam==2.49.0}}
> {{    # via -r dev/dev-requirements.txt}}
> {{avro-python3==1.10.2}}
> {{    # via -r dev/dev-requirements.txt}}
> {{certifi==2023.7.22}}
> {{    # via requests}}
> {{charset-normalizer==3.2.0}}
> {{    # via requests}}
> {{cloudpickle==2.2.1}}
> {{    # via}}
> {{    #   -r dev/dev-requirements.txt}}
> {{    #   apache-beam}}
> {{crcmod==1.7}}
> {{    # via apache-beam}}
> {{cython==3.0.0}}
> {{    # via -r dev/dev-requirements.txt}}
> {{dill==0.3.1.1}}
> {{    # via apache-beam}}
> {{dnspython==2.4.1}}
> {{    # via pymongo}}
> {{docopt==0.6.2}}
> {{    # via hdfs}}
> {{exceptiongroup==1.1.2}}
> {{    # via pytest}}
> {{fastavro==1.8.2}}
> {{    # via}}
> {{    #   -r dev/dev-requirements.txt}}
> {{    #   apache-beam}}
> {{fasteners==0.18}}
> {{    # via apache-beam}}
> {{find-libpython==0.3.0}}
> {{    # via pemja}}
> {{grpcio==1.56.2}}
> {{    # via}}
> {{    #   -r dev/dev-requirements.txt}}
> {{    #   apache-beam}}
> {{    #   grpcio-tools}}
> {{grpcio-tools==1.56.2}}
> {{    # via -r dev/dev-requirements.txt}}
> {{hdfs==2.7.0}}
> {{    # via apache-beam}}
> {{httplib2==0.22.0}}
> {{    # via}}
> {{    #   -r dev/dev-requirements.txt}}
> {{    #   apache-beam}}
> {{idna==3.4}}
> {{    # via requests}}
> {{iniconfig==2.0.0}}
> {{    # via pytest}}
> {{numpy==1.24.4}}
> {{    # via}}
> {{    #   -r dev/dev-requirements.txt}}
> {{    #   apache-beam}}
> {{    #   pandas}}
> {{    #   pyarrow}}
> {{objsize==0.6.1}}
> {{    # via apache-beam}}
> {{orjson==3.9.2}}
> {{    # via apache-beam}}
> {{packaging==23.1}}
> {{    # via pytest}}
> {{pandas==2.0.3}}
> {{    # via -r dev/dev-requirements.txt}}
> {{pemja==0.3.0 ; platform_system != "Windows"}}
> {{    # via -r dev/dev-requirements.txt}}
> {{pluggy==1.2.0}}
> {{    # via pytest}}
> {{proto-plus==1.22.3}}
> {{    # via apache-beam}}
> {{protobuf==4.23.4}}
> {{    # via}}
> {{    #   -r dev/dev-requirements.txt}}
> {{    #   apache-beam}}
> {{    #   grpcio-tools}}
> {{    #   proto-plus}}
> {{py4j==0.10.9.7}}
> {{    # via -r dev/dev-requirements.txt}}
> {{pyarrow==11.0.0}}
> {{    # via}}
> {{    #   -r dev/dev-requirements.txt}}
> {{    #   apache-beam}}
> {{pydot==1.4.2}}
> {{    # via apache-beam}}
> {{pymongo==4.4.1}}
> {{    # via 

[jira] [Updated] (FLINK-32758) PyFlink bounds are overly restrictive and outdated

2023-08-28 Thread Sergey Nuyanzin (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-32758?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Nuyanzin updated FLINK-32758:

Priority: Blocker  (was: Major)

> PyFlink bounds are overly restrictive and outdated
> --
>
> Key: FLINK-32758
> URL: https://issues.apache.org/jira/browse/FLINK-32758
> Project: Flink
>  Issue Type: Improvement
>  Components: API / Python
>Affects Versions: 1.17.1
>Reporter: Deepyaman Datta
>Assignee: Deepyaman Datta
>Priority: Blocker
>  Labels: pull-request-available
>
> Hi! I am part of a team building the Flink backend for Ibis 
> ([https://github.com/ibis-project/ibis]). We would like to leverage PyFlink 
> under the hood for execution; however, PyFlink's requirements are 
> incompatible with several other Ibis requirements. Beyond Ibis, PyFlink's 
> outdated and restrictive requirements prevent it from being used alongside 
> most recent releases of Python data libraries.
> Some of the major libraries we (and likely others in the Python community 
> interested in using PyFlink alongside other libraries) need compatibility 
> with:
>  * PyArrow (at least >=10.0.0, but there's no reason not to be also be 
> compatible with latest)
>  * pandas (should be compatible with 2.x series, but also probably with 
> 1.4.x, released January 2022, and 1.5.x)
>  * numpy (1.22 was released in December 2022)
>  * Newer releases of Apache Beam
>  * Newer releases of cython
> Furthermore, uncapped dependencies could be more generally preferable, as 
> they avoid the need for frequent PyFlink releases as newer versions of 
> libraries are released. A common (and great) argument for not upper-bounding 
> dependencies, especially for libraries: 
> [https://iscinumpy.dev/post/bound-version-constraints/]
> I am currently testing removing upper bounds in 
> [https://github.com/apache/flink/pull/23141]; so far, builds pass without 
> issue in 
> [b65c072|https://github.com/apache/flink/pull/23141/commits/b65c0723ed66e01e83d718f770aa916f41f34581],
>  and I'm currently waiting on 
> [c8eb15c|https://github.com/apache/flink/pull/23141/commits/c8eb15cbc371dc259fb4fda5395f0f55e08ea9c6]
>  to see if I can get PyArrow to resolve >=10.0.0. Solving the proposed 
> dependencies results in:
> {{#}}
> {{# This file is autogenerated by pip-compile with Python 3.8}}
> {{# by the following command:}}
> {{#}}
> {{#    pip-compile --config=pyproject.toml 
> --output-file=dev/compiled-requirements.txt dev/dev-requirements.txt}}
> {{#}}
> {{apache-beam==2.49.0}}
> {{    # via -r dev/dev-requirements.txt}}
> {{avro-python3==1.10.2}}
> {{    # via -r dev/dev-requirements.txt}}
> {{certifi==2023.7.22}}
> {{    # via requests}}
> {{charset-normalizer==3.2.0}}
> {{    # via requests}}
> {{cloudpickle==2.2.1}}
> {{    # via}}
> {{    #   -r dev/dev-requirements.txt}}
> {{    #   apache-beam}}
> {{crcmod==1.7}}
> {{    # via apache-beam}}
> {{cython==3.0.0}}
> {{    # via -r dev/dev-requirements.txt}}
> {{dill==0.3.1.1}}
> {{    # via apache-beam}}
> {{dnspython==2.4.1}}
> {{    # via pymongo}}
> {{docopt==0.6.2}}
> {{    # via hdfs}}
> {{exceptiongroup==1.1.2}}
> {{    # via pytest}}
> {{fastavro==1.8.2}}
> {{    # via}}
> {{    #   -r dev/dev-requirements.txt}}
> {{    #   apache-beam}}
> {{fasteners==0.18}}
> {{    # via apache-beam}}
> {{find-libpython==0.3.0}}
> {{    # via pemja}}
> {{grpcio==1.56.2}}
> {{    # via}}
> {{    #   -r dev/dev-requirements.txt}}
> {{    #   apache-beam}}
> {{    #   grpcio-tools}}
> {{grpcio-tools==1.56.2}}
> {{    # via -r dev/dev-requirements.txt}}
> {{hdfs==2.7.0}}
> {{    # via apache-beam}}
> {{httplib2==0.22.0}}
> {{    # via}}
> {{    #   -r dev/dev-requirements.txt}}
> {{    #   apache-beam}}
> {{idna==3.4}}
> {{    # via requests}}
> {{iniconfig==2.0.0}}
> {{    # via pytest}}
> {{numpy==1.24.4}}
> {{    # via}}
> {{    #   -r dev/dev-requirements.txt}}
> {{    #   apache-beam}}
> {{    #   pandas}}
> {{    #   pyarrow}}
> {{objsize==0.6.1}}
> {{    # via apache-beam}}
> {{orjson==3.9.2}}
> {{    # via apache-beam}}
> {{packaging==23.1}}
> {{    # via pytest}}
> {{pandas==2.0.3}}
> {{    # via -r dev/dev-requirements.txt}}
> {{pemja==0.3.0 ; platform_system != "Windows"}}
> {{    # via -r dev/dev-requirements.txt}}
> {{pluggy==1.2.0}}
> {{    # via pytest}}
> {{proto-plus==1.22.3}}
> {{    # via apache-beam}}
> {{protobuf==4.23.4}}
> {{    # via}}
> {{    #   -r dev/dev-requirements.txt}}
> {{    #   apache-beam}}
> {{    #   grpcio-tools}}
> {{    #   proto-plus}}
> {{py4j==0.10.9.7}}
> {{    # via -r dev/dev-requirements.txt}}
> {{pyarrow==11.0.0}}
> {{    # via}}
> {{    #   -r dev/dev-requirements.txt}}
> {{    #   apache-beam}}
> {{pydot==1.4.2}}
> {{    # via apache-beam}}
> {{pymongo==4.4.1}}
> {{    # via apache-beam}}
> 

[jira] [Updated] (FLINK-32758) PyFlink bounds are overly restrictive and outdated

2023-08-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-32758?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-32758:
---
Labels: pull-request-available  (was: )

> PyFlink bounds are overly restrictive and outdated
> --
>
> Key: FLINK-32758
> URL: https://issues.apache.org/jira/browse/FLINK-32758
> Project: Flink
>  Issue Type: Improvement
>  Components: API / Python
>Affects Versions: 1.17.1
>Reporter: Deepyaman Datta
>Priority: Major
>  Labels: pull-request-available
>
> Hi! I am part of a team building the Flink backend for Ibis 
> ([https://github.com/ibis-project/ibis]). We would like to leverage PyFlink 
> under the hood for execution; however, PyFlink's requirements are 
> incompatible with several other Ibis requirements. Beyond Ibis, PyFlink's 
> outdated and restrictive requirements prevent it from being used alongside 
> most recent releases of Python data libraries.
> Some of the major libraries we (and likely others in the Python community 
> interested in using PyFlink alongside other libraries) need compatibility 
> with:
>  * PyArrow (at least >=10.0.0, but there's no reason not to be also be 
> compatible with latest)
>  * pandas (should be compatible with 2.x series, but also probably with 
> 1.4.x, released January 2022, and 1.5.x)
>  * numpy (1.22 was released in December 2022)
>  * Newer releases of Apache Beam
>  * Newer releases of cython
> Furthermore, uncapped dependencies could be more generally preferable, as 
> they avoid the need for frequent PyFlink releases as newer versions of 
> libraries are released. A common (and great) argument for not upper-bounding 
> dependencies, especially for libraries: 
> [https://iscinumpy.dev/post/bound-version-constraints/]
> I am currently testing removing upper bounds in 
> [https://github.com/apache/flink/pull/23141]; so far, builds pass without 
> issue in 
> [b65c072|https://github.com/apache/flink/pull/23141/commits/b65c0723ed66e01e83d718f770aa916f41f34581],
>  and I'm currently waiting on 
> [c8eb15c|https://github.com/apache/flink/pull/23141/commits/c8eb15cbc371dc259fb4fda5395f0f55e08ea9c6]
>  to see if I can get PyArrow to resolve >=10.0.0. Solving the proposed 
> dependencies results in:
> {{#}}
> {{# This file is autogenerated by pip-compile with Python 3.8}}
> {{# by the following command:}}
> {{#}}
> {{#    pip-compile --config=pyproject.toml 
> --output-file=dev/compiled-requirements.txt dev/dev-requirements.txt}}
> {{#}}
> {{apache-beam==2.49.0}}
> {{    # via -r dev/dev-requirements.txt}}
> {{avro-python3==1.10.2}}
> {{    # via -r dev/dev-requirements.txt}}
> {{certifi==2023.7.22}}
> {{    # via requests}}
> {{charset-normalizer==3.2.0}}
> {{    # via requests}}
> {{cloudpickle==2.2.1}}
> {{    # via}}
> {{    #   -r dev/dev-requirements.txt}}
> {{    #   apache-beam}}
> {{crcmod==1.7}}
> {{    # via apache-beam}}
> {{cython==3.0.0}}
> {{    # via -r dev/dev-requirements.txt}}
> {{dill==0.3.1.1}}
> {{    # via apache-beam}}
> {{dnspython==2.4.1}}
> {{    # via pymongo}}
> {{docopt==0.6.2}}
> {{    # via hdfs}}
> {{exceptiongroup==1.1.2}}
> {{    # via pytest}}
> {{fastavro==1.8.2}}
> {{    # via}}
> {{    #   -r dev/dev-requirements.txt}}
> {{    #   apache-beam}}
> {{fasteners==0.18}}
> {{    # via apache-beam}}
> {{find-libpython==0.3.0}}
> {{    # via pemja}}
> {{grpcio==1.56.2}}
> {{    # via}}
> {{    #   -r dev/dev-requirements.txt}}
> {{    #   apache-beam}}
> {{    #   grpcio-tools}}
> {{grpcio-tools==1.56.2}}
> {{    # via -r dev/dev-requirements.txt}}
> {{hdfs==2.7.0}}
> {{    # via apache-beam}}
> {{httplib2==0.22.0}}
> {{    # via}}
> {{    #   -r dev/dev-requirements.txt}}
> {{    #   apache-beam}}
> {{idna==3.4}}
> {{    # via requests}}
> {{iniconfig==2.0.0}}
> {{    # via pytest}}
> {{numpy==1.24.4}}
> {{    # via}}
> {{    #   -r dev/dev-requirements.txt}}
> {{    #   apache-beam}}
> {{    #   pandas}}
> {{    #   pyarrow}}
> {{objsize==0.6.1}}
> {{    # via apache-beam}}
> {{orjson==3.9.2}}
> {{    # via apache-beam}}
> {{packaging==23.1}}
> {{    # via pytest}}
> {{pandas==2.0.3}}
> {{    # via -r dev/dev-requirements.txt}}
> {{pemja==0.3.0 ; platform_system != "Windows"}}
> {{    # via -r dev/dev-requirements.txt}}
> {{pluggy==1.2.0}}
> {{    # via pytest}}
> {{proto-plus==1.22.3}}
> {{    # via apache-beam}}
> {{protobuf==4.23.4}}
> {{    # via}}
> {{    #   -r dev/dev-requirements.txt}}
> {{    #   apache-beam}}
> {{    #   grpcio-tools}}
> {{    #   proto-plus}}
> {{py4j==0.10.9.7}}
> {{    # via -r dev/dev-requirements.txt}}
> {{pyarrow==11.0.0}}
> {{    # via}}
> {{    #   -r dev/dev-requirements.txt}}
> {{    #   apache-beam}}
> {{pydot==1.4.2}}
> {{    # via apache-beam}}
> {{pymongo==4.4.1}}
> {{    # via apache-beam}}
> {{pyparsing==3.1.1}}
> {{    # via}}
> {{    #