[jira] [Commented] (FLINK-32758) PyFlink bounds are overly restrictive and outdated

2023-08-29 Thread Dian Fu (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-32758?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17760240#comment-17760240
 ] 

Dian Fu commented on FLINK-32758:
-

Merged to:
- release-1.18 via 1f7796ee50cfbea4fb633692e6be01070ed45c6f and 
8551a39ee46054d3ec05f3d31758f7ad39b69a39
- release-1.17 via 30eeb91c3d2048b88e0a9903d9c973085df2c2ea and 
870ac98dcdb92774fed783254a3bf4d8ddc317aa

> PyFlink bounds are overly restrictive and outdated
> --
>
> Key: FLINK-32758
> URL: https://issues.apache.org/jira/browse/FLINK-32758
> Project: Flink
>  Issue Type: Improvement
>  Components: API / Python
>Affects Versions: 1.17.1, 1.19.0
>Reporter: Deepyaman Datta
>Assignee: Deepyaman Datta
>Priority: Blocker
>  Labels: pull-request-available, test-stability
> Attachments: image-2023-08-29-10-19-37-977.png
>
>
> Hi! I am part of a team building the Flink backend for Ibis 
> ([https://github.com/ibis-project/ibis]). We would like to leverage PyFlink 
> under the hood for execution; however, PyFlink's requirements are 
> incompatible with several other Ibis requirements. Beyond Ibis, PyFlink's 
> outdated and restrictive requirements prevent it from being used alongside 
> most recent releases of Python data libraries.
> Some of the major libraries we (and likely others in the Python community 
> interested in using PyFlink alongside other libraries) need compatibility 
> with:
>  * PyArrow (at least >=10.0.0, but there's no reason not to be also be 
> compatible with latest)
>  * pandas (should be compatible with 2.x series, but also probably with 
> 1.4.x, released January 2022, and 1.5.x)
>  * numpy (1.22 was released in December 2022)
>  * Newer releases of Apache Beam
>  * Newer releases of cython
> Furthermore, uncapped dependencies could be more generally preferable, as 
> they avoid the need for frequent PyFlink releases as newer versions of 
> libraries are released. A common (and great) argument for not upper-bounding 
> dependencies, especially for libraries: 
> [https://iscinumpy.dev/post/bound-version-constraints/]
> I am currently testing removing upper bounds in 
> [https://github.com/apache/flink/pull/23141]; so far, builds pass without 
> issue in 
> [b65c072|https://github.com/apache/flink/pull/23141/commits/b65c0723ed66e01e83d718f770aa916f41f34581],
>  and I'm currently waiting on 
> [c8eb15c|https://github.com/apache/flink/pull/23141/commits/c8eb15cbc371dc259fb4fda5395f0f55e08ea9c6]
>  to see if I can get PyArrow to resolve >=10.0.0. Solving the proposed 
> dependencies results in:
> {{#}}
> {{# This file is autogenerated by pip-compile with Python 3.8}}
> {{# by the following command:}}
> {{#}}
> {{#    pip-compile --config=pyproject.toml 
> --output-file=dev/compiled-requirements.txt dev/dev-requirements.txt}}
> {{#}}
> {{apache-beam==2.49.0}}
> {{    # via -r dev/dev-requirements.txt}}
> {{avro-python3==1.10.2}}
> {{    # via -r dev/dev-requirements.txt}}
> {{certifi==2023.7.22}}
> {{    # via requests}}
> {{charset-normalizer==3.2.0}}
> {{    # via requests}}
> {{cloudpickle==2.2.1}}
> {{    # via}}
> {{    #   -r dev/dev-requirements.txt}}
> {{    #   apache-beam}}
> {{crcmod==1.7}}
> {{    # via apache-beam}}
> {{cython==3.0.0}}
> {{    # via -r dev/dev-requirements.txt}}
> {{dill==0.3.1.1}}
> {{    # via apache-beam}}
> {{dnspython==2.4.1}}
> {{    # via pymongo}}
> {{docopt==0.6.2}}
> {{    # via hdfs}}
> {{exceptiongroup==1.1.2}}
> {{    # via pytest}}
> {{fastavro==1.8.2}}
> {{    # via}}
> {{    #   -r dev/dev-requirements.txt}}
> {{    #   apache-beam}}
> {{fasteners==0.18}}
> {{    # via apache-beam}}
> {{find-libpython==0.3.0}}
> {{    # via pemja}}
> {{grpcio==1.56.2}}
> {{    # via}}
> {{    #   -r dev/dev-requirements.txt}}
> {{    #   apache-beam}}
> {{    #   grpcio-tools}}
> {{grpcio-tools==1.56.2}}
> {{    # via -r dev/dev-requirements.txt}}
> {{hdfs==2.7.0}}
> {{    # via apache-beam}}
> {{httplib2==0.22.0}}
> {{    # via}}
> {{    #   -r dev/dev-requirements.txt}}
> {{    #   apache-beam}}
> {{idna==3.4}}
> {{    # via requests}}
> {{iniconfig==2.0.0}}
> {{    # via pytest}}
> {{numpy==1.24.4}}
> {{    # via}}
> {{    #   -r dev/dev-requirements.txt}}
> {{    #   apache-beam}}
> {{    #   pandas}}
> {{    #   pyarrow}}
> {{objsize==0.6.1}}
> {{    # via apache-beam}}
> {{orjson==3.9.2}}
> {{    # via apache-beam}}
> {{packaging==23.1}}
> {{    # via pytest}}
> {{pandas==2.0.3}}
> {{    # via -r dev/dev-requirements.txt}}
> {{pemja==0.3.0 ; platform_system != "Windows"}}
> {{    # via -r dev/dev-requirements.txt}}
> {{pluggy==1.2.0}}
> {{    # via pytest}}
> {{proto-plus==1.22.3}}
> {{    # via apache-beam}}
> {{protobuf==4.23.4}}
> {{    # via}}
> {{    #   -r dev/dev-requirements.txt}}
> {{    #   apache-beam}}
> {{    #   grpcio-tools}}
> {{    #   

[jira] [Commented] (FLINK-32758) PyFlink bounds are overly restrictive and outdated

2023-08-29 Thread Dian Fu (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-32758?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17760238#comment-17760238
 ] 

Dian Fu commented on FLINK-32758:
-

Fixed in master via 5b5a0af15d57ed4424cf8dd744808433e397ebc4

> PyFlink bounds are overly restrictive and outdated
> --
>
> Key: FLINK-32758
> URL: https://issues.apache.org/jira/browse/FLINK-32758
> Project: Flink
>  Issue Type: Improvement
>  Components: API / Python
>Affects Versions: 1.17.1, 1.19.0
>Reporter: Deepyaman Datta
>Assignee: Deepyaman Datta
>Priority: Blocker
>  Labels: pull-request-available, test-stability
> Attachments: image-2023-08-29-10-19-37-977.png
>
>
> Hi! I am part of a team building the Flink backend for Ibis 
> ([https://github.com/ibis-project/ibis]). We would like to leverage PyFlink 
> under the hood for execution; however, PyFlink's requirements are 
> incompatible with several other Ibis requirements. Beyond Ibis, PyFlink's 
> outdated and restrictive requirements prevent it from being used alongside 
> most recent releases of Python data libraries.
> Some of the major libraries we (and likely others in the Python community 
> interested in using PyFlink alongside other libraries) need compatibility 
> with:
>  * PyArrow (at least >=10.0.0, but there's no reason not to be also be 
> compatible with latest)
>  * pandas (should be compatible with 2.x series, but also probably with 
> 1.4.x, released January 2022, and 1.5.x)
>  * numpy (1.22 was released in December 2022)
>  * Newer releases of Apache Beam
>  * Newer releases of cython
> Furthermore, uncapped dependencies could be more generally preferable, as 
> they avoid the need for frequent PyFlink releases as newer versions of 
> libraries are released. A common (and great) argument for not upper-bounding 
> dependencies, especially for libraries: 
> [https://iscinumpy.dev/post/bound-version-constraints/]
> I am currently testing removing upper bounds in 
> [https://github.com/apache/flink/pull/23141]; so far, builds pass without 
> issue in 
> [b65c072|https://github.com/apache/flink/pull/23141/commits/b65c0723ed66e01e83d718f770aa916f41f34581],
>  and I'm currently waiting on 
> [c8eb15c|https://github.com/apache/flink/pull/23141/commits/c8eb15cbc371dc259fb4fda5395f0f55e08ea9c6]
>  to see if I can get PyArrow to resolve >=10.0.0. Solving the proposed 
> dependencies results in:
> {{#}}
> {{# This file is autogenerated by pip-compile with Python 3.8}}
> {{# by the following command:}}
> {{#}}
> {{#    pip-compile --config=pyproject.toml 
> --output-file=dev/compiled-requirements.txt dev/dev-requirements.txt}}
> {{#}}
> {{apache-beam==2.49.0}}
> {{    # via -r dev/dev-requirements.txt}}
> {{avro-python3==1.10.2}}
> {{    # via -r dev/dev-requirements.txt}}
> {{certifi==2023.7.22}}
> {{    # via requests}}
> {{charset-normalizer==3.2.0}}
> {{    # via requests}}
> {{cloudpickle==2.2.1}}
> {{    # via}}
> {{    #   -r dev/dev-requirements.txt}}
> {{    #   apache-beam}}
> {{crcmod==1.7}}
> {{    # via apache-beam}}
> {{cython==3.0.0}}
> {{    # via -r dev/dev-requirements.txt}}
> {{dill==0.3.1.1}}
> {{    # via apache-beam}}
> {{dnspython==2.4.1}}
> {{    # via pymongo}}
> {{docopt==0.6.2}}
> {{    # via hdfs}}
> {{exceptiongroup==1.1.2}}
> {{    # via pytest}}
> {{fastavro==1.8.2}}
> {{    # via}}
> {{    #   -r dev/dev-requirements.txt}}
> {{    #   apache-beam}}
> {{fasteners==0.18}}
> {{    # via apache-beam}}
> {{find-libpython==0.3.0}}
> {{    # via pemja}}
> {{grpcio==1.56.2}}
> {{    # via}}
> {{    #   -r dev/dev-requirements.txt}}
> {{    #   apache-beam}}
> {{    #   grpcio-tools}}
> {{grpcio-tools==1.56.2}}
> {{    # via -r dev/dev-requirements.txt}}
> {{hdfs==2.7.0}}
> {{    # via apache-beam}}
> {{httplib2==0.22.0}}
> {{    # via}}
> {{    #   -r dev/dev-requirements.txt}}
> {{    #   apache-beam}}
> {{idna==3.4}}
> {{    # via requests}}
> {{iniconfig==2.0.0}}
> {{    # via pytest}}
> {{numpy==1.24.4}}
> {{    # via}}
> {{    #   -r dev/dev-requirements.txt}}
> {{    #   apache-beam}}
> {{    #   pandas}}
> {{    #   pyarrow}}
> {{objsize==0.6.1}}
> {{    # via apache-beam}}
> {{orjson==3.9.2}}
> {{    # via apache-beam}}
> {{packaging==23.1}}
> {{    # via pytest}}
> {{pandas==2.0.3}}
> {{    # via -r dev/dev-requirements.txt}}
> {{pemja==0.3.0 ; platform_system != "Windows"}}
> {{    # via -r dev/dev-requirements.txt}}
> {{pluggy==1.2.0}}
> {{    # via pytest}}
> {{proto-plus==1.22.3}}
> {{    # via apache-beam}}
> {{protobuf==4.23.4}}
> {{    # via}}
> {{    #   -r dev/dev-requirements.txt}}
> {{    #   apache-beam}}
> {{    #   grpcio-tools}}
> {{    #   proto-plus}}
> {{py4j==0.10.9.7}}
> {{    # via -r dev/dev-requirements.txt}}
> {{pyarrow==11.0.0}}
> {{    # via}}
> {{    #   -r dev/dev-requirements.txt}}
> {{  

[jira] [Commented] (FLINK-32758) PyFlink bounds are overly restrictive and outdated

2023-08-29 Thread Deepyaman Datta (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-32758?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17759987#comment-17759987
 ] 

Deepyaman Datta commented on FLINK-32758:
-

[~dianfu] I'm happy with the `!=1.8.0` constraint!

> PyFlink bounds are overly restrictive and outdated
> --
>
> Key: FLINK-32758
> URL: https://issues.apache.org/jira/browse/FLINK-32758
> Project: Flink
>  Issue Type: Improvement
>  Components: API / Python
>Affects Versions: 1.17.1, 1.19.0
>Reporter: Deepyaman Datta
>Assignee: Deepyaman Datta
>Priority: Blocker
>  Labels: pull-request-available, test-stability
> Attachments: image-2023-08-29-10-19-37-977.png
>
>
> Hi! I am part of a team building the Flink backend for Ibis 
> ([https://github.com/ibis-project/ibis]). We would like to leverage PyFlink 
> under the hood for execution; however, PyFlink's requirements are 
> incompatible with several other Ibis requirements. Beyond Ibis, PyFlink's 
> outdated and restrictive requirements prevent it from being used alongside 
> most recent releases of Python data libraries.
> Some of the major libraries we (and likely others in the Python community 
> interested in using PyFlink alongside other libraries) need compatibility 
> with:
>  * PyArrow (at least >=10.0.0, but there's no reason not to be also be 
> compatible with latest)
>  * pandas (should be compatible with 2.x series, but also probably with 
> 1.4.x, released January 2022, and 1.5.x)
>  * numpy (1.22 was released in December 2022)
>  * Newer releases of Apache Beam
>  * Newer releases of cython
> Furthermore, uncapped dependencies could be more generally preferable, as 
> they avoid the need for frequent PyFlink releases as newer versions of 
> libraries are released. A common (and great) argument for not upper-bounding 
> dependencies, especially for libraries: 
> [https://iscinumpy.dev/post/bound-version-constraints/]
> I am currently testing removing upper bounds in 
> [https://github.com/apache/flink/pull/23141]; so far, builds pass without 
> issue in 
> [b65c072|https://github.com/apache/flink/pull/23141/commits/b65c0723ed66e01e83d718f770aa916f41f34581],
>  and I'm currently waiting on 
> [c8eb15c|https://github.com/apache/flink/pull/23141/commits/c8eb15cbc371dc259fb4fda5395f0f55e08ea9c6]
>  to see if I can get PyArrow to resolve >=10.0.0. Solving the proposed 
> dependencies results in:
> {{#}}
> {{# This file is autogenerated by pip-compile with Python 3.8}}
> {{# by the following command:}}
> {{#}}
> {{#    pip-compile --config=pyproject.toml 
> --output-file=dev/compiled-requirements.txt dev/dev-requirements.txt}}
> {{#}}
> {{apache-beam==2.49.0}}
> {{    # via -r dev/dev-requirements.txt}}
> {{avro-python3==1.10.2}}
> {{    # via -r dev/dev-requirements.txt}}
> {{certifi==2023.7.22}}
> {{    # via requests}}
> {{charset-normalizer==3.2.0}}
> {{    # via requests}}
> {{cloudpickle==2.2.1}}
> {{    # via}}
> {{    #   -r dev/dev-requirements.txt}}
> {{    #   apache-beam}}
> {{crcmod==1.7}}
> {{    # via apache-beam}}
> {{cython==3.0.0}}
> {{    # via -r dev/dev-requirements.txt}}
> {{dill==0.3.1.1}}
> {{    # via apache-beam}}
> {{dnspython==2.4.1}}
> {{    # via pymongo}}
> {{docopt==0.6.2}}
> {{    # via hdfs}}
> {{exceptiongroup==1.1.2}}
> {{    # via pytest}}
> {{fastavro==1.8.2}}
> {{    # via}}
> {{    #   -r dev/dev-requirements.txt}}
> {{    #   apache-beam}}
> {{fasteners==0.18}}
> {{    # via apache-beam}}
> {{find-libpython==0.3.0}}
> {{    # via pemja}}
> {{grpcio==1.56.2}}
> {{    # via}}
> {{    #   -r dev/dev-requirements.txt}}
> {{    #   apache-beam}}
> {{    #   grpcio-tools}}
> {{grpcio-tools==1.56.2}}
> {{    # via -r dev/dev-requirements.txt}}
> {{hdfs==2.7.0}}
> {{    # via apache-beam}}
> {{httplib2==0.22.0}}
> {{    # via}}
> {{    #   -r dev/dev-requirements.txt}}
> {{    #   apache-beam}}
> {{idna==3.4}}
> {{    # via requests}}
> {{iniconfig==2.0.0}}
> {{    # via pytest}}
> {{numpy==1.24.4}}
> {{    # via}}
> {{    #   -r dev/dev-requirements.txt}}
> {{    #   apache-beam}}
> {{    #   pandas}}
> {{    #   pyarrow}}
> {{objsize==0.6.1}}
> {{    # via apache-beam}}
> {{orjson==3.9.2}}
> {{    # via apache-beam}}
> {{packaging==23.1}}
> {{    # via pytest}}
> {{pandas==2.0.3}}
> {{    # via -r dev/dev-requirements.txt}}
> {{pemja==0.3.0 ; platform_system != "Windows"}}
> {{    # via -r dev/dev-requirements.txt}}
> {{pluggy==1.2.0}}
> {{    # via pytest}}
> {{proto-plus==1.22.3}}
> {{    # via apache-beam}}
> {{protobuf==4.23.4}}
> {{    # via}}
> {{    #   -r dev/dev-requirements.txt}}
> {{    #   apache-beam}}
> {{    #   grpcio-tools}}
> {{    #   proto-plus}}
> {{py4j==0.10.9.7}}
> {{    # via -r dev/dev-requirements.txt}}
> {{pyarrow==11.0.0}}
> {{    # via}}
> {{    #   -r dev/dev-requirements.txt}}

[jira] [Commented] (FLINK-32758) PyFlink bounds are overly restrictive and outdated

2023-08-29 Thread Dian Fu (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-32758?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17759969#comment-17759969
 ] 

Dian Fu commented on FLINK-32758:
-

Have verified that upgrade *{*}cibuildwheel{*}* doesn't work and exclude 
fastavro 1.8.0 works: fastavro>=1.1.0,!=1.8.0.

[~deepyaman] What's your thought?

> PyFlink bounds are overly restrictive and outdated
> --
>
> Key: FLINK-32758
> URL: https://issues.apache.org/jira/browse/FLINK-32758
> Project: Flink
>  Issue Type: Improvement
>  Components: API / Python
>Affects Versions: 1.17.1, 1.19.0
>Reporter: Deepyaman Datta
>Assignee: Deepyaman Datta
>Priority: Blocker
>  Labels: pull-request-available, test-stability
> Attachments: image-2023-08-29-10-19-37-977.png
>
>
> Hi! I am part of a team building the Flink backend for Ibis 
> ([https://github.com/ibis-project/ibis]). We would like to leverage PyFlink 
> under the hood for execution; however, PyFlink's requirements are 
> incompatible with several other Ibis requirements. Beyond Ibis, PyFlink's 
> outdated and restrictive requirements prevent it from being used alongside 
> most recent releases of Python data libraries.
> Some of the major libraries we (and likely others in the Python community 
> interested in using PyFlink alongside other libraries) need compatibility 
> with:
>  * PyArrow (at least >=10.0.0, but there's no reason not to be also be 
> compatible with latest)
>  * pandas (should be compatible with 2.x series, but also probably with 
> 1.4.x, released January 2022, and 1.5.x)
>  * numpy (1.22 was released in December 2022)
>  * Newer releases of Apache Beam
>  * Newer releases of cython
> Furthermore, uncapped dependencies could be more generally preferable, as 
> they avoid the need for frequent PyFlink releases as newer versions of 
> libraries are released. A common (and great) argument for not upper-bounding 
> dependencies, especially for libraries: 
> [https://iscinumpy.dev/post/bound-version-constraints/]
> I am currently testing removing upper bounds in 
> [https://github.com/apache/flink/pull/23141]; so far, builds pass without 
> issue in 
> [b65c072|https://github.com/apache/flink/pull/23141/commits/b65c0723ed66e01e83d718f770aa916f41f34581],
>  and I'm currently waiting on 
> [c8eb15c|https://github.com/apache/flink/pull/23141/commits/c8eb15cbc371dc259fb4fda5395f0f55e08ea9c6]
>  to see if I can get PyArrow to resolve >=10.0.0. Solving the proposed 
> dependencies results in:
> {{#}}
> {{# This file is autogenerated by pip-compile with Python 3.8}}
> {{# by the following command:}}
> {{#}}
> {{#    pip-compile --config=pyproject.toml 
> --output-file=dev/compiled-requirements.txt dev/dev-requirements.txt}}
> {{#}}
> {{apache-beam==2.49.0}}
> {{    # via -r dev/dev-requirements.txt}}
> {{avro-python3==1.10.2}}
> {{    # via -r dev/dev-requirements.txt}}
> {{certifi==2023.7.22}}
> {{    # via requests}}
> {{charset-normalizer==3.2.0}}
> {{    # via requests}}
> {{cloudpickle==2.2.1}}
> {{    # via}}
> {{    #   -r dev/dev-requirements.txt}}
> {{    #   apache-beam}}
> {{crcmod==1.7}}
> {{    # via apache-beam}}
> {{cython==3.0.0}}
> {{    # via -r dev/dev-requirements.txt}}
> {{dill==0.3.1.1}}
> {{    # via apache-beam}}
> {{dnspython==2.4.1}}
> {{    # via pymongo}}
> {{docopt==0.6.2}}
> {{    # via hdfs}}
> {{exceptiongroup==1.1.2}}
> {{    # via pytest}}
> {{fastavro==1.8.2}}
> {{    # via}}
> {{    #   -r dev/dev-requirements.txt}}
> {{    #   apache-beam}}
> {{fasteners==0.18}}
> {{    # via apache-beam}}
> {{find-libpython==0.3.0}}
> {{    # via pemja}}
> {{grpcio==1.56.2}}
> {{    # via}}
> {{    #   -r dev/dev-requirements.txt}}
> {{    #   apache-beam}}
> {{    #   grpcio-tools}}
> {{grpcio-tools==1.56.2}}
> {{    # via -r dev/dev-requirements.txt}}
> {{hdfs==2.7.0}}
> {{    # via apache-beam}}
> {{httplib2==0.22.0}}
> {{    # via}}
> {{    #   -r dev/dev-requirements.txt}}
> {{    #   apache-beam}}
> {{idna==3.4}}
> {{    # via requests}}
> {{iniconfig==2.0.0}}
> {{    # via pytest}}
> {{numpy==1.24.4}}
> {{    # via}}
> {{    #   -r dev/dev-requirements.txt}}
> {{    #   apache-beam}}
> {{    #   pandas}}
> {{    #   pyarrow}}
> {{objsize==0.6.1}}
> {{    # via apache-beam}}
> {{orjson==3.9.2}}
> {{    # via apache-beam}}
> {{packaging==23.1}}
> {{    # via pytest}}
> {{pandas==2.0.3}}
> {{    # via -r dev/dev-requirements.txt}}
> {{pemja==0.3.0 ; platform_system != "Windows"}}
> {{    # via -r dev/dev-requirements.txt}}
> {{pluggy==1.2.0}}
> {{    # via pytest}}
> {{proto-plus==1.22.3}}
> {{    # via apache-beam}}
> {{protobuf==4.23.4}}
> {{    # via}}
> {{    #   -r dev/dev-requirements.txt}}
> {{    #   apache-beam}}
> {{    #   grpcio-tools}}
> {{    #   proto-plus}}
> {{py4j==0.10.9.7}}
> {{    # via -r 

[jira] [Commented] (FLINK-32758) PyFlink bounds are overly restrictive and outdated

2023-08-29 Thread Matthias Pohl (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-32758?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17759847#comment-17759847
 ] 

Matthias Pohl commented on FLINK-32758:
---

https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=52740=logs=f73b5736-8355-5390-ec71-4dfdec0ce6c5=90f7230e-bf5a-531b-8566-ad48d3e03bbb=108

[This 
one|https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=52740=logs=d15e2b2e-10cd-5f59-7734-42d57dc5564d=4a86776f-e6e1-598a-f75a-c43d8b819662=880]
 failed in this same run but in the {{build_wheels_on_linux}} stage.

> PyFlink bounds are overly restrictive and outdated
> --
>
> Key: FLINK-32758
> URL: https://issues.apache.org/jira/browse/FLINK-32758
> Project: Flink
>  Issue Type: Improvement
>  Components: API / Python
>Affects Versions: 1.18.0, 1.17.1, 1.19.0
>Reporter: Deepyaman Datta
>Assignee: Deepyaman Datta
>Priority: Blocker
>  Labels: pull-request-available, test-stability
> Attachments: image-2023-08-29-10-19-37-977.png
>
>
> Hi! I am part of a team building the Flink backend for Ibis 
> ([https://github.com/ibis-project/ibis]). We would like to leverage PyFlink 
> under the hood for execution; however, PyFlink's requirements are 
> incompatible with several other Ibis requirements. Beyond Ibis, PyFlink's 
> outdated and restrictive requirements prevent it from being used alongside 
> most recent releases of Python data libraries.
> Some of the major libraries we (and likely others in the Python community 
> interested in using PyFlink alongside other libraries) need compatibility 
> with:
>  * PyArrow (at least >=10.0.0, but there's no reason not to be also be 
> compatible with latest)
>  * pandas (should be compatible with 2.x series, but also probably with 
> 1.4.x, released January 2022, and 1.5.x)
>  * numpy (1.22 was released in December 2022)
>  * Newer releases of Apache Beam
>  * Newer releases of cython
> Furthermore, uncapped dependencies could be more generally preferable, as 
> they avoid the need for frequent PyFlink releases as newer versions of 
> libraries are released. A common (and great) argument for not upper-bounding 
> dependencies, especially for libraries: 
> [https://iscinumpy.dev/post/bound-version-constraints/]
> I am currently testing removing upper bounds in 
> [https://github.com/apache/flink/pull/23141]; so far, builds pass without 
> issue in 
> [b65c072|https://github.com/apache/flink/pull/23141/commits/b65c0723ed66e01e83d718f770aa916f41f34581],
>  and I'm currently waiting on 
> [c8eb15c|https://github.com/apache/flink/pull/23141/commits/c8eb15cbc371dc259fb4fda5395f0f55e08ea9c6]
>  to see if I can get PyArrow to resolve >=10.0.0. Solving the proposed 
> dependencies results in:
> {{#}}
> {{# This file is autogenerated by pip-compile with Python 3.8}}
> {{# by the following command:}}
> {{#}}
> {{#    pip-compile --config=pyproject.toml 
> --output-file=dev/compiled-requirements.txt dev/dev-requirements.txt}}
> {{#}}
> {{apache-beam==2.49.0}}
> {{    # via -r dev/dev-requirements.txt}}
> {{avro-python3==1.10.2}}
> {{    # via -r dev/dev-requirements.txt}}
> {{certifi==2023.7.22}}
> {{    # via requests}}
> {{charset-normalizer==3.2.0}}
> {{    # via requests}}
> {{cloudpickle==2.2.1}}
> {{    # via}}
> {{    #   -r dev/dev-requirements.txt}}
> {{    #   apache-beam}}
> {{crcmod==1.7}}
> {{    # via apache-beam}}
> {{cython==3.0.0}}
> {{    # via -r dev/dev-requirements.txt}}
> {{dill==0.3.1.1}}
> {{    # via apache-beam}}
> {{dnspython==2.4.1}}
> {{    # via pymongo}}
> {{docopt==0.6.2}}
> {{    # via hdfs}}
> {{exceptiongroup==1.1.2}}
> {{    # via pytest}}
> {{fastavro==1.8.2}}
> {{    # via}}
> {{    #   -r dev/dev-requirements.txt}}
> {{    #   apache-beam}}
> {{fasteners==0.18}}
> {{    # via apache-beam}}
> {{find-libpython==0.3.0}}
> {{    # via pemja}}
> {{grpcio==1.56.2}}
> {{    # via}}
> {{    #   -r dev/dev-requirements.txt}}
> {{    #   apache-beam}}
> {{    #   grpcio-tools}}
> {{grpcio-tools==1.56.2}}
> {{    # via -r dev/dev-requirements.txt}}
> {{hdfs==2.7.0}}
> {{    # via apache-beam}}
> {{httplib2==0.22.0}}
> {{    # via}}
> {{    #   -r dev/dev-requirements.txt}}
> {{    #   apache-beam}}
> {{idna==3.4}}
> {{    # via requests}}
> {{iniconfig==2.0.0}}
> {{    # via pytest}}
> {{numpy==1.24.4}}
> {{    # via}}
> {{    #   -r dev/dev-requirements.txt}}
> {{    #   apache-beam}}
> {{    #   pandas}}
> {{    #   pyarrow}}
> {{objsize==0.6.1}}
> {{    # via apache-beam}}
> {{orjson==3.9.2}}
> {{    # via apache-beam}}
> {{packaging==23.1}}
> {{    # via pytest}}
> {{pandas==2.0.3}}
> {{    # via -r dev/dev-requirements.txt}}
> {{pemja==0.3.0 ; platform_system != "Windows"}}
> {{    # via -r dev/dev-requirements.txt}}
> {{pluggy==1.2.0}}
> {{    # via pytest}}
> 

[jira] [Commented] (FLINK-32758) PyFlink bounds are overly restrictive and outdated

2023-08-28 Thread Dian Fu (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-32758?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17759777#comment-17759777
 ] 

Dian Fu commented on FLINK-32758:
-

[~deepyaman] Actually the tests have passed on the CI:
!image-2023-08-29-10-19-37-977.png!

It's failing when building wheel package for MacOS.

It uses third-party platform **cibuildwheel** to build wheel packages, see 
[https://github.com/apache/flink/blob/master/tools/azure-pipelines/build-python-wheels.yml#L41]
 for more details. Currently it's using version 2.8.0. I will try if the latest 
version (2.15.0) works on my CI.

 

> PyFlink bounds are overly restrictive and outdated
> --
>
> Key: FLINK-32758
> URL: https://issues.apache.org/jira/browse/FLINK-32758
> Project: Flink
>  Issue Type: Improvement
>  Components: API / Python
>Affects Versions: 1.17.1, 1.19.0
>Reporter: Deepyaman Datta
>Assignee: Deepyaman Datta
>Priority: Blocker
>  Labels: pull-request-available, test-stability
> Attachments: image-2023-08-29-10-19-37-977.png
>
>
> Hi! I am part of a team building the Flink backend for Ibis 
> ([https://github.com/ibis-project/ibis]). We would like to leverage PyFlink 
> under the hood for execution; however, PyFlink's requirements are 
> incompatible with several other Ibis requirements. Beyond Ibis, PyFlink's 
> outdated and restrictive requirements prevent it from being used alongside 
> most recent releases of Python data libraries.
> Some of the major libraries we (and likely others in the Python community 
> interested in using PyFlink alongside other libraries) need compatibility 
> with:
>  * PyArrow (at least >=10.0.0, but there's no reason not to be also be 
> compatible with latest)
>  * pandas (should be compatible with 2.x series, but also probably with 
> 1.4.x, released January 2022, and 1.5.x)
>  * numpy (1.22 was released in December 2022)
>  * Newer releases of Apache Beam
>  * Newer releases of cython
> Furthermore, uncapped dependencies could be more generally preferable, as 
> they avoid the need for frequent PyFlink releases as newer versions of 
> libraries are released. A common (and great) argument for not upper-bounding 
> dependencies, especially for libraries: 
> [https://iscinumpy.dev/post/bound-version-constraints/]
> I am currently testing removing upper bounds in 
> [https://github.com/apache/flink/pull/23141]; so far, builds pass without 
> issue in 
> [b65c072|https://github.com/apache/flink/pull/23141/commits/b65c0723ed66e01e83d718f770aa916f41f34581],
>  and I'm currently waiting on 
> [c8eb15c|https://github.com/apache/flink/pull/23141/commits/c8eb15cbc371dc259fb4fda5395f0f55e08ea9c6]
>  to see if I can get PyArrow to resolve >=10.0.0. Solving the proposed 
> dependencies results in:
> {{#}}
> {{# This file is autogenerated by pip-compile with Python 3.8}}
> {{# by the following command:}}
> {{#}}
> {{#    pip-compile --config=pyproject.toml 
> --output-file=dev/compiled-requirements.txt dev/dev-requirements.txt}}
> {{#}}
> {{apache-beam==2.49.0}}
> {{    # via -r dev/dev-requirements.txt}}
> {{avro-python3==1.10.2}}
> {{    # via -r dev/dev-requirements.txt}}
> {{certifi==2023.7.22}}
> {{    # via requests}}
> {{charset-normalizer==3.2.0}}
> {{    # via requests}}
> {{cloudpickle==2.2.1}}
> {{    # via}}
> {{    #   -r dev/dev-requirements.txt}}
> {{    #   apache-beam}}
> {{crcmod==1.7}}
> {{    # via apache-beam}}
> {{cython==3.0.0}}
> {{    # via -r dev/dev-requirements.txt}}
> {{dill==0.3.1.1}}
> {{    # via apache-beam}}
> {{dnspython==2.4.1}}
> {{    # via pymongo}}
> {{docopt==0.6.2}}
> {{    # via hdfs}}
> {{exceptiongroup==1.1.2}}
> {{    # via pytest}}
> {{fastavro==1.8.2}}
> {{    # via}}
> {{    #   -r dev/dev-requirements.txt}}
> {{    #   apache-beam}}
> {{fasteners==0.18}}
> {{    # via apache-beam}}
> {{find-libpython==0.3.0}}
> {{    # via pemja}}
> {{grpcio==1.56.2}}
> {{    # via}}
> {{    #   -r dev/dev-requirements.txt}}
> {{    #   apache-beam}}
> {{    #   grpcio-tools}}
> {{grpcio-tools==1.56.2}}
> {{    # via -r dev/dev-requirements.txt}}
> {{hdfs==2.7.0}}
> {{    # via apache-beam}}
> {{httplib2==0.22.0}}
> {{    # via}}
> {{    #   -r dev/dev-requirements.txt}}
> {{    #   apache-beam}}
> {{idna==3.4}}
> {{    # via requests}}
> {{iniconfig==2.0.0}}
> {{    # via pytest}}
> {{numpy==1.24.4}}
> {{    # via}}
> {{    #   -r dev/dev-requirements.txt}}
> {{    #   apache-beam}}
> {{    #   pandas}}
> {{    #   pyarrow}}
> {{objsize==0.6.1}}
> {{    # via apache-beam}}
> {{orjson==3.9.2}}
> {{    # via apache-beam}}
> {{packaging==23.1}}
> {{    # via pytest}}
> {{pandas==2.0.3}}
> {{    # via -r dev/dev-requirements.txt}}
> {{pemja==0.3.0 ; platform_system != "Windows"}}
> {{    # via -r dev/dev-requirements.txt}}
> {{pluggy==1.2.0}}
> {{    # 

[jira] [Commented] (FLINK-32758) PyFlink bounds are overly restrictive and outdated

2023-08-28 Thread Dian Fu (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-32758?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17759774#comment-17759774
 ] 

Dian Fu commented on FLINK-32758:
-

[~Sergey Nuyanzin] [~deepyaman]  I have submitted a hotfix to temporary 
limiting fastavro < 1.8 to make the CI green (verified it on my CI): 
[https://github.com/apache/flink/commit/345dece9a8fd58d6ea1c829052fb2f3c68516b48]

> PyFlink bounds are overly restrictive and outdated
> --
>
> Key: FLINK-32758
> URL: https://issues.apache.org/jira/browse/FLINK-32758
> Project: Flink
>  Issue Type: Improvement
>  Components: API / Python
>Affects Versions: 1.17.1, 1.19.0
>Reporter: Deepyaman Datta
>Assignee: Deepyaman Datta
>Priority: Blocker
>  Labels: pull-request-available, test-stability
>
> Hi! I am part of a team building the Flink backend for Ibis 
> ([https://github.com/ibis-project/ibis]). We would like to leverage PyFlink 
> under the hood for execution; however, PyFlink's requirements are 
> incompatible with several other Ibis requirements. Beyond Ibis, PyFlink's 
> outdated and restrictive requirements prevent it from being used alongside 
> most recent releases of Python data libraries.
> Some of the major libraries we (and likely others in the Python community 
> interested in using PyFlink alongside other libraries) need compatibility 
> with:
>  * PyArrow (at least >=10.0.0, but there's no reason not to be also be 
> compatible with latest)
>  * pandas (should be compatible with 2.x series, but also probably with 
> 1.4.x, released January 2022, and 1.5.x)
>  * numpy (1.22 was released in December 2022)
>  * Newer releases of Apache Beam
>  * Newer releases of cython
> Furthermore, uncapped dependencies could be more generally preferable, as 
> they avoid the need for frequent PyFlink releases as newer versions of 
> libraries are released. A common (and great) argument for not upper-bounding 
> dependencies, especially for libraries: 
> [https://iscinumpy.dev/post/bound-version-constraints/]
> I am currently testing removing upper bounds in 
> [https://github.com/apache/flink/pull/23141]; so far, builds pass without 
> issue in 
> [b65c072|https://github.com/apache/flink/pull/23141/commits/b65c0723ed66e01e83d718f770aa916f41f34581],
>  and I'm currently waiting on 
> [c8eb15c|https://github.com/apache/flink/pull/23141/commits/c8eb15cbc371dc259fb4fda5395f0f55e08ea9c6]
>  to see if I can get PyArrow to resolve >=10.0.0. Solving the proposed 
> dependencies results in:
> {{#}}
> {{# This file is autogenerated by pip-compile with Python 3.8}}
> {{# by the following command:}}
> {{#}}
> {{#    pip-compile --config=pyproject.toml 
> --output-file=dev/compiled-requirements.txt dev/dev-requirements.txt}}
> {{#}}
> {{apache-beam==2.49.0}}
> {{    # via -r dev/dev-requirements.txt}}
> {{avro-python3==1.10.2}}
> {{    # via -r dev/dev-requirements.txt}}
> {{certifi==2023.7.22}}
> {{    # via requests}}
> {{charset-normalizer==3.2.0}}
> {{    # via requests}}
> {{cloudpickle==2.2.1}}
> {{    # via}}
> {{    #   -r dev/dev-requirements.txt}}
> {{    #   apache-beam}}
> {{crcmod==1.7}}
> {{    # via apache-beam}}
> {{cython==3.0.0}}
> {{    # via -r dev/dev-requirements.txt}}
> {{dill==0.3.1.1}}
> {{    # via apache-beam}}
> {{dnspython==2.4.1}}
> {{    # via pymongo}}
> {{docopt==0.6.2}}
> {{    # via hdfs}}
> {{exceptiongroup==1.1.2}}
> {{    # via pytest}}
> {{fastavro==1.8.2}}
> {{    # via}}
> {{    #   -r dev/dev-requirements.txt}}
> {{    #   apache-beam}}
> {{fasteners==0.18}}
> {{    # via apache-beam}}
> {{find-libpython==0.3.0}}
> {{    # via pemja}}
> {{grpcio==1.56.2}}
> {{    # via}}
> {{    #   -r dev/dev-requirements.txt}}
> {{    #   apache-beam}}
> {{    #   grpcio-tools}}
> {{grpcio-tools==1.56.2}}
> {{    # via -r dev/dev-requirements.txt}}
> {{hdfs==2.7.0}}
> {{    # via apache-beam}}
> {{httplib2==0.22.0}}
> {{    # via}}
> {{    #   -r dev/dev-requirements.txt}}
> {{    #   apache-beam}}
> {{idna==3.4}}
> {{    # via requests}}
> {{iniconfig==2.0.0}}
> {{    # via pytest}}
> {{numpy==1.24.4}}
> {{    # via}}
> {{    #   -r dev/dev-requirements.txt}}
> {{    #   apache-beam}}
> {{    #   pandas}}
> {{    #   pyarrow}}
> {{objsize==0.6.1}}
> {{    # via apache-beam}}
> {{orjson==3.9.2}}
> {{    # via apache-beam}}
> {{packaging==23.1}}
> {{    # via pytest}}
> {{pandas==2.0.3}}
> {{    # via -r dev/dev-requirements.txt}}
> {{pemja==0.3.0 ; platform_system != "Windows"}}
> {{    # via -r dev/dev-requirements.txt}}
> {{pluggy==1.2.0}}
> {{    # via pytest}}
> {{proto-plus==1.22.3}}
> {{    # via apache-beam}}
> {{protobuf==4.23.4}}
> {{    # via}}
> {{    #   -r dev/dev-requirements.txt}}
> {{    #   apache-beam}}
> {{    #   grpcio-tools}}
> {{    #   proto-plus}}
> {{py4j==0.10.9.7}}
> {{    # via -r 

[jira] [Commented] (FLINK-32758) PyFlink bounds are overly restrictive and outdated

2023-08-28 Thread Sergey Nuyanzin (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-32758?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17759693#comment-17759693
 ] 

Sergey Nuyanzin commented on FLINK-32758:
-

> I'm not sure how I can verify that a potential fix would work, if I try? Can 
> I trigger these tests manually?
if you have set your own CI then this could help (just movement of the job to 
normal CI from nightly) 
https://github.com/apache/flink/pull/23045/commits/16b65720306ca820dfd83f1ccaacb4b0aed850ac

if you haven't set your own CI then also rename of job is required. Or once a 
fix ready I can help scheduling it on my own CI

> PyFlink bounds are overly restrictive and outdated
> --
>
> Key: FLINK-32758
> URL: https://issues.apache.org/jira/browse/FLINK-32758
> Project: Flink
>  Issue Type: Improvement
>  Components: API / Python
>Affects Versions: 1.17.1, 1.19.0
>Reporter: Deepyaman Datta
>Assignee: Deepyaman Datta
>Priority: Blocker
>  Labels: pull-request-available, test-stability
>
> Hi! I am part of a team building the Flink backend for Ibis 
> ([https://github.com/ibis-project/ibis]). We would like to leverage PyFlink 
> under the hood for execution; however, PyFlink's requirements are 
> incompatible with several other Ibis requirements. Beyond Ibis, PyFlink's 
> outdated and restrictive requirements prevent it from being used alongside 
> most recent releases of Python data libraries.
> Some of the major libraries we (and likely others in the Python community 
> interested in using PyFlink alongside other libraries) need compatibility 
> with:
>  * PyArrow (at least >=10.0.0, but there's no reason not to be also be 
> compatible with latest)
>  * pandas (should be compatible with 2.x series, but also probably with 
> 1.4.x, released January 2022, and 1.5.x)
>  * numpy (1.22 was released in December 2022)
>  * Newer releases of Apache Beam
>  * Newer releases of cython
> Furthermore, uncapped dependencies could be more generally preferable, as 
> they avoid the need for frequent PyFlink releases as newer versions of 
> libraries are released. A common (and great) argument for not upper-bounding 
> dependencies, especially for libraries: 
> [https://iscinumpy.dev/post/bound-version-constraints/]
> I am currently testing removing upper bounds in 
> [https://github.com/apache/flink/pull/23141]; so far, builds pass without 
> issue in 
> [b65c072|https://github.com/apache/flink/pull/23141/commits/b65c0723ed66e01e83d718f770aa916f41f34581],
>  and I'm currently waiting on 
> [c8eb15c|https://github.com/apache/flink/pull/23141/commits/c8eb15cbc371dc259fb4fda5395f0f55e08ea9c6]
>  to see if I can get PyArrow to resolve >=10.0.0. Solving the proposed 
> dependencies results in:
> {{#}}
> {{# This file is autogenerated by pip-compile with Python 3.8}}
> {{# by the following command:}}
> {{#}}
> {{#    pip-compile --config=pyproject.toml 
> --output-file=dev/compiled-requirements.txt dev/dev-requirements.txt}}
> {{#}}
> {{apache-beam==2.49.0}}
> {{    # via -r dev/dev-requirements.txt}}
> {{avro-python3==1.10.2}}
> {{    # via -r dev/dev-requirements.txt}}
> {{certifi==2023.7.22}}
> {{    # via requests}}
> {{charset-normalizer==3.2.0}}
> {{    # via requests}}
> {{cloudpickle==2.2.1}}
> {{    # via}}
> {{    #   -r dev/dev-requirements.txt}}
> {{    #   apache-beam}}
> {{crcmod==1.7}}
> {{    # via apache-beam}}
> {{cython==3.0.0}}
> {{    # via -r dev/dev-requirements.txt}}
> {{dill==0.3.1.1}}
> {{    # via apache-beam}}
> {{dnspython==2.4.1}}
> {{    # via pymongo}}
> {{docopt==0.6.2}}
> {{    # via hdfs}}
> {{exceptiongroup==1.1.2}}
> {{    # via pytest}}
> {{fastavro==1.8.2}}
> {{    # via}}
> {{    #   -r dev/dev-requirements.txt}}
> {{    #   apache-beam}}
> {{fasteners==0.18}}
> {{    # via apache-beam}}
> {{find-libpython==0.3.0}}
> {{    # via pemja}}
> {{grpcio==1.56.2}}
> {{    # via}}
> {{    #   -r dev/dev-requirements.txt}}
> {{    #   apache-beam}}
> {{    #   grpcio-tools}}
> {{grpcio-tools==1.56.2}}
> {{    # via -r dev/dev-requirements.txt}}
> {{hdfs==2.7.0}}
> {{    # via apache-beam}}
> {{httplib2==0.22.0}}
> {{    # via}}
> {{    #   -r dev/dev-requirements.txt}}
> {{    #   apache-beam}}
> {{idna==3.4}}
> {{    # via requests}}
> {{iniconfig==2.0.0}}
> {{    # via pytest}}
> {{numpy==1.24.4}}
> {{    # via}}
> {{    #   -r dev/dev-requirements.txt}}
> {{    #   apache-beam}}
> {{    #   pandas}}
> {{    #   pyarrow}}
> {{objsize==0.6.1}}
> {{    # via apache-beam}}
> {{orjson==3.9.2}}
> {{    # via apache-beam}}
> {{packaging==23.1}}
> {{    # via pytest}}
> {{pandas==2.0.3}}
> {{    # via -r dev/dev-requirements.txt}}
> {{pemja==0.3.0 ; platform_system != "Windows"}}
> {{    # via -r dev/dev-requirements.txt}}
> {{pluggy==1.2.0}}
> {{    # via pytest}}
> {{proto-plus==1.22.3}}
> 

[jira] [Commented] (FLINK-32758) PyFlink bounds are overly restrictive and outdated

2023-08-28 Thread Deepyaman Datta (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-32758?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17759585#comment-17759585
 ] 

Deepyaman Datta commented on FLINK-32758:
-

[~Sergey Nuyanzin] This looks to be related to 
[https://github.com/fastavro/fastavro/issues/701]; while we pin `cython<3` for 
PyFlink, `fastavro` is getting built separately with Cython 3. One possible 
solution is to do something like 
[https://stackoverflow.com/a/76837035/1093967,] where `cython<3` is installed 
globally in the environment and used for building all of the libraries (I 
think). I'm not sure how you all feel about that, but I try to raise a PR with 
that, if helpful. It seems the failing test is on nightly build that runs a lot 
more checks; I'm not sure how I can verify that a potential fix would work, if 
I try? Can I trigger these tests manually?

The other possibility is to check why `fastavro>=1.8.1` isn't getting picked, 
and it's using `fastavro==1.8.0`. The newer versions have the Cython pin in 
their build requirements, and we wouldn't need to do a `pip wheel 
--no-build-isolation`. I can try to check this later today.

> PyFlink bounds are overly restrictive and outdated
> --
>
> Key: FLINK-32758
> URL: https://issues.apache.org/jira/browse/FLINK-32758
> Project: Flink
>  Issue Type: Improvement
>  Components: API / Python
>Affects Versions: 1.17.1, 1.19.0
>Reporter: Deepyaman Datta
>Assignee: Deepyaman Datta
>Priority: Blocker
>  Labels: pull-request-available, test-stability
>
> Hi! I am part of a team building the Flink backend for Ibis 
> ([https://github.com/ibis-project/ibis]). We would like to leverage PyFlink 
> under the hood for execution; however, PyFlink's requirements are 
> incompatible with several other Ibis requirements. Beyond Ibis, PyFlink's 
> outdated and restrictive requirements prevent it from being used alongside 
> most recent releases of Python data libraries.
> Some of the major libraries we (and likely others in the Python community 
> interested in using PyFlink alongside other libraries) need compatibility 
> with:
>  * PyArrow (at least >=10.0.0, but there's no reason not to be also be 
> compatible with latest)
>  * pandas (should be compatible with 2.x series, but also probably with 
> 1.4.x, released January 2022, and 1.5.x)
>  * numpy (1.22 was released in December 2022)
>  * Newer releases of Apache Beam
>  * Newer releases of cython
> Furthermore, uncapped dependencies could be more generally preferable, as 
> they avoid the need for frequent PyFlink releases as newer versions of 
> libraries are released. A common (and great) argument for not upper-bounding 
> dependencies, especially for libraries: 
> [https://iscinumpy.dev/post/bound-version-constraints/]
> I am currently testing removing upper bounds in 
> [https://github.com/apache/flink/pull/23141]; so far, builds pass without 
> issue in 
> [b65c072|https://github.com/apache/flink/pull/23141/commits/b65c0723ed66e01e83d718f770aa916f41f34581],
>  and I'm currently waiting on 
> [c8eb15c|https://github.com/apache/flink/pull/23141/commits/c8eb15cbc371dc259fb4fda5395f0f55e08ea9c6]
>  to see if I can get PyArrow to resolve >=10.0.0. Solving the proposed 
> dependencies results in:
> {{#}}
> {{# This file is autogenerated by pip-compile with Python 3.8}}
> {{# by the following command:}}
> {{#}}
> {{#    pip-compile --config=pyproject.toml 
> --output-file=dev/compiled-requirements.txt dev/dev-requirements.txt}}
> {{#}}
> {{apache-beam==2.49.0}}
> {{    # via -r dev/dev-requirements.txt}}
> {{avro-python3==1.10.2}}
> {{    # via -r dev/dev-requirements.txt}}
> {{certifi==2023.7.22}}
> {{    # via requests}}
> {{charset-normalizer==3.2.0}}
> {{    # via requests}}
> {{cloudpickle==2.2.1}}
> {{    # via}}
> {{    #   -r dev/dev-requirements.txt}}
> {{    #   apache-beam}}
> {{crcmod==1.7}}
> {{    # via apache-beam}}
> {{cython==3.0.0}}
> {{    # via -r dev/dev-requirements.txt}}
> {{dill==0.3.1.1}}
> {{    # via apache-beam}}
> {{dnspython==2.4.1}}
> {{    # via pymongo}}
> {{docopt==0.6.2}}
> {{    # via hdfs}}
> {{exceptiongroup==1.1.2}}
> {{    # via pytest}}
> {{fastavro==1.8.2}}
> {{    # via}}
> {{    #   -r dev/dev-requirements.txt}}
> {{    #   apache-beam}}
> {{fasteners==0.18}}
> {{    # via apache-beam}}
> {{find-libpython==0.3.0}}
> {{    # via pemja}}
> {{grpcio==1.56.2}}
> {{    # via}}
> {{    #   -r dev/dev-requirements.txt}}
> {{    #   apache-beam}}
> {{    #   grpcio-tools}}
> {{grpcio-tools==1.56.2}}
> {{    # via -r dev/dev-requirements.txt}}
> {{hdfs==2.7.0}}
> {{    # via apache-beam}}
> {{httplib2==0.22.0}}
> {{    # via}}
> {{    #   -r dev/dev-requirements.txt}}
> {{    #   apache-beam}}
> {{idna==3.4}}
> {{    # via requests}}
> {{iniconfig==2.0.0}}
> {{    # via 

[jira] [Commented] (FLINK-32758) PyFlink bounds are overly restrictive and outdated

2023-08-28 Thread Sergey Nuyanzin (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-32758?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17759559#comment-17759559
 ] 

Sergey Nuyanzin commented on FLINK-32758:
-

[~deepyaman] , [~dianfu] could you please have a look?

it seems it is the reason of nightly build failure on macos 

now after merging to master it is failing every night
26.08.2023: 
[https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=52666=logs=f73b5736-8355-5390-ec71-4dfdec0ce6c5=90f7230e-bf5a-531b-8566-ad48d3e03bbb=104]
 
27.08.2023: 
[https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=52680=logs=f73b5736-8355-5390-ec71-4dfdec0ce6c5=90f7230e-bf5a-531b-8566-ad48d3e03bbb=104]
 
28.08.2023: 
[https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=52692=logs=f73b5736-8355-5390-ec71-4dfdec0ce6c5=90f7230e-bf5a-531b-8566-ad48d3e03bbb=102]

in logs
{noformta}
2023-08-28T00:18:56.3513860Z   Building wheels for collected packages: 
fastavro, crcmod, dill, hdfs, pymongo, docopt
2023-08-28T00:18:56.3567790Z Building wheel for fastavro 
(pyproject.toml): started
2023-08-28T00:18:56.3600990Z Building wheel for fastavro 
(pyproject.toml): finished with status 'error'
2023-08-28T00:18:56.3649060Z error: subprocess-exited-with-error
2023-08-28T00:18:56.3678250Z   
2023-08-28T00:18:56.3715590Z × Building wheel for fastavro 
(pyproject.toml) did not run successfully.
2023-08-28T00:18:56.3743070Z │ exit code: 1
2023-08-28T00:18:56.3776690Z ╰─> [150 lines of output]
2023-08-28T00:18:56.3804780Z running bdist_wheel
2023-08-28T00:18:56.3853290Z running build
2023-08-28T00:18:56.3878030Z running build_py
2023-08-28T00:18:56.3898470Z creating build
2023-08-28T00:18:56.3925210Z creating 
build/lib.macosx-10.9-x86_64-cpython-37
2023-08-28T00:18:56.3963550Z creating 
build/lib.macosx-10.9-x86_64-cpython-37/fastavro
{noformat}

> PyFlink bounds are overly restrictive and outdated
> --
>
> Key: FLINK-32758
> URL: https://issues.apache.org/jira/browse/FLINK-32758
> Project: Flink
>  Issue Type: Improvement
>  Components: API / Python
>Affects Versions: 1.17.1
>Reporter: Deepyaman Datta
>Assignee: Deepyaman Datta
>Priority: Major
>  Labels: pull-request-available
>
> Hi! I am part of a team building the Flink backend for Ibis 
> ([https://github.com/ibis-project/ibis]). We would like to leverage PyFlink 
> under the hood for execution; however, PyFlink's requirements are 
> incompatible with several other Ibis requirements. Beyond Ibis, PyFlink's 
> outdated and restrictive requirements prevent it from being used alongside 
> most recent releases of Python data libraries.
> Some of the major libraries we (and likely others in the Python community 
> interested in using PyFlink alongside other libraries) need compatibility 
> with:
>  * PyArrow (at least >=10.0.0, but there's no reason not to be also be 
> compatible with latest)
>  * pandas (should be compatible with 2.x series, but also probably with 
> 1.4.x, released January 2022, and 1.5.x)
>  * numpy (1.22 was released in December 2022)
>  * Newer releases of Apache Beam
>  * Newer releases of cython
> Furthermore, uncapped dependencies could be more generally preferable, as 
> they avoid the need for frequent PyFlink releases as newer versions of 
> libraries are released. A common (and great) argument for not upper-bounding 
> dependencies, especially for libraries: 
> [https://iscinumpy.dev/post/bound-version-constraints/]
> I am currently testing removing upper bounds in 
> [https://github.com/apache/flink/pull/23141]; so far, builds pass without 
> issue in 
> [b65c072|https://github.com/apache/flink/pull/23141/commits/b65c0723ed66e01e83d718f770aa916f41f34581],
>  and I'm currently waiting on 
> [c8eb15c|https://github.com/apache/flink/pull/23141/commits/c8eb15cbc371dc259fb4fda5395f0f55e08ea9c6]
>  to see if I can get PyArrow to resolve >=10.0.0. Solving the proposed 
> dependencies results in:
> {{#}}
> {{# This file is autogenerated by pip-compile with Python 3.8}}
> {{# by the following command:}}
> {{#}}
> {{#    pip-compile --config=pyproject.toml 
> --output-file=dev/compiled-requirements.txt dev/dev-requirements.txt}}
> {{#}}
> {{apache-beam==2.49.0}}
> {{    # via -r dev/dev-requirements.txt}}
> {{avro-python3==1.10.2}}
> {{    # via -r dev/dev-requirements.txt}}
> {{certifi==2023.7.22}}
> {{    # via requests}}
> {{charset-normalizer==3.2.0}}
> {{    # via requests}}
> {{cloudpickle==2.2.1}}
> {{    # via}}
> {{    #   -r dev/dev-requirements.txt}}
> {{    #   apache-beam}}
> {{crcmod==1.7}}
> {{    # via apache-beam}}
> {{cython==3.0.0}}
> {{    # via -r dev/dev-requirements.txt}}
> 

[jira] [Commented] (FLINK-32758) PyFlink bounds are overly restrictive and outdated

2023-08-28 Thread Sergey Nuyanzin (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-32758?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17759560#comment-17759560
 ] 

Sergey Nuyanzin commented on FLINK-32758:
-

also switch to blocker, since every nightly is failing with this

> PyFlink bounds are overly restrictive and outdated
> --
>
> Key: FLINK-32758
> URL: https://issues.apache.org/jira/browse/FLINK-32758
> Project: Flink
>  Issue Type: Improvement
>  Components: API / Python
>Affects Versions: 1.17.1
>Reporter: Deepyaman Datta
>Assignee: Deepyaman Datta
>Priority: Major
>  Labels: pull-request-available
>
> Hi! I am part of a team building the Flink backend for Ibis 
> ([https://github.com/ibis-project/ibis]). We would like to leverage PyFlink 
> under the hood for execution; however, PyFlink's requirements are 
> incompatible with several other Ibis requirements. Beyond Ibis, PyFlink's 
> outdated and restrictive requirements prevent it from being used alongside 
> most recent releases of Python data libraries.
> Some of the major libraries we (and likely others in the Python community 
> interested in using PyFlink alongside other libraries) need compatibility 
> with:
>  * PyArrow (at least >=10.0.0, but there's no reason not to be also be 
> compatible with latest)
>  * pandas (should be compatible with 2.x series, but also probably with 
> 1.4.x, released January 2022, and 1.5.x)
>  * numpy (1.22 was released in December 2022)
>  * Newer releases of Apache Beam
>  * Newer releases of cython
> Furthermore, uncapped dependencies could be more generally preferable, as 
> they avoid the need for frequent PyFlink releases as newer versions of 
> libraries are released. A common (and great) argument for not upper-bounding 
> dependencies, especially for libraries: 
> [https://iscinumpy.dev/post/bound-version-constraints/]
> I am currently testing removing upper bounds in 
> [https://github.com/apache/flink/pull/23141]; so far, builds pass without 
> issue in 
> [b65c072|https://github.com/apache/flink/pull/23141/commits/b65c0723ed66e01e83d718f770aa916f41f34581],
>  and I'm currently waiting on 
> [c8eb15c|https://github.com/apache/flink/pull/23141/commits/c8eb15cbc371dc259fb4fda5395f0f55e08ea9c6]
>  to see if I can get PyArrow to resolve >=10.0.0. Solving the proposed 
> dependencies results in:
> {{#}}
> {{# This file is autogenerated by pip-compile with Python 3.8}}
> {{# by the following command:}}
> {{#}}
> {{#    pip-compile --config=pyproject.toml 
> --output-file=dev/compiled-requirements.txt dev/dev-requirements.txt}}
> {{#}}
> {{apache-beam==2.49.0}}
> {{    # via -r dev/dev-requirements.txt}}
> {{avro-python3==1.10.2}}
> {{    # via -r dev/dev-requirements.txt}}
> {{certifi==2023.7.22}}
> {{    # via requests}}
> {{charset-normalizer==3.2.0}}
> {{    # via requests}}
> {{cloudpickle==2.2.1}}
> {{    # via}}
> {{    #   -r dev/dev-requirements.txt}}
> {{    #   apache-beam}}
> {{crcmod==1.7}}
> {{    # via apache-beam}}
> {{cython==3.0.0}}
> {{    # via -r dev/dev-requirements.txt}}
> {{dill==0.3.1.1}}
> {{    # via apache-beam}}
> {{dnspython==2.4.1}}
> {{    # via pymongo}}
> {{docopt==0.6.2}}
> {{    # via hdfs}}
> {{exceptiongroup==1.1.2}}
> {{    # via pytest}}
> {{fastavro==1.8.2}}
> {{    # via}}
> {{    #   -r dev/dev-requirements.txt}}
> {{    #   apache-beam}}
> {{fasteners==0.18}}
> {{    # via apache-beam}}
> {{find-libpython==0.3.0}}
> {{    # via pemja}}
> {{grpcio==1.56.2}}
> {{    # via}}
> {{    #   -r dev/dev-requirements.txt}}
> {{    #   apache-beam}}
> {{    #   grpcio-tools}}
> {{grpcio-tools==1.56.2}}
> {{    # via -r dev/dev-requirements.txt}}
> {{hdfs==2.7.0}}
> {{    # via apache-beam}}
> {{httplib2==0.22.0}}
> {{    # via}}
> {{    #   -r dev/dev-requirements.txt}}
> {{    #   apache-beam}}
> {{idna==3.4}}
> {{    # via requests}}
> {{iniconfig==2.0.0}}
> {{    # via pytest}}
> {{numpy==1.24.4}}
> {{    # via}}
> {{    #   -r dev/dev-requirements.txt}}
> {{    #   apache-beam}}
> {{    #   pandas}}
> {{    #   pyarrow}}
> {{objsize==0.6.1}}
> {{    # via apache-beam}}
> {{orjson==3.9.2}}
> {{    # via apache-beam}}
> {{packaging==23.1}}
> {{    # via pytest}}
> {{pandas==2.0.3}}
> {{    # via -r dev/dev-requirements.txt}}
> {{pemja==0.3.0 ; platform_system != "Windows"}}
> {{    # via -r dev/dev-requirements.txt}}
> {{pluggy==1.2.0}}
> {{    # via pytest}}
> {{proto-plus==1.22.3}}
> {{    # via apache-beam}}
> {{protobuf==4.23.4}}
> {{    # via}}
> {{    #   -r dev/dev-requirements.txt}}
> {{    #   apache-beam}}
> {{    #   grpcio-tools}}
> {{    #   proto-plus}}
> {{py4j==0.10.9.7}}
> {{    # via -r dev/dev-requirements.txt}}
> {{pyarrow==11.0.0}}
> {{    # via}}
> {{    #   -r dev/dev-requirements.txt}}
> {{    #   apache-beam}}
> {{pydot==1.4.2}}
> {{    # via 

[jira] [Commented] (FLINK-32758) PyFlink bounds are overly restrictive and outdated

2023-08-24 Thread Dian Fu (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-32758?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17758790#comment-17758790
 ] 

Dian Fu commented on FLINK-32758:
-

Will backport to release-1.17 and release-1.8 branch after one official nightly 
test of the master branch.

> PyFlink bounds are overly restrictive and outdated
> --
>
> Key: FLINK-32758
> URL: https://issues.apache.org/jira/browse/FLINK-32758
> Project: Flink
>  Issue Type: Improvement
>  Components: API / Python
>Affects Versions: 1.17.1
>Reporter: Deepyaman Datta
>Assignee: Deepyaman Datta
>Priority: Major
>  Labels: pull-request-available
>
> Hi! I am part of a team building the Flink backend for Ibis 
> ([https://github.com/ibis-project/ibis]). We would like to leverage PyFlink 
> under the hood for execution; however, PyFlink's requirements are 
> incompatible with several other Ibis requirements. Beyond Ibis, PyFlink's 
> outdated and restrictive requirements prevent it from being used alongside 
> most recent releases of Python data libraries.
> Some of the major libraries we (and likely others in the Python community 
> interested in using PyFlink alongside other libraries) need compatibility 
> with:
>  * PyArrow (at least >=10.0.0, but there's no reason not to be also be 
> compatible with latest)
>  * pandas (should be compatible with 2.x series, but also probably with 
> 1.4.x, released January 2022, and 1.5.x)
>  * numpy (1.22 was released in December 2022)
>  * Newer releases of Apache Beam
>  * Newer releases of cython
> Furthermore, uncapped dependencies could be more generally preferable, as 
> they avoid the need for frequent PyFlink releases as newer versions of 
> libraries are released. A common (and great) argument for not upper-bounding 
> dependencies, especially for libraries: 
> [https://iscinumpy.dev/post/bound-version-constraints/]
> I am currently testing removing upper bounds in 
> [https://github.com/apache/flink/pull/23141]; so far, builds pass without 
> issue in 
> [b65c072|https://github.com/apache/flink/pull/23141/commits/b65c0723ed66e01e83d718f770aa916f41f34581],
>  and I'm currently waiting on 
> [c8eb15c|https://github.com/apache/flink/pull/23141/commits/c8eb15cbc371dc259fb4fda5395f0f55e08ea9c6]
>  to see if I can get PyArrow to resolve >=10.0.0. Solving the proposed 
> dependencies results in:
> {{#}}
> {{# This file is autogenerated by pip-compile with Python 3.8}}
> {{# by the following command:}}
> {{#}}
> {{#    pip-compile --config=pyproject.toml 
> --output-file=dev/compiled-requirements.txt dev/dev-requirements.txt}}
> {{#}}
> {{apache-beam==2.49.0}}
> {{    # via -r dev/dev-requirements.txt}}
> {{avro-python3==1.10.2}}
> {{    # via -r dev/dev-requirements.txt}}
> {{certifi==2023.7.22}}
> {{    # via requests}}
> {{charset-normalizer==3.2.0}}
> {{    # via requests}}
> {{cloudpickle==2.2.1}}
> {{    # via}}
> {{    #   -r dev/dev-requirements.txt}}
> {{    #   apache-beam}}
> {{crcmod==1.7}}
> {{    # via apache-beam}}
> {{cython==3.0.0}}
> {{    # via -r dev/dev-requirements.txt}}
> {{dill==0.3.1.1}}
> {{    # via apache-beam}}
> {{dnspython==2.4.1}}
> {{    # via pymongo}}
> {{docopt==0.6.2}}
> {{    # via hdfs}}
> {{exceptiongroup==1.1.2}}
> {{    # via pytest}}
> {{fastavro==1.8.2}}
> {{    # via}}
> {{    #   -r dev/dev-requirements.txt}}
> {{    #   apache-beam}}
> {{fasteners==0.18}}
> {{    # via apache-beam}}
> {{find-libpython==0.3.0}}
> {{    # via pemja}}
> {{grpcio==1.56.2}}
> {{    # via}}
> {{    #   -r dev/dev-requirements.txt}}
> {{    #   apache-beam}}
> {{    #   grpcio-tools}}
> {{grpcio-tools==1.56.2}}
> {{    # via -r dev/dev-requirements.txt}}
> {{hdfs==2.7.0}}
> {{    # via apache-beam}}
> {{httplib2==0.22.0}}
> {{    # via}}
> {{    #   -r dev/dev-requirements.txt}}
> {{    #   apache-beam}}
> {{idna==3.4}}
> {{    # via requests}}
> {{iniconfig==2.0.0}}
> {{    # via pytest}}
> {{numpy==1.24.4}}
> {{    # via}}
> {{    #   -r dev/dev-requirements.txt}}
> {{    #   apache-beam}}
> {{    #   pandas}}
> {{    #   pyarrow}}
> {{objsize==0.6.1}}
> {{    # via apache-beam}}
> {{orjson==3.9.2}}
> {{    # via apache-beam}}
> {{packaging==23.1}}
> {{    # via pytest}}
> {{pandas==2.0.3}}
> {{    # via -r dev/dev-requirements.txt}}
> {{pemja==0.3.0 ; platform_system != "Windows"}}
> {{    # via -r dev/dev-requirements.txt}}
> {{pluggy==1.2.0}}
> {{    # via pytest}}
> {{proto-plus==1.22.3}}
> {{    # via apache-beam}}
> {{protobuf==4.23.4}}
> {{    # via}}
> {{    #   -r dev/dev-requirements.txt}}
> {{    #   apache-beam}}
> {{    #   grpcio-tools}}
> {{    #   proto-plus}}
> {{py4j==0.10.9.7}}
> {{    # via -r dev/dev-requirements.txt}}
> {{pyarrow==11.0.0}}
> {{    # via}}
> {{    #   -r dev/dev-requirements.txt}}
> {{    #   apache-beam}}
> 

[jira] [Commented] (FLINK-32758) PyFlink bounds are overly restrictive and outdated

2023-08-24 Thread Dian Fu (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-32758?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17758788#comment-17758788
 ] 

Dian Fu commented on FLINK-32758:
-

Merged to master via 0dd6f9745b8df005e3aef286ae73092696ca2799

> PyFlink bounds are overly restrictive and outdated
> --
>
> Key: FLINK-32758
> URL: https://issues.apache.org/jira/browse/FLINK-32758
> Project: Flink
>  Issue Type: Improvement
>  Components: API / Python
>Affects Versions: 1.17.1
>Reporter: Deepyaman Datta
>Priority: Major
>  Labels: pull-request-available
>
> Hi! I am part of a team building the Flink backend for Ibis 
> ([https://github.com/ibis-project/ibis]). We would like to leverage PyFlink 
> under the hood for execution; however, PyFlink's requirements are 
> incompatible with several other Ibis requirements. Beyond Ibis, PyFlink's 
> outdated and restrictive requirements prevent it from being used alongside 
> most recent releases of Python data libraries.
> Some of the major libraries we (and likely others in the Python community 
> interested in using PyFlink alongside other libraries) need compatibility 
> with:
>  * PyArrow (at least >=10.0.0, but there's no reason not to be also be 
> compatible with latest)
>  * pandas (should be compatible with 2.x series, but also probably with 
> 1.4.x, released January 2022, and 1.5.x)
>  * numpy (1.22 was released in December 2022)
>  * Newer releases of Apache Beam
>  * Newer releases of cython
> Furthermore, uncapped dependencies could be more generally preferable, as 
> they avoid the need for frequent PyFlink releases as newer versions of 
> libraries are released. A common (and great) argument for not upper-bounding 
> dependencies, especially for libraries: 
> [https://iscinumpy.dev/post/bound-version-constraints/]
> I am currently testing removing upper bounds in 
> [https://github.com/apache/flink/pull/23141]; so far, builds pass without 
> issue in 
> [b65c072|https://github.com/apache/flink/pull/23141/commits/b65c0723ed66e01e83d718f770aa916f41f34581],
>  and I'm currently waiting on 
> [c8eb15c|https://github.com/apache/flink/pull/23141/commits/c8eb15cbc371dc259fb4fda5395f0f55e08ea9c6]
>  to see if I can get PyArrow to resolve >=10.0.0. Solving the proposed 
> dependencies results in:
> {{#}}
> {{# This file is autogenerated by pip-compile with Python 3.8}}
> {{# by the following command:}}
> {{#}}
> {{#    pip-compile --config=pyproject.toml 
> --output-file=dev/compiled-requirements.txt dev/dev-requirements.txt}}
> {{#}}
> {{apache-beam==2.49.0}}
> {{    # via -r dev/dev-requirements.txt}}
> {{avro-python3==1.10.2}}
> {{    # via -r dev/dev-requirements.txt}}
> {{certifi==2023.7.22}}
> {{    # via requests}}
> {{charset-normalizer==3.2.0}}
> {{    # via requests}}
> {{cloudpickle==2.2.1}}
> {{    # via}}
> {{    #   -r dev/dev-requirements.txt}}
> {{    #   apache-beam}}
> {{crcmod==1.7}}
> {{    # via apache-beam}}
> {{cython==3.0.0}}
> {{    # via -r dev/dev-requirements.txt}}
> {{dill==0.3.1.1}}
> {{    # via apache-beam}}
> {{dnspython==2.4.1}}
> {{    # via pymongo}}
> {{docopt==0.6.2}}
> {{    # via hdfs}}
> {{exceptiongroup==1.1.2}}
> {{    # via pytest}}
> {{fastavro==1.8.2}}
> {{    # via}}
> {{    #   -r dev/dev-requirements.txt}}
> {{    #   apache-beam}}
> {{fasteners==0.18}}
> {{    # via apache-beam}}
> {{find-libpython==0.3.0}}
> {{    # via pemja}}
> {{grpcio==1.56.2}}
> {{    # via}}
> {{    #   -r dev/dev-requirements.txt}}
> {{    #   apache-beam}}
> {{    #   grpcio-tools}}
> {{grpcio-tools==1.56.2}}
> {{    # via -r dev/dev-requirements.txt}}
> {{hdfs==2.7.0}}
> {{    # via apache-beam}}
> {{httplib2==0.22.0}}
> {{    # via}}
> {{    #   -r dev/dev-requirements.txt}}
> {{    #   apache-beam}}
> {{idna==3.4}}
> {{    # via requests}}
> {{iniconfig==2.0.0}}
> {{    # via pytest}}
> {{numpy==1.24.4}}
> {{    # via}}
> {{    #   -r dev/dev-requirements.txt}}
> {{    #   apache-beam}}
> {{    #   pandas}}
> {{    #   pyarrow}}
> {{objsize==0.6.1}}
> {{    # via apache-beam}}
> {{orjson==3.9.2}}
> {{    # via apache-beam}}
> {{packaging==23.1}}
> {{    # via pytest}}
> {{pandas==2.0.3}}
> {{    # via -r dev/dev-requirements.txt}}
> {{pemja==0.3.0 ; platform_system != "Windows"}}
> {{    # via -r dev/dev-requirements.txt}}
> {{pluggy==1.2.0}}
> {{    # via pytest}}
> {{proto-plus==1.22.3}}
> {{    # via apache-beam}}
> {{protobuf==4.23.4}}
> {{    # via}}
> {{    #   -r dev/dev-requirements.txt}}
> {{    #   apache-beam}}
> {{    #   grpcio-tools}}
> {{    #   proto-plus}}
> {{py4j==0.10.9.7}}
> {{    # via -r dev/dev-requirements.txt}}
> {{pyarrow==11.0.0}}
> {{    # via}}
> {{    #   -r dev/dev-requirements.txt}}
> {{    #   apache-beam}}
> {{pydot==1.4.2}}
> {{    # via apache-beam}}
> {{pymongo==4.4.1}}
> {{    # via apache-beam}}
>