[ https://issues.apache.org/jira/browse/BEAM-6765?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Barry Hart reopened BEAM-6765: ------------------------------ Ropening after others have mentioned they have the same issue. I initially closed it because on DataFlow, there _is_ a workaround: When submitting the job, use a different {{requirements.txt}} with {{apache-beam}} and {{pyarrow}} removed. I do believe there is an issue or two here worth addressing, though: 1. It seems that Beam lists the libraries used for _testing_ it in its {{setup.py}}. These libraries probably don't belong in {{setup.py}}. Only libraries _used_ by Beam itself should be here. 2. Worse, some of the libraries listed have (a) very narrow ranges allowed (b) for {{google-cloud-core}}, it was newly added in 2.11 _and_ appears to be a really old version. My application uses {{google-cloud-logging}}, and I had to downgrade by two releases (from 1.10 to 1.8, IIRC) simply to get it to co-exist, version wise, with the version of {{google-cloud-core}} that Beam wants. > Beam 2.10.0 for Python requires pyarrow 0.11.1, which is not installable in > Google Cloud DataFlow > ------------------------------------------------------------------------------------------------- > > Key: BEAM-6765 > URL: https://issues.apache.org/jira/browse/BEAM-6765 > Project: Beam > Issue Type: Bug > Components: sdk-py-core > Affects Versions: 2.10.0 > Reporter: Barry Hart > Priority: Major > Fix For: 2.10.0 > > > When trying to run a Beam 2.10.0 job in Google Cloud DataFlow, I get the > following error: > {noformat} > Collecting pyarrow==0.11.1 (from -r requirements.txt (line 51)) > Could not find a version that satisfies the requirement pyarrow==0.11.1 (from > -r requirements.txt (line 51)) (from versions: 0.9.0, 0.10.0, 0.11.0, 0.12.1) > No matching distribution found for pyarrow==0.11.1 (from -r requirements.txt > (line 51)) > {noformat} > This version, while it exists, cannot be installed in Google Cloud DataFlow, > because it is only available on PyPI as a wheel, and DataFlow does not allow > installing binary packages, only source packages. -- This message was sent by Atlassian JIRA (v7.6.3#76005)