For info Avro has published a new version 1.9.2.1 that fixes the issue: https://issues.apache.org/jira/browse/AVRO-2737
I just submitted a PR to make the dependency consistent with Avro versioning and verify that everything works as intended with the upgraded dependency on the python SDK. Can you PTAL? https://github.com/apache/beam/pull/10851 On Thu, Feb 13, 2020 at 9:39 AM Ismaël Mejía <ieme...@gmail.com> wrote: > > > I can argue for not pinning and bounding with major version ranges. This > gives flexibility to users to mix other third party libraries that share > common dependencies with Beam. Our expectation is that dependencies follow > semantic versioning and do not introduce breaking changes unless there is a > major version change. A good example of this is Beam's dependency on > "pytz>=2018.3". It is a simple wrapper around a time zone file. Latest > version of the dependency is 2019.3, it is updated a few times a year. Beam > users do not have to update Beam just to be able to use a later version of > it since Beam does not pin it. > > Avro does not follow semantic versioning (the first number corresponds to > the version of the Avro binary format the release is compatible with, the > second correspond to the MAJOR and the third to the MINOR in semver), so we > should then fix the upper bound to 1.10.0 instead of 2.0.0 considering that > 1.10.x before the summer and it may contain breaking changes. > > > There is also a middle ground, where we can pin certain dependencies if > we are not confident about their releases. And allow ranges for rest of the > dependencies. In general, we are currently following this practice. > > I see your point, like many things in software it is all about tradeoffs, > and it is good to find a middle ground, do we have a robust reproducible > release experience, or do we deal with the annoyance of doing manual minor > version upgrades. Choices choices... > > > > > On Thu, Feb 13, 2020 at 2:26 AM Ahmet Altay <al...@google.com> wrote: > >> >> >> On Wed, Feb 12, 2020 at 12:54 PM Ismaël Mejía <ieme...@gmail.com> wrote: >> >>> Independently of the bug in the dependency release the fact that the >>> Beam Python >>> SDK does not have pinned fixed dependency numbers is error-prone. We may >>> continue to have this kind of problems until we fix this (with other >>> dependencies too). In the Java SDK we do not accept such type of dynamic >>> dependency numbers and python should probably follow this practice to >>> avoid >>> issues like the present one. >>> >>> Why don't we just do: >>> >>> 'avro-python3==1.9.1', >>> >>> instead of the current: >>> >>> 'avro-python3>=1.8.1,!=1.9.2,<2.0.0; python_version >= "3.0"', >>> >> >> I agree this is error prone. Your argument for pinning makes sense and I >> agree with it. >> >> I can argue for not pinning and bounding with major version ranges. This >> gives flexibility to users to mix other third party libraries that share >> common dependencies with Beam. Our expectation is that dependencies follow >> semantic versioning and do not introduce breaking changes unless there is a >> major version change. A good example of this is Beam's dependency on >> "pytz>=2018.3". It is a simple wrapper around a time zone file. Latest >> version of the dependency is 2019.3, it is updated a few times a year. Beam >> users do not have to update Beam just to be able to use a later version of >> it since Beam does not pin it. >> >> There is also a middle ground, where we can pin certain dependencies if >> we are not confident about their releases. And allow ranges for rest of the >> dependencies. In general, we are currently following this practice. >> >> >>> >>> >>> On Wed, Feb 12, 2020 at 9:14 PM Ahmet Altay <al...@google.com> wrote: >>> >>>> Related: we have dependencies on avro, avro-python3, and fastavro. >>>> fastavro supports both python 2 and 3. Could we reduce this dependency list >>>> and depend only on fastavro? If we need avro and avro-python3 for the >>>> purposes of testing only, we can move them to test only dependencies. >>>> >>>> +Chamikara Jayalath <chamik...@google.com>, because I vaguely remember >>>> him working on this. >>>> >>>> The reason I am calling for this is the impact of bad dependency >>>> releases are high. All previously released Beam versions will be impacted. >>>> Reducing the dependency list will reduce the risk. >>>> >>>> Ahmet >>>> >>>> On Wed, Feb 12, 2020 at 12:02 PM Ahmet Altay <al...@google.com> wrote: >>>> >>>>> Thank you Valentyn! >>>>> >>>>> On Wed, Feb 12, 2020 at 11:32 AM Valentyn Tymofieiev < >>>>> valen...@google.com> wrote: >>>>> >>>>>> Yes, otherwise all Python tests will continue to fail until Avro >>>>>> comes up with a new release. Sent: >>>>>> https://github.com/apache/beam/pull/10844 >>>>>> >>>>>> On Wed, Feb 12, 2020 at 11:08 AM Ahmet Altay <al...@google.com> >>>>>> wrote: >>>>>> >>>>>>> Should we update Beam's setup.py to skip this avro-python3 version? >>>>>>> >>>>>>> On Wed, Feb 12, 2020 at 10:57 AM Alan Krumholz < >>>>>>> alan.krumh...@betterup.co> wrote: >>>>>>> >>>>>>>> makes sense. I'll add this workaround for now. >>>>>>>> Thanks so much for your help! >>>>>>>> >>>>>>>> On Wed, Feb 12, 2020 at 10:33 AM Valentyn Tymofieiev < >>>>>>>> valen...@google.com> wrote: >>>>>>>> >>>>>>>>> Alan, Dataflow workers preinstall Beam SDK dependencies, including >>>>>>>>> (a working version) of avro-python3. So after reading your email once >>>>>>>>> again, I think in your case you were not able to install Beam SDK >>>>>>>>> locally. >>>>>>>>> So a workaround for you would be to `pip install avro-python3==1.9.1` >>>>>>>>> or >>>>>>>>> `pip install pycodestyle` before installing Beam, until AVRO-2737 >>>>>>>>> is resolved. >>>>>>>>> >>>>>>>>> >>>>>>>>> On Wed, Feb 12, 2020 at 10:21 AM Valentyn Tymofieiev < >>>>>>>>> valen...@google.com> wrote: >>>>>>>>> >>>>>>>>>> Ah, there's already >>>>>>>>>> https://issues.apache.org/jira/browse/AVRO-2737 and it received >>>>>>>>>> attention. >>>>>>>>>> >>>>>>>>>> On Wed, Feb 12, 2020 at 10:19 AM Valentyn Tymofieiev < >>>>>>>>>> valen...@google.com> wrote: >>>>>>>>>> >>>>>>>>>>> Opened https://issues.apache.org/jira/browse/AVRO-2738 >>>>>>>>>>> >>>>>>>>>>> On Wed, Feb 12, 2020 at 10:14 AM Valentyn Tymofieiev < >>>>>>>>>>> valen...@google.com> wrote: >>>>>>>>>>> >>>>>>>>>>>> Here's a short repro: >>>>>>>>>>>> >>>>>>>>>>>> :~$ docker run -it --entrypoint=/bin/bash python:3.7-stretch >>>>>>>>>>>> root@04b45a100d16:/# pip install avro-python3 >>>>>>>>>>>> Collecting avro-python3 >>>>>>>>>>>> Downloading avro-python3-1.9.2.tar.gz (37 kB) >>>>>>>>>>>> ERROR: Command errored out with exit status 1: >>>>>>>>>>>> command: /usr/local/bin/python -c 'import sys, setuptools, >>>>>>>>>>>> tokenize; sys.argv[0] = >>>>>>>>>>>> '"'"'/tmp/pip-install-mmy4vspt/avro-python3/setup.py'"'"'; >>>>>>>>>>>> __file__='"'"'/tmp/pip-install-mmy4vspt/avro-python3/setup.py'"'"';f=getattr(tokenize, >>>>>>>>>>>> '"'"'open'"'"', >>>>>>>>>>>> open)(__file__);code=f.read().replace('"'"'\r\n'"'"', >>>>>>>>>>>> '"'"'\n'"'"');f.close();exec(compile(code, __file__, >>>>>>>>>>>> '"'"'exec'"'"'))' >>>>>>>>>>>> egg_info --egg-base >>>>>>>>>>>> /tmp/pip-install-mmy4vspt/avro-python3/pip-egg-info >>>>>>>>>>>> cwd: /tmp/pip-install-mmy4vspt/avro-python3/ >>>>>>>>>>>> Complete output (5 lines): >>>>>>>>>>>> Traceback (most recent call last): >>>>>>>>>>>> File "<string>", line 1, in <module> >>>>>>>>>>>> File "/tmp/pip-install-mmy4vspt/avro-python3/setup.py", >>>>>>>>>>>> line 41, in <module> >>>>>>>>>>>> import pycodestyle >>>>>>>>>>>> ModuleNotFoundError: No module named 'pycodestyle' >>>>>>>>>>>> ---------------------------------------- >>>>>>>>>>>> ERROR: Command errored out with exit status 1: python setup.py >>>>>>>>>>>> egg_info Check the logs for full command output. >>>>>>>>>>>> root@04b45a100d16:/# >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> On Wed, Feb 12, 2020 at 10:14 AM Valentyn Tymofieiev < >>>>>>>>>>>> valen...@google.com> wrote: >>>>>>>>>>>> >>>>>>>>>>>>> Yes, it is a bug in the recent Avro release. We should report >>>>>>>>>>>>> it to the Avro maintainers. The workaround is to downgrade >>>>>>>>>>>>> avro-python3 to >>>>>>>>>>>>> 1.9.1, for example via requirements.txt. >>>>>>>>>>>>> >>>>>>>>>>>>> On Wed, Feb 12, 2020 at 10:06 AM Steve Niemitz < >>>>>>>>>>>>> sniem...@apache.org> wrote: >>>>>>>>>>>>> >>>>>>>>>>>>>> avro-python3 1.9.2 was released on pypi 4 hours ago, and >>>>>>>>>>>>>> added pycodestyle as a dependency, probably related? >>>>>>>>>>>>>> >>>>>>>>>>>>>> On Wed, Feb 12, 2020 at 1:03 PM Luke Cwik <lc...@google.com> >>>>>>>>>>>>>> wrote: >>>>>>>>>>>>>> >>>>>>>>>>>>>>> +dev <dev@beam.apache.org> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> There was recently an update to add autoformatting to the >>>>>>>>>>>>>>> Python SDK[1]. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> I'm seeing this during testing of a PR as well. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> 1: >>>>>>>>>>>>>>> https://lists.apache.org/thread.html/448bb5c2d73fbd74eec7aacb5f28fa2f9d791784c2e53a2e3325627a%40%3Cdev.beam.apache.org%3E >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> On Wed, Feb 12, 2020 at 9:57 AM Alan Krumholz < >>>>>>>>>>>>>>> alan.krumh...@betterup.co> wrote: >>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Some more information for this as I still can't get to fix >>>>>>>>>>>>>>>> it.... >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> This job is triggered using the beam[gcp] python sdk from a >>>>>>>>>>>>>>>> KubeFlow Pipelines component which runs on top of docker image: >>>>>>>>>>>>>>>> tensorflow/tensorflow:1.13.1-py3 >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> I just checked and that image hasn't been updated recently. >>>>>>>>>>>>>>>> I also redeployed my pipeline to another (older) deployment of >>>>>>>>>>>>>>>> KFP and it >>>>>>>>>>>>>>>> gives me the same error (which tells me this isn't an internal >>>>>>>>>>>>>>>> KFP problem) >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> The exact same pipeline/code running on the exact same >>>>>>>>>>>>>>>> image has been running fine for days. Did anything changed on >>>>>>>>>>>>>>>> the >>>>>>>>>>>>>>>> beam/dataflow side since yesterday morning? >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Thanks for your help! this is a production pipeline that is >>>>>>>>>>>>>>>> not running for us :( >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> On Wed, Feb 12, 2020 at 7:21 AM Alan Krumholz < >>>>>>>>>>>>>>>> alan.krumh...@betterup.co> wrote: >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Hi, I have a scheduled daily job that I have been running >>>>>>>>>>>>>>>>> fine in dataflow for days now. >>>>>>>>>>>>>>>>> We haven't changed anything on this code but this morning >>>>>>>>>>>>>>>>> run failed (it couldn't even spin up the job) >>>>>>>>>>>>>>>>> The job submits a setup.py file (that also hasn't changed) >>>>>>>>>>>>>>>>> but maybe is causing the problem? (based on the error I'm >>>>>>>>>>>>>>>>> getting) >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Anyone else having the same issue? or know how to fix it? >>>>>>>>>>>>>>>>> Thanks! >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> ERROR: Complete output from command python setup.py >>>>>>>>>>>>>>>>> egg_info: >>>>>>>>>>>>>>>>> 2 ERROR: Traceback (most recent call last): >>>>>>>>>>>>>>>>> 3 File "<string>", line 1, in <module> >>>>>>>>>>>>>>>>> 4 File "/tmp/pip-install-42zyi89t/avro-python3/setup.py", >>>>>>>>>>>>>>>>> line 41, in <module> >>>>>>>>>>>>>>>>> 5 import pycodestyle >>>>>>>>>>>>>>>>> 6 ImportError: No module named 'pycodestyle' >>>>>>>>>>>>>>>>> 7 ---------------------------------------- >>>>>>>>>>>>>>>>> 8ERROR: Command "python setup.py egg_info" failed with >>>>>>>>>>>>>>>>> error code 1 in /tmp/pip-install-42zyi89t/avro-python3/ >>>>>>>>>>>>>>>>> 9 ERROR: Complete output from command python setup.py >>>>>>>>>>>>>>>>> egg_info: >>>>>>>>>>>>>>>>> 10 ERROR: Traceback (most recent call last): >>>>>>>>>>>>>>>>> 11 File "<string>", line 1, in <module> >>>>>>>>>>>>>>>>> 12 File >>>>>>>>>>>>>>>>> "/tmp/pip-install-wrqytf9a/avro-python3/setup.py", line 41, >>>>>>>>>>>>>>>>> in <module> >>>>>>>>>>>>>>>>> 13 import pycodestyle >>>>>>>>>>>>>>>>> 14 ImportError: No module named 'pycodestyle' >>>>>>>>>>>>>>>>> 15 ---------------------------------------- >>>>>>>>>>>>>>>>> 16ERROR: Command "python setup.py egg_info" failed with >>>>>>>>>>>>>>>>> error code 1 in /tmp/pip-install-wrqytf9a/avro-python3/ >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>