Another input here: If you opened a Python PR in the last few days, you probably noticed that our test suites were broken by a transitive dependency of Beam that dropped python 2 support, but did not declare python_requires>=3 in its setup.py [1]. This temporarily broke a subset of Beam Py2 users (who did not explicitly pin the 'rsa' dependency), and still affects Beam development[2].
This is the second time[3] Beam is affected with an issue of this kind, so support of Python 2 starts to slow down our development, and add toil for maintainers of packages we depend on (both directly and transitively). [1] https://github.com/sybrenstuvel/python-rsa/issues/152 [2] https://lists.apache.org/thread.html/r9993b40b0c1cb8682ce56013165d4b80fdde0ee469a73bcb9466ddfb%40%3Cdev.beam.apache.org%3E [3] https://github.com/hamcrest/PyHamcrest/issues/131 On Tue, Jun 9, 2020 at 4:06 PM Ahmet Altay <al...@google.com> wrote: > Thank you for re-opening this Valentyn. I am in favor of EOLing py2 > support sooner than later. The reality is that we will not be effectively > supporting beam python 2 for a long time while the ecosystem already EOLed > python 2. That said, a significant chunk (but no longer a majority) of our > users are still using python 2. Upgrades are painful, it might be > especially painful nowadays. It would be good to hear counter view points, > user voices related to this. > > On Thu, Jun 4, 2020 at 4:53 PM Valentyn Tymofieiev <valen...@google.com> > wrote: > >> Back at the end of February we decided to revisit this conversation in 3 >> months. Do folks on this thread have any new input or perspective regarding >> us balancing "user pain/contributor pain/our ability to continuously test >> with python 2 in a shifting environment"? >> >> Some new information on my end is that we have been seeing steady >> adoption of Python 3 among Beam Python users in Dataflow, particularly >> strong adoption among streaming users, and Dataflow is sunsetting Python 2 >> support for all released Beam SDKs later this year [1]. We will have to >> remove Python 2 Beam test suites that use Dataflow when Dataflow runner >> disables Py2 support if this happens before Beam Py2 EOL (when we have to >> remove all Py2 suites), including performance tests that still use Dataflow >> on Python 3. >> >> I am curious how much motivation there is in the community at this moment >> to continue Py2 support in Beam, whether any previous Py3 migration >> blockers were resolved or any new blockers discovered among Beam users. >> >> [1] https://cloud.google.com/python/docs/python2-sunset/#dataflow >> >> On Fri, May 8, 2020 at 3:52 PM Valentyn Tymofieiev <valen...@google.com> >> wrote: >> >>> That's good news! Thanks for sharing. >>> >>> Another datapoint, here are a few of Beam's dependencies that no longer >>> release new py2 artifacts (I looked at REQUIRED_PACKAGES + aws, gcp, and >>> interactive extras): >>> >>> hdfs >>> numpy >>> pyarrow >>> ipython >>> >>> There are more if we include transitive dependencies and test-only >>> packages. I also remember encountering one issue last month that was broken >>> only on Py2, which we had to go back and fix. >>> >>> If others have noticed frictions related to ongoing Py2 support or have >>> updates on previously mentioned Py3 migration blockers, feel free to post >>> them. >>> >>> On Fri, May 8, 2020 at 9:19 AM Robert Bradshaw <rober...@google.com> >>> wrote: >>> >>>> It hasn't been 3 months yet, but I wanted to call out a milestone that >>>> Python 3 downloads crossed the 50% threshold on pypi, if just briefly. >>>> >>>> On Thu, Feb 13, 2020 at 12:40 AM Ismaël Mejía <ieme...@gmail.com> >>>> wrote: >>>> > >>>> > > I would suggest re-evaluating this within the next 3 months again. >>>> We need to balance between user pain/contributor pain/our ability to >>>> continuously test with python 2 in a shifting environment. >>>> > >>>> > Good idea for the in 3 months evaluation, at that point also >>>> distributions will probably be phasing out python2 by default which >>>> definitely help in this direction. >>>> > Thanks for updating the roadmap Ahmet >>>> > >>>> > >>>> > On Thu, Feb 13, 2020 at 2:49 AM Ahmet Altay <al...@google.com> wrote: >>>> >> >>>> >> >>>> >> >>>> >> On Wed, Feb 12, 2020 at 1:29 AM Ismaël Mejía <ieme...@gmail.com> >>>> wrote: >>>> >>> >>>> >>> I am with Chad on this, we should probably extend it a bit more, >>>> even if it >>>> >>> makes us struggle a bit at least we have some workarounds as Robert >>>> suggests, >>>> >>> and as Chad said there are still many people playing the python 3 >>>> catchup game, >>>> >>> so worth to support those users. >>>> >>> >>>> >>> >>>> >>> But maybe it is worth to evaluate the current state later in the >>>> year. >>>> >> >>>> >> >>>> >> I would suggest re-evaluating this within the next 3 months again. >>>> We need to balance between user pain/contributor pain/our ability to >>>> continuously test with python 2 in a shifting environment. >>>> >> >>>> >>> >>>> >>> In the >>>> >>> meantime can someone please update our Roadmap in the website with >>>> this info and >>>> >>> where we are with Python 3 support (it looks not up to date). >>>> >>> https://beam.apache.org/roadmap/ >>>> >> >>>> >> >>>> >> I made a minor change to update that page ( >>>> https://github.com/apache/beam/pull/10848). A more comprehensive >>>> update to that page and linked ( >>>> https://beam.apache.org/roadmap/python-sdk/#python-3-support) would >>>> still be welcome. >>>> >> >>>> >>> >>>> >>> >>>> >>> - Ismaël >>>> >>> >>>> >>> >>>> >>> On Tue, Feb 4, 2020 at 10:49 PM Robert Bradshaw < >>>> rober...@google.com> wrote: >>>> >>>> >>>> >>>> On Tue, Feb 4, 2020 at 12:12 PM Chad Dombrova <chad...@gmail.com> >>>> wrote: >>>> >>>> >> >>>> >>>> >> Not to mention that all the nice work for the type hints will >>>> have to be redone in the for 3.x. >>>> >>>> > >>>> >>>> > Note that there's a tool for automatically converting type >>>> comments to annotations: https://github.com/ilevkivskyi/com2ann >>>> >>>> > >>>> >>>> > So don't let that part bother you. >>>> >>>> >>>> >>>> +1, I wouldn't worry about what can be easily automated. >>>> >>>> >>>> >>>> > I'm curious what other features you'd like to be using in the >>>> Beam source that you cannot now. >>>> >>>> >>>> >>>> I hit things occasionally, e.g. I just ran into wanting >>>> keyword-only >>>> >>>> arguments the other day. >>>> >>>> >>>> >>>> >> It seems the faster we drop support the better. >>>> >>>> > >>>> >>>> > >>>> >>>> > I've already gone over my position on this, but a refresher for >>>> those who care: some of the key vendors that support my industry will not >>>> offer python3-compatible versions of their software until the 4th quarter >>>> of 2020. If Beam switches to python3-only before that point we may be >>>> forced to stop contributing features (note: I'm the guy who added the type >>>> hints :). Every month you can give us would be greatly appreciated. >>>> >>>> >>>> >>>> As another data point, we're still 80/20 on Py2/Py3 for downloads >>>> at >>>> >>>> PyPi [1] (which I've heard should be taken with a grain of salt, >>>> but >>>> >>>> likely isn't totally off). IMHO that ratio needs to be way higher >>>> for >>>> >>>> Python 3 to consider dropping Python 2. It's pretty noisy, but say >>>> it >>>> >>>> doubles every 3 months that would put us at least mid-year before >>>> we >>>> >>>> hit a cross-over point. On the other hand Q4 2020 is probably a >>>> >>>> stretch. >>>> >>>> >>>> >>>> We could consider whether it needs to be an all-or-nothing thing as >>>> >>>> well. E.g. perhaps some features could be Python 3 only sooner than >>>> >>>> the whole codebase. (This would have to be well justified.) Another >>>> >>>> mitigation is that it is possible to mix Python 2 and Python 3 in >>>> the >>>> >>>> same pipeline with portability, so if there's a library that you >>>> need >>>> >>>> for one DoFn it doesn't mean you have to hold back your whole >>>> >>>> pipeline. >>>> >>>> >>>> >>>> - Robert >>>> >>>> >>>> >>>> [1] https://pypistats.org/packages/apache-beam , and that 20% may >>>> just >>>> >>>> be a spike. >>>> >>>