Another input here:

If you opened a Python PR in the last few days, you probably noticed that
our test suites were broken by a transitive dependency of Beam that dropped
python 2 support, but did not declare python_requires>=3 in its setup.py
[1]. This temporarily broke a subset of Beam Py2 users (who did not
explicitly pin the 'rsa' dependency), and still affects Beam
development[2].

This is the second time[3] Beam is affected with an issue of this kind, so
support of Python 2 starts to slow down our development, and add toil for
maintainers of packages we depend on (both directly and transitively).

[1] https://github.com/sybrenstuvel/python-rsa/issues/152
[2]
https://lists.apache.org/thread.html/r9993b40b0c1cb8682ce56013165d4b80fdde0ee469a73bcb9466ddfb%40%3Cdev.beam.apache.org%3E
[3] https://github.com/hamcrest/PyHamcrest/issues/131

On Tue, Jun 9, 2020 at 4:06 PM Ahmet Altay <al...@google.com> wrote:

> Thank you for re-opening this Valentyn. I am in favor of EOLing py2
> support sooner than later. The reality is that we will not be effectively
> supporting beam python 2 for a long time while the ecosystem already EOLed
> python 2. That said, a significant chunk (but no longer a majority) of our
> users are still using python 2. Upgrades are painful, it might be
> especially painful nowadays. It would be good to hear counter view points,
> user voices related to this.
>
> On Thu, Jun 4, 2020 at 4:53 PM Valentyn Tymofieiev <valen...@google.com>
> wrote:
>
>> Back at the end of February we decided to revisit this conversation in 3
>> months. Do folks on this thread have any new input or perspective regarding
>> us balancing "user pain/contributor pain/our ability to continuously test
>> with python 2 in a shifting environment"?
>>
>> Some new information on my end is that we have been seeing steady
>> adoption of Python 3 among Beam Python users in Dataflow, particularly
>> strong adoption among streaming users, and Dataflow is sunsetting Python 2
>> support for all released Beam SDKs later this year [1]. We will have to
>> remove Python 2 Beam test suites that use Dataflow  when Dataflow runner
>> disables Py2 support if this happens before Beam Py2 EOL (when we have to
>> remove all Py2 suites), including performance tests that still use Dataflow
>> on Python 3.
>>
>> I am curious how much motivation there is in the community at this moment
>> to continue Py2 support in Beam,  whether any previous Py3 migration
>> blockers were resolved or any new blockers discovered among Beam users.
>>
>> [1] https://cloud.google.com/python/docs/python2-sunset/#dataflow
>>
>> On Fri, May 8, 2020 at 3:52 PM Valentyn Tymofieiev <valen...@google.com>
>> wrote:
>>
>>> That's good news! Thanks for sharing.
>>>
>>> Another datapoint, here are a few of Beam's dependencies that no longer
>>> release new py2 artifacts (I looked at REQUIRED_PACKAGES +  aws, gcp, and
>>> interactive extras):
>>>
>>> hdfs
>>> numpy
>>> pyarrow
>>> ipython
>>>
>>> There are more if we include transitive dependencies and test-only
>>> packages. I also remember encountering one issue last month that was broken
>>> only on Py2, which we had to go back and fix.
>>>
>>> If others have noticed frictions related to ongoing Py2 support or have
>>> updates on previously mentioned Py3 migration blockers, feel free to post
>>> them.
>>>
>>> On Fri, May 8, 2020 at 9:19 AM Robert Bradshaw <rober...@google.com>
>>> wrote:
>>>
>>>> It hasn't been 3 months yet, but I wanted to call out a milestone that
>>>> Python 3 downloads crossed the 50% threshold on pypi, if just briefly.
>>>>
>>>> On Thu, Feb 13, 2020 at 12:40 AM Ismaël Mejía <ieme...@gmail.com>
>>>> wrote:
>>>> >
>>>> > > I would suggest re-evaluating this within the next 3 months again.
>>>> We need to balance between user pain/contributor pain/our ability to
>>>> continuously test with python 2 in a shifting environment.
>>>> >
>>>> > Good idea for the in 3 months evaluation, at that point also
>>>> distributions will probably be phasing out python2 by default which
>>>> definitely help in this direction.
>>>> > Thanks for updating the roadmap Ahmet
>>>> >
>>>> >
>>>> > On Thu, Feb 13, 2020 at 2:49 AM Ahmet Altay <al...@google.com> wrote:
>>>> >>
>>>> >>
>>>> >>
>>>> >> On Wed, Feb 12, 2020 at 1:29 AM Ismaël Mejía <ieme...@gmail.com>
>>>> wrote:
>>>> >>>
>>>> >>> I am with Chad on this, we should probably extend it a bit more,
>>>> even if it
>>>> >>> makes us struggle a bit at least we have some workarounds as Robert
>>>> suggests,
>>>> >>> and as Chad said there are still many people playing the python 3
>>>> catchup game,
>>>> >>> so worth to support those users.
>>>> >>>
>>>> >>>
>>>> >>> But maybe it is worth to evaluate the current state later in the
>>>> year.
>>>> >>
>>>> >>
>>>> >> I would suggest re-evaluating this within the next 3 months again.
>>>> We need to balance between user pain/contributor pain/our ability to
>>>> continuously test with python 2 in a shifting environment.
>>>> >>
>>>> >>>
>>>> >>> In the
>>>> >>> meantime can someone please update our Roadmap in the website with
>>>> this info and
>>>> >>> where we are with Python 3 support (it looks not up to date).
>>>> >>> https://beam.apache.org/roadmap/
>>>> >>
>>>> >>
>>>> >> I made a minor change to update that page (
>>>> https://github.com/apache/beam/pull/10848). A more comprehensive
>>>> update to that page and linked (
>>>> https://beam.apache.org/roadmap/python-sdk/#python-3-support) would
>>>> still be welcome.
>>>> >>
>>>> >>>
>>>> >>>
>>>> >>> - Ismaël
>>>> >>>
>>>> >>>
>>>> >>> On Tue, Feb 4, 2020 at 10:49 PM Robert Bradshaw <
>>>> rober...@google.com> wrote:
>>>> >>>>
>>>> >>>>  On Tue, Feb 4, 2020 at 12:12 PM Chad Dombrova <chad...@gmail.com>
>>>> wrote:
>>>> >>>> >>
>>>> >>>> >>  Not to mention that all the nice work for the type hints will
>>>> have to be redone in the for 3.x.
>>>> >>>> >
>>>> >>>> > Note that there's a tool for automatically converting type
>>>> comments to annotations: https://github.com/ilevkivskyi/com2ann
>>>> >>>> >
>>>> >>>> > So don't let that part bother you.
>>>> >>>>
>>>> >>>> +1, I wouldn't worry about what can be easily automated.
>>>> >>>>
>>>> >>>> > I'm curious what other features you'd like to be using in the
>>>> Beam source that you cannot now.
>>>> >>>>
>>>> >>>> I hit things occasionally, e.g. I just ran into wanting
>>>> keyword-only
>>>> >>>> arguments the other day.
>>>> >>>>
>>>> >>>> >> It seems the faster we drop support the better.
>>>> >>>> >
>>>> >>>> >
>>>> >>>> > I've already gone over my position on this, but a refresher for
>>>> those who care:  some of the key vendors that support my industry will not
>>>> offer python3-compatible versions of their software until the 4th quarter
>>>> of 2020.  If Beam switches to python3-only before that point we may be
>>>> forced to stop contributing features (note: I'm the guy who added the type
>>>> hints :).   Every month you can give us would be greatly appreciated.
>>>> >>>>
>>>> >>>> As another data point, we're still 80/20 on Py2/Py3 for downloads
>>>> at
>>>> >>>> PyPi [1] (which I've heard should be taken with a grain of salt,
>>>> but
>>>> >>>> likely isn't totally off). IMHO that ratio needs to be way higher
>>>> for
>>>> >>>> Python 3 to consider dropping Python 2. It's pretty noisy, but say
>>>> it
>>>> >>>> doubles every 3 months that would put us at least mid-year before
>>>> we
>>>> >>>> hit a cross-over point. On the other hand Q4 2020 is probably a
>>>> >>>> stretch.
>>>> >>>>
>>>> >>>> We could consider whether it needs to be an all-or-nothing thing as
>>>> >>>> well. E.g. perhaps some features could be Python 3 only sooner than
>>>> >>>> the whole codebase. (This would have to be well justified.) Another
>>>> >>>> mitigation is that it is possible to mix Python 2 and Python 3 in
>>>> the
>>>> >>>> same pipeline with portability, so if there's a library that you
>>>> need
>>>> >>>> for one DoFn it doesn't mean you have to hold back your whole
>>>> >>>> pipeline.
>>>> >>>>
>>>> >>>> - Robert
>>>> >>>>
>>>> >>>> [1] https://pypistats.org/packages/apache-beam , and that 20% may
>>>> just
>>>> >>>> be a spike.
>>>>
>>>

Reply via email to