Re: [DISCUSS] How many Python 3.x minor versions should Beam Python SDK aim to support concurrently?

Ismaël Mejía Fri, 28 Feb 2020 01:26:40 -0800

One interesting variable that has not being mentioned is what versions of
python
3 are available to users via their distribution channels (the linux
distributions they use to develop/run the pipelines).


- RHEL 8 users have python 3.6 available
- RHEL 7 users have python 3.6 available
- Debian 10/Ubuntu 18.04 users have python 3.7/3.6 available
- Debian 9/Ubuntu 16.04 users have python 3.5 available

We should consider this when we evaluate future support removals.

Given  that the distros that support python 3.5 are ~4y old and since
python 3.5
is also losing LTS support soon is probably ok to not support it in Beam
anymore as Robert suggests.


On Thu, Feb 27, 2020 at 3:57 AM Valentyn Tymofieiev <valen...@google.com>
wrote:

> Thanks everyone for sharing your perspectives so far. It sounds like we
> can mitigate the cost of test infrastructure by having:
> - a selection of (fast) tests that we will want to run against all Python
> versions we support.
> - high priority Python versions, which we will test extensively.
> - infrequent postcommit test that exercise low-priority versions.
> We will need test infrastructure improvements to have the flexibility of
> designating versions of high-pri/low-pri and minimizing efforts requiring
> adopting a new version.
>
> There is still a question of how long we want to support old Py3.x
> versions. As mentioned above, I think we should not support them beyond EOL
> (5 years after a release). I wonder if that is still too long. The cost of
> supporting a version may include:
>  - Developing against older Python version
>  - Release overhead (building & storing containers, wheels, doing release
> validation)
>  - Complexity / development cost to support the quirks of the minor
> versions.
>
> We can decide to drop support, after, say, 4 years, or after usage drops
> below a threshold, or decide on a case-by-case basis. Thoughts? Also asked
> for feedback on user@ [1]
>
> [1]
> https://lists.apache.org/thread.html/r630a3b55aa8e75c68c8252ea6f824c3ab231ad56e18d916dfb84d9e8%40%3Cuser.beam.apache.org%3E
>
> On Wed, Feb 26, 2020 at 5:27 PM Robert Bradshaw <rober...@google.com>
> wrote:
>
>> On Wed, Feb 26, 2020 at 5:21 PM Valentyn Tymofieiev <valen...@google.com>
>> wrote:
>> >
>> > > +1 to consulting users.
>> > I will message user@ as well and point to this thread.
>> >
>> > > I would propose getting in warnings about 3.5 EoL well ahead of time.
>> > I think we should document on our website, and  in the code (warnings)
>> that users should not expect SDKs to be supported in Beam beyond the EOL.
>> If we want to have flexibility to drop support earlier than EOL, we need to
>> be more careful with messaging because users might otherwise expect that
>> support will last until EOL, if we mention EOL date.
>>
>> +1
>>
>> > I am hoping that we can establish a consensus for when we will be
>> dropping support for a version, so that we don't have to discuss it on a
>> case by case basis in the future.
>> >
>> > > I think it would makes sense to add support for 3.8 right away (or at
>> least get a good sense of what work needs to be done and what our
>> dependency situation is like)
>> > https://issues.apache.org/jira/browse/BEAM-8494 is a starting point. I
>> tried 3.8 a while ago some dependencies were not able to install, checked
>> again just now. SDK is "installable" after minor changes. Some tests don't
>> pass. BEAM-8494 does not have an owner atm, and if anyone is interested I'm
>> happy to give further pointers and help get started.
>> >
>> > > For the 3.x series, I think we will get the most signal out of the
>> lowest and highest version, and can get by with smoke tests +
>> > infrequent post-commits for the ones between.
>> >
>> > > I agree with having low-frequency tests for low-priority versions.
>> Low-priority versions could be determined according to least usage.
>> >
>> > These are good ideas. Do you think we will want to have an ability  to
>> run some (inexpensive) tests for all versions  frequently (on presubmits),
>> or this is extra complexity that can be avoided? I am thinking about type
>> inference for example. Afaik inference logic is very sensitive to the
>> version. Would it be acceptable to catch  errors there in infrequent
>> postcommits or an early signal will be preferred?
>>
>> This is a good example--the type inference tests are sensitive to
>> version (due to using internal details and relying on the
>> still-evolving typing module) but also run in ~15 seconds. I think
>> these should be in precommits. We just don't need to run every test
>> for every version.
>>
>> > On Wed, Feb 26, 2020 at 5:17 PM Kyle Weaver <kcwea...@google.com>
>> wrote:
>> >>
>> >> Oh, I didn't see Robert's earlier email:
>> >>
>> >> > Currently 3.5 downloads sit at 3.7%, or about
>> >> > 20% of all Python 3 downloads.
>> >>
>> >> Where did these numbers come from?
>> >>
>> >> On Wed, Feb 26, 2020 at 5:15 PM Kyle Weaver <kcwea...@google.com>
>> wrote:
>> >>>
>> >>> > I agree with having low-frequency tests for low-priority versions.
>> >>> > Low-priority versions could be determined according to least usage.
>> >>>
>> >>> +1. While the difference may not be as great between, say, 3.6 and
>> 3.7, I think that if we had to choose, it would be more useful to test the
>> versions folks are actually using the most. 3.5 only has about a third of
>> the Docker pulls of 3.6 or 3.7 [1]. Does anyone have other usage statistics
>> we can consult?
>> >>>
>> >>> [1] https://hub.docker.com/search?q=apachebeam%2Fpython&type=image
>> >>>
>> >>> On Wed, Feb 26, 2020 at 5:00 PM Ruoyun Huang <ruo...@google.com>
>> wrote:
>> >>>>
>> >>>> I feel 4+ versions take too long to run anything.
>> >>>>
>> >>>> would vote for lowest + highest,  2 versions.
>> >>>>
>> >>>> On Wed, Feb 26, 2020 at 4:52 PM Udi Meiri <eh...@google.com> wrote:
>> >>>>>
>> >>>>> I agree with having low-frequency tests for low-priority versions.
>> >>>>> Low-priority versions could be determined according to least usage.
>> >>>>>
>> >>>>>
>> >>>>>
>> >>>>> On Wed, Feb 26, 2020 at 4:06 PM Robert Bradshaw <
>> rober...@google.com> wrote:
>> >>>>>>
>> >>>>>> On Wed, Feb 26, 2020 at 3:29 PM Kenneth Knowles <k...@apache.org>
>> wrote:
>> >>>>>> >
>> >>>>>> > Are these divergent enough that they all need to consume testing
>> resources? For example can lower priority versions be daily runs or some
>> such?
>> >>>>>>
>> >>>>>> For the 3.x series, I think we will get the most signal out of the
>> >>>>>> lowest and highest version, and can get by with smoke tests +
>> >>>>>> infrequent post-commits for the ones between.
>> >>>>>>
>> >>>>>> > Kenn
>> >>>>>> >
>> >>>>>> > On Wed, Feb 26, 2020 at 3:25 PM Robert Bradshaw <
>> rober...@google.com> wrote:
>> >>>>>> >>
>> >>>>>> >> +1 to consulting users. Currently 3.5 downloads sit at 3.7%, or
>> about
>> >>>>>> >> 20% of all Python 3 downloads.
>> >>>>>> >>
>> >>>>>> >> I would propose getting in warnings about 3.5 EoL well ahead of
>> time,
>> >>>>>> >> at the very least as part of the 2.7 warning.
>> >>>>>> >>
>> >>>>>> >> Fortunately, supporting multiple 3.x versions is significantly
>> easier
>> >>>>>> >> than spanning 2.7 and 3.x. I would rather not impose an
>> ordering on
>> >>>>>> >> dropping 3.5 and adding 3.8 but consider their merits
>> independently.
>> >>>>>> >>
>> >>>>>> >>
>> >>>>>> >> On Wed, Feb 26, 2020 at 3:16 PM Kyle Weaver <
>> kcwea...@google.com> wrote:
>> >>>>>> >> >
>> >>>>>> >> > 5 versions is too many IMO. We've had issues with Python
>> precommit resource usage in the past, and adding another version would
>> surely exacerbate those issues. And we have also already had to leave out
>> certain features on 3.5 [1]. Therefore, I am in favor of dropping 3.5
>> before adding 3.8. After dropping Python 2 and adding 3.8, that will leave
>> us with the latest three minor versions (3.6, 3.7, 3.8), which I think is
>> closer to the "sweet spot." Though I would be interested in hearing if
>> there are any users who would prefer we continue supporting 3.5.
>> >>>>>> >> >
>> >>>>>> >> > [1]
>> https://github.com/apache/beam/blob/8658b95545352e51f35959f38334f3c7df8b48eb/sdks/python/apache_beam/runners/portability/flink_runner.py#L55
>> >>>>>> >> >
>> >>>>>> >> > On Wed, Feb 26, 2020 at 3:00 PM Valentyn Tymofieiev <
>> valen...@google.com> wrote:
>> >>>>>> >> >>
>> >>>>>> >> >> I would like to start a discussion about identifying a
>> guideline for answering questions like:
>> >>>>>> >> >>
>> >>>>>> >> >> 1. When will Beam support a new Python version (say, Python
>> 3.8)?
>> >>>>>> >> >> 2. When will Beam drop support for an old Python version
>> (say, Python 3.5)?
>> >>>>>> >> >> 3. How many Python versions should we aim to support
>> concurrently (investigate issues, have continuous integration tests)?
>> >>>>>> >> >> 4. What comes first: adding support for a new version (3.8)
>> or deprecating older one (3.5)? This may affect the max load our test
>> infrastructure needs to sustain.
>> >>>>>> >> >>
>> >>>>>> >> >> We are already getting requests for supporting Python 3.8
>> and there were some good reasons[1] to drop support for Python 3.5 (at
>> least, early versions of 3.5). Answering these questions would help set
>> expectations in Beam user community, Beam dev community, and  may help us
>> establish resource requirements for test infrastructure and plan efforts.
>> >>>>>> >> >>
>> >>>>>> >> >> PEP-0602 [2] establishes a yearly release cycle for Python
>> versions starting from 3.9. Each release is a long-term support release and
>> is supported for 5 years: first 1.5 years allow for general bug fix
>> support, remaining 3.5 years have security fix support.
>> >>>>>> >> >>
>> >>>>>> >> >> At every point, there may be up to 5 Python minor versions
>> that did not yet reach EOL, see "Release overlap with 12 month diagram"
>> [3]. We can try to support all of them, but that may come at a cost of
>> velocity: we will have more tests to maintain, and we will have to develop
>> Beam against a lower version for a longer period. Supporting less versions
>> will have implications for user experience. It also may be difficult to
>> ensure support of the most recent version early, since our  dependencies
>> (e.g. picklers) may not be supporting them yet.
>> >>>>>> >> >>
>> >>>>>> >> >> Currently we support 4 Python versions (2.7, 3.5, 3.6, 3.7).
>> >>>>>> >> >>
>> >>>>>> >> >> Is 4 versions a sweet spot? Too much? Too little? What do
>> you think?
>> >>>>>> >> >>
>> >>>>>> >> >> [1]
>> https://github.com/apache/beam/pull/10821#issuecomment-590167711
>> >>>>>> >> >> [2] https://www.python.org/dev/peps/pep-0602/
>> >>>>>> >> >> [3] https://www.python.org/dev/peps/pep-0602/#id17
>>
>

Re: [DISCUSS] How many Python 3.x minor versions should Beam Python SDK aim to support concurrently?

Reply via email to