On Mon, Aug 26, 2024 at 11:57 AM Robert Bradshaw <rober...@google.com>
wrote:

> On Mon, Aug 26, 2024 at 11:22 AM Valentyn Tymofieiev via dev
> <dev@beam.apache.org> wrote:
> >
> > Interesting findings. When researching Dataflow Python usage with
> internal telemetry, I see that Python 3.11 has slightly more usage than
> Python 3.8. When I exclude Dev SDKs (this might also exclude some
> Google-internal users who use bleeding-edge SDKs), Python 3.8 reaches to
> the top. If I exclude Google Dynamic "FLEX" templates, the following become
> top 3:
> >
> > Apache Beam Python 3.9 SDK
> > 24.40%
> > Apache Beam Python 3.7 SDK
> > 23.34%
> > Apache Beam Python 3.8 SDK
> > 21.63%
>
> Interesting. I'm assuming this is across all Beam versions, right?
>
Yes, across all Beam Python versions.
>
>
> > This might be explained by the fact that the default "Python3" flex
> template image referenced in the docs (at
> https://cloud.google.com/dataflow/docs/guides/templates/using-flex-templates#python_3)
> is Python 3.8.
>
> We should definitely fix that.
>
> > > On the other hand, I do like the idea of letting the Python EoL cycle
> drive our own supported versions.
> >
> > +1. As much as I don't like force upgrades, it won't be sustainable long
> term to keep versions indefinitely. I don't anticipate any blockers for
> switching Python 3.8 to Python 3.9.
> >
> > > For many workflows like our unit test suites this is not a large
> change; the Python version matrix simply omits 3.8 and runs on the
> remaining python versions as expected. This is more complicated for a
> number of workflows that currently only run on 3.8 or both 3.8 and 3.12, as
> GitHub will not run the updated actions in the main repository until the PR
> updating them is submitted.
> >
> > Yes, that's a known inconvenience. I believe this can be worked around
> by pushing the changes to a branch on main repo, and then manually
> triggering a GHA workflow from that branch, if you want to be really
> careful. I think we have this documented somewhere, but I couldn't quickly
> find it. @Danny McCormick might have a link.
> >
> > Merging and iterating sounds good to me if we can quickly roll back/fix
> forward changes to not make PRs blocked due to tests not passing.
>
> This risk is accepting changes that are incompatible with Python 3.8.
> Once we drop it (even in the dev repo) we should drop it for good.
>
> > We also set the default Python version in
> https://github.com/apache/beam/blob/9c0a9503ebd59778d488dcfff7fb9417a808152b/buildSrc/src/main/groovy/org/apache/beam/gradle/BeamModulePlugin.groovy#L2960
> that might affect some workflows.
> >
> > > To Robert Bradshaw's point, I wouldn't necessarily be opposed to
> pushing out this process to 2.61.0.
> > As long as we don't add a new version before remove an existing one,
> probably no significant difference for us.
>
> Sounds like a reasonable plan then.
>
> > Our dependencies (like numpy, pandas, etc) are definitely dropping
> Python 3.8 support, usually ahead of us. Some Google Cloud Python Client
> libraries are planning to drop Python 3.8 support after EOL as well.
> >
> > On Mon, Aug 26, 2024 at 11:17 AM Jack McCluskey via dev <
> dev@beam.apache.org> wrote:
> >>
> >> To Robert Bradshaw's point, I wouldn't necessarily be opposed to
> pushing out this process to 2.61.0. That does give more time to validate
> some of the actions changes and let us warn users about the drop in 3.8
> support in a release. Admittedly a major motivator for moving off of 3.8 at
> EoL is so I can do some overhauling of the type hinting code, as 3.8 is the
> last version where PEP-585 type hints are not supported by default (some
> context for this is available on my Current State of Beam Python Type
> Hinting doc from last November.) But that isn't necessarily urgent work as
> far as users are concerned.
> >>
> >> There's an argument for trying to keep our documentation and tutorials
> pointing at relatively recent versions of Beam, but that's probably best
> left as a best-effort type thing for now.
> >>
> >> On Mon, Aug 26, 2024 at 1:41 PM Robert Burke <lostl...@apache.org>
> wrote:
> >>>
> >>> A minor point but often when onboarding, folks will try things
> verbatim from the website and documentation:
> >>>
> >>>
> https://github.com/search?q=repo%3Aapache%2Fbeam+python3.7+lang%3AMarkdown+&type=code
> >>>
> >>> Granted, the most popular combo there was not present in this search,
> so it's probably not terribly significant, compared to the reason Robert is
> guessing.
> >>>
> >>> Dunno what we can do about that without going all out in specifying
> templated versions to use in our various docs. (That has the different
> problem of ensuring everything being described actually works as typed out,
> and we are not set up to efficiently validate that for every release.)
> >>>
> >>> On 2024/08/26 17:30:23 Robert Bradshaw via dev wrote:
> >>> > So, 3.8 remains the most popular python version per pypi:
> >>> > https://pypistats.org/packages/apache-beam
> >>> >
> >>> > Breaking down by Beam version over the last 90 days we get
> >>> >
> >>> >
> https://docs.google.com/spreadsheets/d/1-PPxZHs17aXvXgdl439tF7IqIs0XUxtDbDxGYcBg92I
> >>> >
> >>> > Which shows that this remains true even for the latest Beam releases.
> >>> > (Interestingly, one of the most popular combinations is the Python
> 3.7 +
> >>> > Beam 2.48. I wonder if people are holding off upgrading Beam due to
> Python
> >>> > 3.7 being dropped...)
> >>> >
> >>> > Of course, the relationship between pypi downloads and actual
> customer
> >>> > usage is not 1:1, but is likely directional at least.
> >>> >
> >>> > On the other hand, I do like the idea of letting the Python EoL
> cycle drive
> >>> > our own supported versions. Given that 3.8 EoL is in October and our
> >>> > release is (hopefully) also in October, what if instead we planned on
> >>> > making 2.60 (tentatively) the last officially supported 3.8 release
> instead
> >>> > of the release in which we drop 3.8 and then see what the stats say
> once
> >>> > Python is officially EoL. Yes, we could just drop it if that's the
> >>> > consensus, but given these usage numbers I don't think the case is
> so clear
> >>> > cut.
> >>> >
> >>> > We could also look at what our dependencies are doing. And if
> supporting
> >>> > 3.8 becomes difficult (e.g. is it being removed from github actions?)
> >>> > that's another reason to do so.
> >>> >
> >>> >
> >>> > [image: Skärmavbild 2024-08-26 kl. 10.08.09 fm.png]
> >>> >
> >>> >
> >>> >
> >>> > On Mon, Aug 26, 2024 at 9:42 AM Robert Burke <rob...@frantil.com>
> wrote:
> >>> >
> >>> > > I'd take care only relying on the most recent release (as much as
> it
> >>> > > supports the consensus point). The most recent beam version is
> inherently
> >>> > > going to have smaller and less consistent numbers, vs N-1 or N-2,
> since
> >>> > > only the most keen or in need updates immediately.
> >>> > >
> >>> > > On Mon, Aug 26, 2024, 9:27 AM Danny McCormick via dev <
> dev@beam.apache.org>
> >>> > > wrote:
> >>> > >
> >>> > >> Was about to respond, Rebo you beat me to it! I agree DockerHub
> is the
> >>> > >> right thing to look at since Pypi reporting isn't awesome, I
> think we
> >>> > >> should only look at the most recent versions though, since 3.8
> will work
> >>> > >> for old versions forever.
> >>> > >>
> >>> > >> For 2.58.0 last month (partial month results), I see:
> >>> > >>
> >>> > >> "Repo","Unique IPs","Pull by tag","Pull by digest","Version check"
> >>> > >> "beam_python312_sdk",151,70,0,410
> >>> > >> "beam_python311_sdk",151,64,0,360
> >>> > >> "beam_python310_sdk",40,97,0,13
> >>> > >> "beam_python3.9_sdk",18,388,0,14
> >>> > >> "beam_python3.8_sdk",36,97,0,2
> >>> > >>
> >>> > >> So it was <10% of pulls (including our automation as Rebo pointed
> out)
> >>> > >>
> >>> > >> I'll join Jack, Kenn, and Rebo and agree dropping support is the
> right
> >>> > >> thing here. The plan SGTM as well.
> >>> > >>
> >>> > >> Thanks,
> >>> > >> Danny
> >>> > >>
> >>> > >> On Mon, Aug 26, 2024 at 5:21 PM Robert Burke <rob...@frantil.com>
> wrote:
> >>> > >>
> >>> > >>> As an approximation we can use the docker container pulls at
> least.
> >>> > >>>
> >>> > >>>
> >>> > >>> Py version : Pulls last week
> >>> > >>>
> >>> > >>> 3.8:  7476
> >>> > >>> 3.9:  1,259
> >>> > >>> 3.10: 6169
> >>> > >>> 3.11: 2999
> >>> > >>> 3.12: 241
> >>> > >>>
> >>> > >>> 3.7: 395
> >>> > >>> 3.6: 241
> >>> > >>> 3.4: 156
> >>> > >>> 2.7: 188
> >>> > >>>
> >>> > >>> But note that any of our automation for 3.8 that pulls
> containers would
> >>> > >>> impact these result too.
> >>> > >>>
> >>> > >>> I will note that Beam dropping 3.8 support shouldn't be a
> problem given
> >>> > >>> the general end of support of 3.8.
> >>> > >>>
> >>> > >>> Users can always upgrade their python version separately from
> the Beam
> >>> > >>> version, and then update the Beam version. Ultimately, the cost
> of the
> >>> > >>> latest and greatest version, is staying up to date.
> >>> > >>>
> >>> > >>>
> >>> > >>> On Mon, Aug 26, 2024, 8:24 AM Kenneth Knowles <k...@apache.org>
> wrote:
> >>> > >>>
> >>> > >>>> SGTM
> >>> > >>>>
> >>> > >>>> Incidentally I poked around on pypi for a minute but didn't
> find even
> >>> > >>>> basic download analytics. Do we have data about usage of Python
> versions?
> >>> > >>>> (this is not pushback - I'm all for turning things down on a
> natural pace
> >>> > >>>> (or faster!); I'm just even *more* for having data around it)
> >>> > >>>>
> >>> > >>>> Kenn
> >>> > >>>>
> >>> > >>>> On Mon, Aug 26, 2024 at 10:59 AM Jack McCluskey via dev <
> >>> > >>>> dev@beam.apache.org> wrote:
> >>> > >>>>
> >>> > >>>>> Hey everyone,
> >>> > >>>>>
> >>> > >>>>> With Python 3.8 reaching end-of-life in October, I've started
> the work
> >>> > >>>>> of removing support in the Beam repository. The aim is to
> target Beam
> >>> > >>>>> release 2.60.0 for this, since the expected release cut date
> is on
> >>> > >>>>> October 2nd, 2024. The start of this effort is at
> >>> > >>>>> https://github.com/apache/beam/pull/32283/, updating our
> GitHub
> >>> > >>>>> Actions workflows. For many workflows like our unit test
> suites this is not
> >>> > >>>>> a large change; the Python version matrix simply omits 3.8 and
> runs on the
> >>> > >>>>> remaining python versions as expected. This is more
> complicated for a
> >>> > >>>>> number of workflows that currently only run on 3.8 or both 3.8
> and 3.12, as
> >>> > >>>>> GitHub will not run the updated actions in the main repository
> until the PR
> >>> > >>>>> updating them is submitted. This can already be seen in some
> workflow runs
> >>> > >>>>> on the PR where Python 3.8 is no longer being installed in the
> runner
> >>> > >>>>> environment, leading to failures.
> >>> > >>>>>
> >>> > >>>>> The current plan is to do as much validation of the new
> workflow files
> >>> > >>>>> as I can before the above PR is submitted (hopefully the week
> after Beam
> >>> > >>>>> Summit,) then focus on getting any potential workflow
> breakages resolved
> >>> > >>>>> before removing the core Python 3.8 support from the package.
> There may be
> >>> > >>>>> some instability with our workflows, and I will try my best to
> resolve
> >>> > >>>>> things as they pop up. This is the first Python version to
> have support
> >>> > >>>>> dropped since we migrated to GitHub Actions, so there's going
> to be a
> >>> > >>>>> decent amount of trial and error as we navigate this. That
> said, if you
> >>> > >>>>> notice problems please let me know! Either file a standalone
> issue and tag
> >>> > >>>>> me on it (@jrmccluskey) or leave a comment on
> >>> > >>>>> https://github.com/apache/beam/issues/31192 so I can take a
> look.
> >>> > >>>>>
> >>> > >>>>> Thanks,
> >>> > >>>>>
> >>> > >>>>> Jack McCluskey
> >>> > >>>>>
> >>> > >>>>> --
> >>> > >>>>>
> >>> > >>>>>
> >>> > >>>>> Jack McCluskey
> >>> > >>>>> SWE - DataPLS PLAT/ Dataflow ML
> >>> > >>>>> RDU
> >>> > >>>>> jrmcclus...@google.com
> >>> > >>>>>
> >>> > >>>>>
> >>> > >>>>>
> >>> >
>

Reply via email to