The current draft of the exit blog post is
https://github.com/apache/beam/pull/15894
Comments are very welcome. I'm going to continue looking for Known issues
(which will be linked to their respective JIRAs) tomorrow.

Since RC1 is getting cycled, I can also go back to the original plan of
v2.33.0, if we'd like to get it out this week.


On Wed, 3 Nov 2021 at 10:17, Robert Burke <[email protected]> wrote:

> Investigation yielded that there's no way around the prefixed tags. The
> JIRA has been commented with the explanation.
>
> https://github.com/apache/beam/pull/15881 has the release script updates.
>
> I'm working on the Exit blogpost and the updated Go SDK roadmap. The draft
> PR will be linked here.
>
> Since 2.34.0 is almost out (assuming RC1 verification goes well) I'm
> inclined to wait for that release to finish before publishing the blogpost.
> I'll link the draft PR here as soon as it's ready.
>
> Once 2.34.0 is released, I'm inclined to still have 2.33.0 be also prefix
> tagged so there isn't a gap in versions between the unmoduled code and
> moduled code.
>
> Once published,  that'll be the end of this thread.
>
> Thank you very much everyone.
>
> Robert Burke
> Beam Go Busybody
>
> On Tue, Oct 26, 2021, 5:36 PM Kyle Weaver <[email protected]> wrote:
>
>> +1 to extra tags. They'll be trivial to add to our release process, and
>> git tags are lightweight by design so I don't foresee any problems.
>>
>> On Tue, Oct 26, 2021 at 5:27 PM Robert Bradshaw <[email protected]>
>> wrote:
>>
>>> Glad you were able to figure it out. The extra tags are certainly
>>> worth making this work if it's what we have to do, and shouldn't be
>>> too much of a problem (until, hopefully, it's fixed on the go side).
>>>
>>> On Tue, Oct 26, 2021 at 4:53 PM Robert Burke <[email protected]>
>>> wrote:
>>> >
>>> > With Kyle's help with the additional tagging of the next RC, we have
>>> validated that this is the currently correct approach.
>>> >
>>> >
>>> https://pkg.go.dev/github.com/apache/beam/sdks/[email protected]/go/pkg/beam?tab=versions
>>> >
>>> https://pkg.go.dev/github.com/apache/beam/sdks/[email protected]/go/pkg/beam
>>> >
>>> > Or even:
>>> > https://pkg.go.dev/github.com/apache/beam/sdks/v2/go/pkg/beam  (links
>>> to latest tagged version)
>>> >
>>> > The main cost to this approach is doubling the number of tags in the
>>> tags list: https://github.com/apache/beam/tags which is not ideal, but
>>> overall a small cost. There's no need for "full publish" of these
>>> additional tags, so we won't be doubling our "releases" (see
>>> https://github.com/apache/beam/releases).
>>> >
>>> > I'll still be filing a bug against the Go commands since the mandatory
>>> prefixing is unintuitive, and seems unnecessary. If it becomes so, we can
>>> always delete the tags from the affected branches, and cease the behavior
>>> going forward. I'll search through the existing Go issues first however to
>>> see if this has been previously discussed, and report my findings here
>>> either way.
>>> >
>>> > This does require 2 small changes to release guide: The rc tagging
>>> script, and the finally tagging:
>>> >
>>> https://github.com/apache/beam/blob/243128a8fc52798e1b58b0cf1a271d95ee7aa241/release/src/main/scripts/choose_rc_commit.sh
>>> >
>>> >
>>> https://github.com/apache/beam/blob/f8660d343fb218cb7acce81ddcc49de0710a0d14/website/www/site/content/en/contribute/release-guide.md#git-tag
>>> >
>>> > I'll make this change later this week (or early next) assuming there
>>> are no objections.
>>> >
>>> > Thank you all very much for your patience,
>>> > Robert Burke
>>> > Beam Go Busybody
>>> >
>>> >
>>> > On 2021/10/26 23:01:00, Robert Burke <[email protected]> wrote:
>>> > > With much research in reading the Go Modules documentation, I have
>>> confirmed what the issue is.
>>> > >
>>> > > We added the go.mod file to sdks/ under the repo root because it's a
>>> cleaner spot for the change, captures the Java and Python container boot
>>> code (written in Go) into the module and avoids conflicts in
>>> interpretations of the vendor directory that lives at the root level.
>>> > >
>>> > > However, we missed that when doing so, the standard version tags
>>> would only apply to modules at the root level, not at modules in
>>> subdirectories. See https://golang.org/ref/mod#vcs-version, but quoting
>>> the important paragraph:
>>> > >
>>> > > > If a module is defined in a subdirectory within the repository,
>>> that is, the module subdirectory portion of
>>> > > > the module path is not empty, then each tag name must be prefixed
>>> with the module subdirectory,
>>> > > > followed by a slash. For example, the module
>>> golang.org/x/tools/gopls is defined in the gopls
>>> > > > subdirectory of the repository with root path golang.org/x/tools.
>>> The version v0.4.0 of that module must > have the tag named gopls/v0.4.0 in
>>> that repository.
>>> > >
>>> > > Specifically, for the Go SDK to be able to be fetched at the right
>>> version, we need to have prefixed tags like "sdks/v2.33.0" or
>>> "sdks/v2.34.0-RC1"
>>> > >
>>> > > So, the fix for the Go versioning issue is to amend our Release
>>> process (including generating Release Candidate builds) to also add a
>>> prefixed version tag with the same version.
>>> > >
>>> > > I can work with Kyle to validate this for 2.34.0 RC1, and if there
>>> are no objections we can back update the 2.33.0 release branch with such a
>>> prefixed tag. At which point I can also write the Official Experiemental
>>> Exit Blog post.
>>> > >
>>> > > Thank you all for your patience.
>>> > > Robert Burke
>>> > >
>>> > > On 2021/10/14 00:00:53, Ahmet Altay <[email protected]> wrote:
>>> > > > Thank you for the detailed update! Let us know if we can help.
>>> > > >
>>> > > > On Wed, Oct 13, 2021 at 2:42 PM Robert Burke <[email protected]>
>>> wrote:
>>> > > >
>>> > > > > This is a status update.
>>> > > > >
>>> > > > > At this point 2.33.0 is released, but there are difficulties with
>>> > > > > accessing the tagged versions using the standard go tools. It's
>>> currently
>>> > > > > under investigation.
>>> > > > >
>>> > > > > Using the v2 path in a go program then running `go mod tidy`
>>> will populate
>>> > > > > the file with  a pseudo-version rather than the latest tag
>>> (v2.33.0)  (eg
>>> > > > > the line looks like
>>> > > > > require github.com/apache/beam/sdks/v2
>>> v2.0.0-20211013181004-a9120e083008
>>> > > > > )
>>> > > > >
>>> > > > > While this will work, it's not the desired experience for users
>>> at this
>>> > > > > point. Current downside is that the releases are not meaningful
>>> targets for
>>> > > > > some reason. However, we retain the other benefits of Go Modules
>>> (actual
>>> > > > > dependency versioning, management by go tools).
>>> > > > >
>>> > > > > The issue is some combination of the go tooling [A] , that we
>>> added a go
>>> > > > > mod file outside of the repo root [B], and that we did not
>>> increment the
>>> > > > > major version (v2 -> v3) when adding the go mod file [C].
>>> > > > >
>>> > > > > [B] From the go documentation, this should be legal and fine,
>>> even if it's
>>> > > > > not recommended. This is fortunate because the root of the repo
>>> would have
>>> > > > > played poorly with root vendor directory, which the go tools
>>> have opinions
>>> > > > > on.
>>> > > > >
>>> > > > > [C] Incrementing the major version is recommended,in the Go
>>> Modules
>>> > > > > documentation, when transitioning to Go Modules. However, it
>>> never said it
>>> > > > > was required, nor did it indicate this current failure mode. If
>>> anything
>>> > > > > this should be documented in those docs, if it's not another
>>> bug. We would
>>> > > > > not necessarily want to declare a global v3 for beam at this
>>> time, for just
>>> > > > > the Go SDK, it would become confusing rather quickly. Notionally
>>> there are
>>> > > > > some larger breaking changes the Java and Python SDKs would want
>>> to make in
>>> > > > > such an event, and thus it's a larger conversation, that is out
>>> of scope at
>>> > > > > this time.
>>> > > > >
>>> > > > > This leaves [A] where some mis-understanding of the documented
>>> semantics
>>> > > > > occurred. I certainly expected the tagged version of the
>>> non-root go-module
>>> > > > > to be inherited from the parent, not wholesale ignored. As a
>>> result, I'll
>>> > > > > be filing a bug against the go tools to determine this, and see
>>> what paths
>>> > > > > forward exist.
>>> > > > >
>>> > > > > It's my hope to resolve this before we write a properly
>>> Experimental Exit
>>> > > > > blog post for the Go SDK.
>>> > > > >
>>> > > > > Thank you for your patience, and time.
>>> > > > > Robert Burke
>>> > > > > Beam Go Busybody
>>> > > > >
>>> > > > >
>>> > > > >
>>> > > > >
>>> > > > > On 2021/08/23 18:12:00, Robert Burke <[email protected]>
>>> wrote:
>>> > > > > > With 2.32 the LICENSE issue has been fixed [1], and the SDK
>>> now uses Go
>>> > > > > Modules for dependency management, simplifying Go SDK
>>> contributions. [2]
>>> > > > > >
>>> > > > > > The Module file lives in the sdks/ directory so there's a
>>> single Go
>>> > > > > Module for the whole SDK, tests, examples, and any support code
>>> for the
>>> > > > > container boot builds. This excludes the Go SDK Code katas [3]
>>> go modules
>>> > > > > which can be updated once 2.33.0 has been released.
>>> > > > > >
>>> > > > > > PR 15365 [4] adds the SDK containers back to the release
>>> builds, and
>>> > > > > default uses the release specific container for docker execution
>>> jobs. For
>>> > > > > at least the 2.33.0 release this does mean that  manual
>>> validation will
>>> > > > > need to explictly specify RC versions of containers. However,
>>> given that
>>> > > > > the Go SDK container and worker boot process rarely changes,
>>> this is
>>> > > > > unlikely to be an issue.
>>> > > > > >
>>> > > > > > At present I'm cleaning up some of the references to
>>> experimental, and
>>> > > > > making it clear that 2.33.0 is the first non-experimental
>>> release (even
>>> > > > > though that's 4-6 weeks out from actual release.) CHANGES.md
>>> will be
>>> > > > > updated to note the event, but a larger blogpost will happen
>>> after the
>>> > > > > release goes public.
>>> > > > > >
>>> > > > > > Cheers,
>>> > > > > > Robert Burke
>>> > > > > > Defacto Beam Go TL.
>>> > > > > >
>>> > > > > > [1]
>>> > > > >
>>> https://pkg.go.dev/github.com/apache/[email protected]+incompatible/sdks/go/pkg/beam
>>> > > > > > [2] https://github.com/apache/beam/pull/15323
>>> > > > > > [3]
>>> https://github.com/apache/beam/tree/master/learning/katas/go
>>> > > > > > [4] https://github.com/apache/beam/pull/15365
>>> > > > > >
>>> > > > > > On 2021/06/28 23:12:19, Ahmet Altay <[email protected]> wrote:
>>> > > > > > > +1, congratulations & thank you!
>>> > > > > > >
>>> > > > > > > On Tue, Jun 22, 2021 at 3:15 PM Robert Burke <
>>> [email protected]>
>>> > > > > wrote:
>>> > > > > > >
>>> > > > > > > > Regarding documentation update: Initial PR is
>>> > > > > > > > https://github.com/apache/beam/pull/15057 which goes up
>>> to section
>>> > > > > ~4.3.
>>> > > > > > > > JIRA link for Programing Guide changes:
>>> > > > > > > > https://issues.apache.org/jira/browse/BEAM-12513
>>> > > > > > > >
>>> > > > > > > >
>>> > > > > > > > On 2021/06/17 14:58:54, Robert Burke <[email protected]>
>>> wrote:
>>> > > > > > > > > Yup!
>>> > > > > > > > >
>>> > > > > > > > > My immediate plan is to work on incorporating the Go SDK
>>> fully
>>> > > > > into the
>>> > > > > > > > > Beam Programming Guide. I've audited the guide, and
>>> > > > > > > > > am beginning to add missing content and filling in the
>>> Go specific
>>> > > > > gaps.
>>> > > > > > > > > This will be tied to improving the Go Doc with more Go
>>> > > > > > > > > specific user documentation that isn't appropriate for
>>> the BPG.
>>> > > > > > > > >
>>> > > > > > > > > My audit of the guide is here:
>>> > > > > > > > >
>>> > > > > > > >
>>> > > > >
>>> https://docs.google.com/spreadsheets/d/1DrBFjxPBmMMmPfeFr6jr_JndxGOes8qDqKZ2Uxwvvds/edit?resourcekey=0-tVFwcLrQ2v2jpZkHk6QOpQ#gid=2072310090
>>> > > > > > > > >
>>> > > > > > > > > The other sheets focus on features and tests. The
>>> feature page
>>> > > > > looks
>>> > > > > > > > worse
>>> > > > > > > > > than it is, as it was more productive to focus on what
>>> isn't
>>> > > > > available
>>> > > > > > > > than
>>> > > > > > > > > what is. That's a snapshot of my actual working sheet
>>> but I'll be
>>> > > > > > > > updating
>>> > > > > > > > > it as needed.
>>> > > > > > > > >
>>> > > > > > > > > On Thu, Jun 17, 2021, 6:23 AM Ismaël Mejía <
>>> [email protected]>
>>> > > > > wrote:
>>> > > > > > > > >
>>> > > > > > > > > > Oups forgot to write one question. Will this come with
>>> revamped
>>> > > > > > > > > > website instructions/doc for golang too?
>>> > > > > > > > > >
>>> > > > > > > > > > On Thu, Jun 17, 2021 at 3:21 PM Ismaël Mejía <
>>> [email protected]>
>>> > > > > > > > wrote:
>>> > > > > > > > > > >
>>> > > > > > > > > > > Huge +1
>>> > > > > > > > > > >
>>> > > > > > > > > > > This is definitely something many people have asked
>>> about, so
>>> > > > > it is
>>> > > > > > > > > > > great to see it finally happening.
>>> > > > > > > > > > >
>>> > > > > > > > > > > On Wed, Jun 16, 2021 at 7:56 PM Kenneth Knowles <
>>> > > > > [email protected]>
>>> > > > > > > > wrote:
>>> > > > > > > > > > > >
>>> > > > > > > > > > > > +1 awesome
>>> > > > > > > > > > > >
>>> > > > > > > > > > > > On Wed, Jun 16, 2021 at 10:33 AM Robert Burke <
>>> > > > > [email protected]
>>> > > > > > > > >
>>> > > > > > > > > > wrote:
>>> > > > > > > > > > > >>
>>> > > > > > > > > > > >> Sounds reasonable to me. I agree. We'll aim to
>>> get those (Go
>>> > > > > > > > modules
>>> > > > > > > > > > and LICENSE issue) done before the 2.32 cut, and
>>> certainly
>>> > > > > before the
>>> > > > > > > > 2.33
>>> > > > > > > > > > cut if release images aren't added to the 2.32 process.
>>> > > > > > > > > > > >>
>>> > > > > > > > > > > >> Regarding Go Generics: at some point in the
>>> future, we may
>>> > > > > want a
>>> > > > > > > > > > harder break between a newer Generic first API and and
>>> the
>>> > > > > current
>>> > > > > > > > version,
>>> > > > > > > > > > but there's no rush. Generics/TypeParameters in Go
>>> aren't
>>> > > > > identical to
>>> > > > > > > > the
>>> > > > > > > > > > feature referred to by that term in Java, C++, Rust,
>>> etc, so
>>> > > > > it'll
>>> > > > > > > > take a
>>> > > > > > > > > > bit of time for that expertise to develop.
>>> > > > > > > > > > > >>
>>> > > > > > > > > > > >> However, by the current nature of Go, we had to
>>> have pretty
>>> > > > > > > > > > sophisticated reflective analysis to handle DoFns and
>>> map them
>>> > > > > to their
>>> > > > > > > > > > graph inputs. So, adding new helpers like a KV,
>>> emitter, and
>>> > > > > Iterator
>>> > > > > > > > > > types, shouldn't be too difficult. Changing Go SDK
>>> internals to
>>> > > > > use
>>> > > > > > > > > > generics (like the implementation of Stats DoFns like
>>> Min, Max,
>>> > > > > etc)
>>> > > > > > > > would
>>> > > > > > > > > > also be able to be made transparently to most users,
>>> and
>>> > > > > certainly any
>>> > > > > > > > of
>>> > > > > > > > > > the framework for execution time handling (the
>>> "worker's SDK
>>> > > > > harness")
>>> > > > > > > > > > would be able to be cleaned up if need be. Finally,
>>> adding more
>>> > > > > > > > > > sophisticated DoFn registration and code generation
>>> would be
>>> > > > > able to
>>> > > > > > > > > > replace the optional code generator entirely, saving
>>> some users
>>> > > > > a `go
>>> > > > > > > > > > generate` step, simplifying getting improved execution
>>> > > > > performance.
>>> > > > > > > > > > > >>
>>> > > > > > > > > > > >> Changing things like making a Type Parameterized
>>> > > > > PCollection,
>>> > > > > > > > would
>>> > > > > > > > > > be far more involved, as would trying to use some kind
>>> of Apply
>>> > > > > > > > format. The
>>> > > > > > > > > > lack of Method Overrides prevents the apply chaining
>>> approach.
>>> > > > > Or at
>>> > > > > > > > least
>>> > > > > > > > > > prevents it from working simply.
>>> > > > > > > > > > > >>
>>> > > > > > > > > > > >> Finally, Go Generics won't be available until Go
>>> 1.18,
>>> > > > > which isn't
>>> > > > > > > > > > until next year. See
>>> https://blog.golang.org/generics-proposal
>>> > > > > for
>>> > > > > > > > > > details.
>>> > > > > > > > > > > >>
>>> > > > > > > > > > > >> Go 1.17 https://tip.golang.org/doc/go1.17 does
>>> include a
>>> > > > > Register
>>> > > > > > > > > > calling convention, leading to a modest performance
>>> improvement
>>> > > > > across
>>> > > > > > > > the
>>> > > > > > > > > > board.
>>> > > > > > > > > > > >>
>>> > > > > > > > > > > >> Cheers,
>>> > > > > > > > > > > >> Robert Burke
>>> > > > > > > > > > > >>
>>> > > > > > > > > > > >> On 2021/06/15 18:10:46, Robert Bradshaw <
>>> > > > > [email protected]>
>>> > > > > > > > wrote:
>>> > > > > > > > > > > >> > +1 to declaring Golang support out of
>>> experimental once
>>> > > > > the Go
>>> > > > > > > > > > Modules
>>> > > > > > > > > > > >> > issues are solved. I don't think an SDK needs
>>> to support
>>> > > > > every
>>> > > > > > > > > > feature
>>> > > > > > > > > > > >> > to be accepted, especially now that we can do
>>> > > > > cross-language
>>> > > > > > > > > > > >> > transforms, and Go definitely supports enough
>>> to be quite
>>> > > > > > > > useful.
>>> > > > > > > > > > (WRT
>>> > > > > > > > > > > >> > streaming, my understanding is that Go supports
>>> the
>>> > > > > streaming
>>> > > > > > > > model
>>> > > > > > > > > > > >> > with windows and timestamps, and runs fine on a
>>> streaming
>>> > > > > > > > runner,
>>> > > > > > > > > > even
>>> > > > > > > > > > > >> > if more advanced features like state and timers
>>> aren't yet
>>> > > > > > > > > > available.)
>>> > > > > > > > > > > >> >
>>> > > > > > > > > > > >> > This is a great milestone.
>>> > > > > > > > > > > >> >
>>> > > > > > > > > > > >> > On Tue, Jun 15, 2021 at 10:12 AM Tyson Hamilton
>>> <
>>> > > > > > > > [email protected]>
>>> > > > > > > > > > wrote:
>>> > > > > > > > > > > >> > >
>>> > > > > > > > > > > >> > > WOW! Big news.
>>> > > > > > > > > > > >> > >
>>> > > > > > > > > > > >> > > I'm supportive of leaving experimental status
>>> after Go
>>> > > > > Modules
>>> > > > > > > > > > are completed and the LICENSE issue is resolved. I
>>> don't think
>>> > > > > that
>>> > > > > > > > lacking
>>> > > > > > > > > > streaming support is a blocker. The other thing I
>>> checked to see
>>> > > > > was if
>>> > > > > > > > > > there were metrics available on
>>> metrics.beam.apache.org,
>>> > > > > specifically
>>> > > > > > > > for
>>> > > > > > > > > > measuring code health via post-commit over time, which
>>> there are
>>> > > > > and
>>> > > > > > > > the
>>> > > > > > > > > > passing test rate is high (Huzzah!). The one thing that
>>> > > > > surprised me
>>> > > > > > > > from
>>> > > > > > > > > > your summary is that when Go introduces generics it
>>> won't result
>>> > > > > in any
>>> > > > > > > > > > backwards incompatible changes in Apache Beam. That's
>>> great
>>> > > > > news, but
>>> > > > > > > > does
>>> > > > > > > > > > it mean there will be a need to support both
>>> non-generic and
>>> > > > > generic
>>> > > > > > > > APIs
>>> > > > > > > > > > moving forward? It seems like generics will be
>>> introduced in the
>>> > > > > Go
>>> > > > > > > > 1.17
>>> > > > > > > > > > release (optimistically) in August this year.
>>> > > > > > > > > > > >> > >
>>> > > > > > > > > > > >> > >
>>> > > > > > > > > > > >> > >
>>> > > > > > > > > > > >> > > On Thu, Jun 10, 2021 at 5:04 PM Robert Burke <
>>> > > > > > > > [email protected]>
>>> > > > > > > > > > wrote:
>>> > > > > > > > > > > >> > >>
>>> > > > > > > > > > > >> > >> Hello Beam Community!
>>> > > > > > > > > > > >> > >>
>>> > > > > > > > > > > >> > >> I propose we stop calling the Apache Beam Go
>>> SDK
>>> > > > > > > > experimental.
>>> > > > > > > > > > > >> > >>
>>> > > > > > > > > > > >> > >> This thread is to discuss it as a community,
>>> and any
>>> > > > > > > > conditions
>>> > > > > > > > > > that remain that would prevent the exit.
>>> > > > > > > > > > > >> > >>
>>> > > > > > > > > > > >> > >> tl;dr;
>>> > > > > > > > > > > >> > >> Ask Questions for answers and links! I have
>>> both.
>>> > > > > > > > > > > >> > >> This entails including it officially in the
>>> Release
>>> > > > > process,
>>> > > > > > > > > > removing the various "experimental" text throughout
>>> the repo etc,
>>> > > > > > > > > > > >> > >> and otherwise treating it like Python and
>>> Java. Some Go
>>> > > > > > > > specific
>>> > > > > > > > > > tasks around dep versioning.
>>> > > > > > > > > > > >> > >>
>>> > > > > > > > > > > >> > >> The Go SDK implements the beam model
>>> efficiently for
>>> > > > > most
>>> > > > > > > > batch
>>> > > > > > > > > > tasks, including basic windowing.
>>> > > > > > > > > > > >> > >> Apache Beam Go jobs can execute, and are
>>> tested on all
>>> > > > > > > > Portable
>>> > > > > > > > > > runners.
>>> > > > > > > > > > > >> > >> The core APIs are not going to change in
>>> incompatible
>>> > > > > ways
>>> > > > > > > > going
>>> > > > > > > > > > forward.
>>> > > > > > > > > > > >> > >> Scalable transforms can be written through
>>> > > > > SplittableDoFns or
>>> > > > > > > > > > via Cross Language transforms.
>>> > > > > > > > > > > >> > >>
>>> > > > > > > > > > > >> > >> The SDK isn't 100% feature complete, but
>>> keeping it
>>> > > > > > > > experimental
>>> > > > > > > > > > doesn't help with that any further.
>>> > > > > > > > > > > >> > >> Communities grow through contributions and
>>> use, and
>>> > > > > > > > experimental
>>> > > > > > > > > > markers dissuade users.
>>> > > > > > > > > > > >> > >> There's plenty to do in order expand what
>>> can be done
>>> > > > > with
>>> > > > > > > > the
>>> > > > > > > > > > SDK. (Contributions welcome)
>>> > > > > > > > > > > >> > >>
>>> > > > > > > > > > > >> > >> Why Exit Experimental now?
>>> > > > > > > > > > > >> > >>
>>> > > > > > > > > > > >> > >> Typically when we call an SDK or API
>>> Experimental, it's
>>> > > > > > > > because
>>> > > > > > > > > > there's a risk that API or behaviors may change
>>> significantly.
>>> > > > > > > > > > > >> > >> This in turn, leads to additional work for
>>> users of
>>> > > > > the SDK
>>> > > > > > > > on
>>> > > > > > > > > > every release which leads to sticking to older
>>> versions or
>>> > > > > forking
>>> > > > > > > > > > > >> > >> to preserve behavior. Version updates should
>>> be looked
>>> > > > > > > > forward
>>> > > > > > > > > > to, and viewed as having little risk. Further while
>>> there's been
>>> > > > > > > > > > > >> > >> previous dicussion about what the "low bar"
>>> is for a
>>> > > > > new
>>> > > > > > > > SDK, it
>>> > > > > > > > > > hasn't been summarily applied to the Go SDK. I feel
>>> this has
>>> > > > > > > > > > > >> > >> hurt development and contribution of new SDK
>>> languages
>>> > > > > > > > (inherent
>>> > > > > > > > > > difficulty of SDK development notwithstanding).
>>> > > > > > > > > > > >> > >>
>>> > > > > > > > > > > >> > >> When the SDK was designed, it wasn't
>>> entirely clear
>>> > > > > what the
>>> > > > > > > > > > Beam Model should look like in an opinionated language
>>> like Go.
>>> > > > > > > > > > > >> > >> Their initial take (see
>>> > > > > > > > > > https://s.apache.org/beam-go-sdk-design-rfc [0]) goes
>>> into
>>> > > > > detail
>>> > > > > > > > what it
>>> > > > > > > > > > means for a language without
>>> > > > > > > > > > > >> > >> Generics, or overloading, or inheritance to
>>> implement
>>> > > > > the
>>> > > > > > > > beam
>>> > > > > > > > > > model. One could largely throw away static types (like
>>> Python),
>>> > > > > > > > > > > >> > >> but this approach rings hollow for Go. It
>>> would not do
>>> > > > > if the
>>> > > > > > > > > > approach couldn't grow and scale to the Beam Model.
>>> It's also
>>> > > > > hard
>>> > > > > > > > > > > >> > >> to tell if an API is any good before there
>>> are users.
>>> > > > > > > > > > > >> > >>
>>> > > > > > > > > > > >> > >> Further, in the early days of Portability,
>>> there
>>> > > > > wasn't a
>>> > > > > > > > way to
>>> > > > > > > > > > write scalable DoFns, dynamically or otherwise. It's an
>>> > > > > incredible
>>> > > > > > > > > > > >> > >> bottleneck to need to do all initial fanout
>>> of work on
>>> > > > > a
>>> > > > > > > > single
>>> > > > > > > > > > machine, write everything to a Reshuffle, just in
>>> order to scale
>>> > > > > up.
>>> > > > > > > > > > > >> > >> Without being able to scale, Beam is little
>>> more than
>>> > > > > > > > overhead.
>>> > > > > > > > > > > >> > >>
>>> > > > > > > > > > > >> > >> At this point, both of these needs are met
>>> within the
>>> > > > > Go SDK
>>> > > > > > > > for
>>> > > > > > > > > > open source.
>>> > > > > > > > > > > >> > >>
>>> > > > > > > > > > > >> > >> Background
>>> > > > > > > > > > > >> > >>
>>> > > > > > > > > > > >> > >> The Go SDK has been a part of the beam repo
>>> for a few
>>> > > > > years
>>> > > > > > > > now,
>>> > > > > > > > > > since it was accidentally merged into master.
>>> > > > > > > > > > > >> > >> Since then it's been called experimental,
>>> and not
>>> > > > > officially
>>> > > > > > > > > > part of the releases.
>>> > > > > > > > > > > >> > >>
>>> > > > > > > > > > > >> > >> Of the SDKs, it's was always designed around
>>> Beam
>>> > > > > Portability
>>> > > > > > > > > > first. It never had any "Legacy" (SDK x Runner
>>> specific )
>>> > > > > workers.
>>> > > > > > > > > > > >> > >> It's always used the Beam Pipeline protos
>>> and FnAPI to
>>> > > > > > > > execute
>>> > > > > > > > > > jobs, first with some very experimental code on
>>> Dataflow, but now
>>> > > > > > > > > > > >> > >> on all portable supported runners, like
>>> Flink, Spark,
>>> > > > > the
>>> > > > > > > > Python
>>> > > > > > > > > > Portable runner, and Dataflow.
>>> > > > > > > > > > > >> > >>
>>> > > > > > > > > > > >> > >> API Stability
>>> > > > > > > > > > > >> > >>
>>> > > > > > > > > > > >> > >> The Go SDK hasn't meaningfully changed it's
>>> user API
>>> > > > > for DoFn
>>> > > > > > > > > > and pipeline construction since it was first merged
>>> in, and
>>> > > > > there are
>>> > > > > > > > no
>>> > > > > > > > > > > >> > >> changes to that on the horizon that can't be
>>> made in a
>>> > > > > > > > backwards
>>> > > > > > > > > > compatible manner. Largely these are related to New
>>> Features, or
>>> > > > > > > > > > > >> > >> usability improvements enabled by the advent
>>> of Go
>>> > > > > Generics
>>> > > > > > > > > > (think of "real" KV, emitter, and iterator types).
>>> > > > > > > > > > > >> > >>
>>> > > > > > > > > > > >> > >> It's an open secret that the Go SDK has
>>> largely been
>>> > > > > under
>>> > > > > > > > work
>>> > > > > > > > > > for use within Google. It's use is called FlumeGo,
>>> representing
>>> > > > > > > > > > > >> > >> the Apache Beam Go SDK, running on top of
>>> Flume,
>>> > > > > Google's
>>> > > > > > > > batch
>>> > > > > > > > > > pipeline processing engine. Thus most of the focus on
>>> improving
>>> > > > > > > > > > > >> > >> batch execution. FlumeGo sees ample use
>>> today, and
>>> > > > > there
>>> > > > > > > > hasn't
>>> > > > > > > > > > been a call for fundamental changes to the API for
>>> ergonomic or
>>> > > > > > > > > > > >> > >> usability concerns.
>>> > > > > > > > > > > >> > >>
>>> > > > > > > > > > > >> > >> Scalability
>>> > > > > > > > > > > >> > >>
>>> > > > > > > > > > > >> > >> Google could get away without the Go SDK
>>> having an SDK
>>> > > > > side
>>> > > > > > > > > > scalability solution as a result of it's integration
>>> with Flume.
>>> > > > > > > > > > > >> > >> However, those days are now past.
>>> > > > > > > > > > > >> > >>
>>> > > > > > > > > > > >> > >> The Go SDK now supports SplittableDoFns
>>> along with
>>> > > > > Dynamic
>>> > > > > > > > > > Splitting, which supports writing scalable batch
>>> transforms
>>> > > > > natively
>>> > > > > > > > > > > >> > >> in the Go SDK.
>>> > > > > > > > > > > >> > >> The SDK also supports Cross Language
>>> Transforms, with
>>> > > > > Beam
>>> > > > > > > > > > Schema encodings. With it, production hardened
>>> transforms
>>> > > > > > > > > > > >> > >> from Java and Python are a wrapper away.
>>> > > > > > > > > > > >> > >>
>>> > > > > > > > > > > >> > >> Presently, Daniel Oliveira (who implemented
>>> the SDF
>>> > > > > side
>>> > > > > > > > work,
>>> > > > > > > > > > and completed the Xlang work,) is adding a wrapper for
>>> the
>>> > > > > > > > > > > >> > >> Java Kafka IO using Cross Language
>>> Transforms, which
>>> > > > > is often
>>> > > > > > > > > > been requested. This will also enable use of the Beam
>>> SQL
>>> > > > > > > > > > > >> > >> transforms that java enables.
>>> > > > > > > > > > > >> > >>
>>> > > > > > > > > > > >> > >> Features
>>> > > > > > > > > > > >> > >>
>>> > > > > > > > > > > >> > >> The Go SDK implements the Beam C=core. The
>>> Go SDK
>>> > > > > implements
>>> > > > > > > > > > standard coders, allows for user DoFns, and CombineFns
>>> and access
>>> > > > > > > > > > > >> > >> to core transforms like Flatten, GroupByKey,
>>> and
>>> > > > > features
>>> > > > > > > > like
>>> > > > > > > > > > Side Inputs, Windowing, and User Metrics.
>>> > > > > > > > > > > >> > >> Basic windowing will be fully supported for
>>> batch even
>>> > > > > > > > through
>>> > > > > > > > > > lifted combines in the 2.32.0 release.
>>> > > > > > > > > > > >> > >>
>>> > > > > > > > > > > >> > >> All of the above enables Beam Go to be
>>> versatile for
>>> > > > > batch
>>> > > > > > > > > > execution on portable runners, and for simple streaming
>>> > > > > pipelines.
>>> > > > > > > > > > > >> > >>
>>> > > > > > > > > > > >> > >> Repo Testing
>>> > > > > > > > > > > >> > >>
>>> > > > > > > > > > > >> > >> On precommit the Go SDK runs all it's unit
>>> tests. On
>>> > > > > top of
>>> > > > > > > > > > that, it runs all it's integration tests against the
>>> Python
>>> > > > > Portable
>>> > > > > > > > runner,
>>> > > > > > > > > > > >> > >> making it quick and robust to detect
>>> breaking changes
>>> > > > > without
>>> > > > > > > > > > overspending community resources. Those same tests are
>>> also
>>> > > > > > > > > > > >> > >> run against Dataflow, Flink, and Spark.
>>> > > > > > > > > > > >> > >>
>>> > > > > > > > > > > >> > >> The tests are executable against all runners
>>> via the
>>> > > > > > > > appropriate
>>> > > > > > > > > > Go commands (if you've stood up your own job
>>> management server),
>>> > > > > > > > > > > >> > >> or Gradle commands (which will spin up runner
>>> > > > > instances for
>>> > > > > > > > > > you). Documentation for executing tests and adding new
>>> ones
>>> > > > > > > > > > > >> > >> is on the wiki. [2] They are accessible to Go
>>> > > > > developers as
>>> > > > > > > > > > they're implemented with the standard Go testing tools.
>>> > > > > > > > > > > >> > >>
>>> > > > > > > > > > > >> > >> Shortcomings
>>> > > > > > > > > > > >> > >> That said, there's still much to do. Let me
>>> briefly
>>> > > > > tell you
>>> > > > > > > > > > what doesn't work, and it's up to you to weigh whether
>>> they block
>>> > > > > > > > > > > >> > >> being out of experimental.
>>> > > > > > > > > > > >> > >>
>>> > > > > > > > > > > >> > >> At present, only a textio has been
>>> implemented as
>>> > > > > Splittable
>>> > > > > > > > > > DoFn.
>>> > > > > > > > > > > >> > >> Once the Kafka wrapper is merged in, it will
>>> serve as
>>> > > > > a the
>>> > > > > > > > > > first example for future contributions for
>>> > > > > > > > > > > >> > >> new transform wrappers for the Go SDK.
>>> > > > > > > > > > > >> > >> Transforms and IOs are lacking, but at this
>>> point
>>> > > > > users are
>>> > > > > > > > > > empowered to write their own DoFns or wrap existing
>>> transforms
>>> > > > > for
>>> > > > > > > > Cross
>>> > > > > > > > > > Language use.
>>> > > > > > > > > > > >> > >>
>>> > > > > > > > > > > >> > >> In the core SDK, more streaming focused
>>> features have
>>> > > > > yet to
>>> > > > > > > > be
>>> > > > > > > > > > implemented, but they're largely additions to what
>>> exists already
>>> > > > > > > > > > > >> > >> rather than total rebuilds. Much of the work
>>> is
>>> > > > > definining
>>> > > > > > > > how a
>>> > > > > > > > > > user specifies their desires, and turning those into
>>> the
>>> > > > > appropriate
>>> > > > > > > > > > > >> > >> FnAPI requests at execution time. Back in
>>> October I
>>> > > > > wrote at
>>> > > > > > > > > > length on the wiki [1] what's missing for additional
>>> streaming
>>> > > > > > > > features.
>>> > > > > > > > > > > >> > >>
>>> > > > > > > > > > > >> > >> While we have bolstered our testing
>>> recently, there's
>>> > > > > likely
>>> > > > > > > > > > still more we could test to improve our confidence in
>>> the SDK,
>>> > > > > > > > > > > >> > >> in particular regarding the included
>>> transforms
>>> > > > > libraries and
>>> > > > > > > > > > examples.
>>> > > > > > > > > > > >> > >>
>>> > > > > > > > > > > >> > >> Moving Forward
>>> > > > > > > > > > > >> > >>
>>> > > > > > > > > > > >> > >> My immediate plan is to work on
>>> incorporating the Go
>>> > > > > SDK
>>> > > > > > > > fully
>>> > > > > > > > > > into the Beam Programming Guide. I've audited the
>>> guide [3], and
>>> > > > > > > > > > > >> > >> am beginning to add missing content and
>>> filling in the
>>> > > > > Go
>>> > > > > > > > > > specific gaps. This will be tied to improving the Go
>>> Doc with
>>> > > > > more Go
>>> > > > > > > > > > > >> > >> specific user documentation that isn't
>>> appropriate for
>>> > > > > the
>>> > > > > > > > BPG.
>>> > > > > > > > > > > >> > >> And resolving the LICENSE issue around the
>>> public
>>> > > > > display of
>>> > > > > > > > > > that GoDoc.
>>> > > > > > > > > > > >> > >>
>>> > > > > > > > > > > >> > >> If this proposal is accepted by a binding
>>> vote, I will
>>> > > > > > > > > > incorporate the SDK into the release process, and
>>> remove the
>>> > > > > > > > "experimental"
>>> > > > > > > > > > > >> > >> language around the SDK. This largely
>>> entails updating
>>> > > > > the
>>> > > > > > > > > > release scripts to also build and publish the Go SDK
>>> Docker
>>> > > > > containers.
>>> > > > > > > > > > > >> > >> As for releasing the code, we're technically
>>> already
>>> > > > > doing so
>>> > > > > > > > > > whenever we tag a release branch [4].
>>> > > > > > > > > > > >> > >>
>>> > > > > > > > > > > >> > >> The clearest signal to the Go community
>>> however will be
>>> > > > > > > > > > migrating the SDK to use Go Modules for dependency
>>> version
>>> > > > > control,
>>> > > > > > > > > > > >> > >> which Daniel is planning on working on after
>>> his Kafka
>>> > > > > task.
>>> > > > > > > > > > This will put our repo infrastructure, SDK
>>> contributors, and
>>> > > > > users
>>> > > > > > > > > > > >> > >> on the same footing when it comes to
>>> dependency
>>> > > > > management.
>>> > > > > > > > It
>>> > > > > > > > > > will remove the "+incompatible" tags one sees on the
>>> > > > > > > > > > > >> > >> pkg.go.dev list at [4].
>>> > > > > > > > > > > >> > >>
>>> > > > > > > > > > > >> > >> I'm very happy to answer any questions you
>>> might have
>>> > > > > about
>>> > > > > > > > the
>>> > > > > > > > > > SDK, and provide additional links as needed. I
>>> intentionally
>>> > > > > avoided
>>> > > > > > > > > > > >> > >> a link barrage in this email, as they can
>>> distract
>>> > > > > from the
>>> > > > > > > > > > point: The SDK is ready for folks to use it, we need
>>> to tell
>>> > > > > them that
>>> > > > > > > > they
>>> > > > > > > > > > can
>>> > > > > > > > > > > >> > >> rather than they shouldn't.
>>> > > > > > > > > > > >> > >>
>>> > > > > > > > > > > >> > >> Robert Burke
>>> > > > > > > > > > > >> > >> Defacto Beam Go TL
>>> > > > > > > > > > > >> > >>
>>> > > > > > > > > > > >> > >> [0]
>>> https://s.apache.org/beam-go-sdk-design-rfc
>>> > > > > > > > > > > >> > >> [1]
>>> > > > > > > > > >
>>> > > > > > > >
>>> > > > >
>>> https://cwiki.apache.org/confluence/display/BEAM/Supporting+Streaming+in+the+Go+SDK
>>> > > > > > > > > > > >> > >> [2]
>>> > > > > https://cwiki.apache.org/confluence/display/BEAM/Go+Tips
>>> > > > > > > > > > > >> > >> [3]
>>> > > > > > > > > >
>>> > > > > > > >
>>> > > > >
>>> https://docs.google.com/spreadsheets/d/1DrBFjxPBmMMmPfeFr6jr_JndxGOes8qDqKZ2Uxwvvds/edit?resourcekey=0-tVFwcLrQ2v2jpZkHk6QOpQ#gid=2072310090
>>> > > > > > > > > > (SDK Audit sheet)
>>> > > > > > > > > > > >> > >> [4]
>>> > > > > > > > > >
>>> > > > > > > >
>>> > > > >
>>> https://pkg.go.dev/github.com/apache/beam/sdks/go/pkg/beam?tab=versions
>>> > > > > > > > > > > >> >
>>> > > > > > > > > >
>>> > > > > > > > >
>>> > > > > > > >
>>> > > > > > >
>>> > > > > >
>>> > > > >
>>> > > >
>>> > >
>>>
>>

Reply via email to