It's my great pleasure to announce that the Apache Beam Go SDK is no longer
experimental. https://beam.apache.org/blog/go-sdk-release/

Thank you everyone.
Robert Burke
Beam Go Busybody

On Thu, Nov 4, 2021, 6:29 PM Robert Burke <[email protected]> wrote:

> At this point I just need an LGTM on the blog post PR, as the draft is
> finalized.
>
> Udi added the sdks/v2.33.0 tag which works as expected. I've also verified
> that the appropriate container is used by default when not specified which
> is the last unknown in this process.
>
> Who's ready to release a new SDK? I am!
>
>  https://github.com/apache/beam/pull/15894 (or join the exciting reaction
> emoji on the top post).
>
>
>
> On Wed, Nov 3, 2021, 8:37 PM Robert Burke <[email protected]> wrote:
>
>> The current draft of the exit blog post is
>> https://github.com/apache/beam/pull/15894
>> Comments are very welcome. I'm going to continue looking for Known issues
>> (which will be linked to their respective JIRAs) tomorrow.
>>
>> Since RC1 is getting cycled, I can also go back to the original plan of
>> v2.33.0, if we'd like to get it out this week.
>>
>>
>> On Wed, 3 Nov 2021 at 10:17, Robert Burke <[email protected]> wrote:
>>
>>> Investigation yielded that there's no way around the prefixed tags. The
>>> JIRA has been commented with the explanation.
>>>
>>> https://github.com/apache/beam/pull/15881 has the release script
>>> updates.
>>>
>>> I'm working on the Exit blogpost and the updated Go SDK roadmap. The
>>> draft PR will be linked here.
>>>
>>> Since 2.34.0 is almost out (assuming RC1 verification goes well) I'm
>>> inclined to wait for that release to finish before publishing the blogpost.
>>> I'll link the draft PR here as soon as it's ready.
>>>
>>> Once 2.34.0 is released, I'm inclined to still have 2.33.0 be also
>>> prefix tagged so there isn't a gap in versions between the unmoduled code
>>> and moduled code.
>>>
>>> Once published,  that'll be the end of this thread.
>>>
>>> Thank you very much everyone.
>>>
>>> Robert Burke
>>> Beam Go Busybody
>>>
>>> On Tue, Oct 26, 2021, 5:36 PM Kyle Weaver <[email protected]> wrote:
>>>
>>>> +1 to extra tags. They'll be trivial to add to our release process, and
>>>> git tags are lightweight by design so I don't foresee any problems.
>>>>
>>>> On Tue, Oct 26, 2021 at 5:27 PM Robert Bradshaw <[email protected]>
>>>> wrote:
>>>>
>>>>> Glad you were able to figure it out. The extra tags are certainly
>>>>> worth making this work if it's what we have to do, and shouldn't be
>>>>> too much of a problem (until, hopefully, it's fixed on the go side).
>>>>>
>>>>> On Tue, Oct 26, 2021 at 4:53 PM Robert Burke <[email protected]>
>>>>> wrote:
>>>>> >
>>>>> > With Kyle's help with the additional tagging of the next RC, we have
>>>>> validated that this is the currently correct approach.
>>>>> >
>>>>> >
>>>>> https://pkg.go.dev/github.com/apache/beam/sdks/[email protected]/go/pkg/beam?tab=versions
>>>>> >
>>>>> https://pkg.go.dev/github.com/apache/beam/sdks/[email protected]/go/pkg/beam
>>>>> >
>>>>> > Or even:
>>>>> > https://pkg.go.dev/github.com/apache/beam/sdks/v2/go/pkg/beam
>>>>> (links to latest tagged version)
>>>>> >
>>>>> > The main cost to this approach is doubling the number of tags in the
>>>>> tags list: https://github.com/apache/beam/tags which is not ideal,
>>>>> but overall a small cost. There's no need for "full publish" of these
>>>>> additional tags, so we won't be doubling our "releases" (see
>>>>> https://github.com/apache/beam/releases).
>>>>> >
>>>>> > I'll still be filing a bug against the Go commands since the
>>>>> mandatory prefixing is unintuitive, and seems unnecessary. If it becomes
>>>>> so, we can always delete the tags from the affected branches, and cease 
>>>>> the
>>>>> behavior going forward. I'll search through the existing Go issues first
>>>>> however to see if this has been previously discussed, and report my
>>>>> findings here either way.
>>>>> >
>>>>> > This does require 2 small changes to release guide: The rc tagging
>>>>> script, and the finally tagging:
>>>>> >
>>>>> https://github.com/apache/beam/blob/243128a8fc52798e1b58b0cf1a271d95ee7aa241/release/src/main/scripts/choose_rc_commit.sh
>>>>> >
>>>>> >
>>>>> https://github.com/apache/beam/blob/f8660d343fb218cb7acce81ddcc49de0710a0d14/website/www/site/content/en/contribute/release-guide.md#git-tag
>>>>> >
>>>>> > I'll make this change later this week (or early next) assuming there
>>>>> are no objections.
>>>>> >
>>>>> > Thank you all very much for your patience,
>>>>> > Robert Burke
>>>>> > Beam Go Busybody
>>>>> >
>>>>> >
>>>>> > On 2021/10/26 23:01:00, Robert Burke <[email protected]> wrote:
>>>>> > > With much research in reading the Go Modules documentation, I have
>>>>> confirmed what the issue is.
>>>>> > >
>>>>> > > We added the go.mod file to sdks/ under the repo root because it's
>>>>> a cleaner spot for the change, captures the Java and Python container boot
>>>>> code (written in Go) into the module and avoids conflicts in
>>>>> interpretations of the vendor directory that lives at the root level.
>>>>> > >
>>>>> > > However, we missed that when doing so, the standard version tags
>>>>> would only apply to modules at the root level, not at modules in
>>>>> subdirectories. See https://golang.org/ref/mod#vcs-version, but
>>>>> quoting the important paragraph:
>>>>> > >
>>>>> > > > If a module is defined in a subdirectory within the repository,
>>>>> that is, the module subdirectory portion of
>>>>> > > > the module path is not empty, then each tag name must be
>>>>> prefixed with the module subdirectory,
>>>>> > > > followed by a slash. For example, the module
>>>>> golang.org/x/tools/gopls is defined in the gopls
>>>>> > > > subdirectory of the repository with root path golang.org/x/tools.
>>>>> The version v0.4.0 of that module must > have the tag named gopls/v0.4.0 
>>>>> in
>>>>> that repository.
>>>>> > >
>>>>> > > Specifically, for the Go SDK to be able to be fetched at the right
>>>>> version, we need to have prefixed tags like "sdks/v2.33.0" or
>>>>> "sdks/v2.34.0-RC1"
>>>>> > >
>>>>> > > So, the fix for the Go versioning issue is to amend our Release
>>>>> process (including generating Release Candidate builds) to also add a
>>>>> prefixed version tag with the same version.
>>>>> > >
>>>>> > > I can work with Kyle to validate this for 2.34.0 RC1, and if there
>>>>> are no objections we can back update the 2.33.0 release branch with such a
>>>>> prefixed tag. At which point I can also write the Official Experiemental
>>>>> Exit Blog post.
>>>>> > >
>>>>> > > Thank you all for your patience.
>>>>> > > Robert Burke
>>>>> > >
>>>>> > > On 2021/10/14 00:00:53, Ahmet Altay <[email protected]> wrote:
>>>>> > > > Thank you for the detailed update! Let us know if we can help.
>>>>> > > >
>>>>> > > > On Wed, Oct 13, 2021 at 2:42 PM Robert Burke <
>>>>> [email protected]> wrote:
>>>>> > > >
>>>>> > > > > This is a status update.
>>>>> > > > >
>>>>> > > > > At this point 2.33.0 is released, but there are difficulties
>>>>> with
>>>>> > > > > accessing the tagged versions using the standard go tools.
>>>>> It's currently
>>>>> > > > > under investigation.
>>>>> > > > >
>>>>> > > > > Using the v2 path in a go program then running `go mod tidy`
>>>>> will populate
>>>>> > > > > the file with  a pseudo-version rather than the latest tag
>>>>> (v2.33.0)  (eg
>>>>> > > > > the line looks like
>>>>> > > > > require github.com/apache/beam/sdks/v2
>>>>> v2.0.0-20211013181004-a9120e083008
>>>>> > > > > )
>>>>> > > > >
>>>>> > > > > While this will work, it's not the desired experience for
>>>>> users at this
>>>>> > > > > point. Current downside is that the releases are not
>>>>> meaningful targets for
>>>>> > > > > some reason. However, we retain the other benefits of Go
>>>>> Modules (actual
>>>>> > > > > dependency versioning, management by go tools).
>>>>> > > > >
>>>>> > > > > The issue is some combination of the go tooling [A] , that we
>>>>> added a go
>>>>> > > > > mod file outside of the repo root [B], and that we did not
>>>>> increment the
>>>>> > > > > major version (v2 -> v3) when adding the go mod file [C].
>>>>> > > > >
>>>>> > > > > [B] From the go documentation, this should be legal and fine,
>>>>> even if it's
>>>>> > > > > not recommended. This is fortunate because the root of the
>>>>> repo would have
>>>>> > > > > played poorly with root vendor directory, which the go tools
>>>>> have opinions
>>>>> > > > > on.
>>>>> > > > >
>>>>> > > > > [C] Incrementing the major version is recommended,in the Go
>>>>> Modules
>>>>> > > > > documentation, when transitioning to Go Modules. However, it
>>>>> never said it
>>>>> > > > > was required, nor did it indicate this current failure mode.
>>>>> If anything
>>>>> > > > > this should be documented in those docs, if it's not another
>>>>> bug. We would
>>>>> > > > > not necessarily want to declare a global v3 for beam at this
>>>>> time, for just
>>>>> > > > > the Go SDK, it would become confusing rather quickly.
>>>>> Notionally there are
>>>>> > > > > some larger breaking changes the Java and Python SDKs would
>>>>> want to make in
>>>>> > > > > such an event, and thus it's a larger conversation, that is
>>>>> out of scope at
>>>>> > > > > this time.
>>>>> > > > >
>>>>> > > > > This leaves [A] where some mis-understanding of the documented
>>>>> semantics
>>>>> > > > > occurred. I certainly expected the tagged version of the
>>>>> non-root go-module
>>>>> > > > > to be inherited from the parent, not wholesale ignored. As a
>>>>> result, I'll
>>>>> > > > > be filing a bug against the go tools to determine this, and
>>>>> see what paths
>>>>> > > > > forward exist.
>>>>> > > > >
>>>>> > > > > It's my hope to resolve this before we write a properly
>>>>> Experimental Exit
>>>>> > > > > blog post for the Go SDK.
>>>>> > > > >
>>>>> > > > > Thank you for your patience, and time.
>>>>> > > > > Robert Burke
>>>>> > > > > Beam Go Busybody
>>>>> > > > >
>>>>> > > > >
>>>>> > > > >
>>>>> > > > >
>>>>> > > > > On 2021/08/23 18:12:00, Robert Burke <[email protected]>
>>>>> wrote:
>>>>> > > > > > With 2.32 the LICENSE issue has been fixed [1], and the SDK
>>>>> now uses Go
>>>>> > > > > Modules for dependency management, simplifying Go SDK
>>>>> contributions. [2]
>>>>> > > > > >
>>>>> > > > > > The Module file lives in the sdks/ directory so there's a
>>>>> single Go
>>>>> > > > > Module for the whole SDK, tests, examples, and any support
>>>>> code for the
>>>>> > > > > container boot builds. This excludes the Go SDK Code katas [3]
>>>>> go modules
>>>>> > > > > which can be updated once 2.33.0 has been released.
>>>>> > > > > >
>>>>> > > > > > PR 15365 [4] adds the SDK containers back to the release
>>>>> builds, and
>>>>> > > > > default uses the release specific container for docker
>>>>> execution jobs. For
>>>>> > > > > at least the 2.33.0 release this does mean that  manual
>>>>> validation will
>>>>> > > > > need to explictly specify RC versions of containers. However,
>>>>> given that
>>>>> > > > > the Go SDK container and worker boot process rarely changes,
>>>>> this is
>>>>> > > > > unlikely to be an issue.
>>>>> > > > > >
>>>>> > > > > > At present I'm cleaning up some of the references to
>>>>> experimental, and
>>>>> > > > > making it clear that 2.33.0 is the first non-experimental
>>>>> release (even
>>>>> > > > > though that's 4-6 weeks out from actual release.) CHANGES.md
>>>>> will be
>>>>> > > > > updated to note the event, but a larger blogpost will happen
>>>>> after the
>>>>> > > > > release goes public.
>>>>> > > > > >
>>>>> > > > > > Cheers,
>>>>> > > > > > Robert Burke
>>>>> > > > > > Defacto Beam Go TL.
>>>>> > > > > >
>>>>> > > > > > [1]
>>>>> > > > >
>>>>> https://pkg.go.dev/github.com/apache/[email protected]+incompatible/sdks/go/pkg/beam
>>>>> > > > > > [2] https://github.com/apache/beam/pull/15323
>>>>> > > > > > [3]
>>>>> https://github.com/apache/beam/tree/master/learning/katas/go
>>>>> > > > > > [4] https://github.com/apache/beam/pull/15365
>>>>> > > > > >
>>>>> > > > > > On 2021/06/28 23:12:19, Ahmet Altay <[email protected]>
>>>>> wrote:
>>>>> > > > > > > +1, congratulations & thank you!
>>>>> > > > > > >
>>>>> > > > > > > On Tue, Jun 22, 2021 at 3:15 PM Robert Burke <
>>>>> [email protected]>
>>>>> > > > > wrote:
>>>>> > > > > > >
>>>>> > > > > > > > Regarding documentation update: Initial PR is
>>>>> > > > > > > > https://github.com/apache/beam/pull/15057 which goes up
>>>>> to section
>>>>> > > > > ~4.3.
>>>>> > > > > > > > JIRA link for Programing Guide changes:
>>>>> > > > > > > > https://issues.apache.org/jira/browse/BEAM-12513
>>>>> > > > > > > >
>>>>> > > > > > > >
>>>>> > > > > > > > On 2021/06/17 14:58:54, Robert Burke <[email protected]>
>>>>> wrote:
>>>>> > > > > > > > > Yup!
>>>>> > > > > > > > >
>>>>> > > > > > > > > My immediate plan is to work on incorporating the Go
>>>>> SDK fully
>>>>> > > > > into the
>>>>> > > > > > > > > Beam Programming Guide. I've audited the guide, and
>>>>> > > > > > > > > am beginning to add missing content and filling in the
>>>>> Go specific
>>>>> > > > > gaps.
>>>>> > > > > > > > > This will be tied to improving the Go Doc with more Go
>>>>> > > > > > > > > specific user documentation that isn't appropriate for
>>>>> the BPG.
>>>>> > > > > > > > >
>>>>> > > > > > > > > My audit of the guide is here:
>>>>> > > > > > > > >
>>>>> > > > > > > >
>>>>> > > > >
>>>>> https://docs.google.com/spreadsheets/d/1DrBFjxPBmMMmPfeFr6jr_JndxGOes8qDqKZ2Uxwvvds/edit?resourcekey=0-tVFwcLrQ2v2jpZkHk6QOpQ#gid=2072310090
>>>>> > > > > > > > >
>>>>> > > > > > > > > The other sheets focus on features and tests. The
>>>>> feature page
>>>>> > > > > looks
>>>>> > > > > > > > worse
>>>>> > > > > > > > > than it is, as it was more productive to focus on what
>>>>> isn't
>>>>> > > > > available
>>>>> > > > > > > > than
>>>>> > > > > > > > > what is. That's a snapshot of my actual working sheet
>>>>> but I'll be
>>>>> > > > > > > > updating
>>>>> > > > > > > > > it as needed.
>>>>> > > > > > > > >
>>>>> > > > > > > > > On Thu, Jun 17, 2021, 6:23 AM Ismaël Mejía <
>>>>> [email protected]>
>>>>> > > > > wrote:
>>>>> > > > > > > > >
>>>>> > > > > > > > > > Oups forgot to write one question. Will this come
>>>>> with revamped
>>>>> > > > > > > > > > website instructions/doc for golang too?
>>>>> > > > > > > > > >
>>>>> > > > > > > > > > On Thu, Jun 17, 2021 at 3:21 PM Ismaël Mejía <
>>>>> [email protected]>
>>>>> > > > > > > > wrote:
>>>>> > > > > > > > > > >
>>>>> > > > > > > > > > > Huge +1
>>>>> > > > > > > > > > >
>>>>> > > > > > > > > > > This is definitely something many people have
>>>>> asked about, so
>>>>> > > > > it is
>>>>> > > > > > > > > > > great to see it finally happening.
>>>>> > > > > > > > > > >
>>>>> > > > > > > > > > > On Wed, Jun 16, 2021 at 7:56 PM Kenneth Knowles <
>>>>> > > > > [email protected]>
>>>>> > > > > > > > wrote:
>>>>> > > > > > > > > > > >
>>>>> > > > > > > > > > > > +1 awesome
>>>>> > > > > > > > > > > >
>>>>> > > > > > > > > > > > On Wed, Jun 16, 2021 at 10:33 AM Robert Burke <
>>>>> > > > > [email protected]
>>>>> > > > > > > > >
>>>>> > > > > > > > > > wrote:
>>>>> > > > > > > > > > > >>
>>>>> > > > > > > > > > > >> Sounds reasonable to me. I agree. We'll aim to
>>>>> get those (Go
>>>>> > > > > > > > modules
>>>>> > > > > > > > > > and LICENSE issue) done before the 2.32 cut, and
>>>>> certainly
>>>>> > > > > before the
>>>>> > > > > > > > 2.33
>>>>> > > > > > > > > > cut if release images aren't added to the 2.32
>>>>> process.
>>>>> > > > > > > > > > > >>
>>>>> > > > > > > > > > > >> Regarding Go Generics: at some point in the
>>>>> future, we may
>>>>> > > > > want a
>>>>> > > > > > > > > > harder break between a newer Generic first API and
>>>>> and the
>>>>> > > > > current
>>>>> > > > > > > > version,
>>>>> > > > > > > > > > but there's no rush. Generics/TypeParameters in Go
>>>>> aren't
>>>>> > > > > identical to
>>>>> > > > > > > > the
>>>>> > > > > > > > > > feature referred to by that term in Java, C++, Rust,
>>>>> etc, so
>>>>> > > > > it'll
>>>>> > > > > > > > take a
>>>>> > > > > > > > > > bit of time for that expertise to develop.
>>>>> > > > > > > > > > > >>
>>>>> > > > > > > > > > > >> However, by the current nature of Go, we had to
>>>>> have pretty
>>>>> > > > > > > > > > sophisticated reflective analysis to handle DoFns
>>>>> and map them
>>>>> > > > > to their
>>>>> > > > > > > > > > graph inputs. So, adding new helpers like a KV,
>>>>> emitter, and
>>>>> > > > > Iterator
>>>>> > > > > > > > > > types, shouldn't be too difficult. Changing Go SDK
>>>>> internals to
>>>>> > > > > use
>>>>> > > > > > > > > > generics (like the implementation of Stats DoFns
>>>>> like Min, Max,
>>>>> > > > > etc)
>>>>> > > > > > > > would
>>>>> > > > > > > > > > also be able to be made transparently to most users,
>>>>> and
>>>>> > > > > certainly any
>>>>> > > > > > > > of
>>>>> > > > > > > > > > the framework for execution time handling (the
>>>>> "worker's SDK
>>>>> > > > > harness")
>>>>> > > > > > > > > > would be able to be cleaned up if need be. Finally,
>>>>> adding more
>>>>> > > > > > > > > > sophisticated DoFn registration and code generation
>>>>> would be
>>>>> > > > > able to
>>>>> > > > > > > > > > replace the optional code generator entirely, saving
>>>>> some users
>>>>> > > > > a `go
>>>>> > > > > > > > > > generate` step, simplifying getting improved
>>>>> execution
>>>>> > > > > performance.
>>>>> > > > > > > > > > > >>
>>>>> > > > > > > > > > > >> Changing things like making a Type Parameterized
>>>>> > > > > PCollection,
>>>>> > > > > > > > would
>>>>> > > > > > > > > > be far more involved, as would trying to use some
>>>>> kind of Apply
>>>>> > > > > > > > format. The
>>>>> > > > > > > > > > lack of Method Overrides prevents the apply chaining
>>>>> approach.
>>>>> > > > > Or at
>>>>> > > > > > > > least
>>>>> > > > > > > > > > prevents it from working simply.
>>>>> > > > > > > > > > > >>
>>>>> > > > > > > > > > > >> Finally, Go Generics won't be available until
>>>>> Go 1.18,
>>>>> > > > > which isn't
>>>>> > > > > > > > > > until next year. See
>>>>> https://blog.golang.org/generics-proposal
>>>>> > > > > for
>>>>> > > > > > > > > > details.
>>>>> > > > > > > > > > > >>
>>>>> > > > > > > > > > > >> Go 1.17 https://tip.golang.org/doc/go1.17 does
>>>>> include a
>>>>> > > > > Register
>>>>> > > > > > > > > > calling convention, leading to a modest performance
>>>>> improvement
>>>>> > > > > across
>>>>> > > > > > > > the
>>>>> > > > > > > > > > board.
>>>>> > > > > > > > > > > >>
>>>>> > > > > > > > > > > >> Cheers,
>>>>> > > > > > > > > > > >> Robert Burke
>>>>> > > > > > > > > > > >>
>>>>> > > > > > > > > > > >> On 2021/06/15 18:10:46, Robert Bradshaw <
>>>>> > > > > [email protected]>
>>>>> > > > > > > > wrote:
>>>>> > > > > > > > > > > >> > +1 to declaring Golang support out of
>>>>> experimental once
>>>>> > > > > the Go
>>>>> > > > > > > > > > Modules
>>>>> > > > > > > > > > > >> > issues are solved. I don't think an SDK needs
>>>>> to support
>>>>> > > > > every
>>>>> > > > > > > > > > feature
>>>>> > > > > > > > > > > >> > to be accepted, especially now that we can do
>>>>> > > > > cross-language
>>>>> > > > > > > > > > > >> > transforms, and Go definitely supports enough
>>>>> to be quite
>>>>> > > > > > > > useful.
>>>>> > > > > > > > > > (WRT
>>>>> > > > > > > > > > > >> > streaming, my understanding is that Go
>>>>> supports the
>>>>> > > > > streaming
>>>>> > > > > > > > model
>>>>> > > > > > > > > > > >> > with windows and timestamps, and runs fine on
>>>>> a streaming
>>>>> > > > > > > > runner,
>>>>> > > > > > > > > > even
>>>>> > > > > > > > > > > >> > if more advanced features like state and
>>>>> timers aren't yet
>>>>> > > > > > > > > > available.)
>>>>> > > > > > > > > > > >> >
>>>>> > > > > > > > > > > >> > This is a great milestone.
>>>>> > > > > > > > > > > >> >
>>>>> > > > > > > > > > > >> > On Tue, Jun 15, 2021 at 10:12 AM Tyson
>>>>> Hamilton <
>>>>> > > > > > > > [email protected]>
>>>>> > > > > > > > > > wrote:
>>>>> > > > > > > > > > > >> > >
>>>>> > > > > > > > > > > >> > > WOW! Big news.
>>>>> > > > > > > > > > > >> > >
>>>>> > > > > > > > > > > >> > > I'm supportive of leaving experimental
>>>>> status after Go
>>>>> > > > > Modules
>>>>> > > > > > > > > > are completed and the LICENSE issue is resolved. I
>>>>> don't think
>>>>> > > > > that
>>>>> > > > > > > > lacking
>>>>> > > > > > > > > > streaming support is a blocker. The other thing I
>>>>> checked to see
>>>>> > > > > was if
>>>>> > > > > > > > > > there were metrics available on
>>>>> metrics.beam.apache.org,
>>>>> > > > > specifically
>>>>> > > > > > > > for
>>>>> > > > > > > > > > measuring code health via post-commit over time,
>>>>> which there are
>>>>> > > > > and
>>>>> > > > > > > > the
>>>>> > > > > > > > > > passing test rate is high (Huzzah!). The one thing
>>>>> that
>>>>> > > > > surprised me
>>>>> > > > > > > > from
>>>>> > > > > > > > > > your summary is that when Go introduces generics it
>>>>> won't result
>>>>> > > > > in any
>>>>> > > > > > > > > > backwards incompatible changes in Apache Beam.
>>>>> That's great
>>>>> > > > > news, but
>>>>> > > > > > > > does
>>>>> > > > > > > > > > it mean there will be a need to support both
>>>>> non-generic and
>>>>> > > > > generic
>>>>> > > > > > > > APIs
>>>>> > > > > > > > > > moving forward? It seems like generics will be
>>>>> introduced in the
>>>>> > > > > Go
>>>>> > > > > > > > 1.17
>>>>> > > > > > > > > > release (optimistically) in August this year.
>>>>> > > > > > > > > > > >> > >
>>>>> > > > > > > > > > > >> > >
>>>>> > > > > > > > > > > >> > >
>>>>> > > > > > > > > > > >> > > On Thu, Jun 10, 2021 at 5:04 PM Robert
>>>>> Burke <
>>>>> > > > > > > > [email protected]>
>>>>> > > > > > > > > > wrote:
>>>>> > > > > > > > > > > >> > >>
>>>>> > > > > > > > > > > >> > >> Hello Beam Community!
>>>>> > > > > > > > > > > >> > >>
>>>>> > > > > > > > > > > >> > >> I propose we stop calling the Apache Beam
>>>>> Go SDK
>>>>> > > > > > > > experimental.
>>>>> > > > > > > > > > > >> > >>
>>>>> > > > > > > > > > > >> > >> This thread is to discuss it as a
>>>>> community, and any
>>>>> > > > > > > > conditions
>>>>> > > > > > > > > > that remain that would prevent the exit.
>>>>> > > > > > > > > > > >> > >>
>>>>> > > > > > > > > > > >> > >> tl;dr;
>>>>> > > > > > > > > > > >> > >> Ask Questions for answers and links! I
>>>>> have both.
>>>>> > > > > > > > > > > >> > >> This entails including it officially in
>>>>> the Release
>>>>> > > > > process,
>>>>> > > > > > > > > > removing the various "experimental" text throughout
>>>>> the repo etc,
>>>>> > > > > > > > > > > >> > >> and otherwise treating it like Python and
>>>>> Java. Some Go
>>>>> > > > > > > > specific
>>>>> > > > > > > > > > tasks around dep versioning.
>>>>> > > > > > > > > > > >> > >>
>>>>> > > > > > > > > > > >> > >> The Go SDK implements the beam model
>>>>> efficiently for
>>>>> > > > > most
>>>>> > > > > > > > batch
>>>>> > > > > > > > > > tasks, including basic windowing.
>>>>> > > > > > > > > > > >> > >> Apache Beam Go jobs can execute, and are
>>>>> tested on all
>>>>> > > > > > > > Portable
>>>>> > > > > > > > > > runners.
>>>>> > > > > > > > > > > >> > >> The core APIs are not going to change in
>>>>> incompatible
>>>>> > > > > ways
>>>>> > > > > > > > going
>>>>> > > > > > > > > > forward.
>>>>> > > > > > > > > > > >> > >> Scalable transforms can be written through
>>>>> > > > > SplittableDoFns or
>>>>> > > > > > > > > > via Cross Language transforms.
>>>>> > > > > > > > > > > >> > >>
>>>>> > > > > > > > > > > >> > >> The SDK isn't 100% feature complete, but
>>>>> keeping it
>>>>> > > > > > > > experimental
>>>>> > > > > > > > > > doesn't help with that any further.
>>>>> > > > > > > > > > > >> > >> Communities grow through contributions and
>>>>> use, and
>>>>> > > > > > > > experimental
>>>>> > > > > > > > > > markers dissuade users.
>>>>> > > > > > > > > > > >> > >> There's plenty to do in order expand what
>>>>> can be done
>>>>> > > > > with
>>>>> > > > > > > > the
>>>>> > > > > > > > > > SDK. (Contributions welcome)
>>>>> > > > > > > > > > > >> > >>
>>>>> > > > > > > > > > > >> > >> Why Exit Experimental now?
>>>>> > > > > > > > > > > >> > >>
>>>>> > > > > > > > > > > >> > >> Typically when we call an SDK or API
>>>>> Experimental, it's
>>>>> > > > > > > > because
>>>>> > > > > > > > > > there's a risk that API or behaviors may change
>>>>> significantly.
>>>>> > > > > > > > > > > >> > >> This in turn, leads to additional work for
>>>>> users of
>>>>> > > > > the SDK
>>>>> > > > > > > > on
>>>>> > > > > > > > > > every release which leads to sticking to older
>>>>> versions or
>>>>> > > > > forking
>>>>> > > > > > > > > > > >> > >> to preserve behavior. Version updates
>>>>> should be looked
>>>>> > > > > > > > forward
>>>>> > > > > > > > > > to, and viewed as having little risk. Further while
>>>>> there's been
>>>>> > > > > > > > > > > >> > >> previous dicussion about what the "low
>>>>> bar" is for a
>>>>> > > > > new
>>>>> > > > > > > > SDK, it
>>>>> > > > > > > > > > hasn't been summarily applied to the Go SDK. I feel
>>>>> this has
>>>>> > > > > > > > > > > >> > >> hurt development and contribution of new
>>>>> SDK languages
>>>>> > > > > > > > (inherent
>>>>> > > > > > > > > > difficulty of SDK development notwithstanding).
>>>>> > > > > > > > > > > >> > >>
>>>>> > > > > > > > > > > >> > >> When the SDK was designed, it wasn't
>>>>> entirely clear
>>>>> > > > > what the
>>>>> > > > > > > > > > Beam Model should look like in an opinionated
>>>>> language like Go.
>>>>> > > > > > > > > > > >> > >> Their initial take (see
>>>>> > > > > > > > > > https://s.apache.org/beam-go-sdk-design-rfc [0])
>>>>> goes into
>>>>> > > > > detail
>>>>> > > > > > > > what it
>>>>> > > > > > > > > > means for a language without
>>>>> > > > > > > > > > > >> > >> Generics, or overloading, or inheritance
>>>>> to implement
>>>>> > > > > the
>>>>> > > > > > > > beam
>>>>> > > > > > > > > > model. One could largely throw away static types
>>>>> (like Python),
>>>>> > > > > > > > > > > >> > >> but this approach rings hollow for Go. It
>>>>> would not do
>>>>> > > > > if the
>>>>> > > > > > > > > > approach couldn't grow and scale to the Beam Model.
>>>>> It's also
>>>>> > > > > hard
>>>>> > > > > > > > > > > >> > >> to tell if an API is any good before there
>>>>> are users.
>>>>> > > > > > > > > > > >> > >>
>>>>> > > > > > > > > > > >> > >> Further, in the early days of Portability,
>>>>> there
>>>>> > > > > wasn't a
>>>>> > > > > > > > way to
>>>>> > > > > > > > > > write scalable DoFns, dynamically or otherwise. It's
>>>>> an
>>>>> > > > > incredible
>>>>> > > > > > > > > > > >> > >> bottleneck to need to do all initial
>>>>> fanout of work on
>>>>> > > > > a
>>>>> > > > > > > > single
>>>>> > > > > > > > > > machine, write everything to a Reshuffle, just in
>>>>> order to scale
>>>>> > > > > up.
>>>>> > > > > > > > > > > >> > >> Without being able to scale, Beam is
>>>>> little more than
>>>>> > > > > > > > overhead.
>>>>> > > > > > > > > > > >> > >>
>>>>> > > > > > > > > > > >> > >> At this point, both of these needs are met
>>>>> within the
>>>>> > > > > Go SDK
>>>>> > > > > > > > for
>>>>> > > > > > > > > > open source.
>>>>> > > > > > > > > > > >> > >>
>>>>> > > > > > > > > > > >> > >> Background
>>>>> > > > > > > > > > > >> > >>
>>>>> > > > > > > > > > > >> > >> The Go SDK has been a part of the beam
>>>>> repo for a few
>>>>> > > > > years
>>>>> > > > > > > > now,
>>>>> > > > > > > > > > since it was accidentally merged into master.
>>>>> > > > > > > > > > > >> > >> Since then it's been called experimental,
>>>>> and not
>>>>> > > > > officially
>>>>> > > > > > > > > > part of the releases.
>>>>> > > > > > > > > > > >> > >>
>>>>> > > > > > > > > > > >> > >> Of the SDKs, it's was always designed
>>>>> around Beam
>>>>> > > > > Portability
>>>>> > > > > > > > > > first. It never had any "Legacy" (SDK x Runner
>>>>> specific )
>>>>> > > > > workers.
>>>>> > > > > > > > > > > >> > >> It's always used the Beam Pipeline protos
>>>>> and FnAPI to
>>>>> > > > > > > > execute
>>>>> > > > > > > > > > jobs, first with some very experimental code on
>>>>> Dataflow, but now
>>>>> > > > > > > > > > > >> > >> on all portable supported runners, like
>>>>> Flink, Spark,
>>>>> > > > > the
>>>>> > > > > > > > Python
>>>>> > > > > > > > > > Portable runner, and Dataflow.
>>>>> > > > > > > > > > > >> > >>
>>>>> > > > > > > > > > > >> > >> API Stability
>>>>> > > > > > > > > > > >> > >>
>>>>> > > > > > > > > > > >> > >> The Go SDK hasn't meaningfully changed
>>>>> it's user API
>>>>> > > > > for DoFn
>>>>> > > > > > > > > > and pipeline construction since it was first merged
>>>>> in, and
>>>>> > > > > there are
>>>>> > > > > > > > no
>>>>> > > > > > > > > > > >> > >> changes to that on the horizon that can't
>>>>> be made in a
>>>>> > > > > > > > backwards
>>>>> > > > > > > > > > compatible manner. Largely these are related to New
>>>>> Features, or
>>>>> > > > > > > > > > > >> > >> usability improvements enabled by the
>>>>> advent of Go
>>>>> > > > > Generics
>>>>> > > > > > > > > > (think of "real" KV, emitter, and iterator types).
>>>>> > > > > > > > > > > >> > >>
>>>>> > > > > > > > > > > >> > >> It's an open secret that the Go SDK has
>>>>> largely been
>>>>> > > > > under
>>>>> > > > > > > > work
>>>>> > > > > > > > > > for use within Google. It's use is called FlumeGo,
>>>>> representing
>>>>> > > > > > > > > > > >> > >> the Apache Beam Go SDK, running on top of
>>>>> Flume,
>>>>> > > > > Google's
>>>>> > > > > > > > batch
>>>>> > > > > > > > > > pipeline processing engine. Thus most of the focus
>>>>> on improving
>>>>> > > > > > > > > > > >> > >> batch execution. FlumeGo sees ample use
>>>>> today, and
>>>>> > > > > there
>>>>> > > > > > > > hasn't
>>>>> > > > > > > > > > been a call for fundamental changes to the API for
>>>>> ergonomic or
>>>>> > > > > > > > > > > >> > >> usability concerns.
>>>>> > > > > > > > > > > >> > >>
>>>>> > > > > > > > > > > >> > >> Scalability
>>>>> > > > > > > > > > > >> > >>
>>>>> > > > > > > > > > > >> > >> Google could get away without the Go SDK
>>>>> having an SDK
>>>>> > > > > side
>>>>> > > > > > > > > > scalability solution as a result of it's integration
>>>>> with Flume.
>>>>> > > > > > > > > > > >> > >> However, those days are now past.
>>>>> > > > > > > > > > > >> > >>
>>>>> > > > > > > > > > > >> > >> The Go SDK now supports SplittableDoFns
>>>>> along with
>>>>> > > > > Dynamic
>>>>> > > > > > > > > > Splitting, which supports writing scalable batch
>>>>> transforms
>>>>> > > > > natively
>>>>> > > > > > > > > > > >> > >> in the Go SDK.
>>>>> > > > > > > > > > > >> > >> The SDK also supports Cross Language
>>>>> Transforms, with
>>>>> > > > > Beam
>>>>> > > > > > > > > > Schema encodings. With it, production hardened
>>>>> transforms
>>>>> > > > > > > > > > > >> > >> from Java and Python are a wrapper away.
>>>>> > > > > > > > > > > >> > >>
>>>>> > > > > > > > > > > >> > >> Presently, Daniel Oliveira (who
>>>>> implemented the SDF
>>>>> > > > > side
>>>>> > > > > > > > work,
>>>>> > > > > > > > > > and completed the Xlang work,) is adding a wrapper
>>>>> for the
>>>>> > > > > > > > > > > >> > >> Java Kafka IO using Cross Language
>>>>> Transforms, which
>>>>> > > > > is often
>>>>> > > > > > > > > > been requested. This will also enable use of the
>>>>> Beam SQL
>>>>> > > > > > > > > > > >> > >> transforms that java enables.
>>>>> > > > > > > > > > > >> > >>
>>>>> > > > > > > > > > > >> > >> Features
>>>>> > > > > > > > > > > >> > >>
>>>>> > > > > > > > > > > >> > >> The Go SDK implements the Beam C=core. The
>>>>> Go SDK
>>>>> > > > > implements
>>>>> > > > > > > > > > standard coders, allows for user DoFns, and
>>>>> CombineFns and access
>>>>> > > > > > > > > > > >> > >> to core transforms like Flatten,
>>>>> GroupByKey, and
>>>>> > > > > features
>>>>> > > > > > > > like
>>>>> > > > > > > > > > Side Inputs, Windowing, and User Metrics.
>>>>> > > > > > > > > > > >> > >> Basic windowing will be fully supported
>>>>> for batch even
>>>>> > > > > > > > through
>>>>> > > > > > > > > > lifted combines in the 2.32.0 release.
>>>>> > > > > > > > > > > >> > >>
>>>>> > > > > > > > > > > >> > >> All of the above enables Beam Go to be
>>>>> versatile for
>>>>> > > > > batch
>>>>> > > > > > > > > > execution on portable runners, and for simple
>>>>> streaming
>>>>> > > > > pipelines.
>>>>> > > > > > > > > > > >> > >>
>>>>> > > > > > > > > > > >> > >> Repo Testing
>>>>> > > > > > > > > > > >> > >>
>>>>> > > > > > > > > > > >> > >> On precommit the Go SDK runs all it's unit
>>>>> tests. On
>>>>> > > > > top of
>>>>> > > > > > > > > > that, it runs all it's integration tests against the
>>>>> Python
>>>>> > > > > Portable
>>>>> > > > > > > > runner,
>>>>> > > > > > > > > > > >> > >> making it quick and robust to detect
>>>>> breaking changes
>>>>> > > > > without
>>>>> > > > > > > > > > overspending community resources. Those same tests
>>>>> are also
>>>>> > > > > > > > > > > >> > >> run against Dataflow, Flink, and Spark.
>>>>> > > > > > > > > > > >> > >>
>>>>> > > > > > > > > > > >> > >> The tests are executable against all
>>>>> runners via the
>>>>> > > > > > > > appropriate
>>>>> > > > > > > > > > Go commands (if you've stood up your own job
>>>>> management server),
>>>>> > > > > > > > > > > >> > >> or Gradle commands (which will spin up
>>>>> runner
>>>>> > > > > instances for
>>>>> > > > > > > > > > you). Documentation for executing tests and adding
>>>>> new ones
>>>>> > > > > > > > > > > >> > >> is on the wiki. [2] They are accessible to
>>>>> Go
>>>>> > > > > developers as
>>>>> > > > > > > > > > they're implemented with the standard Go testing
>>>>> tools.
>>>>> > > > > > > > > > > >> > >>
>>>>> > > > > > > > > > > >> > >> Shortcomings
>>>>> > > > > > > > > > > >> > >> That said, there's still much to do. Let
>>>>> me briefly
>>>>> > > > > tell you
>>>>> > > > > > > > > > what doesn't work, and it's up to you to weigh
>>>>> whether they block
>>>>> > > > > > > > > > > >> > >> being out of experimental.
>>>>> > > > > > > > > > > >> > >>
>>>>> > > > > > > > > > > >> > >> At present, only a textio has been
>>>>> implemented as
>>>>> > > > > Splittable
>>>>> > > > > > > > > > DoFn.
>>>>> > > > > > > > > > > >> > >> Once the Kafka wrapper is merged in, it
>>>>> will serve as
>>>>> > > > > a the
>>>>> > > > > > > > > > first example for future contributions for
>>>>> > > > > > > > > > > >> > >> new transform wrappers for the Go SDK.
>>>>> > > > > > > > > > > >> > >> Transforms and IOs are lacking, but at
>>>>> this point
>>>>> > > > > users are
>>>>> > > > > > > > > > empowered to write their own DoFns or wrap existing
>>>>> transforms
>>>>> > > > > for
>>>>> > > > > > > > Cross
>>>>> > > > > > > > > > Language use.
>>>>> > > > > > > > > > > >> > >>
>>>>> > > > > > > > > > > >> > >> In the core SDK, more streaming focused
>>>>> features have
>>>>> > > > > yet to
>>>>> > > > > > > > be
>>>>> > > > > > > > > > implemented, but they're largely additions to what
>>>>> exists already
>>>>> > > > > > > > > > > >> > >> rather than total rebuilds. Much of the
>>>>> work is
>>>>> > > > > definining
>>>>> > > > > > > > how a
>>>>> > > > > > > > > > user specifies their desires, and turning those into
>>>>> the
>>>>> > > > > appropriate
>>>>> > > > > > > > > > > >> > >> FnAPI requests at execution time. Back in
>>>>> October I
>>>>> > > > > wrote at
>>>>> > > > > > > > > > length on the wiki [1] what's missing for additional
>>>>> streaming
>>>>> > > > > > > > features.
>>>>> > > > > > > > > > > >> > >>
>>>>> > > > > > > > > > > >> > >> While we have bolstered our testing
>>>>> recently, there's
>>>>> > > > > likely
>>>>> > > > > > > > > > still more we could test to improve our confidence
>>>>> in the SDK,
>>>>> > > > > > > > > > > >> > >> in particular regarding the included
>>>>> transforms
>>>>> > > > > libraries and
>>>>> > > > > > > > > > examples.
>>>>> > > > > > > > > > > >> > >>
>>>>> > > > > > > > > > > >> > >> Moving Forward
>>>>> > > > > > > > > > > >> > >>
>>>>> > > > > > > > > > > >> > >> My immediate plan is to work on
>>>>> incorporating the Go
>>>>> > > > > SDK
>>>>> > > > > > > > fully
>>>>> > > > > > > > > > into the Beam Programming Guide. I've audited the
>>>>> guide [3], and
>>>>> > > > > > > > > > > >> > >> am beginning to add missing content and
>>>>> filling in the
>>>>> > > > > Go
>>>>> > > > > > > > > > specific gaps. This will be tied to improving the Go
>>>>> Doc with
>>>>> > > > > more Go
>>>>> > > > > > > > > > > >> > >> specific user documentation that isn't
>>>>> appropriate for
>>>>> > > > > the
>>>>> > > > > > > > BPG.
>>>>> > > > > > > > > > > >> > >> And resolving the LICENSE issue around the
>>>>> public
>>>>> > > > > display of
>>>>> > > > > > > > > > that GoDoc.
>>>>> > > > > > > > > > > >> > >>
>>>>> > > > > > > > > > > >> > >> If this proposal is accepted by a binding
>>>>> vote, I will
>>>>> > > > > > > > > > incorporate the SDK into the release process, and
>>>>> remove the
>>>>> > > > > > > > "experimental"
>>>>> > > > > > > > > > > >> > >> language around the SDK. This largely
>>>>> entails updating
>>>>> > > > > the
>>>>> > > > > > > > > > release scripts to also build and publish the Go SDK
>>>>> Docker
>>>>> > > > > containers.
>>>>> > > > > > > > > > > >> > >> As for releasing the code, we're
>>>>> technically already
>>>>> > > > > doing so
>>>>> > > > > > > > > > whenever we tag a release branch [4].
>>>>> > > > > > > > > > > >> > >>
>>>>> > > > > > > > > > > >> > >> The clearest signal to the Go community
>>>>> however will be
>>>>> > > > > > > > > > migrating the SDK to use Go Modules for dependency
>>>>> version
>>>>> > > > > control,
>>>>> > > > > > > > > > > >> > >> which Daniel is planning on working on
>>>>> after his Kafka
>>>>> > > > > task.
>>>>> > > > > > > > > > This will put our repo infrastructure, SDK
>>>>> contributors, and
>>>>> > > > > users
>>>>> > > > > > > > > > > >> > >> on the same footing when it comes to
>>>>> dependency
>>>>> > > > > management.
>>>>> > > > > > > > It
>>>>> > > > > > > > > > will remove the "+incompatible" tags one sees on the
>>>>> > > > > > > > > > > >> > >> pkg.go.dev list at [4].
>>>>> > > > > > > > > > > >> > >>
>>>>> > > > > > > > > > > >> > >> I'm very happy to answer any questions you
>>>>> might have
>>>>> > > > > about
>>>>> > > > > > > > the
>>>>> > > > > > > > > > SDK, and provide additional links as needed. I
>>>>> intentionally
>>>>> > > > > avoided
>>>>> > > > > > > > > > > >> > >> a link barrage in this email, as they can
>>>>> distract
>>>>> > > > > from the
>>>>> > > > > > > > > > point: The SDK is ready for folks to use it, we need
>>>>> to tell
>>>>> > > > > them that
>>>>> > > > > > > > they
>>>>> > > > > > > > > > can
>>>>> > > > > > > > > > > >> > >> rather than they shouldn't.
>>>>> > > > > > > > > > > >> > >>
>>>>> > > > > > > > > > > >> > >> Robert Burke
>>>>> > > > > > > > > > > >> > >> Defacto Beam Go TL
>>>>> > > > > > > > > > > >> > >>
>>>>> > > > > > > > > > > >> > >> [0]
>>>>> https://s.apache.org/beam-go-sdk-design-rfc
>>>>> > > > > > > > > > > >> > >> [1]
>>>>> > > > > > > > > >
>>>>> > > > > > > >
>>>>> > > > >
>>>>> https://cwiki.apache.org/confluence/display/BEAM/Supporting+Streaming+in+the+Go+SDK
>>>>> > > > > > > > > > > >> > >> [2]
>>>>> > > > > https://cwiki.apache.org/confluence/display/BEAM/Go+Tips
>>>>> > > > > > > > > > > >> > >> [3]
>>>>> > > > > > > > > >
>>>>> > > > > > > >
>>>>> > > > >
>>>>> https://docs.google.com/spreadsheets/d/1DrBFjxPBmMMmPfeFr6jr_JndxGOes8qDqKZ2Uxwvvds/edit?resourcekey=0-tVFwcLrQ2v2jpZkHk6QOpQ#gid=2072310090
>>>>> > > > > > > > > > (SDK Audit sheet)
>>>>> > > > > > > > > > > >> > >> [4]
>>>>> > > > > > > > > >
>>>>> > > > > > > >
>>>>> > > > >
>>>>> https://pkg.go.dev/github.com/apache/beam/sdks/go/pkg/beam?tab=versions
>>>>> > > > > > > > > > > >> >
>>>>> > > > > > > > > >
>>>>> > > > > > > > >
>>>>> > > > > > > >
>>>>> > > > > > >
>>>>> > > > > >
>>>>> > > > >
>>>>> > > >
>>>>> > >
>>>>>
>>>>

Reply via email to