Investigation yielded that there's no way around the prefixed tags. The JIRA has been commented with the explanation.
https://github.com/apache/beam/pull/15881 has the release script updates. I'm working on the Exit blogpost and the updated Go SDK roadmap. The draft PR will be linked here. Since 2.34.0 is almost out (assuming RC1 verification goes well) I'm inclined to wait for that release to finish before publishing the blogpost. I'll link the draft PR here as soon as it's ready. Once 2.34.0 is released, I'm inclined to still have 2.33.0 be also prefix tagged so there isn't a gap in versions between the unmoduled code and moduled code. Once published, that'll be the end of this thread. Thank you very much everyone. Robert Burke Beam Go Busybody On Tue, Oct 26, 2021, 5:36 PM Kyle Weaver <[email protected]> wrote: > +1 to extra tags. They'll be trivial to add to our release process, and > git tags are lightweight by design so I don't foresee any problems. > > On Tue, Oct 26, 2021 at 5:27 PM Robert Bradshaw <[email protected]> > wrote: > >> Glad you were able to figure it out. The extra tags are certainly >> worth making this work if it's what we have to do, and shouldn't be >> too much of a problem (until, hopefully, it's fixed on the go side). >> >> On Tue, Oct 26, 2021 at 4:53 PM Robert Burke <[email protected]> wrote: >> > >> > With Kyle's help with the additional tagging of the next RC, we have >> validated that this is the currently correct approach. >> > >> > >> https://pkg.go.dev/github.com/apache/beam/sdks/[email protected]/go/pkg/beam?tab=versions >> > >> https://pkg.go.dev/github.com/apache/beam/sdks/[email protected]/go/pkg/beam >> > >> > Or even: >> > https://pkg.go.dev/github.com/apache/beam/sdks/v2/go/pkg/beam (links >> to latest tagged version) >> > >> > The main cost to this approach is doubling the number of tags in the >> tags list: https://github.com/apache/beam/tags which is not ideal, but >> overall a small cost. There's no need for "full publish" of these >> additional tags, so we won't be doubling our "releases" (see >> https://github.com/apache/beam/releases). >> > >> > I'll still be filing a bug against the Go commands since the mandatory >> prefixing is unintuitive, and seems unnecessary. If it becomes so, we can >> always delete the tags from the affected branches, and cease the behavior >> going forward. I'll search through the existing Go issues first however to >> see if this has been previously discussed, and report my findings here >> either way. >> > >> > This does require 2 small changes to release guide: The rc tagging >> script, and the finally tagging: >> > >> https://github.com/apache/beam/blob/243128a8fc52798e1b58b0cf1a271d95ee7aa241/release/src/main/scripts/choose_rc_commit.sh >> > >> > >> https://github.com/apache/beam/blob/f8660d343fb218cb7acce81ddcc49de0710a0d14/website/www/site/content/en/contribute/release-guide.md#git-tag >> > >> > I'll make this change later this week (or early next) assuming there >> are no objections. >> > >> > Thank you all very much for your patience, >> > Robert Burke >> > Beam Go Busybody >> > >> > >> > On 2021/10/26 23:01:00, Robert Burke <[email protected]> wrote: >> > > With much research in reading the Go Modules documentation, I have >> confirmed what the issue is. >> > > >> > > We added the go.mod file to sdks/ under the repo root because it's a >> cleaner spot for the change, captures the Java and Python container boot >> code (written in Go) into the module and avoids conflicts in >> interpretations of the vendor directory that lives at the root level. >> > > >> > > However, we missed that when doing so, the standard version tags >> would only apply to modules at the root level, not at modules in >> subdirectories. See https://golang.org/ref/mod#vcs-version, but quoting >> the important paragraph: >> > > >> > > > If a module is defined in a subdirectory within the repository, >> that is, the module subdirectory portion of >> > > > the module path is not empty, then each tag name must be prefixed >> with the module subdirectory, >> > > > followed by a slash. For example, the module >> golang.org/x/tools/gopls is defined in the gopls >> > > > subdirectory of the repository with root path golang.org/x/tools. >> The version v0.4.0 of that module must > have the tag named gopls/v0.4.0 in >> that repository. >> > > >> > > Specifically, for the Go SDK to be able to be fetched at the right >> version, we need to have prefixed tags like "sdks/v2.33.0" or >> "sdks/v2.34.0-RC1" >> > > >> > > So, the fix for the Go versioning issue is to amend our Release >> process (including generating Release Candidate builds) to also add a >> prefixed version tag with the same version. >> > > >> > > I can work with Kyle to validate this for 2.34.0 RC1, and if there >> are no objections we can back update the 2.33.0 release branch with such a >> prefixed tag. At which point I can also write the Official Experiemental >> Exit Blog post. >> > > >> > > Thank you all for your patience. >> > > Robert Burke >> > > >> > > On 2021/10/14 00:00:53, Ahmet Altay <[email protected]> wrote: >> > > > Thank you for the detailed update! Let us know if we can help. >> > > > >> > > > On Wed, Oct 13, 2021 at 2:42 PM Robert Burke <[email protected]> >> wrote: >> > > > >> > > > > This is a status update. >> > > > > >> > > > > At this point 2.33.0 is released, but there are difficulties with >> > > > > accessing the tagged versions using the standard go tools. It's >> currently >> > > > > under investigation. >> > > > > >> > > > > Using the v2 path in a go program then running `go mod tidy` will >> populate >> > > > > the file with a pseudo-version rather than the latest tag >> (v2.33.0) (eg >> > > > > the line looks like >> > > > > require github.com/apache/beam/sdks/v2 >> v2.0.0-20211013181004-a9120e083008 >> > > > > ) >> > > > > >> > > > > While this will work, it's not the desired experience for users >> at this >> > > > > point. Current downside is that the releases are not meaningful >> targets for >> > > > > some reason. However, we retain the other benefits of Go Modules >> (actual >> > > > > dependency versioning, management by go tools). >> > > > > >> > > > > The issue is some combination of the go tooling [A] , that we >> added a go >> > > > > mod file outside of the repo root [B], and that we did not >> increment the >> > > > > major version (v2 -> v3) when adding the go mod file [C]. >> > > > > >> > > > > [B] From the go documentation, this should be legal and fine, >> even if it's >> > > > > not recommended. This is fortunate because the root of the repo >> would have >> > > > > played poorly with root vendor directory, which the go tools have >> opinions >> > > > > on. >> > > > > >> > > > > [C] Incrementing the major version is recommended,in the Go >> Modules >> > > > > documentation, when transitioning to Go Modules. However, it >> never said it >> > > > > was required, nor did it indicate this current failure mode. If >> anything >> > > > > this should be documented in those docs, if it's not another bug. >> We would >> > > > > not necessarily want to declare a global v3 for beam at this >> time, for just >> > > > > the Go SDK, it would become confusing rather quickly. Notionally >> there are >> > > > > some larger breaking changes the Java and Python SDKs would want >> to make in >> > > > > such an event, and thus it's a larger conversation, that is out >> of scope at >> > > > > this time. >> > > > > >> > > > > This leaves [A] where some mis-understanding of the documented >> semantics >> > > > > occurred. I certainly expected the tagged version of the non-root >> go-module >> > > > > to be inherited from the parent, not wholesale ignored. As a >> result, I'll >> > > > > be filing a bug against the go tools to determine this, and see >> what paths >> > > > > forward exist. >> > > > > >> > > > > It's my hope to resolve this before we write a properly >> Experimental Exit >> > > > > blog post for the Go SDK. >> > > > > >> > > > > Thank you for your patience, and time. >> > > > > Robert Burke >> > > > > Beam Go Busybody >> > > > > >> > > > > >> > > > > >> > > > > >> > > > > On 2021/08/23 18:12:00, Robert Burke <[email protected]> wrote: >> > > > > > With 2.32 the LICENSE issue has been fixed [1], and the SDK now >> uses Go >> > > > > Modules for dependency management, simplifying Go SDK >> contributions. [2] >> > > > > > >> > > > > > The Module file lives in the sdks/ directory so there's a >> single Go >> > > > > Module for the whole SDK, tests, examples, and any support code >> for the >> > > > > container boot builds. This excludes the Go SDK Code katas [3] go >> modules >> > > > > which can be updated once 2.33.0 has been released. >> > > > > > >> > > > > > PR 15365 [4] adds the SDK containers back to the release >> builds, and >> > > > > default uses the release specific container for docker execution >> jobs. For >> > > > > at least the 2.33.0 release this does mean that manual >> validation will >> > > > > need to explictly specify RC versions of containers. However, >> given that >> > > > > the Go SDK container and worker boot process rarely changes, this >> is >> > > > > unlikely to be an issue. >> > > > > > >> > > > > > At present I'm cleaning up some of the references to >> experimental, and >> > > > > making it clear that 2.33.0 is the first non-experimental release >> (even >> > > > > though that's 4-6 weeks out from actual release.) CHANGES.md >> will be >> > > > > updated to note the event, but a larger blogpost will happen >> after the >> > > > > release goes public. >> > > > > > >> > > > > > Cheers, >> > > > > > Robert Burke >> > > > > > Defacto Beam Go TL. >> > > > > > >> > > > > > [1] >> > > > > >> https://pkg.go.dev/github.com/apache/[email protected]+incompatible/sdks/go/pkg/beam >> > > > > > [2] https://github.com/apache/beam/pull/15323 >> > > > > > [3] >> https://github.com/apache/beam/tree/master/learning/katas/go >> > > > > > [4] https://github.com/apache/beam/pull/15365 >> > > > > > >> > > > > > On 2021/06/28 23:12:19, Ahmet Altay <[email protected]> wrote: >> > > > > > > +1, congratulations & thank you! >> > > > > > > >> > > > > > > On Tue, Jun 22, 2021 at 3:15 PM Robert Burke < >> [email protected]> >> > > > > wrote: >> > > > > > > >> > > > > > > > Regarding documentation update: Initial PR is >> > > > > > > > https://github.com/apache/beam/pull/15057 which goes up to >> section >> > > > > ~4.3. >> > > > > > > > JIRA link for Programing Guide changes: >> > > > > > > > https://issues.apache.org/jira/browse/BEAM-12513 >> > > > > > > > >> > > > > > > > >> > > > > > > > On 2021/06/17 14:58:54, Robert Burke <[email protected]> >> wrote: >> > > > > > > > > Yup! >> > > > > > > > > >> > > > > > > > > My immediate plan is to work on incorporating the Go SDK >> fully >> > > > > into the >> > > > > > > > > Beam Programming Guide. I've audited the guide, and >> > > > > > > > > am beginning to add missing content and filling in the Go >> specific >> > > > > gaps. >> > > > > > > > > This will be tied to improving the Go Doc with more Go >> > > > > > > > > specific user documentation that isn't appropriate for >> the BPG. >> > > > > > > > > >> > > > > > > > > My audit of the guide is here: >> > > > > > > > > >> > > > > > > > >> > > > > >> https://docs.google.com/spreadsheets/d/1DrBFjxPBmMMmPfeFr6jr_JndxGOes8qDqKZ2Uxwvvds/edit?resourcekey=0-tVFwcLrQ2v2jpZkHk6QOpQ#gid=2072310090 >> > > > > > > > > >> > > > > > > > > The other sheets focus on features and tests. The feature >> page >> > > > > looks >> > > > > > > > worse >> > > > > > > > > than it is, as it was more productive to focus on what >> isn't >> > > > > available >> > > > > > > > than >> > > > > > > > > what is. That's a snapshot of my actual working sheet but >> I'll be >> > > > > > > > updating >> > > > > > > > > it as needed. >> > > > > > > > > >> > > > > > > > > On Thu, Jun 17, 2021, 6:23 AM Ismaël Mejía < >> [email protected]> >> > > > > wrote: >> > > > > > > > > >> > > > > > > > > > Oups forgot to write one question. Will this come with >> revamped >> > > > > > > > > > website instructions/doc for golang too? >> > > > > > > > > > >> > > > > > > > > > On Thu, Jun 17, 2021 at 3:21 PM Ismaël Mejía < >> [email protected]> >> > > > > > > > wrote: >> > > > > > > > > > > >> > > > > > > > > > > Huge +1 >> > > > > > > > > > > >> > > > > > > > > > > This is definitely something many people have asked >> about, so >> > > > > it is >> > > > > > > > > > > great to see it finally happening. >> > > > > > > > > > > >> > > > > > > > > > > On Wed, Jun 16, 2021 at 7:56 PM Kenneth Knowles < >> > > > > [email protected]> >> > > > > > > > wrote: >> > > > > > > > > > > > >> > > > > > > > > > > > +1 awesome >> > > > > > > > > > > > >> > > > > > > > > > > > On Wed, Jun 16, 2021 at 10:33 AM Robert Burke < >> > > > > [email protected] >> > > > > > > > > >> > > > > > > > > > wrote: >> > > > > > > > > > > >> >> > > > > > > > > > > >> Sounds reasonable to me. I agree. We'll aim to get >> those (Go >> > > > > > > > modules >> > > > > > > > > > and LICENSE issue) done before the 2.32 cut, and >> certainly >> > > > > before the >> > > > > > > > 2.33 >> > > > > > > > > > cut if release images aren't added to the 2.32 process. >> > > > > > > > > > > >> >> > > > > > > > > > > >> Regarding Go Generics: at some point in the >> future, we may >> > > > > want a >> > > > > > > > > > harder break between a newer Generic first API and and >> the >> > > > > current >> > > > > > > > version, >> > > > > > > > > > but there's no rush. Generics/TypeParameters in Go >> aren't >> > > > > identical to >> > > > > > > > the >> > > > > > > > > > feature referred to by that term in Java, C++, Rust, >> etc, so >> > > > > it'll >> > > > > > > > take a >> > > > > > > > > > bit of time for that expertise to develop. >> > > > > > > > > > > >> >> > > > > > > > > > > >> However, by the current nature of Go, we had to >> have pretty >> > > > > > > > > > sophisticated reflective analysis to handle DoFns and >> map them >> > > > > to their >> > > > > > > > > > graph inputs. So, adding new helpers like a KV, >> emitter, and >> > > > > Iterator >> > > > > > > > > > types, shouldn't be too difficult. Changing Go SDK >> internals to >> > > > > use >> > > > > > > > > > generics (like the implementation of Stats DoFns like >> Min, Max, >> > > > > etc) >> > > > > > > > would >> > > > > > > > > > also be able to be made transparently to most users, and >> > > > > certainly any >> > > > > > > > of >> > > > > > > > > > the framework for execution time handling (the >> "worker's SDK >> > > > > harness") >> > > > > > > > > > would be able to be cleaned up if need be. Finally, >> adding more >> > > > > > > > > > sophisticated DoFn registration and code generation >> would be >> > > > > able to >> > > > > > > > > > replace the optional code generator entirely, saving >> some users >> > > > > a `go >> > > > > > > > > > generate` step, simplifying getting improved execution >> > > > > performance. >> > > > > > > > > > > >> >> > > > > > > > > > > >> Changing things like making a Type Parameterized >> > > > > PCollection, >> > > > > > > > would >> > > > > > > > > > be far more involved, as would trying to use some kind >> of Apply >> > > > > > > > format. The >> > > > > > > > > > lack of Method Overrides prevents the apply chaining >> approach. >> > > > > Or at >> > > > > > > > least >> > > > > > > > > > prevents it from working simply. >> > > > > > > > > > > >> >> > > > > > > > > > > >> Finally, Go Generics won't be available until Go >> 1.18, >> > > > > which isn't >> > > > > > > > > > until next year. See >> https://blog.golang.org/generics-proposal >> > > > > for >> > > > > > > > > > details. >> > > > > > > > > > > >> >> > > > > > > > > > > >> Go 1.17 https://tip.golang.org/doc/go1.17 does >> include a >> > > > > Register >> > > > > > > > > > calling convention, leading to a modest performance >> improvement >> > > > > across >> > > > > > > > the >> > > > > > > > > > board. >> > > > > > > > > > > >> >> > > > > > > > > > > >> Cheers, >> > > > > > > > > > > >> Robert Burke >> > > > > > > > > > > >> >> > > > > > > > > > > >> On 2021/06/15 18:10:46, Robert Bradshaw < >> > > > > [email protected]> >> > > > > > > > wrote: >> > > > > > > > > > > >> > +1 to declaring Golang support out of >> experimental once >> > > > > the Go >> > > > > > > > > > Modules >> > > > > > > > > > > >> > issues are solved. I don't think an SDK needs to >> support >> > > > > every >> > > > > > > > > > feature >> > > > > > > > > > > >> > to be accepted, especially now that we can do >> > > > > cross-language >> > > > > > > > > > > >> > transforms, and Go definitely supports enough to >> be quite >> > > > > > > > useful. >> > > > > > > > > > (WRT >> > > > > > > > > > > >> > streaming, my understanding is that Go supports >> the >> > > > > streaming >> > > > > > > > model >> > > > > > > > > > > >> > with windows and timestamps, and runs fine on a >> streaming >> > > > > > > > runner, >> > > > > > > > > > even >> > > > > > > > > > > >> > if more advanced features like state and timers >> aren't yet >> > > > > > > > > > available.) >> > > > > > > > > > > >> > >> > > > > > > > > > > >> > This is a great milestone. >> > > > > > > > > > > >> > >> > > > > > > > > > > >> > On Tue, Jun 15, 2021 at 10:12 AM Tyson Hamilton < >> > > > > > > > [email protected]> >> > > > > > > > > > wrote: >> > > > > > > > > > > >> > > >> > > > > > > > > > > >> > > WOW! Big news. >> > > > > > > > > > > >> > > >> > > > > > > > > > > >> > > I'm supportive of leaving experimental status >> after Go >> > > > > Modules >> > > > > > > > > > are completed and the LICENSE issue is resolved. I >> don't think >> > > > > that >> > > > > > > > lacking >> > > > > > > > > > streaming support is a blocker. The other thing I >> checked to see >> > > > > was if >> > > > > > > > > > there were metrics available on metrics.beam.apache.org >> , >> > > > > specifically >> > > > > > > > for >> > > > > > > > > > measuring code health via post-commit over time, which >> there are >> > > > > and >> > > > > > > > the >> > > > > > > > > > passing test rate is high (Huzzah!). The one thing that >> > > > > surprised me >> > > > > > > > from >> > > > > > > > > > your summary is that when Go introduces generics it >> won't result >> > > > > in any >> > > > > > > > > > backwards incompatible changes in Apache Beam. That's >> great >> > > > > news, but >> > > > > > > > does >> > > > > > > > > > it mean there will be a need to support both >> non-generic and >> > > > > generic >> > > > > > > > APIs >> > > > > > > > > > moving forward? It seems like generics will be >> introduced in the >> > > > > Go >> > > > > > > > 1.17 >> > > > > > > > > > release (optimistically) in August this year. >> > > > > > > > > > > >> > > >> > > > > > > > > > > >> > > >> > > > > > > > > > > >> > > >> > > > > > > > > > > >> > > On Thu, Jun 10, 2021 at 5:04 PM Robert Burke < >> > > > > > > > [email protected]> >> > > > > > > > > > wrote: >> > > > > > > > > > > >> > >> >> > > > > > > > > > > >> > >> Hello Beam Community! >> > > > > > > > > > > >> > >> >> > > > > > > > > > > >> > >> I propose we stop calling the Apache Beam Go >> SDK >> > > > > > > > experimental. >> > > > > > > > > > > >> > >> >> > > > > > > > > > > >> > >> This thread is to discuss it as a community, >> and any >> > > > > > > > conditions >> > > > > > > > > > that remain that would prevent the exit. >> > > > > > > > > > > >> > >> >> > > > > > > > > > > >> > >> tl;dr; >> > > > > > > > > > > >> > >> Ask Questions for answers and links! I have >> both. >> > > > > > > > > > > >> > >> This entails including it officially in the >> Release >> > > > > process, >> > > > > > > > > > removing the various "experimental" text throughout the >> repo etc, >> > > > > > > > > > > >> > >> and otherwise treating it like Python and >> Java. Some Go >> > > > > > > > specific >> > > > > > > > > > tasks around dep versioning. >> > > > > > > > > > > >> > >> >> > > > > > > > > > > >> > >> The Go SDK implements the beam model >> efficiently for >> > > > > most >> > > > > > > > batch >> > > > > > > > > > tasks, including basic windowing. >> > > > > > > > > > > >> > >> Apache Beam Go jobs can execute, and are >> tested on all >> > > > > > > > Portable >> > > > > > > > > > runners. >> > > > > > > > > > > >> > >> The core APIs are not going to change in >> incompatible >> > > > > ways >> > > > > > > > going >> > > > > > > > > > forward. >> > > > > > > > > > > >> > >> Scalable transforms can be written through >> > > > > SplittableDoFns or >> > > > > > > > > > via Cross Language transforms. >> > > > > > > > > > > >> > >> >> > > > > > > > > > > >> > >> The SDK isn't 100% feature complete, but >> keeping it >> > > > > > > > experimental >> > > > > > > > > > doesn't help with that any further. >> > > > > > > > > > > >> > >> Communities grow through contributions and >> use, and >> > > > > > > > experimental >> > > > > > > > > > markers dissuade users. >> > > > > > > > > > > >> > >> There's plenty to do in order expand what can >> be done >> > > > > with >> > > > > > > > the >> > > > > > > > > > SDK. (Contributions welcome) >> > > > > > > > > > > >> > >> >> > > > > > > > > > > >> > >> Why Exit Experimental now? >> > > > > > > > > > > >> > >> >> > > > > > > > > > > >> > >> Typically when we call an SDK or API >> Experimental, it's >> > > > > > > > because >> > > > > > > > > > there's a risk that API or behaviors may change >> significantly. >> > > > > > > > > > > >> > >> This in turn, leads to additional work for >> users of >> > > > > the SDK >> > > > > > > > on >> > > > > > > > > > every release which leads to sticking to older versions >> or >> > > > > forking >> > > > > > > > > > > >> > >> to preserve behavior. Version updates should >> be looked >> > > > > > > > forward >> > > > > > > > > > to, and viewed as having little risk. Further while >> there's been >> > > > > > > > > > > >> > >> previous dicussion about what the "low bar" >> is for a >> > > > > new >> > > > > > > > SDK, it >> > > > > > > > > > hasn't been summarily applied to the Go SDK. I feel >> this has >> > > > > > > > > > > >> > >> hurt development and contribution of new SDK >> languages >> > > > > > > > (inherent >> > > > > > > > > > difficulty of SDK development notwithstanding). >> > > > > > > > > > > >> > >> >> > > > > > > > > > > >> > >> When the SDK was designed, it wasn't entirely >> clear >> > > > > what the >> > > > > > > > > > Beam Model should look like in an opinionated language >> like Go. >> > > > > > > > > > > >> > >> Their initial take (see >> > > > > > > > > > https://s.apache.org/beam-go-sdk-design-rfc [0]) goes >> into >> > > > > detail >> > > > > > > > what it >> > > > > > > > > > means for a language without >> > > > > > > > > > > >> > >> Generics, or overloading, or inheritance to >> implement >> > > > > the >> > > > > > > > beam >> > > > > > > > > > model. One could largely throw away static types (like >> Python), >> > > > > > > > > > > >> > >> but this approach rings hollow for Go. It >> would not do >> > > > > if the >> > > > > > > > > > approach couldn't grow and scale to the Beam Model. >> It's also >> > > > > hard >> > > > > > > > > > > >> > >> to tell if an API is any good before there >> are users. >> > > > > > > > > > > >> > >> >> > > > > > > > > > > >> > >> Further, in the early days of Portability, >> there >> > > > > wasn't a >> > > > > > > > way to >> > > > > > > > > > write scalable DoFns, dynamically or otherwise. It's an >> > > > > incredible >> > > > > > > > > > > >> > >> bottleneck to need to do all initial fanout >> of work on >> > > > > a >> > > > > > > > single >> > > > > > > > > > machine, write everything to a Reshuffle, just in order >> to scale >> > > > > up. >> > > > > > > > > > > >> > >> Without being able to scale, Beam is little >> more than >> > > > > > > > overhead. >> > > > > > > > > > > >> > >> >> > > > > > > > > > > >> > >> At this point, both of these needs are met >> within the >> > > > > Go SDK >> > > > > > > > for >> > > > > > > > > > open source. >> > > > > > > > > > > >> > >> >> > > > > > > > > > > >> > >> Background >> > > > > > > > > > > >> > >> >> > > > > > > > > > > >> > >> The Go SDK has been a part of the beam repo >> for a few >> > > > > years >> > > > > > > > now, >> > > > > > > > > > since it was accidentally merged into master. >> > > > > > > > > > > >> > >> Since then it's been called experimental, and >> not >> > > > > officially >> > > > > > > > > > part of the releases. >> > > > > > > > > > > >> > >> >> > > > > > > > > > > >> > >> Of the SDKs, it's was always designed around >> Beam >> > > > > Portability >> > > > > > > > > > first. It never had any "Legacy" (SDK x Runner specific >> ) >> > > > > workers. >> > > > > > > > > > > >> > >> It's always used the Beam Pipeline protos and >> FnAPI to >> > > > > > > > execute >> > > > > > > > > > jobs, first with some very experimental code on >> Dataflow, but now >> > > > > > > > > > > >> > >> on all portable supported runners, like >> Flink, Spark, >> > > > > the >> > > > > > > > Python >> > > > > > > > > > Portable runner, and Dataflow. >> > > > > > > > > > > >> > >> >> > > > > > > > > > > >> > >> API Stability >> > > > > > > > > > > >> > >> >> > > > > > > > > > > >> > >> The Go SDK hasn't meaningfully changed it's >> user API >> > > > > for DoFn >> > > > > > > > > > and pipeline construction since it was first merged in, >> and >> > > > > there are >> > > > > > > > no >> > > > > > > > > > > >> > >> changes to that on the horizon that can't be >> made in a >> > > > > > > > backwards >> > > > > > > > > > compatible manner. Largely these are related to New >> Features, or >> > > > > > > > > > > >> > >> usability improvements enabled by the advent >> of Go >> > > > > Generics >> > > > > > > > > > (think of "real" KV, emitter, and iterator types). >> > > > > > > > > > > >> > >> >> > > > > > > > > > > >> > >> It's an open secret that the Go SDK has >> largely been >> > > > > under >> > > > > > > > work >> > > > > > > > > > for use within Google. It's use is called FlumeGo, >> representing >> > > > > > > > > > > >> > >> the Apache Beam Go SDK, running on top of >> Flume, >> > > > > Google's >> > > > > > > > batch >> > > > > > > > > > pipeline processing engine. Thus most of the focus on >> improving >> > > > > > > > > > > >> > >> batch execution. FlumeGo sees ample use >> today, and >> > > > > there >> > > > > > > > hasn't >> > > > > > > > > > been a call for fundamental changes to the API for >> ergonomic or >> > > > > > > > > > > >> > >> usability concerns. >> > > > > > > > > > > >> > >> >> > > > > > > > > > > >> > >> Scalability >> > > > > > > > > > > >> > >> >> > > > > > > > > > > >> > >> Google could get away without the Go SDK >> having an SDK >> > > > > side >> > > > > > > > > > scalability solution as a result of it's integration >> with Flume. >> > > > > > > > > > > >> > >> However, those days are now past. >> > > > > > > > > > > >> > >> >> > > > > > > > > > > >> > >> The Go SDK now supports SplittableDoFns along >> with >> > > > > Dynamic >> > > > > > > > > > Splitting, which supports writing scalable batch >> transforms >> > > > > natively >> > > > > > > > > > > >> > >> in the Go SDK. >> > > > > > > > > > > >> > >> The SDK also supports Cross Language >> Transforms, with >> > > > > Beam >> > > > > > > > > > Schema encodings. With it, production hardened >> transforms >> > > > > > > > > > > >> > >> from Java and Python are a wrapper away. >> > > > > > > > > > > >> > >> >> > > > > > > > > > > >> > >> Presently, Daniel Oliveira (who implemented >> the SDF >> > > > > side >> > > > > > > > work, >> > > > > > > > > > and completed the Xlang work,) is adding a wrapper for >> the >> > > > > > > > > > > >> > >> Java Kafka IO using Cross Language >> Transforms, which >> > > > > is often >> > > > > > > > > > been requested. This will also enable use of the Beam >> SQL >> > > > > > > > > > > >> > >> transforms that java enables. >> > > > > > > > > > > >> > >> >> > > > > > > > > > > >> > >> Features >> > > > > > > > > > > >> > >> >> > > > > > > > > > > >> > >> The Go SDK implements the Beam C=core. The Go >> SDK >> > > > > implements >> > > > > > > > > > standard coders, allows for user DoFns, and CombineFns >> and access >> > > > > > > > > > > >> > >> to core transforms like Flatten, GroupByKey, >> and >> > > > > features >> > > > > > > > like >> > > > > > > > > > Side Inputs, Windowing, and User Metrics. >> > > > > > > > > > > >> > >> Basic windowing will be fully supported for >> batch even >> > > > > > > > through >> > > > > > > > > > lifted combines in the 2.32.0 release. >> > > > > > > > > > > >> > >> >> > > > > > > > > > > >> > >> All of the above enables Beam Go to be >> versatile for >> > > > > batch >> > > > > > > > > > execution on portable runners, and for simple streaming >> > > > > pipelines. >> > > > > > > > > > > >> > >> >> > > > > > > > > > > >> > >> Repo Testing >> > > > > > > > > > > >> > >> >> > > > > > > > > > > >> > >> On precommit the Go SDK runs all it's unit >> tests. On >> > > > > top of >> > > > > > > > > > that, it runs all it's integration tests against the >> Python >> > > > > Portable >> > > > > > > > runner, >> > > > > > > > > > > >> > >> making it quick and robust to detect breaking >> changes >> > > > > without >> > > > > > > > > > overspending community resources. Those same tests are >> also >> > > > > > > > > > > >> > >> run against Dataflow, Flink, and Spark. >> > > > > > > > > > > >> > >> >> > > > > > > > > > > >> > >> The tests are executable against all runners >> via the >> > > > > > > > appropriate >> > > > > > > > > > Go commands (if you've stood up your own job management >> server), >> > > > > > > > > > > >> > >> or Gradle commands (which will spin up runner >> > > > > instances for >> > > > > > > > > > you). Documentation for executing tests and adding new >> ones >> > > > > > > > > > > >> > >> is on the wiki. [2] They are accessible to Go >> > > > > developers as >> > > > > > > > > > they're implemented with the standard Go testing tools. >> > > > > > > > > > > >> > >> >> > > > > > > > > > > >> > >> Shortcomings >> > > > > > > > > > > >> > >> That said, there's still much to do. Let me >> briefly >> > > > > tell you >> > > > > > > > > > what doesn't work, and it's up to you to weigh whether >> they block >> > > > > > > > > > > >> > >> being out of experimental. >> > > > > > > > > > > >> > >> >> > > > > > > > > > > >> > >> At present, only a textio has been >> implemented as >> > > > > Splittable >> > > > > > > > > > DoFn. >> > > > > > > > > > > >> > >> Once the Kafka wrapper is merged in, it will >> serve as >> > > > > a the >> > > > > > > > > > first example for future contributions for >> > > > > > > > > > > >> > >> new transform wrappers for the Go SDK. >> > > > > > > > > > > >> > >> Transforms and IOs are lacking, but at this >> point >> > > > > users are >> > > > > > > > > > empowered to write their own DoFns or wrap existing >> transforms >> > > > > for >> > > > > > > > Cross >> > > > > > > > > > Language use. >> > > > > > > > > > > >> > >> >> > > > > > > > > > > >> > >> In the core SDK, more streaming focused >> features have >> > > > > yet to >> > > > > > > > be >> > > > > > > > > > implemented, but they're largely additions to what >> exists already >> > > > > > > > > > > >> > >> rather than total rebuilds. Much of the work >> is >> > > > > definining >> > > > > > > > how a >> > > > > > > > > > user specifies their desires, and turning those into the >> > > > > appropriate >> > > > > > > > > > > >> > >> FnAPI requests at execution time. Back in >> October I >> > > > > wrote at >> > > > > > > > > > length on the wiki [1] what's missing for additional >> streaming >> > > > > > > > features. >> > > > > > > > > > > >> > >> >> > > > > > > > > > > >> > >> While we have bolstered our testing recently, >> there's >> > > > > likely >> > > > > > > > > > still more we could test to improve our confidence in >> the SDK, >> > > > > > > > > > > >> > >> in particular regarding the included >> transforms >> > > > > libraries and >> > > > > > > > > > examples. >> > > > > > > > > > > >> > >> >> > > > > > > > > > > >> > >> Moving Forward >> > > > > > > > > > > >> > >> >> > > > > > > > > > > >> > >> My immediate plan is to work on incorporating >> the Go >> > > > > SDK >> > > > > > > > fully >> > > > > > > > > > into the Beam Programming Guide. I've audited the guide >> [3], and >> > > > > > > > > > > >> > >> am beginning to add missing content and >> filling in the >> > > > > Go >> > > > > > > > > > specific gaps. This will be tied to improving the Go >> Doc with >> > > > > more Go >> > > > > > > > > > > >> > >> specific user documentation that isn't >> appropriate for >> > > > > the >> > > > > > > > BPG. >> > > > > > > > > > > >> > >> And resolving the LICENSE issue around the >> public >> > > > > display of >> > > > > > > > > > that GoDoc. >> > > > > > > > > > > >> > >> >> > > > > > > > > > > >> > >> If this proposal is accepted by a binding >> vote, I will >> > > > > > > > > > incorporate the SDK into the release process, and >> remove the >> > > > > > > > "experimental" >> > > > > > > > > > > >> > >> language around the SDK. This largely entails >> updating >> > > > > the >> > > > > > > > > > release scripts to also build and publish the Go SDK >> Docker >> > > > > containers. >> > > > > > > > > > > >> > >> As for releasing the code, we're technically >> already >> > > > > doing so >> > > > > > > > > > whenever we tag a release branch [4]. >> > > > > > > > > > > >> > >> >> > > > > > > > > > > >> > >> The clearest signal to the Go community >> however will be >> > > > > > > > > > migrating the SDK to use Go Modules for dependency >> version >> > > > > control, >> > > > > > > > > > > >> > >> which Daniel is planning on working on after >> his Kafka >> > > > > task. >> > > > > > > > > > This will put our repo infrastructure, SDK >> contributors, and >> > > > > users >> > > > > > > > > > > >> > >> on the same footing when it comes to >> dependency >> > > > > management. >> > > > > > > > It >> > > > > > > > > > will remove the "+incompatible" tags one sees on the >> > > > > > > > > > > >> > >> pkg.go.dev list at [4]. >> > > > > > > > > > > >> > >> >> > > > > > > > > > > >> > >> I'm very happy to answer any questions you >> might have >> > > > > about >> > > > > > > > the >> > > > > > > > > > SDK, and provide additional links as needed. I >> intentionally >> > > > > avoided >> > > > > > > > > > > >> > >> a link barrage in this email, as they can >> distract >> > > > > from the >> > > > > > > > > > point: The SDK is ready for folks to use it, we need to >> tell >> > > > > them that >> > > > > > > > they >> > > > > > > > > > can >> > > > > > > > > > > >> > >> rather than they shouldn't. >> > > > > > > > > > > >> > >> >> > > > > > > > > > > >> > >> Robert Burke >> > > > > > > > > > > >> > >> Defacto Beam Go TL >> > > > > > > > > > > >> > >> >> > > > > > > > > > > >> > >> [0] >> https://s.apache.org/beam-go-sdk-design-rfc >> > > > > > > > > > > >> > >> [1] >> > > > > > > > > > >> > > > > > > > >> > > > > >> https://cwiki.apache.org/confluence/display/BEAM/Supporting+Streaming+in+the+Go+SDK >> > > > > > > > > > > >> > >> [2] >> > > > > https://cwiki.apache.org/confluence/display/BEAM/Go+Tips >> > > > > > > > > > > >> > >> [3] >> > > > > > > > > > >> > > > > > > > >> > > > > >> https://docs.google.com/spreadsheets/d/1DrBFjxPBmMMmPfeFr6jr_JndxGOes8qDqKZ2Uxwvvds/edit?resourcekey=0-tVFwcLrQ2v2jpZkHk6QOpQ#gid=2072310090 >> > > > > > > > > > (SDK Audit sheet) >> > > > > > > > > > > >> > >> [4] >> > > > > > > > > > >> > > > > > > > >> > > > > >> https://pkg.go.dev/github.com/apache/beam/sdks/go/pkg/beam?tab=versions >> > > > > > > > > > > >> > >> > > > > > > > > > >> > > > > > > > > >> > > > > > > > >> > > > > > > >> > > > > > >> > > > > >> > > > >> > > >> >
