+1 very well summarized Danny! Robert B Beam Go Busybody
PS. Apologies to the list for my emphatic but overly abrasive reply up thread. I wrote brashly, and misunderstood Kerry's earnest question. On Mon, Mar 28, 2022, 8:47 AM Ahmet Altay <al...@google.com> wrote: > Thank you Danny for sharing a summary for the rest of us on the mailing > list. Thank you all for the due diligence and the discussion. I appreciate > it :) > > On Mon, Mar 28, 2022 at 6:49 AM Danny McCormick <dannymccorm...@google.com> > wrote: > >> To close the loop here, Robert and I dug in offline, and Robert >> ultimately convinced me that we can relax the completeness requirement from >> my doc and not require coverage of all structural DoFn methods in order to >> replace the code generator. Ultimately, the most important things that >> changed my mind were: >> >> 1) Better understanding the current uses and challenges with the code >> generator (this includes both the current uses and probably more >> importantly understanding more of the challenges around things like naming >> collisions). Robert summarized these challenges nicely in his 3rd point >> earlier in the thread. Ultimately, if the generator is really hard to use >> and nobody (or few people) in the OSS community are benefiting from it as a >> result, then completeness matters less than making it usable. >> 2) Robert's suggestion of interface composition in the doc helps reduce >> the complexity for *most *cases, meaning that even if we can't hit all >> structural DoFn methods, we can get closer than I initially thought. >> 3) Our up-thread conversation about the frequency of ProcessElement calls >> vs other calls. While I'm still worried about watermark estimation and some >> other future sdf methods, we can get most of the benefits of a generics >> approach without worrying about those methods. >> >> From the beginning we've agreed that if we relax the completeness >> requirement then most of the technical challenges outlined in my doc go >> away and this work becomes feasible. I've updated the doc to lead with a >> disclaimer that we've agreed to relax this requirement, but I would mostly >> like to keep it as is other than that - it is still technically solid, it >> provides a useful outline of why we can't guarantee complete coverage of >> all structural DoFn methods, and it gives us a good starting point to reach >> full coverage if the Go team eventually decides to take on >> https://github.com/golang/go/issues/41176. >> >> I'm still planning on taking on watermark estimation before this work >> since it could have an impact on how we think about the final design. After >> that, I will probably come back to this and put together another small doc >> outlining the updated approach with the relaxed requirement (assuming >> nobody gets to it first 🙂). >> >> Thanks, >> Danny >> >> On Thu, Mar 17, 2022 at 11:38 AM Robert Burke <rob...@frantil.com> wrote: >> >>> Apologies to Danny, i haven't read your considered reply yet, as i need >>> to address Kerry's question on priorities immediately. >>> >>> >>> >>> I feel it's entirely critical to have before native streaming, because >>> it serves users more in the long term. >>> >>> The cost of waiting is that we can't establish the convention of >>> registering things with the SDK to get performance boosts and instead ask >>> people to fiddle with their code again a few months down the road. This >>> leads to proliferation of code that will be unable to set the precedent of >>> efficiency for organizations who aren't able to keep up with the latest and >>> greatest in Beam. >>> >>> The cost of waiting is having users burn through CPU unnecessarily >>> contributing to the ongoing boiling of the planet. >>> >>> The cost of waiting is making users spend more on said CPU and memory, >>> when they could have faster completing jobs more cheaply now, rather than >>> later. That does more for all Go SDK users now than streaming would. >>> >>> Setting conventions, and having a good experience with the SDK is >>> something we should be doing sooner rather than later, so we aren't >>> constantly changing things on our users with New Ways To Do Old Things, vs >>> unlocking abilities to do things in another SDK, that have Options in the >>> other SDKs already. >>> >>> >>> On Thu, Mar 17, 2022, 8:01 AM Kerry Donny-Clark <kerr...@google.com> >>> wrote: >>> >>>> My thoughts: >>>> What are the benefits of doing this now/the costs of waiting? >>>> Are generics something that gives us more value than streaming >>>> features? It makes sense to me to punt this work until the SDK is more >>>> developed, so that we can observe the consequences of streaming instead of >>>> imagining them. >>>> I'm the least expert Go developer on this thread, so please let me know >>>> if I'm missing something obvious. I really appreciate the clear back and >>>> forth :) >>>> >>>> On Thu, Mar 17, 2022 at 8:36 AM Danny McCormick < >>>> dannymccorm...@google.com> wrote: >>>> >>>>> Robert and I have been catching up offline, I did still want to >>>>> respond here for the broader audience, both to clarify some of the >>>>> questions Robert brought up and to share some of the progress we've made >>>>> towards getting on the same page. Structurally, I'll walk through the >>>>> questions/assertions brought up above to establish common ground and >>>>> where we have seen the world differently. Some of that has been covered in >>>>> our offline conversation, some has not. At the end I'll try to give a >>>>> brief >>>>> summary of what we've talked about offline. Robert, please feel free to >>>>> correct me if I've misrepresented any of that conversation! >>>>> >>>>> *Robert's tl;dr* >>>>> >>>>> > I'd also like to emphasize that we gain 99.99999...% of the code >>>>> generator benefit by covering the ProcessElement calls. >>>>> > No other method is called similarly often by the system (on every >>>>> element). Perfect is the enemy of Good Enough. >>>>> >>>>> I think this is actually fundamentally where we diverged (more on that >>>>> below) - I've been treating coverage of all structural DoFn methods as a >>>>> requirement for a code generator replacement, Robert has not. More on that >>>>> below. >>>>> >>>>> *1. Problem: Users have to determine which Registration function to >>>>> use.* >>>>> >>>>> *1.a)* I think that perhaps there has been a breakdown of terminology >>>>> here. When I say on the order of 1000s of DoFn variants, I'm not just >>>>> talking about DoFn1x0 to DoFn7x4. If that is all we need to worry about I >>>>> agree that's not a problem. I'm referring to the section of my doc where I >>>>> discuss the need for a proliferation of interfaces to support each >>>>> structural DoFn method such that a user can pass in a DoFn of the exact >>>>> right interface. See >>>>> https://docs.google.com/document/d/1imYbBeu2FNJkwPNm6E9GEJkjpHnHscvFoKAE6AISvFA/edit#heading=h.e24tsm6ev6f0 >>>>> where I discuss the problem - specifically, it culminates in "To use >>>>> this function, the user would be required to know which of the many >>>>> DoFn variants they should using...". >>>>> >>>>> It's fine if you disagree with that (and I think you have brought up >>>>> some valid points about ways to reduce the complexity by composing >>>>> interfaces), but I'm heavily relying on shared understanding of that doc >>>>> for contextualizing this conversation. >>>>> >>>>> *1.b) *Sorry that was confusing, you are correct that I've been >>>>> abusing the term cast when really what I've meant in most cases is a user >>>>> forcing type inference (e.g. `foo.(Bar)`) >>>>> >>>>> *1.c) *Agreed if there are ~30 types of DoFn reasonably named, I >>>>> think this becomes exceedingly inconvenient if there is indeed an >>>>> explosion >>>>> of types into the hundreds or thousands. I don't think we disagree here, >>>>> this point mostly reduces to (1.a) - please correct me if I'm wrong. >>>>> >>>>> *1.d) *That's true, we could also have it print out arbitrary >>>>> registration code though - I don't think that's a particularly compelling >>>>> user experience either way. >>>>> >>>>> *2. Feasibility.* >>>>> >>>>> *2.a) *To be honest, I just flat out disagree here - like I said >>>>> above, your example doesn't touch the actual user facing function which is >>>>> the hard part of the problem IMO. I think it starts to show a path towards >>>>> simplifying the code generator (which I don't think we have any >>>>> disagreement on the feasibility of), but I'd appreciate some justification >>>>> for your statement that it demonstrates feasibility in most cases. >>>>> >>>>> As an aside, I would say that if you think you had a basically working >>>>> example (or could come up with one very quickly), I would've appreciated >>>>> you sharing that as a resource associated with your 1 pager (not >>>>> necessarily presented as *the solution*, but as a possible approach). >>>>> >>>>> *2.b) and 2.c) *I won't touch this too much here because I think this >>>>> has been explored a lot more in our offline conversation (summarized >>>>> below). At a high level, I agree, it's just a matter of which tradeoffs we >>>>> want to make. As mentioned above, I've been treating coverage of all >>>>> structural DoFn methods as a requirement for a code generator replacement >>>>> (largely in service of "It should be performance neutral" from the >>>>> original one-pager, but also because I generally think that's the right >>>>> approach). >>>>> >>>>> *3. Justifications for the work* >>>>> >>>>> I don't have much to add here, other than asking that this kind of >>>>> context be included in future one-pagers. Context like this is both really >>>>> valuable to me as a newer contributor and really hard to find on my own if >>>>> it isn't written down. Thanks for sharing it now! >>>>> >>>>> With this context, I'm generally on board with at least using generics >>>>> to clean up some of these issues with the generator. I don't foresee the >>>>> same set of problems I've described in my doc keeping us from simplifying >>>>> the generator. >>>>> >>>>> >>>>> ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ >>>>> >>>>> *Summary of offline conversation* >>>>> >>>>> As alluded to above, I believe the fundamental difference in how >>>>> Robert and I have been seeing this problem is one of requirements: I've >>>>> been treating coverage of all structural DoFn methods as a requirement for >>>>> a code generator replacement, Robert has not. If you remove that >>>>> requirement and just focus on the easier ones to optimize (including >>>>> ProcessElement, all the other non-sdf functions, and a subset of the sdf >>>>> functions), the main problems presented in my doc go away. >>>>> >>>>> A huge piece of Robert's reasoning is looking at the frequency with >>>>> which these functions will be invoked. ProcessElement gets invoked way >>>>> more >>>>> often than the rest of the structural DoFn methods, so we should focus on >>>>> optimizing for that and not worry as much about the rest. >>>>> >>>>> I currently think that it is a good idea to continue to leave the door >>>>> open for optimizing all the structural DoFn methods. The one I'm most >>>>> worried about is watermark estimation, which doesn't actually exist yet in >>>>> the Go Sdk - I think its likely that it will require invocation of at >>>>> least >>>>> 1 structural DoFn function at the element or bundle level, and I don't >>>>> want >>>>> to box ourselves out of being able to support performant watermark >>>>> estimation (and thus fully performant full streaming support). Generally, >>>>> the questions I'm interested in are: >>>>> >>>>> 1) Can we answer definitively at what level we will need structural >>>>> DoFn invocation for watermark estimation 2) If its at the bundle/element >>>>> level does that preclude us from a generics approach that doesn't fully >>>>> optimize for that 3) If its unclear, should we prioritize watermark >>>>> estimation design (and maybe implementation) ahead of generics/code >>>>> generator >>>>> >>>>> While Robert has been framing it as: >>>>> >>>>> 4) Does a feature, like Watermark estimation, require us to do >>>>> something different with the generic code generator such that we need to >>>>> put off working on it now? >>>>> >>>>> I'm also still not 100% convinced that optimizing only at the >>>>> bundle/element level is the right approach, though I'm probably moving in >>>>> that direction. >>>>> >>>>> I would love input from others here if anyone has thoughts or context >>>>> that is missing from this conversation! Otherwise, I'm sure Robert and I >>>>> will continue our offline conversations and drive towards a solution. >>>>> >>>>> Thanks, >>>>> Danny >>>>> >>>>> On Wed, Mar 16, 2022 at 2:18 PM Robert Burke <rob...@frantil.com> >>>>> wrote: >>>>> >>>>>> Great questions! >>>>>> >>>>>> I think, ultimately, I was too coy in my one pager, and that's >>>>>> leading to the disconnect we're having. I go into much more details here. >>>>>> >>>>>> *tl;dr;* >>>>>> >>>>>> I didn't emphasize the issues with the current generator in my doc. >>>>>> Naming is a *huge* problem, and a generic implementation of the >>>>>> generated code avoids most of it, and a total replacement avoids all of >>>>>> it. >>>>>> >>>>>> I'd also like to emphasize that we gain 99.99999...% of the code >>>>>> generator benefit by covering the ProcessElement calls. >>>>>> No other method is called similarly often by the system (on every >>>>>> element). Perfect is the enemy of Good Enough. >>>>>> >>>>>> It's certainly good enough if more users are able to benefit from it, >>>>>> rather than those with infrastructure, or the rigor to constantly update >>>>>> their generated code. Users need to register their DoFns to make things >>>>>> work anyway, they might as well get a performance boost out of it. >>>>>> >>>>>> *Longer version* >>>>>> >>>>>> I find endlessly inlined points and counterpoints to be very hard to >>>>>> read, so I'm enumerating them so they can be referred to. >>>>>> >>>>>> *0.* Lets avoid discussing using Generics for KVs, Iterators, >>>>>> Emitters. etc. That convenience is orthogonal to the code generator >>>>>> concerns. They are related only by generics, and the code generator is a >>>>>> big enough topic by itself. We can discuss it elsewhere. We gain nothing >>>>>> by conflating the two and scope creeping to include them at this time. >>>>>> * 0.a )* Unless there's a compelling argument to address them now, >>>>>> that I'm missing, of course. >>>>>> >>>>>> *1. Problem: Users have to determine which Registration function to >>>>>> use.* >>>>>> >>>>>> *1.a)* I think "thousands" is overstating the issue. For the >>>>>> "supported" variants for which we do call time optimization, it's from >>>>>> 1x0 >>>>>> to 7x4. So about 28-30 Registrations for structural DoFns, and an >>>>>> equivalent set for functional DoFns. A big number, but certainly not >>>>>> thousands. >>>>>> >>>>>> Recall the generics largely are unaware of which specific types they >>>>>> actually need to deal with. It's not clear to me where the explosion is >>>>>> coming from, considering the other life cycle methods can be in related, >>>>>> but independent interfaces the user doesn't need to be aware of. >>>>>> >>>>>> The code generator (and most of the rest of the invoker framework) is >>>>>> unaware of whether a parameter is a context, or a window, or a value, or >>>>>> an >>>>>> iterator, or an emitter. It just type asserts, and >>>>>> >>>>>> *1.b)* You keep mentioning "casting". Per my example, there's no >>>>>> "casting", just selecting the right registration function, and specifying >>>>>> types. Please clarify if that's what you're referring to, as Go doesn't >>>>>> have "casting" as a concept. There are type conversions[0] for most >>>>>> values >>>>>> of identical memory layout, and type assertions[1] for interfaces. >>>>>> >>>>>> *1.c) *I think users are capable of picking the right registration >>>>>> method and filling in the types, if they are provide with clear examples >>>>>> in >>>>>> the SDK, and documentation. Users will go through a fair amount in order >>>>>> to >>>>>> squeeze out some performance. >>>>>> >>>>>> *1.d) *If all else fails, with a generic method, we can have the SDK >>>>>> litterally print out the registration code for the users. >>>>>> >>>>>> *2. Feasibility.* >>>>>> >>>>>> *2.a)* I feel my example demonstrates the feasibility for *most >>>>>> cases*. >>>>>> >>>>>> *2.b)*There are certainly areas that are currently lacking. This is >>>>>> also true of the code generator as is. It doesn't support map side inputs >>>>>> in any form right now. Either we move to generics and stop asking the >>>>>> question, or we need to fix it for every complex function type we accept >>>>>> as >>>>>> a parameter. >>>>>> >>>>>> This is a naming problem that is avoided entirely with a generic >>>>>> implementation. >>>>>> >>>>>> *2.c)* Even if we can't or don't initially support SplittableDoFns >>>>>> (which are not currently supported properly anyway), or CombineFns (which >>>>>> are), there will always be a "just one more" problem. That's OK. Most >>>>>> DoFns >>>>>> are much simpler, and can and should get the performance benefits. >>>>>> Perfect >>>>>> is the enemy of Good Enough. >>>>>> >>>>>> *3. Justifications for the work* >>>>>> >>>>>> *3.a) *I'm only aware of one user for the code generator at present, >>>>>> and that's presently for Google internal use. >>>>>> >>>>>> It relies on Bazel rules that users need to specify instead of the >>>>>> usual "go_library" or "go_binary" rules by default. Already this is an >>>>>> encumbrance. Static analysis is never straightforward, so users must >>>>>> still >>>>>> manually list their dofns in those BUILD files, outside of their code. >>>>>> The >>>>>> main benefit of this approach is users can be ignorant of the .shims.go >>>>>> file we're adding to their package. >>>>>> >>>>>> Through this usage, issues with the code generator have been >>>>>> discovered: >>>>>> >>>>>> * Separate from the code, and not type checked -> Their IDEs can't >>>>>> assist with typos and similar, but with an in-code generic solution this >>>>>> is >>>>>> resolved. A work around exists by having the static analysis attempt to >>>>>> figure things out, but writing clear error messages in this case is >>>>>> non-obvious, and generally separated from the code, which would need to >>>>>> be >>>>>> corrected in the separate file. >>>>>> * It's not possible to use the code generator for both package code, >>>>>> and test code, as there are naming collisions around iterators and >>>>>> emitters, as they generate in separate shim files in the same package. >>>>>> * The code generator needs to determine and provide imports for the >>>>>> shims files for the various user types. This is not entirely trivial, and >>>>>> also runs into name collisions for the short package names. >>>>>> * See newer iterator/side input types issue above. >>>>>> >>>>>> These could all be fixed in the code generator, but in the end, we'd >>>>>> largely be replicating what the compiler could be doing for us instead, >>>>>> if >>>>>> the users were calling functions in their own code files. >>>>>> >>>>>> These problems are not exclusive to Googles internal approach to >>>>>> using the generator either. >>>>>> >>>>>> *3.b) *The Code generator by itself isn't great to use for open >>>>>> source. >>>>>> >>>>>> * Requires an explicit `go generate` call by the user, which >>>>>> generates a file they should largely not touch, until it causes a problem >>>>>> with compilation. >>>>>> * It's presently loosely versioned, so we can't enforce that users >>>>>> re-generate when they update their code, even if it benefits them. >>>>>> * The errors and code it generates are inscrutable. >>>>>> >>>>>> *3.c)* Benefits of moving to generic generation from static >>>>>> generation >>>>>> >>>>>> Assuming we determine users won't want to write the registration >>>>>> calls themselves, then we still need to fix the above issues. I posit >>>>>> that >>>>>> generating a file that contains generic function calls is much better >>>>>> than >>>>>> what we currently have, which uses arbitrary unicode symbols to avoid >>>>>> collisions, and similar. >>>>>> >>>>>> Should we move to generating calls to generic methods, we avoid all >>>>>> naming problems, except for the import short name issue. >>>>>> >>>>>> This alone is worth it in the avoided edge cases. >>>>>> >>>>>> Robert Burke >>>>>> Beam Go Busybody >>>>>> >>>>>> [0] https://go.dev/doc/effective_go#conversions >>>>>> [1] https://go.dev/doc/effective_go#interface_conversions >>>>>> >>>>>> >>>>>> On Wed, 16 Mar 2022 at 07:00, Danny McCormick < >>>>>> dannymccorm...@google.com> wrote: >>>>>> >>>>>>> Hey Robert, thanks for the thoughts - I replied to your thoughts in >>>>>>> the doc, but will add some thoughts here as well: >>>>>>> >>>>>>> > I agree that it's going to be considered inconvenient for users >>>>>>> to manually specify types at this stage. >>>>>>> > I think it's a substantial improvement over the status quo, simply >>>>>>> because it removes a step in the iteration cycle when a change needs to >>>>>>> be >>>>>>> made. >>>>>>> > It has the compiler do the work for us mostly. >>>>>>> >>>>>>> I think this is understating the problem significantly. The issue as >>>>>>> described in the doc isn't just asking users to specify types in their >>>>>>> registration call, it's that they would need to find the correct >>>>>>> interface >>>>>>> to cast their function to from many generated DoFn variants (on the >>>>>>> order >>>>>>> of thousands) - then they would need to specify types, which is the easy >>>>>>> part. If we're able to reduce the problem to users specifying types >>>>>>> then I >>>>>>> wholeheartedly agree we should move forward with that approach. >>>>>>> >>>>>>> > I don't think the doc went far enough. See a working sketch of my >>>>>>> thoughts here: https://go.dev/play/p/Bwf4eSavNxt?v=gotip >>>>>>> >>>>>>> and >>>>>>> >>>>>>> > The doc covers it's explored options very well, but it leaves out >>>>>>> the possibility of getting halfway there: >>>>>>> https://go.dev/play/p/Bwf4eSavNxt?v=gotip >>>>>>> >>>>>>> It's not 100% clear to me what you're getting at here, but I do >>>>>>> think that this falls well short of the target of this investigation and >>>>>>> fails to address what is actually the sticking point explained in the >>>>>>> doc >>>>>>> (the user experience of trying to find the right function from DoFn >>>>>>> variants) - this doesn't touch the actual user facing registration >>>>>>> function >>>>>>> which is the hard part IMO. If your point is that we might be able to >>>>>>> leverage generics to simplify/improve our code generator, I mention that >>>>>>> below - if not, could you help me understand what you're suggesting? >>>>>>> >>>>>>> > And finally, even if inference is ultimately the sticking point: >>>>>>> We should still rewrite the code generator to generate generic code for >>>>>>> us, >>>>>>> since it moves the harder parts (de-duplication, name collisions in >>>>>>> types >>>>>>> if used in both package code, and test code) into the Go compiler, >>>>>>> rather >>>>>>> than in our code. >>>>>>> >>>>>>> I don't necessarily disagree - just simplifying/improving the code >>>>>>> generator with generics (along with generic KVs and other user facing >>>>>>> improvements) was out of the scope of the doc, which was targeted >>>>>>> specifically at whether we can fully replace the code generator. I >>>>>>> intentionally targeted that because it was the clearest value add for >>>>>>> users >>>>>>> and definitely seemed to justify the investment - it is not as >>>>>>> immediately >>>>>>> clear to me how important replacing pieces of the code generator with >>>>>>> generics is and whether we should prioritize that work immediately, >>>>>>> save it >>>>>>> for later, or wait and see if the type inference blocker is something >>>>>>> the >>>>>>> Go team decides to pick up. >>>>>>> >>>>>>> I think that's a discussion worth having, but it only matters if >>>>>>> replacing the code generator is not actually feasible, so until we have >>>>>>> consensus on that I'd prefer to hold off on going too far down that >>>>>>> path. >>>>>>> >>>>>>> Thanks, >>>>>>> Danny >>>>>>> >>>>>>> On Tue, Mar 15, 2022 at 8:31 PM Robert Burke <rob...@frantil.com> >>>>>>> wrote: >>>>>>> >>>>>>>> The problem with being on vacation for a week is so many cool >>>>>>>> things come up while you're gone that you can't get ahead of... >>>>>>>> >>>>>>>> Thank you Danny for the initial write up! I hope the exercise >>>>>>>> helped you understand more about the SDK and Go, but I think there's >>>>>>>> more >>>>>>>> to do here. >>>>>>>> >>>>>>>> *tl;dr; * >>>>>>>> 1. I don't think the doc went far enough. See a working sketch of >>>>>>>> my thoughts here: https://go.dev/play/p/Bwf4eSavNxt?v=gotip >>>>>>>> 2. Agreed that supporting generic KVs should still be possible, but >>>>>>>> it's an orthogonal investigation to the code generator (and its >>>>>>>> generated >>>>>>>> code) question. >>>>>>>> I won't discuss that further, as I have a different one-pager in >>>>>>>> the works on that topic, and that's a very very different thread. >>>>>>>> >>>>>>>> *On the Code Generator* >>>>>>>> Overall, the goal is to totally do-away with the current approach >>>>>>>> of pre-generating. But I don't think a lack of inference is an >>>>>>>> obstacle to >>>>>>>> that goal. >>>>>>>> >>>>>>>> I agree that it's going to be considered inconvenient for users >>>>>>>> to manually specify types at this stage. >>>>>>>> I think it's a substantial improvement over the status quo, simply >>>>>>>> because it removes a step in the iteration cycle when a change needs >>>>>>>> to be >>>>>>>> made. >>>>>>>> It has the compiler do the work for us mostly. >>>>>>>> >>>>>>>> The doc covers it's explored options very well, but it leaves out >>>>>>>> the possibility of getting halfway there: >>>>>>>> https://go.dev/play/p/Bwf4eSavNxt?v=gotip >>>>>>>> >>>>>>>> To cover the varieties of expected arities, we should generate the >>>>>>>> SDK side registration code in order to avoid missing cases, and ensure >>>>>>>> even >>>>>>>> coverage. >>>>>>>> >>>>>>>> And finally, even if inference is ultimately the sticking point: We >>>>>>>> should still rewrite the code generator to generate generic code for >>>>>>>> us, >>>>>>>> since it moves the harder parts (de-duplication, name collisions in >>>>>>>> types >>>>>>>> if used in both package code, and test code) into the Go compiler, >>>>>>>> rather >>>>>>>> than in our code. >>>>>>>> >>>>>>>> And who knows? Perhaps in Go 1.19 or 1.20, the inferences Danny >>>>>>>> proposed become possible, and we can clean things up further. >>>>>>>> >>>>>>>> Robert Burke, >>>>>>>> Beam Go Busybody >>>>>>>> >>>>>>>> On Fri, 11 Mar 2022 at 06:03, Danny McCormick < >>>>>>>> dannymccorm...@google.com> wrote: >>>>>>>> >>>>>>>>> > Can generics be used to at least get rid of the oddities of KV >>>>>>>>> handling? >>>>>>>>> >>>>>>>>> Good question - my honest answer is that I don't know: this >>>>>>>>> doc/investigation was specifically focused on replacing the code >>>>>>>>> generator >>>>>>>>> and generics not being useful in this context does not mean they >>>>>>>>> can't be >>>>>>>>> useful elsewhere. >>>>>>>>> >>>>>>>>> My speculative answer (without having a full familiarity with how >>>>>>>>> we currently handle KVs or the associated problems) is that I don't >>>>>>>>> *think >>>>>>>>> *the core blocker keeping us from replacing the code generator >>>>>>>>> applies here. The big issue we're facing here is that we can't infer >>>>>>>>> generic interface types based on the parameters of method functions >>>>>>>>> of a >>>>>>>>> struct. In the case of KV, we would be inferring the interface types >>>>>>>>> based >>>>>>>>> on the types of member variables of the struct, which Go should be >>>>>>>>> able to >>>>>>>>> do. >>>>>>>>> >>>>>>>>> All that to say, in no way should this investigation preclude us >>>>>>>>> from looking for other ways to apply generics to make our lives easier >>>>>>>>> (whether that be in KV handling or improving other experiences). >>>>>>>>> >>>>>>>>> On Thu, Mar 10, 2022 at 6:55 PM Robert Bradshaw < >>>>>>>>> rober...@google.com> wrote: >>>>>>>>> >>>>>>>>>> Thanks for the writeup. That's too bad. Can generics be used to >>>>>>>>>> at least get rid of the oddities of KV handling? >>>>>>>>>> >>>>>>>>>> On Thu, Mar 10, 2022 at 3:40 PM Luke Cwik <lc...@google.com> >>>>>>>>>> wrote: >>>>>>>>>> >>>>>>>>>>> Well written doc and unfortunate that Generics seem to not work >>>>>>>>>>> out for us. >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> On Thu, Mar 10, 2022 at 11:03 AM Jack McCluskey < >>>>>>>>>>> jrmcclus...@google.com> wrote: >>>>>>>>>>> >>>>>>>>>>>> Well, looks like Go generics weren't quite the magic bullet we >>>>>>>>>>>> were hoping for in their current state. We'll get that code >>>>>>>>>>>> generator >>>>>>>>>>>> ripped out one day, but not today. >>>>>>>>>>>> >>>>>>>>>>>> On Thu, Mar 10, 2022 at 12:44 PM Danny McCormick < >>>>>>>>>>>> dannymccorm...@google.com> wrote: >>>>>>>>>>>> >>>>>>>>>>>>> Hey folks, >>>>>>>>>>>>> >>>>>>>>>>>>> Right now, the Go Sdk relies on users to generate code as a >>>>>>>>>>>>> pre-compile step to avoid some overhead of excessive runtime >>>>>>>>>>>>> reflection. >>>>>>>>>>>>> This is suboptimal because it relies on users to regenerate code >>>>>>>>>>>>> every time >>>>>>>>>>>>> they create a DoFn, causing friction to users and slowness when >>>>>>>>>>>>> they don't >>>>>>>>>>>>> remember to regenerate (among other reasons). @lostluck wrote a >>>>>>>>>>>>> one pager >>>>>>>>>>>>> around replacing this code generator with generics - after >>>>>>>>>>>>> spending pieces >>>>>>>>>>>>> of the last couple days digging in my takeaway is that it's not >>>>>>>>>>>>> doable in a >>>>>>>>>>>>> way that benefits the Sdk at this time and I wanted to share my >>>>>>>>>>>>> findings. I >>>>>>>>>>>>> wrote them up here - >>>>>>>>>>>>> https://docs.google.com/document/d/1imYbBeu2FNJkwPNm6E9GEJkjpHnHscvFoKAE6AISvFA/edit?usp=sharing >>>>>>>>>>>>> - >>>>>>>>>>>>> please take a look if you're interested, and definitely let me >>>>>>>>>>>>> know if I'm >>>>>>>>>>>>> missing anything! >>>>>>>>>>>>> >>>>>>>>>>>>> Thanks, >>>>>>>>>>>>> Danny >>>>>>>>>>>>> >>>>>>>>>>>>