On Thu, Oct 29, 2020 at 9:55 AM Ismaël Mejía <ieme...@gmail.com> wrote: > > > Could you clarify what you mean by this? We certainly wouldn't want the > > stringification of all elements, only some of them, often post-hoc. > > What I mean by round trip is that I imagined we care mostly about data > processed > by the SDK Harness which is only bytes for the runner, so if we need to know > the > String representation of that data we should do an extra call after the data > is > processed by the Harness.
Yep, this is how it would work. If the runner has some bytes in hand that it wants to display it would create a new bundle using this instruction, and send it the set of elements that it wishes to stringify. This is similar to the "invoke combiner" or "merge windows" operations that can be used as one-offs. > Of course having a function in the SDK harness that receives coder + data and > gives back its string representation makes total sense and it is more generic > (I > am assuming that the string representation comes from the object: toString(), > __str(), etc. > > I was just more curious about the intent so thanks for the clarification > because > it makes more sense now, my initial understanding was that it was more to > 'debug' SDK Harness processed elements (that's why I mentioned Instructions) > but > it is clearly beyond that. Yeah, I think this will be useful. > On Thu, Oct 29, 2020 at 5:38 PM Robert Bradshaw <rober...@google.com> wrote: > > > > On Thu, Oct 29, 2020 at 3:18 AM Ismaël Mejía <ieme...@gmail.com> wrote: > > > > > > Thanks for sharing, > > > > > > I was initially confused by the title/terminology, I thought it was > > > about an end-user transform but this is a 'protocol' for a runner to > > > get the string representation of an element encoded by a SDK Harness > > > (potentially in a different language) if I understood correctly. > > > > > > Are there use cases where a runner cares about the String > > > representation of data encoded by the SDK harness apart of the > > > debugging case? > > > > Yeah, I think this is the intent. E.g. a runner could use this to > > show, in its UI or logs, particularly expensive elements, or hot keys, > > or excessive uses of state, or even just a sampling of "typical" > > elements for a given PCollection. > > > > > I ask this because I was imagining that if we care > > > 'only' about debugging data processed by the harness, we could just > > > have a new debug-like Instruction that produces the tuple of <encoded > > > data, string representation> and avoid a round-trip. > > > > Could you clarify what you mean by this? We certainly wouldn't want > > the stringification of all elements, only some of them, often > > post-hoc. > > > > > But well take this with a grain of salt, I am far from an expert on > > > portability, just curious about finding the simplest approach. > > > > > > On Thu, Oct 29, 2020 at 12:02 AM Sam Rohde <sro...@google.com> wrote: > > > > > > > > done! > > > > > > > > On Wed, Oct 28, 2020 at 3:54 PM Tyson Hamilton <tyso...@google.com> > > > > wrote: > > > >> > > > >> Can you open up comment access please? > > > >> > > > >> On Wed, Oct 28, 2020 at 3:40 PM Sam Rohde <sro...@google.com> wrote: > > > >>> > > > >>> +Lukasz Cwik > > > >>> > > > >>> On Tue, Oct 27, 2020 at 12:04 PM Sam Rohde <sro...@google.com> wrote: > > > >>>> > > > >>>> Hi All, > > > >>>> > > > >>>> I'm working on a project in Dataflow that requires the runner to > > > >>>> translate an element to a human-readable form. To do this, I want to > > > >>>> add a new well-known transform that allows any runner to ask the SDK > > > >>>> to stringify (human-readable) an element. Let me know what you > > > >>>> think, you can find the proposed specification and implementation > > > >>>> details here. > > > >>>> > > > >>>> If there are no objections, I want to start implementation as soon > > > >>>> as I can. > > > >>>> > > > >>>> Regards, > > > >>>> Sam