> Could you clarify what you mean by this? We certainly wouldn't want the
> stringification of all elements, only some of them, often post-hoc.

What I mean by round trip is that I imagined we care mostly about data processed
by the SDK Harness which is only bytes for the runner, so if we need to know the
String representation of that data we should do an extra call after the data is
processed by the Harness.

Of course having a function in the SDK harness that receives coder + data and
gives back its string representation makes total sense and it is more generic (I
am assuming that the string representation comes from the object: toString(),
__str(), etc.

I was just more curious about the intent so thanks for the clarification because
it makes more sense now, my initial understanding was that it was more to
'debug' SDK Harness processed elements (that's why I mentioned Instructions) but
it is clearly beyond that.

On Thu, Oct 29, 2020 at 5:38 PM Robert Bradshaw <rober...@google.com> wrote:
>
> On Thu, Oct 29, 2020 at 3:18 AM Ismaël Mejía <ieme...@gmail.com> wrote:
> >
> > Thanks for sharing,
> >
> > I was initially confused by the title/terminology, I thought it was
> > about an end-user transform but this is a 'protocol' for a runner to
> > get the string representation of an element encoded by a SDK Harness
> > (potentially in a different language) if I understood correctly.
> >
> > Are there use cases where a runner cares about the String
> > representation of data encoded by the SDK harness apart of the
> > debugging case?
>
> Yeah, I think this is the intent. E.g. a runner could use this to
> show, in its UI or logs, particularly expensive elements, or hot keys,
> or excessive uses of state, or even just a sampling of "typical"
> elements for a given PCollection.
>
> > I ask this because I was imagining that if we care
> > 'only' about debugging data processed by the harness, we could just
> > have a new debug-like Instruction that produces the tuple of <encoded
> > data,  string representation> and avoid a round-trip.
>
> Could you clarify what you mean by this? We certainly wouldn't want
> the stringification of all elements, only some of them, often
> post-hoc.
>
> > But well take this with a grain of salt, I am far from an expert on
> > portability, just curious about finding the simplest approach.
> >
> > On Thu, Oct 29, 2020 at 12:02 AM Sam Rohde <sro...@google.com> wrote:
> > >
> > > done!
> > >
> > > On Wed, Oct 28, 2020 at 3:54 PM Tyson Hamilton <tyso...@google.com> wrote:
> > >>
> > >> Can you open up comment access please?
> > >>
> > >> On Wed, Oct 28, 2020 at 3:40 PM Sam Rohde <sro...@google.com> wrote:
> > >>>
> > >>> +Lukasz Cwik
> > >>>
> > >>> On Tue, Oct 27, 2020 at 12:04 PM Sam Rohde <sro...@google.com> wrote:
> > >>>>
> > >>>> Hi All,
> > >>>>
> > >>>> I'm working on a project in Dataflow that requires the runner to 
> > >>>> translate an element to a human-readable form. To do this, I want to 
> > >>>> add a new well-known transform that allows any runner to ask the SDK 
> > >>>> to stringify (human-readable) an element. Let me know what you think, 
> > >>>> you can find the proposed specification and implementation details 
> > >>>> here.
> > >>>>
> > >>>> If there are no objections, I want to start implementation as soon as 
> > >>>> I can.
> > >>>>
> > >>>> Regards,
> > >>>> Sam

Reply via email to