Re: Proposal: ToStringFn

2020-10-29 Thread Robert Bradshaw
On Thu, Oct 29, 2020 at 9:55 AM Ismaël Mejía wrote: > > > Could you clarify what you mean by this? We certainly wouldn't want the > > stringification of all elements, only some of them, often post-hoc. > > What I mean by round trip is that I imagined we care mostly about data > processed > by

Re: Proposal: ToStringFn

2020-10-29 Thread Robert Burke
The part I find interesting here is that it allows extension of what runners and SDKs can do without changing or adding a new FnAPI rpc. "Known urns" like these can be toggled by including the appropriate urn along with other restrictions like coders or SDFs. On Thu, Oct 29, 2020, 9:55 AM Ismaël

Re: Proposal: ToStringFn

2020-10-29 Thread Ismaël Mejía
> Could you clarify what you mean by this? We certainly wouldn't want the > stringification of all elements, only some of them, often post-hoc. What I mean by round trip is that I imagined we care mostly about data processed by the SDK Harness which is only bytes for the runner, so if we need to

Re: Proposal: ToStringFn

2020-10-29 Thread Robert Bradshaw
On Thu, Oct 29, 2020 at 3:18 AM Ismaël Mejía wrote: > > Thanks for sharing, > > I was initially confused by the title/terminology, I thought it was > about an end-user transform but this is a 'protocol' for a runner to > get the string representation of an element encoded by a SDK Harness >

Re: Jenkins Java Version Confusion/Failures

2020-10-29 Thread Ismaël Mejía
You can unsubscribe by sending an email to: dev-unsubscr...@beam.apache.org On Thu, Oct 29, 2020 at 11:28 AM Manuela Chamda Tchakoute wrote: > > Hi All, > > Please can one remove me from the Apache beam mail platform. I had once > signed in as an outreachy intern. But didn't succeed. > Thank

Re: Possible 80% reduction in overhead for flink runner, input needed

2020-10-29 Thread Teodor Spæren
Thanks Jan, this cleared some things up! Best regards, Teodor Spæren On Thu, Oct 29, 2020 at 02:13:50PM +0100, Jan Lukavský wrote: Hi Teodor, the confusion here maybe comes from the fact, that there are two (logical) representations of an element in PCollection. One representation is the

Re: Possible 80% reduction in overhead for flink runner, input needed

2020-10-29 Thread Jan Lukavský
Hi Teodor, the confusion here maybe comes from the fact, that there are two (logical) representations of an element in PCollection. One representation is the never mutable (most probably serialized in a binary form) form of a PCollection element, where no modifications are possible. Once a

Re: Possible 80% reduction in overhead for flink runner, input needed

2020-10-29 Thread Teodor Spæren
Hey Jan! I fully agree! Best regards, Teodor Spæren On Thu, Oct 29, 2020 at 09:00:33AM +0100, Jan Lukavský wrote: Hi Teodor and Max, I think that there is not 100% need for all runners to behave exactly the same way. The reason for that is that different runners can have different

Re: Possible 80% reduction in overhead for flink runner, input needed

2020-10-29 Thread Teodor Spæren
Hey! Just so I understand this correctly then, what does the following quote from [1], section 3.2.3 mean: A PCollection is immutable. Once created, you cannot add, remove, or change individual elements. A Beam Transform might process each element of a PCollection and generate new pipeline

Re: Jenkins Java Version Confusion/Failures

2020-10-29 Thread Manuela Chamda Tchakoute
Hi All, Please can one remove me from the Apache beam mail platform. I had once signed in as an outreachy intern. But didn't succeed. Thank you. On Oct 29, 2020 11:22 AM, "Ismaël Mejía" wrote: > Thanks Tyson for finding/fixing the issue! > I just double checked with today's SNAPSHOTS and the

Re: Jenkins Java Version Confusion/Failures

2020-10-29 Thread Ismaël Mejía
Thanks Tyson for finding/fixing the issue! I just double checked with today's SNAPSHOTS and the issue is gone. On Wed, Oct 28, 2020 at 7:39 PM Tyson Hamilton wrote: > > Thanks Udi & Infra team for helping me out here. This was discussed on the > builds mailing list [1] and in the INFRA ticket

Re: Proposal: ToStringFn

2020-10-29 Thread Ismaël Mejía
Thanks for sharing, I was initially confused by the title/terminology, I thought it was about an end-user transform but this is a 'protocol' for a runner to get the string representation of an element encoded by a SDK Harness (potentially in a different language) if I understood correctly. Are

Re: Possible 80% reduction in overhead for flink runner, input needed

2020-10-29 Thread Maximilian Michels
Ok then we are on the same page, but I disagree with your conclusion. The reason Flink has to do the deep copy is that it doesn't state that the inputs are immutable and should not be changed, and so have to do the deep copy. In Beam, the user is not supposed to modify the input collection and if

Re: Possible 80% reduction in overhead for flink runner, input needed

2020-10-29 Thread Jan Lukavský
Hi Teodor and Max, I think that there is not 100% need for all runners to behave exactly the same way. The reason for that is that different runners can have different purposes. The purpose of DirectRunner is to verify code of the pipeline and (if it succeeds) to validate that it will

Re: Possible 80% reduction in overhead for flink runner, input needed

2020-10-29 Thread Teodor Spæren
Ok then we are on the same page, but I disagree with your conclusion. The reason Flink has to do the deep copy is that it doesn't state that the inputs are immutable and should not be changed, and so have to do the deep copy. In Beam, the user is not supposed to modify the input collection and