I'm +1 on John's point 3) for punctuations.

And I think if people are on the same page that a reference equality check
per record is not a huge overhead, I think doing that enforcement is better
than documentations and hand-wavy undefined behaviors.


Guozhang

On Wed, Feb 9, 2022 at 11:27 AM John Roesler <vvcep...@apache.org> wrote:

> Thanks for the KIP Jorge,
>
> I'm in support of your proposal.
>
> 1)
> I do agree with Guozhang's point (1). I think the cleanest
> approach. I think it's cleaner and better to keep the
> enforcement internal to the framework than to introduce a
> public API or context wrapper for processors to use
> explicitly.
>
> 2) I tend to agree with you on this one; I think the
> equality check ought to be fast enough in practice.
>
> 3) I think this is implicit, but should be explicit in the
> KIP: For the `processValues` API, because the framework sets
> the key on the context before calling `process` and then
> unsets it afterwards, there will always be no key set during
> task puctuation. Therefore, while processors may still
> register punctuators, they will not be able to forward
> anything from them.
>
> This is functionally equivalent to the existing
> transformers, by the way, that are also forbidden to forward
> anything during punctuation.
>
> For what it's worth, I think this is the best tradeoff.
>
> The only alternative I see is not to place any restriction
> on forwarded keys at all and just document that if users
> don't maintain proper partitioning, they'll get undefined
> behavior. That might be more powerful, but it's also a
> usability problem.
>
> Thanks,
> -John
>
> On Wed, 2022-02-09 at 11:34 +0000, Jorge Esteban Quilcate
> Otoya wrote:
> > Thanks Guozhang.
> >
> > > Does `ValueProcessorContext` have to be a public API? It seems to me
> > that this can be completely abstracted away from user interfaces as an
> > internal class
> >
> > Totally agree. No intention to add these as public APIs. Will update the
> > KIP to reflect this.
> >
> > > in the past the rationale for enforcing it at the
> > interface layer rather than do runtime checks is that it is more
> efficient.
> > > I'm not sure how much overhead it may incur to check if the key did not
> > change: if it is just a reference equality check maybe it's okay. What's
> > your take on this?
> >
> > Agree, reference equality should cover this validation and the overhead
> > impact should not be meaningful.
> > Will update the KIP to reflect this as well.
> >
> >
> > On Tue, 8 Feb 2022 at 19:05, Guozhang Wang <wangg...@gmail.com> wrote:
> >
> > > Hello Jorge,
> > >
> > > Thanks for bringing this KIP! I think this is a nice idea to consider
> using
> > > a single overloaded function name for #process, just a couple quick
> > > questions after reading the proposal:
> > >
> > > 1) Does `ValueProcessorContext` have to be a public API? It seems to me
> > > that this can be completely abstracted away from user interfaces as an
> > > internal class, and we call the `setKey` before calling
> user-instantiated
> > > `process` function, and then in its overridden `forward` it can just
> check
> > > if the key changes or not.
> > > 2) Related to 1) above, in the past the rationale for enforcing it at
> the
> > > interface layer rather than do runtime checks is that it is more
> efficient.
> > > I'm not sure how much overhead it may incur to check if the key did not
> > > change: if it is just a reference equality check maybe it's okay.
> What's
> > > your take on this?
> > >
> > >
> > > Guozhang
> > >
> > > On Tue, Feb 8, 2022 at 5:17 AM Jorge Esteban Quilcate Otoya <
> > > quilcate.jo...@gmail.com> wrote:
> > >
> > > > Hi Dev team,
> > > >
> > > > I'd like to start a new discussion thread on Kafka Streams KIP-820:
> > > >
> > > >
> > >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-820%3A+Extend+KStream+process+with+new+Processor+API
> > > >
> > > > This KIP is aimed to extend the current `KStream#process` API to
> return
> > > > output values that could be chained across the topology, as well as
> > > > introducing a new `KStream#processValues` to use processor while
> > > validating
> > > > keys haven't change and repartition is not required.
> > > >
> > > > Looking forward to your feedback.
> > > >
> > > > Regards,
> > > > Jorge.
> > > >
> > >
> > >
> > > --
> > > -- Guozhang
> > >
>
>

-- 
-- Guozhang

Reply via email to