I'm +1 on John's point 3) for punctuations. And I think if people are on the same page that a reference equality check per record is not a huge overhead, I think doing that enforcement is better than documentations and hand-wavy undefined behaviors.
Guozhang On Wed, Feb 9, 2022 at 11:27 AM John Roesler <vvcep...@apache.org> wrote: > Thanks for the KIP Jorge, > > I'm in support of your proposal. > > 1) > I do agree with Guozhang's point (1). I think the cleanest > approach. I think it's cleaner and better to keep the > enforcement internal to the framework than to introduce a > public API or context wrapper for processors to use > explicitly. > > 2) I tend to agree with you on this one; I think the > equality check ought to be fast enough in practice. > > 3) I think this is implicit, but should be explicit in the > KIP: For the `processValues` API, because the framework sets > the key on the context before calling `process` and then > unsets it afterwards, there will always be no key set during > task puctuation. Therefore, while processors may still > register punctuators, they will not be able to forward > anything from them. > > This is functionally equivalent to the existing > transformers, by the way, that are also forbidden to forward > anything during punctuation. > > For what it's worth, I think this is the best tradeoff. > > The only alternative I see is not to place any restriction > on forwarded keys at all and just document that if users > don't maintain proper partitioning, they'll get undefined > behavior. That might be more powerful, but it's also a > usability problem. > > Thanks, > -John > > On Wed, 2022-02-09 at 11:34 +0000, Jorge Esteban Quilcate > Otoya wrote: > > Thanks Guozhang. > > > > > Does `ValueProcessorContext` have to be a public API? It seems to me > > that this can be completely abstracted away from user interfaces as an > > internal class > > > > Totally agree. No intention to add these as public APIs. Will update the > > KIP to reflect this. > > > > > in the past the rationale for enforcing it at the > > interface layer rather than do runtime checks is that it is more > efficient. > > > I'm not sure how much overhead it may incur to check if the key did not > > change: if it is just a reference equality check maybe it's okay. What's > > your take on this? > > > > Agree, reference equality should cover this validation and the overhead > > impact should not be meaningful. > > Will update the KIP to reflect this as well. > > > > > > On Tue, 8 Feb 2022 at 19:05, Guozhang Wang <wangg...@gmail.com> wrote: > > > > > Hello Jorge, > > > > > > Thanks for bringing this KIP! I think this is a nice idea to consider > using > > > a single overloaded function name for #process, just a couple quick > > > questions after reading the proposal: > > > > > > 1) Does `ValueProcessorContext` have to be a public API? It seems to me > > > that this can be completely abstracted away from user interfaces as an > > > internal class, and we call the `setKey` before calling > user-instantiated > > > `process` function, and then in its overridden `forward` it can just > check > > > if the key changes or not. > > > 2) Related to 1) above, in the past the rationale for enforcing it at > the > > > interface layer rather than do runtime checks is that it is more > efficient. > > > I'm not sure how much overhead it may incur to check if the key did not > > > change: if it is just a reference equality check maybe it's okay. > What's > > > your take on this? > > > > > > > > > Guozhang > > > > > > On Tue, Feb 8, 2022 at 5:17 AM Jorge Esteban Quilcate Otoya < > > > quilcate.jo...@gmail.com> wrote: > > > > > > > Hi Dev team, > > > > > > > > I'd like to start a new discussion thread on Kafka Streams KIP-820: > > > > > > > > > > > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-820%3A+Extend+KStream+process+with+new+Processor+API > > > > > > > > This KIP is aimed to extend the current `KStream#process` API to > return > > > > output values that could be chained across the topology, as well as > > > > introducing a new `KStream#processValues` to use processor while > > > validating > > > > keys haven't change and repartition is not required. > > > > > > > > Looking forward to your feedback. > > > > > > > > Regards, > > > > Jorge. > > > > > > > > > > > > > -- > > > -- Guozhang > > > > > -- -- Guozhang