John Roesler created KAFKA-8396:
-----------------------------------

             Summary: Clean up Transformer API
                 Key: KAFKA-8396
                 URL: https://issues.apache.org/jira/browse/KAFKA-8396
             Project: Kafka
          Issue Type: Improvement
          Components: streams
            Reporter: John Roesler


Currently, KStream operators transformValues and flatTransformValues disable 
context forwarding, and force operators to just return the new values.

The reason is that we wanted to prevent the key from changing, since the whole 
point of a `xValues` transformation is that we _do not_ change the key, and 
hence don't need to repartition.

However, the chosen mechanism has some drawbacks: The Transform concept is 
basically a way to plug in a custom Processor within the Streams DSL, but these 
restrictions make it more like a MapValues with access to the context. For 
example, even though you can still schedule punctuations, there's no way to 
forward values as a result of them. So, as a user, it's hard to build a mental 
model of how to use a TransformValues (because it's not quite a Transformer and 
not quite a Mapper).

Also, logically, a Transformer can call forward as much as it wants, so a 
Transformer and a FlatTransformer are effectively the same thing. Then, we also 
have TransformValues and FlatTransformValues that are also two more versions of 
the same thing, just to implement the key restrictions. Internally, some of 
these can send downstream by returning OR forwarding, and others can only 
return. It's a lot for users to keep in mind.

We can clean up this API significantly by just allowing all transformers to 
call `forward`. In the `Values` case, we can wrap the ProcessorContext in one 
that checks the key is `equal` to the one that got passed in (i.e., saves a 
reference and enforces equality with that reference in any call to `forward`). 
Then, we can actually deprecate the `*ValueTransformer*` interfaces and remove 
the restriction about calling forward.

We can consider a further cleanup (TBD) to deprecate the existing Transformer 
interface entirely, and replace it with one with a `void` return type. Then, 
the Transform and FlatTransform cases collapse together, and we just need 
Transform and TransformValues.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to