Mickael Maison created KAFKA-15912:
--------------------------------------

             Summary: Parallelize conversion and transformation steps in Connect
                 Key: KAFKA-15912
                 URL: https://issues.apache.org/jira/browse/KAFKA-15912
             Project: Kafka
          Issue Type: Improvement
          Components: connect
            Reporter: Mickael Maison


In busy Connect pipelines, the conversion and transformation steps can 
sometimes have a very significant impact on performance. This is especially 
true with large records with complex schemas, for example with CDC connectors.

Today in order to always preserve ordering, converters and transformations are 
called on one record at a time in a single thread in the Connect worker. As 
Connect usually handles records in batches (up to max.poll.records in sink 
pipelines, for source pipelines it depends on the connector), it could be 
highly beneficial to attempt running the converters and transformation chain in 
parallel by a pool a processing threads.

It should be possible to do some of these steps in parallel and still keep 
exact ordering. I'm even considering whether an option to lose ordering but 
allow even faster processing would make sense.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to