Mickael Maison created KAFKA-15912: -------------------------------------- Summary: Parallelize conversion and transformation steps in Connect Key: KAFKA-15912 URL: https://issues.apache.org/jira/browse/KAFKA-15912 Project: Kafka Issue Type: Improvement Components: connect Reporter: Mickael Maison
In busy Connect pipelines, the conversion and transformation steps can sometimes have a very significant impact on performance. This is especially true with large records with complex schemas, for example with CDC connectors. Today in order to always preserve ordering, converters and transformations are called on one record at a time in a single thread in the Connect worker. As Connect usually handles records in batches (up to max.poll.records in sink pipelines, for source pipelines it depends on the connector), it could be highly beneficial to attempt running the converters and transformation chain in parallel by a pool a processing threads. It should be possible to do some of these steps in parallel and still keep exact ordering. I'm even considering whether an option to lose ordering but allow even faster processing would make sense. -- This message was sent by Atlassian Jira (v8.20.10#820010)