Kumud Kumar Srivatsava Tirupati created KAFKA-13926:
-------------------------------------------------------
Summary: Proposal to have "HasField" predicate for kafka connect
Key: KAFKA-13926
URL: https://issues.apache.org/jira/browse/KAFKA-13926
Project: Kafka
Issue Type: Improvement
Components: KafkaConnect
Reporter: Kumud Kumar Srivatsava Tirupati
Hello,
Today's connect predicates enables checks on the record metadata. However, this
can be limiting considering {*}many inbuilt and custom transformations that we
(community) use are more key/value centric{*}.
Some use-cases this can solve:
* Data type conversions of certain pre-identified fields for records coming
across datasets only if those fields exist. [Ex: TimestampConverter can be run
only if the specified date field exists irrespective of the record metadata]
* Skip running certain transform if a given field does/does not exist. A lot
of inbuilt transforms raise exceptions (Ex: InsertField transform if the field
already exists) thereby breaking the task. Giving this control enable users to
consciously configure for such cases.
* Even though some inbuilt transforms explicitly handle these cases, it would
still be an unnecessary pass-through loop.
* Considering each connector usually deals with multiple datasets (Even 100s
for a database CDC connector), metadata-centric predicate checking will be
somewhat limiting when we talk about such pre-identified custom metadata fields
in the records.
I know some of these cases can be handled within the transforms itself but that
defeats the purpose of having predicates.
We have built this predicate for us and it is found to be extremely helpful.
Please let me know your thoughts on the same so that I can raise a PR.
--
This message was sent by Atlassian Jira
(v8.20.7#820007)