[ 
https://issues.apache.org/jira/browse/KAFKA-14699?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17702155#comment-17702155
 ] 

Sagar Rao commented on KAFKA-14699:
-----------------------------------

hi [~miguel_costa] , thanks for filing this improvement ticket. It certainly 
looks like an interesting idea. Couple of things that I want to add here:

1) Since what you are proposing could potentially be an update to the public 
interfaces, such changes need a KIP.

2) Staying on KIPs, the error reporting semantics was added to Connect in 
[KIP-610  
|https://cwiki.apache.org/confluence/display/KAFKA/KIP-610%3A+Error+Reporting+in+Sink+Connectors]and
 
[KIP-298.|https://cwiki.apache.org/confluence/display/KAFKA/KIP-298%3A+Error+Handling+in+Connect]

These KIPs specifically talk about some of the situations you described above 
like connectivity issue or transformation issue. When I look at the KIPs, it 
appears to me that the additions made were kept fairly generic since each 
connector has it's own set of requirements as far as error handling goes. Given 
that the errors you described above would depend on how the connectors want to 
handle those errors, I would argue that such error handling would be better off 
on the individual connector. Having said that, if you feel the changes you 
proposed are mandatory, you might want to create a KIP and trigger a discussion 
on the dev community. Let me know if that makes sense.

> Kafka Connect framework errors.tolerance improvement 
> -----------------------------------------------------
>
>                 Key: KAFKA-14699
>                 URL: https://issues.apache.org/jira/browse/KAFKA-14699
>             Project: Kafka
>          Issue Type: Improvement
>          Components: KafkaConnect
>    Affects Versions: 3.3.2
>            Reporter: Miguel Costa
>            Priority: Major
>
> TL/DR improve errors tolerance from [none, all] to, [none, deserialization, 
> transformation, put, all]
>  
> Hi all, It's my first time requesting an improvement here so, sorry if my 
> request it not clear, if it's already been disregarded or if it's incomplete.
> I am currently experiencing some issues with the Kafka Connect error handling 
> and the DLQ setup that maybe is just my setup that is wrong or my 
> understanding of it is wrong, that makes me assume that the current options 
> provided by Kafka are insufficient.
> I start by the current assumptions:
> [https://kafka.apache.org/documentation/#sourceconnectorconfigs_errors.tolerance]
> h4. 
> [errors.tolerance|https://kafka.apache.org/documentation/#sourceconnectorconfigs_errors.tolerance]
> Behavior for tolerating errors during connector operation. 'none' is the 
> default value and signals that any error will result in an immediate 
> connector task failure; 'all' changes the behavior to skip over problematic 
> records.
> ||Type:|string|
> ||Default:|none|
> ||Valid Values:|[none, all]|
> ||Importance:|medium|
>  
> My understanding is that currently Kafka Connect framework allows you to 
> either handle all errors as something ok or not, and leaves any further 
> handling to the different plugin implementations of the Kafka Connectors 
> themselves.
> My experience is mainly in the Kafka Sink connectors.
> What I've experience recently is something that is also reported here as a 
> possible improvement on the individual connectors themselves.
> [https://github.com/confluentinc/kafka-connect-jdbc/issues/721]
> What I think is that Kafka Connect framework could provide an option to allow 
> to better set the scenarios when we want to have records in the DLQ or when 
> we want to have the connectors fail.
> In my opinion failures in deserialization (Key, Header, Value Converters) or 
> in the Transformation chain, are good errors to be candidates to go to the 
> DLQ.
> Errors when on Sink/Put are errors that should never be in the DLQ and 
> instead should make the connectors fail, because this errors are not (or may 
> not be) transient.
> Trying to better explain, if I have a connectivity issue, or a table space 
> issue, it makes no sense to try to move to next records and send all the 
> records to the DLQ because until the target is up and running smoothly there 
> would be no way to continue processing data.
> I can imagine in a JDBC scenario and for example constraint violations that 
> this would only happen to some records that we still would like them in the 
> DLQ instead of failing the full pipeline, that why I think a configuration 
> for "put" stage should also exist.
> Let me know if this is clear, and if any of my understanding is completely 
> wrong.
> Best regards,
> Miguel
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to