vamossagar12 commented on PR #14372: URL: https://github.com/apache/kafka/pull/14372#issuecomment-2100630568
Thanks @C0urante , I updated the ticket description with examples of exceptions which leads to this scenario. It is mainly for scenarios like `org.apache.kafka.common.errors.SslAuthenticationException: SSL handshake failed` and such exceptions where I have noticed these errors. I did consider the options of => 1) Killing the worker: I went against it because a) Exceptions like these are generally wide spread across the cluster. So even if I kill the worker and have the tasks moved to some other worker, eventually all of them end up hitting the same exceptions (have seen this on a cluster with 14 workers for example). b) The other reason I thought we shouldn't kill the worker is that there might be other sink connectors consuming from different kafka brokers which might not have this problem. So, we shouldn't make them pay this penalty. 2) Infinite retries (with or w/o backoff): Even if we retry , these are non transient errors and eventually the connector would still fail with the same error. These are my rationales for doing whatever I did. Would like to know your thoughts on this. Thanks! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org