C0urante commented on a change in pull request #10503: URL: https://github.com/apache/kafka/pull/10503#discussion_r610330282
########## File path: connect/runtime/src/main/java/org/apache/kafka/connect/runtime/WorkerTask.java ########## @@ -185,8 +185,12 @@ private void doRun() throws InterruptedException { execute(); } catch (Throwable t) { - log.error("{} Task threw an uncaught and unrecoverable exception. Task is being killed and will not recover until manually restarted", this, t); - throw t; + if (!stopping && !cancelled) { Review comment: Hmm... I don't believe "cancelled" is a term we've used in public-facing surfaces in the past. For example, when a task takes too long to shut down now and we have to cancel it, we log the message that "Graceful stop... failed": https://github.com/apache/kafka/blob/5964401bf9aab611bd4a072941bd1c927e044258/connect/runtime/src/main/java/org/apache/kafka/connect/runtime/Worker.java#L866 Personally I think the additional code complexity is worth it; the original ticket mentions a case where these messages confuse users because they're generated for cancelled tasks, so I'd rather err on the side of making things as obvious as possible to them. It might be possible to keep things simple and eliminate branches by tweaking the message to make it clear that newer task instances won't be impacted by this failure, though. A possible downside to this is that it might be confusing if there are no newer instances that will be brought up on the worker (because the connector has been deleted, the number of tasks has been reduced, or the task has been reassigned to another worker). But with some careful wording we might be able to avoid misleading people into thinking that this message implies there's already another instance running. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org