We have stumbled upon an issue on a running cluster with multiple source/sink connectors:
1. One of our connectors was a JDBC sink connector connected to an SQL Server database (using the oracle JDBC driver). 2. It turns out that the DB instance had a problem causing all queries to be stuck forever, which in turn made the start method of the connector hang forever. 3. After some time, the entire Kafka Connect cluster was unavailable and the REST API was not responding giving {"error_code":500,"message":"Request timed out"} for most requests. 4. Pausing (just before the deletion of the consumer group) or deleting the problematic connector allowed the cluster to run normally again. We could reproduce the same issue by adding Thread.sleep(300000) in the start method or in the put method of the ConnectorTask. Wanted to know if there's any wiki/documentation provided that mentions how to handle this issue. My approach would be to throw a timeout after waiting for a particular time period and make the connector fail fast. -- Thanks & Regards, Hemanth