[ https://issues.apache.org/jira/browse/KAFKA-10816?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Chris Egerton reassigned KAFKA-10816: ------------------------------------- Assignee: Chris Egerton (was: Tom Bentley) > Connect REST API should have a resource that can be used as a readiness probe > ----------------------------------------------------------------------------- > > Key: KAFKA-10816 > URL: https://issues.apache.org/jira/browse/KAFKA-10816 > Project: Kafka > Issue Type: Improvement > Components: connect > Reporter: Randall Hauch > Assignee: Chris Egerton > Priority: Major > > There are a few ways to accurately detect whether a Connect worker is > *completely* ready to process all REST requests: > # Wait for {{Herder started}} in the Connect worker logs > # Use the REST API to issue a request that will be completed only after the > herder has started, such as {{GET /connectors/{name}/}} or {{GET > /connectors/{name}/status}}. > Other techniques can be used to detect other startup states, though none of > these will guarantee that the worker has indeed completely started up and can > process all REST requests: > * {{GET /}} can be used to know when the REST server has started, but this > may be before the worker has started completely and successfully. > * {{GET /connectors}} can be used to know when the REST server has started, > but this may be before the worker has started completely and successfully. > And, for the distributed Connect worker, this may actually return an older > list of connectors if the worker hasn't yet completely read through the > internal config topic. It's also possible that this request returns even if > the worker is having trouble reading from the internal config topic. > * {{GET /connector-plugins}} can be used to know when the REST server has > started, but this may be before the worker has started completely and > successfully. > The Connect REST API should have an endpoint that more obviously and more > simply can be used as a readiness probe. This could be a new resource (e.g., > {{GET /status}}), though this would only work on newer Connect runtimes, and > existing tooling, installations, and examples would have to be modified to > take advantage of this feature (if it exists). > Alternatively, we could make sure that the existing resources (e.g., {{GET > /}} or {{GET /connectors}}) wait for the herder to start completely; this > wouldn't require a KIP and it would not require clients use different > technique for newer and older Connect runtimes. (Whether or not we back port > this is another question altogether, since it's debatable whether the > behavior of the existing REST resources is truly a bug.) -- This message was sent by Atlassian Jira (v8.20.10#820010)