Kalpesh Patel created KAFKA-12704: ------------------------------------- Summary: Concurrent calls to AbstractHerder::getConnector can potentially create two connector instances Key: KAFKA-12704 URL: https://issues.apache.org/jira/browse/KAFKA-12704 Project: Kafka Issue Type: Bug Components: KafkaConnect Reporter: Kalpesh Patel
Requests to the {{PUT /connector-plugins/\{connectorType}/config/validate}} endpoint are [delegated to the herder|https://github.com/apache/kafka/blob/16ee326755e3f13914a0ed446c34c84e65fc0bc4/connect/runtime/src/main/java/org/apache/kafka/connect/runtime/rest/resources/ConnectorPluginsResource.java#L81], which [caches connector instances|https://github.com/apache/kafka/blob/16ee326755e3f13914a0ed446c34c84e65fc0bc4/connect/runtime/src/main/java/org/apache/kafka/connect/runtime/AbstractHerder.java#L536-L544] that are used [during config validation|https://github.com/apache/kafka/blob/16ee326755e3f13914a0ed446c34c84e65fc0bc4/connect/runtime/src/main/java/org/apache/kafka/connect/runtime/AbstractHerder.java#L310]. This has the effect that, should concurrent requests to that endpoint occur for the same connector type, the same connector instance may be responsible for [validating those configurations|https://github.com/apache/kafka/blob/16ee326755e3f13914a0ed446c34c84e65fc0bc4/connect/runtime/src/main/java/org/apache/kafka/connect/runtime/AbstractHerder.java#L334] concurrently _(may_ instead of _will_ because there is also a race condition in the {{AbstractHerder::getConnector}} method that potentially fails to detect that an instance of the connector has already been created and, as a result, creates a second instance). This is slightly problematic because the [Connector::validate|https://github.com/apache/kafka/blob/16ee326755e3f13914a0ed446c34c84e65fc0bc4/connect/api/src/main/java/org/apache/kafka/connect/connector/Connector.java#L122-L127] method is not marked as thread-safe. However, because a lot of connectors out there tend to use the default implementation for that method, it's probably not super urgent that we patch this immediately. A couple of options are: # Update the docs for that method to specify that it must be thread-safe # Rewrite the connector validation logic in the framework to avoid concurrently invoking {{Connector::validate}} on the same instance. -- This message was sent by Atlassian Jira (v8.3.4#803005)