[kafka] branch trunk updated: KAFKA-15233: Add documentation for plugin.discovery and connect-plugin-path (KIP-898) (#14068)

gharris Thu, 10 Aug 2023 16:16:24 -0700

This is an automated email from the ASF dual-hosted git repository.

gharris pushed a commit to branch trunk
in repository https://gitbox.apache.org/repos/asf/kafka.git



The following commit(s) were added to refs/heads/trunk by this push:
     new 821954e5692 KAFKA-15233: Add documentation for plugin.discovery and 
connect-plugin-path (KIP-898) (#14068)
821954e5692 is described below

commit 821954e5692f0a2748018939a7af8fe0f765ac24
Author: Greg Harris <[email protected]>
AuthorDate: Thu Aug 10 16:16:11 2023 -0700

    KAFKA-15233: Add documentation for plugin.discovery and connect-plugin-path 
(KIP-898) (#14068)
    
    Reviewers: Qichao Chu (@ex172000), Chris Egerton <[email protected]>
---
 docs/connect.html | 69 ++++++++++++++++++++++++++++++++++++++++++++++++++++++-
 docs/toc.html     |  1 +
 2 files changed, 69 insertions(+), 1 deletion(-)

diff --git a/docs/connect.html b/docs/connect.html
index ac580696d36..d324f4384b4 100644
--- a/docs/connect.html
+++ b/docs/connect.html
@@ -543,6 +543,67 @@ errors.tolerance=all</pre>
         </tbody>
     </table>
 
+    <h4><a id="connect_plugindiscovery" href="#connect_plugindiscovery">Plugin 
Discovery</a></h4>
+
+    <p>Plugin discovery is the name for the strategy which the Connect worker 
uses to find plugin classes and make them accessible to configure and run in 
connectors. This is controlled by the <a 
href="#connectconfigs_plugin.discovery">plugin.discovery</a> worker 
configuration, and has a significant impact on worker startup time. 
<code>service_load</code> is the fastest strategy, but care should be taken to 
verify that plugins are compatible before setting this configuration to 
<code>ser [...]
+
+    <p>Prior to version 3.6, this strategy was not configurable, and behaved 
like the <code>only_scan</code> mode which is compatible with all plugins. For 
version 3.6 and later, this mode defaults to <code>hybrid_warn</code> which is 
also compatible with all plugins, but logs a warning for plugins which are 
incompatible with <code>service_load</code>. The <code>hybrid_fail</code> 
strategy stops the worker with an error if a plugin incompatible with 
<code>service_load</code> is detected, [...]
+
+    <h5><a id="connect_plugindiscovery_compatibility" 
href="#connect_plugindiscovery_compatibility">Verifying Plugin 
Compatibility</a></h5>
+
+    <p>To verify if all of your plugins are compatible with 
<code>service_load</code>, first ensure that you are using version 3.6 or later 
of Kafka Connect. You can then perform one of the following checks:</p>
+
+    <ul>
+        <li>Start your worker with the default 
<code>hybrid_warn</code>strategy, and WARN logs enabled for the 
<code>org.apache.kafka.connect</code> package. At least one WARN log message 
mentioning the <code>plugin.discovery</code> configuration should be printed. 
This log message will explicitly say that all plugins are compatible, or list 
the incompatible plugins.</li>
+        <li>Start your worker in a test environment with 
<code>hybrid_fail</code>. If all plugins are compatible, startup will succeed. 
If at least one plugin is not compatible the worker will fail to start up, and 
all incompatible plugins will be listed in the exception.</li>
+    </ul>
+
+    <p>If the verification step succeeds, then your current set of installed 
plugins is compatible, and it should be safe to change the 
<code>plugin.discovery</code> configuration to <code>service_load</code>. If 
the verification fails, you cannot use <code>service_load</code> strategy and 
should take note of the list of incompatible plugins. All plugins must be 
addressed before using the <code>service_load</code> strategy. It is 
recommended to perform this verification after installing  [...]
+
+    <h5><a id="connect_plugindiscovery_migrateartifact" 
href="#connect_plugindiscovery_migrateartifact">Operators: Artifact 
Migration</a></h5>
+
+    <p>As an operator of Connect, if you discover incompatible plugins, there 
are multiple ways to resolve the incompatibility. They are listed below from 
most to least preferable.</p>
+
+    <ol>
+        <li>Check the latest release from your plugin provider, and if it is 
compatible, upgrade.</li>
+        <li>Contact your plugin provider and request that they migrate the 
plugin to be compatible, following the <a 
href="#connect_plugindiscovery_migratesource">source migration 
instructions</a>, and then upgrade to the compatible version.</li>
+        <li>Migrate the plugin artifacts yourself using the included migration 
script.</li>
+    </ol>
+
+    <p>The migration script is located in 
<code>bin/connect-plugin-path.sh</code> and 
<code>bin\windows\connect-plugin-path.bat</code> of your Kafka installation. 
The script can migrate incompatible plugin artifacts already installed on your 
Connect worker's <code>plugin.path</code> by adding or modifying JAR or 
resource files. This is not suitable for environments using code-signing, as 
this can change artifacts such that they will fail signature verification. View 
the built-in help wit [...]
+
+    <p>To perform a migration, first use the <code>list</code> subcommand to 
get an overview of the plugins available to the script. You must tell the 
script where to find plugins, which can be done with the repeatable 
<code>--worker-config</code>, <code>--plugin-path</code>, and 
<code>--plugin-location</code> arguments. The script will ignore plugins on the 
classpath, so any custom plugins on your classpath should be moved to the 
plugin path in order to be used with this migration scrip [...]
+
+    <p>Once you see that all incompatible plugins are included in the listing, 
you can proceed to dry-run the migration with <code>sync-manifests 
--dry-run</code>. This will perform all parts of the migration, except for 
writing the results of the migration to disk. Note that the 
<code>sync-manifests</code> command requires all specified paths to be 
writable, and may alter the contents of the directories. Make a backup of your 
plugins in the specified paths, or copy them to a writable di [...]
+
+    <p>Ensure that you have a backup of your plugins and the dry-run succeeds 
before removing the <code>--dry-run</code> flag and actually running the 
migration. If the migration fails without the <code>--dry-run</code> flag, then 
the partially migrated artifacts should be discarded. The migration is 
idempotent, so running it multiple times and on already-migrated plugins is 
safe. After the script finishes, you should <a 
href="#connect_plugindiscovery_compatibility">verify the migration  [...]
+
+    <h5><a id="connect_plugindiscovery_migratesource" 
href="#connect_plugindiscovery_migratesource">Developers: Source 
Migration</a></h5>
+
+    <p>To make plugins compatible with <code>service_load</code>, it is 
necessary to add <a 
href="https://docs.oracle.com/javase/8/docs/api/java/util/ServiceLoader.html";>ServiceLoader</a>
 manifests to your source code, which should then be packaged in the release 
artifact. Manifests are resource files in <code>META-INF/services/</code> named 
after their superclass type, and contain a list of fully-qualified subclass 
names, one on each line.</p>
+
+    <p>In order for a plugin to be compatible, it must appear as a line in a 
manifest corresponding to the plugin superclass it extends. If a single plugin 
implements multiple plugin interfaces, then it should appear in a manifest for 
each interface it implements. If you have no classes for a certain type of 
plugin, you do not need to include a manifest file for that type. If you have 
classes which should not be visible as plugins, they should be marked abstract. 
The following types are  [...]
+
+    <ul>
+        <li><code>org.apache.kafka.connect.sink.SinkConnector</code></li>
+        <li><code>org.apache.kafka.connect.source.SourceConnector</code></li>
+        <li><code>org.apache.kafka.connect.storage.Converter</code></li>
+        <li><code>org.apache.kafka.connect.storage.HeaderConverter</code></li>
+        
<li><code>org.apache.kafka.connect.transforms.Transformation</code></li>
+        
<li><code>org.apache.kafka.connect.transforms.predicates.Predicate</code></li>
+        
<li><code>org.apache.kafka.common.config.provider.ConfigProvider</code></li>
+        
<li><code>org.apache.kafka.connect.rest.ConnectRestExtension</code></li>
+        
<li><code>org.apache.kafka.connect.connector.policy.ConnectorClientConfigOverridePolicy</code></li>
+    </ul>
+
+    <p>For example, if you only have one connector with the fully-qualified 
name <code>com.example.MySinkConnector</code>, then only one manifest file must 
be added to resources in 
<code>META-INF/services/org.apache.kafka.connect.sink.SinkConnector</code>, and 
the contents should be similar to the following:</p>
+
+    <pre class="brush: resource;">
+# license header or comment
+com.example.MySinkConnector</pre>
+
+    <p>You should then verify that your manifests are correct by using the <a 
href="#connect_plugindiscovery_compatibility">verification steps</a> with a 
pre-release artifact. If the verification succeeds, you can then release the 
plugin normally, and operators can upgrade to the compatible version.</p>
+
     <h3><a id="connect_development" href="#connect_development">8.3 Connector 
Development Guide</a></h3>
 
     <p>This guide describes how developers can write new connectors for Kafka 
Connect to move data between Kafka and other systems. It briefly reviews a few 
key concepts and then describes how to create a simple connector.</p>
@@ -577,9 +638,15 @@ errors.tolerance=all</pre>
 
     <h5><a id="connect_connectorexample" 
href="#connect_connectorexample">Connector Example</a></h5>
 
-    <p>We'll cover the <code>SourceConnector</code> as a simple example. 
<code>SinkConnector</code> implementations are very similar. Start by creating 
the class that inherits from <code>SourceConnector</code> and add a field that 
will store the configuration information to be propagated to the task(s) (the 
topic to send data to, and optionally - the filename to read from and the 
maximum batch size):</p>
+    <p>We'll cover the <code>SourceConnector</code> as a simple example. 
<code>SinkConnector</code> implementations are very similar. Pick a package and 
class name, these examples will use the <code>FileStreamSourceConnector</code> 
but substitute your own class name where appropriate. In order to <a 
href="#connect_plugindiscovery">make the plugin discoverable at runtime</a>, 
add a ServiceLoader manifest to your resources in 
<code>META-INF/services/org.apache.kafka.connect.source.SourceCo [...]
+    <pre class="brush: resource;">
+com.example.FileStreamSourceConnector</pre>
+
+    <p>Create a class that inherits from <code>SourceConnector</code> and add 
a field that will store the configuration information to be propagated to the 
task(s) (the topic to send data to, and optionally - the filename to read from 
and the maximum batch size):</p>
 
     <pre class="brush: java;">
+package com.example;
+
 public class FileStreamSourceConnector extends SourceConnector {
     private Map&lt;String, String&gt; props;</pre>
 
diff --git a/docs/toc.html b/docs/toc.html
index a50a81a5c41..d8979b1a904 100644
--- a/docs/toc.html
+++ b/docs/toc.html
@@ -202,6 +202,7 @@
                     <li><a href="#connect_rest">REST API</a></li>
                     <li><a href="#connect_errorreporting">Error Reporting in 
Connect</a></li>
                     <li><a href="#connect_exactlyonce">Exactly-once 
support</a></li>
+                    <li><a href="#connect_plugindiscovery">Plugin 
Discovery</a></li>
                 </ul>
                 <li><a href="#connect_development">8.3 Connector Development 
Guide</a></li>
             </ul>

[kafka] branch trunk updated: KAFKA-15233: Add documentation for plugin.discovery and connect-plugin-path (KIP-898) (#14068)

Reply via email to