[jira] [Commented] (KAFKA-15473) Connect connector-plugins endpoint shows duplicate plugins

2023-09-18 Thread Greg Harris (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-15473?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17766578#comment-17766578
 ] 

Greg Harris commented on KAFKA-15473:
-

it appears that the bug which prompted the fix in KAFKA-15244 (wrong PluginType 
being inferred) also could cause duplicates. For example:
{noformat}
  {
    "class": "org.apache.kafka.connect.storage.StringConverter",
    "type": "converter"
  },
  {
    "class": "org.apache.kafka.connect.storage.StringConverter",
    "type": "converter"
  },{noformat}
Here, the second entry should have been "header_converter". So while there are 
more duplicates in 3.6.0-rc0 than there were in <3.6.0-rc0, the presence of 
duplicates is not new. We wouldn't be breaking any third-party clients, because 
they would already have to handle these duplicates.

> Connect connector-plugins endpoint shows duplicate plugins
> --
>
> Key: KAFKA-15473
> URL: https://issues.apache.org/jira/browse/KAFKA-15473
> Project: Kafka
>  Issue Type: Bug
>  Components: KafkaConnect
>Affects Versions: 3.6.0
>Reporter: Greg Harris
>Assignee: Greg Harris
>Priority: Blocker
> Fix For: 3.6.0
>
>
> In <3.6.0-rc0, only one copy of each plugin would be shown. For example:
> {noformat}
>   {
> "class": "org.apache.kafka.connect.storage.StringConverter",
> "type": "converter"
>   },{noformat}
> In 3.6.0-rc0, there are multiple listings for the same plugin. For example:
>  
> {noformat}
>   {
>     "class": "org.apache.kafka.connect.storage.StringConverter",
>     "type": "converter"
>   },
>   {
>     "class": "org.apache.kafka.connect.storage.StringConverter",
>     "type": "converter"
>   },
>   {
>     "class": "org.apache.kafka.connect.storage.StringConverter",
>     "type": "converter"
>   },
>   {
>     "class": "org.apache.kafka.connect.storage.StringConverter",
>     "type": "converter"
>   },
>   {
>     "class": "org.apache.kafka.connect.storage.StringConverter",
>     "type": "converter"
>   },
>   {
>     "class": "org.apache.kafka.connect.storage.StringConverter",
>     "type": "converter",
>     "version": "3.6.0"
>   },{noformat}
> These duplicates appear to happen when a plugin with the same class name 
> appears in multiple locations/classloaders.
> When interpreting a connector configuration, only one of these plugins will 
> be chosen, so only one is relevant to show to users. The REST API should only 
> display the plugins which are eligible to be loaded, and hide the duplicates.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (KAFKA-15473) Connect connector-plugins endpoint shows duplicate plugins

2023-09-18 Thread Greg Harris (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-15473?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17766592#comment-17766592
 ] 

Greg Harris commented on KAFKA-15473:
-

Plugins could also appear multiple times <3.6.0-rc0 if multiple versions were 
on the plugin path concurrently. The DelegatingClassLoader would prefer the one 
with the latest version, but all of the different versions would be visible in 
the REST API.
It also treated the undefined version as distinct from defined versions. Since 
3.6.0 is also the first version with KAFKA-15291 and there are some public 
plugins which package the AK transforms, there are now going to be both 
un-versioned and versioned copies of the transforms classes.

I think there are multiple ways to address this:
1. Show the source location of the plugins in the REST API to distinguish 
between apparently equivalent entries
2. Use the DelegatingClassLoader logic for choosing among multiple 
similarly-named plugins to only show the entry which the DelegatingClassLoader 
would use
3. Deduplicate the PluginInfos to avoid obvious repetition, but allow multiple 
versions to still be shown.

(1) would require a KIP, and might be scope creep for the REST API. 
Theoretically the API client shouldn't care about the on-disk layout of plugins.
(2) hides the duplicates introduced by KAFKA-15244 and KAFKA-15291, but also 
hides all but the latest version of each plugin. If someone has multiple 
versions of a plugin installed, they can currently diagnose that via REST API. 
After implementing solution (2), they would need to look at the worker logs.
(3) hides the duplicates introduced by KAFKA-15244, but leaves the duplicates 
introduced by KAFKA-15291, and allows users to diagnose when multiple versions 
are installed.

I'm not sure which of these to implement. In <3.6.0-rc0 it was not possible to 
diagnose installations with multiple copies of the same plugin when they had 
the same (or undefined) version, and in 3.6.0-rc0 that is now possible. If we 
implement solution (2) or (3), we would take away that capability.
We might just be able to leave this as-is, and try to implement some form of 
solution (1) to make these duplicates more descriptive.

> Connect connector-plugins endpoint shows duplicate plugins
> --
>
> Key: KAFKA-15473
> URL: https://issues.apache.org/jira/browse/KAFKA-15473
> Project: Kafka
>  Issue Type: Bug
>  Components: KafkaConnect
>Affects Versions: 3.6.0
>Reporter: Greg Harris
>Assignee: Greg Harris
>Priority: Major
> Fix For: 3.6.0
>
>
> In <3.6.0-rc0, only one copy of each plugin would be shown. For example:
> {noformat}
>   {
> "class": "org.apache.kafka.connect.storage.StringConverter",
> "type": "converter"
>   },{noformat}
> In 3.6.0-rc0, there are multiple listings for the same plugin. For example:
>  
> {noformat}
>   {
>     "class": "org.apache.kafka.connect.storage.StringConverter",
>     "type": "converter"
>   },
>   {
>     "class": "org.apache.kafka.connect.storage.StringConverter",
>     "type": "converter"
>   },
>   {
>     "class": "org.apache.kafka.connect.storage.StringConverter",
>     "type": "converter"
>   },
>   {
>     "class": "org.apache.kafka.connect.storage.StringConverter",
>     "type": "converter"
>   },
>   {
>     "class": "org.apache.kafka.connect.storage.StringConverter",
>     "type": "converter"
>   },
>   {
>     "class": "org.apache.kafka.connect.storage.StringConverter",
>     "type": "converter",
>     "version": "3.6.0"
>   },{noformat}
> These duplicates appear to happen when a plugin with the same class name 
> appears in multiple locations/classloaders.
> When interpreting a connector configuration, only one of these plugins will 
> be chosen, so only one is relevant to show to users. The REST API should only 
> display the plugins which are eligible to be loaded, and hide the duplicates.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (KAFKA-15473) Connect connector-plugins endpoint shows duplicate plugins

2023-09-18 Thread Greg Harris (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-15473?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17766595#comment-17766595
 ] 

Greg Harris commented on KAFKA-15473:
-

I've opened [https://github.com/apache/kafka/pull/14398] with strategy (3) from 
above. We can always implement (1) in the future and change the 
PluginInfo::equals implementation to show these duplicates, so we can hide them 
for now. I think (2) removes functionality from the API and would count as a 
regression itself.

> Connect connector-plugins endpoint shows duplicate plugins
> --
>
> Key: KAFKA-15473
> URL: https://issues.apache.org/jira/browse/KAFKA-15473
> Project: Kafka
>  Issue Type: Bug
>  Components: KafkaConnect
>Affects Versions: 3.6.0
>Reporter: Greg Harris
>Assignee: Greg Harris
>Priority: Major
> Fix For: 3.6.0
>
>
> In <3.6.0-rc0, only one copy of each plugin would be shown. For example:
> {noformat}
>   {
> "class": "org.apache.kafka.connect.storage.StringConverter",
> "type": "converter"
>   },{noformat}
> In 3.6.0-rc0, there are multiple listings for the same plugin. For example:
>  
> {noformat}
>   {
>     "class": "org.apache.kafka.connect.storage.StringConverter",
>     "type": "converter"
>   },
>   {
>     "class": "org.apache.kafka.connect.storage.StringConverter",
>     "type": "converter"
>   },
>   {
>     "class": "org.apache.kafka.connect.storage.StringConverter",
>     "type": "converter"
>   },
>   {
>     "class": "org.apache.kafka.connect.storage.StringConverter",
>     "type": "converter"
>   },
>   {
>     "class": "org.apache.kafka.connect.storage.StringConverter",
>     "type": "converter"
>   },
>   {
>     "class": "org.apache.kafka.connect.storage.StringConverter",
>     "type": "converter",
>     "version": "3.6.0"
>   },{noformat}
> These duplicates appear to happen when a plugin with the same class name 
> appears in multiple locations/classloaders.
> When interpreting a connector configuration, only one of these plugins will 
> be chosen, so only one is relevant to show to users. The REST API should only 
> display the plugins which are eligible to be loaded, and hide the duplicates.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (KAFKA-15473) Connect connector-plugins endpoint shows duplicate plugins

2023-09-19 Thread Satish Duggana (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-15473?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17766718#comment-17766718
 ] 

Satish Duggana commented on KAFKA-15473:


[~gharris1727]
Is this API documented that it does not return duplicate entries?

Can we also get an opinion from PMC/Committers who have KafkaConnect
expertise on whether this issue is a release blocker?

If we agree that it is not a release blocker then we can have a
release note clarifying this behaviour and add a reference to the JIRA
that follows up on the possible solutions.

> Connect connector-plugins endpoint shows duplicate plugins
> --
>
> Key: KAFKA-15473
> URL: https://issues.apache.org/jira/browse/KAFKA-15473
> Project: Kafka
>  Issue Type: Bug
>  Components: KafkaConnect
>Affects Versions: 3.6.0
>Reporter: Greg Harris
>Assignee: Greg Harris
>Priority: Major
> Fix For: 3.6.0
>
>
> In <3.6.0-rc0, only one copy of each plugin would be shown. For example:
> {noformat}
>   {
> "class": "org.apache.kafka.connect.storage.StringConverter",
> "type": "converter"
>   },{noformat}
> In 3.6.0-rc0, there are multiple listings for the same plugin. For example:
>  
> {noformat}
>   {
>     "class": "org.apache.kafka.connect.storage.StringConverter",
>     "type": "converter"
>   },
>   {
>     "class": "org.apache.kafka.connect.storage.StringConverter",
>     "type": "converter"
>   },
>   {
>     "class": "org.apache.kafka.connect.storage.StringConverter",
>     "type": "converter"
>   },
>   {
>     "class": "org.apache.kafka.connect.storage.StringConverter",
>     "type": "converter"
>   },
>   {
>     "class": "org.apache.kafka.connect.storage.StringConverter",
>     "type": "converter"
>   },
>   {
>     "class": "org.apache.kafka.connect.storage.StringConverter",
>     "type": "converter",
>     "version": "3.6.0"
>   },{noformat}
> These duplicates appear to happen when a plugin with the same class name 
> appears in multiple locations/classloaders.
> When interpreting a connector configuration, only one of these plugins will 
> be chosen, so only one is relevant to show to users. The REST API should only 
> display the plugins which are eligible to be loaded, and hide the duplicates.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (KAFKA-15473) Connect connector-plugins endpoint shows duplicate plugins

2023-09-19 Thread Sagar Rao (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-15473?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17766738#comment-17766738
 ] 

Sagar Rao commented on KAFKA-15473:
---

[~satish.duggana], No the API documentation doesn't mention anything about the 
presence/absence of duplicate entries. This is what it says:

??GET /connector-plugins- return a list of connector plugins installed in the 
Kafka Connect cluster. Note that the API only checks for connectors on the 
worker that handles the request, which means you may see inconsistent results, 
especially during a rolling upgrade if you add new connector jars
??
I think the implicit assumption is that these would always return unique values 
but as Greg pointed out above, even pre-3.6 there could be cases in which this 
end point can return duplicate entries. Keeping that in mind, IMO this needn't 
be a release blocker and we can document it as you suggested. 

> Connect connector-plugins endpoint shows duplicate plugins
> --
>
> Key: KAFKA-15473
> URL: https://issues.apache.org/jira/browse/KAFKA-15473
> Project: Kafka
>  Issue Type: Bug
>  Components: KafkaConnect
>Affects Versions: 3.6.0
>Reporter: Greg Harris
>Assignee: Greg Harris
>Priority: Major
> Fix For: 3.6.0
>
>
> In <3.6.0-rc0, only one copy of each plugin would be shown. For example:
> {noformat}
>   {
> "class": "org.apache.kafka.connect.storage.StringConverter",
> "type": "converter"
>   },{noformat}
> In 3.6.0-rc0, there are multiple listings for the same plugin. For example:
>  
> {noformat}
>   {
>     "class": "org.apache.kafka.connect.storage.StringConverter",
>     "type": "converter"
>   },
>   {
>     "class": "org.apache.kafka.connect.storage.StringConverter",
>     "type": "converter"
>   },
>   {
>     "class": "org.apache.kafka.connect.storage.StringConverter",
>     "type": "converter"
>   },
>   {
>     "class": "org.apache.kafka.connect.storage.StringConverter",
>     "type": "converter"
>   },
>   {
>     "class": "org.apache.kafka.connect.storage.StringConverter",
>     "type": "converter"
>   },
>   {
>     "class": "org.apache.kafka.connect.storage.StringConverter",
>     "type": "converter",
>     "version": "3.6.0"
>   },{noformat}
> These duplicates appear to happen when a plugin with the same class name 
> appears in multiple locations/classloaders.
> When interpreting a connector configuration, only one of these plugins will 
> be chosen, so only one is relevant to show to users. The REST API should only 
> display the plugins which are eligible to be loaded, and hide the duplicates.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)