Re: [VOTE] KIP-891: Running multiple versions of Connector plugins

2024-10-08 Thread Snehashis
Thanks Greg, Chris and Ashwin

For the individual queries,

1. The example is incorrect and both should be treated as exact versions.
Thanks for pointing that out.
2. I can see a converter/transformation having a version property but it's
probably not likely. That said I don't see why we should not use
plugin.version to make this more specific and avoid encountering this in
the first place.

Will make the appropriate changes to the KIP to include these two points.
Thanks
Snehashis

On Tue, Oct 8, 2024 at 10:33 AM Ashwin  wrote:

> Hi Snehasis,
>
> Thanks for the KIP - +1 (non-binding)
>
> I just had a question regarding the sample config in
>
> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=235834793#KIP891:RunningmultipleversionsofConnectorplugins-Configuration
>
> >  "connector.version": "3.8"
> > "transforms.flatten-old.version": "3.8.0"
>
> How will the feature be implemented so that first config treated as an
> exact version requirement, while the second one is a
> "use-only-if-you-have-it" version ?
>
> Thanks,
> Ashwin
>
>
> On Tue, Oct 8, 2024 at 12:19 AM Chris Egerton 
> wrote:
>
> > Hi Snehashis,
> >
> > Thanks for the KIP. I'm +1 (binding) but have one more non-blocking
> thought
> > I wanted to share.
> >
> > On the off chance that an existing plugin is designed to accept a
> "version"
> > property, could we either 1) keep passing that property to plugins
> instead
> > of stripping it, or 2) rename our new property to something like
> > "plugin.version"?
> >
> > Feel free to close the vote (if/when it gets a third binding+1) without
> > addressing this if you believe the tradeoffs with the existing design are
> > superior.
> >
> > Cheers,
> >
> > Chris
> >
> > On Mon, Oct 7, 2024, 11:22 Greg Harris 
> > wrote:
> >
> > > Hey Snehashis,
> > >
> > > Thanks for the KIP! +1 (binding)
> > >
> > > Greg
> > >
> > > On Fri, Oct 4, 2024 at 10:14 PM Snehashis 
> > > wrote:
> > >
> > > > Hi everyone
> > > >
> > > > I would like to call a vote for KIP-891. Please take a moment to
> review
> > > the
> > > > proposal and submit your vote. Special thanks to Greg who helped to
> > > expand
> > > > this to make it much more broadly useful, and everyone else who
> > > > participated in both discussion threads.
> > > >
> > > > KIP
> > > > KIP-891: Running multiple versions of Connector plugins - Apache
> Kafka
> > -
> > > > Apache Software Foundation
> > > > <
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-891%3A+Running+multiple+versions+of+Connector+plugins
> > > > >
> > > >
> > > > Thanks
> > > > Snehashis
> > > >
> > >
> >
>


Re: [DISCUSS] KIP-891: Running multiple versions of Connector plugins

2024-10-04 Thread Snehashis
Hi Greg,

I have started a vote thread and added it to the doc. Thanks for all your
help on this. Looking forward to this implementation.

Regards
Snehashis


On Wed, Sep 25, 2024 at 10:38 PM Greg Harris 
wrote:

> Hey Snehashis,
>
> I updated the KIP to remove some stale mentions of the soft version
> requirements, and the crashing workers on startup. I also added more detail
> to the REST API.
>
> IMHO we're ready to move to voting, so please open the thread if you also
> believe it is ready.
>
> Thanks!
>
> Greg
>
> On Fri, Aug 23, 2024 at 7:21 AM Snehashis 
> wrote:
>
> > Hi Greg,
> >
> > Thanks for the clarification!
> >
> > I agree with not breaking compatibility and opting to not fail worker
> > startup by validating converter versions.
> >
> > Let's also go with the unclosed variation of the requirements as a hard
> > requirement i.e. 3.8 instead of [3.8]. I have updated the KIP and added a
> > line there highlighting this.
> >
> > Regards
> > Snehashis
> >
> > On Wed, Aug 21, 2024 at 9:24 PM Greg Harris  >
> > wrote:
> >
> > > Hey Snehashis,
> > >
> > > Thanks for your reply!
> > >
> > > > Deviating
> > > > it from the spec seems unnecessary if we document it accordingly,
> > however
> > > > It's probably less intuitive and can lead to confusion. I would just
> > keep
> > > > it as is but making it simpler is also fine.
> > >
> > > It sounds like you don't have a strong opinion on this, similar to me.
> > > Chris had a more firm stance on this. I think that you're right that we
> > > could document this, but it would still be the single biggest foot-gun
> of
> > > the feature, and I think we would regret it later.
> > >
> > > > Also for converter versions
> > > > specified as part of the worker configs I believe we concluded that
> > this
> > > > step need not be fatal during worker startup if the required version
> is
> > > not
> > > > found but LMK if otherwise.
> > >
> > > Re-reading our earlier discussion, I think Chris had a strong opinion
> > that
> > > we shouldn't fail on startup because it would be inconsistent. I made
> an
> > > offhand comment that if this was released in 4.0, we could change it so
> > > both invalid classes and invalid versions cause the worker to fail,
> which
> > > would be inconsistent but backwards incompatible.
> > > I think in the interest of not breaking compatibility needlessly and
> > > keeping consistent behavior, we should ignore invalid versions in the
> > > worker config.
> > >
> > > Thanks,
> > > Greg
> > >
> > > On Wed, Aug 21, 2024 at 1:58 AM Snehashis 
> > > wrote:
> > >
> > > > Hi Greg,
> > > >
> > > > No issues, I have been caught up in a few things myself.
> > > >
> > > > I have added the points we discussed. In addition, I have added
> config
> > > > providers as part of the set of plugins which will not support
> > > versioning,
> > > > for the same reason as to why it is not supported in the other
> plugins
> > > that
> > > > are initiated on startup.
> > > >
> > > > On whether to deviate from maven versioning for hard requirements as
> > > > discussed between you and Chris.Whether to simply simply specify
> > > > connector.version=1.1.1 as a hard requirement instead of [1.1.1].
> > > Deviating
> > > > it from the spec seems unnecessary if we document it accordingly,
> > however
> > > > It's probably less intuitive and can lead to confusion. I would just
> > keep
> > > > it as is but making it simpler is also fine. Also for converter
> > versions
> > > > specified as part of the worker configs I believe we concluded that
> > this
> > > > step need not be fatal during worker startup if the required version
> is
> > > not
> > > > found but LMK if otherwise.
> > > >
> > > > Regards
> > > > Snehashis
> > > >
> > > >
> > > >
> > > > On Mon, Aug 19, 2024 at 9:24 PM Greg Harris
> >  > > >
> > > > wrote:
> > > >
> > > > > Hi Snehashis,
> > > > >
> > > > > Sorry for the late reply.
> > > > >
> > > > > > Heterogeneo

[VOTE] KIP-891: Running multiple versions of Connector plugins

2024-10-04 Thread Snehashis
Hi everyone

I would like to call a vote for KIP-891. Please take a moment to review the
proposal and submit your vote. Special thanks to Greg who helped to expand
this to make it much more broadly useful, and everyone else who
participated in both discussion threads.

KIP
KIP-891: Running multiple versions of Connector plugins - Apache Kafka -
Apache Software Foundation
<https://cwiki.apache.org/confluence/display/KAFKA/KIP-891%3A+Running+multiple+versions+of+Connector+plugins>

Thanks
Snehashis


Re: [DISCUSS] KIP-891: Running multiple versions of Connector plugins

2024-08-23 Thread Snehashis
Hi Greg,

Thanks for the clarification!

I agree with not breaking compatibility and opting to not fail worker
startup by validating converter versions.

Let's also go with the unclosed variation of the requirements as a hard
requirement i.e. 3.8 instead of [3.8]. I have updated the KIP and added a
line there highlighting this.

Regards
Snehashis

On Wed, Aug 21, 2024 at 9:24 PM Greg Harris 
wrote:

> Hey Snehashis,
>
> Thanks for your reply!
>
> > Deviating
> > it from the spec seems unnecessary if we document it accordingly, however
> > It's probably less intuitive and can lead to confusion. I would just keep
> > it as is but making it simpler is also fine.
>
> It sounds like you don't have a strong opinion on this, similar to me.
> Chris had a more firm stance on this. I think that you're right that we
> could document this, but it would still be the single biggest foot-gun of
> the feature, and I think we would regret it later.
>
> > Also for converter versions
> > specified as part of the worker configs I believe we concluded that this
> > step need not be fatal during worker startup if the required version is
> not
> > found but LMK if otherwise.
>
> Re-reading our earlier discussion, I think Chris had a strong opinion that
> we shouldn't fail on startup because it would be inconsistent. I made an
> offhand comment that if this was released in 4.0, we could change it so
> both invalid classes and invalid versions cause the worker to fail, which
> would be inconsistent but backwards incompatible.
> I think in the interest of not breaking compatibility needlessly and
> keeping consistent behavior, we should ignore invalid versions in the
> worker config.
>
> Thanks,
> Greg
>
> On Wed, Aug 21, 2024 at 1:58 AM Snehashis 
> wrote:
>
> > Hi Greg,
> >
> > No issues, I have been caught up in a few things myself.
> >
> > I have added the points we discussed. In addition, I have added config
> > providers as part of the set of plugins which will not support
> versioning,
> > for the same reason as to why it is not supported in the other plugins
> that
> > are initiated on startup.
> >
> > On whether to deviate from maven versioning for hard requirements as
> > discussed between you and Chris.Whether to simply simply specify
> > connector.version=1.1.1 as a hard requirement instead of [1.1.1].
> Deviating
> > it from the spec seems unnecessary if we document it accordingly, however
> > It's probably less intuitive and can lead to confusion. I would just keep
> > it as is but making it simpler is also fine. Also for converter versions
> > specified as part of the worker configs I believe we concluded that this
> > step need not be fatal during worker startup if the required version is
> not
> > found but LMK if otherwise.
> >
> > Regards
> > Snehashis
> >
> >
> >
> > On Mon, Aug 19, 2024 at 9:24 PM Greg Harris  >
> > wrote:
> >
> > > Hi Snehashis,
> > >
> > > Sorry for the late reply.
> > >
> > > > Heterogeneous dependencies in a multi cluster deployment is highly
> > > discouraged
> > >
> > > Right, this remains unchanged in this KIP.
> > >
> > > > Let's add the version information for both connector and tasks in the
> > > connector status itself
> > > > once we add these two additions to the KIP (LMK if you want me to
> take
> > > that up).
> > >
> > > Could you make these additions?
> > >
> > > I'm interested to see if we can include this in 4.0.
> > >
> > > Thanks,
> > > Greg
> > >
> > > On Tue, Jul 2, 2024 at 2:52 AM Snehashis 
> > wrote:
> > >
> > > > Hi Greg,
> > > >
> > > > Thanks for taking a look at this, to conclude on the two points
> above.
> > > >
> > > > 1. I'm okay with the status quo of leaving the dependency management
> of
> > > > plugins to systems outside of the connect runtime as it is now. Given
> > > that
> > > > the dependencies are homogenous across a connect cluster, it should
> > > ensure
> > > > that task and connector versions are uniform across a deployment,
> after
> > > it
> > > > has stabilised post an upgrade operation. Heterogeneous dependencies
> > in a
> > > > multi cluster deployment is highly discouraged and we should point
> out
> > > this
> > > > can lead to inconsistent behaviour with connectors and even
> > vali

Re: [DISCUSS] KIP-891: Running multiple versions of Connector plugins

2024-08-21 Thread Snehashis
Hi Greg,

No issues, I have been caught up in a few things myself.

I have added the points we discussed. In addition, I have added config
providers as part of the set of plugins which will not support versioning,
for the same reason as to why it is not supported in the other plugins that
are initiated on startup.

On whether to deviate from maven versioning for hard requirements as
discussed between you and Chris.Whether to simply simply specify
connector.version=1.1.1 as a hard requirement instead of [1.1.1]. Deviating
it from the spec seems unnecessary if we document it accordingly, however
It's probably less intuitive and can lead to confusion. I would just keep
it as is but making it simpler is also fine. Also for converter versions
specified as part of the worker configs I believe we concluded that this
step need not be fatal during worker startup if the required version is not
found but LMK if otherwise.

Regards
Snehashis



On Mon, Aug 19, 2024 at 9:24 PM Greg Harris 
wrote:

> Hi Snehashis,
>
> Sorry for the late reply.
>
> > Heterogeneous dependencies in a multi cluster deployment is highly
> discouraged
>
> Right, this remains unchanged in this KIP.
>
> > Let's add the version information for both connector and tasks in the
> connector status itself
> > once we add these two additions to the KIP (LMK if you want me to take
> that up).
>
> Could you make these additions?
>
> I'm interested to see if we can include this in 4.0.
>
> Thanks,
> Greg
>
> On Tue, Jul 2, 2024 at 2:52 AM Snehashis  wrote:
>
> > Hi Greg,
> >
> > Thanks for taking a look at this, to conclude on the two points above.
> >
> > 1. I'm okay with the status quo of leaving the dependency management of
> > plugins to systems outside of the connect runtime as it is now. Given
> that
> > the dependencies are homogenous across a connect cluster, it should
> ensure
> > that task and connector versions are uniform across a deployment, after
> it
> > has stabilised post an upgrade operation. Heterogeneous dependencies in a
> > multi cluster deployment is highly discouraged and we should point out
> this
> > can lead to inconsistent behaviour with connectors and even validations.
> >
> > 2. Let's add the version information for both connector and tasks in the
> > connector status itself, I don't see another endpoint being required for
> > this.
> >
> > I don't have any further points, so if you are okay we can put this to a
> > vote, once we add these two additions to the KIP (LMK if you want me to
> > take that up).
> >
> > Regards
> > Snehashis
> >
> > On Tue, Jul 2, 2024 at 12:08 AM Greg Harris  >
> > wrote:
> >
> > > Hey Snehashis,
> > >
> > > Sorry for the late reply, and thanks for helping close out the
> > discussion.
> > >
> > > > Note that if my assumptions are correct then
> > > > this can happen with the existing framework as well, or is there some
> > > > safeguard from this happening?
> > >
> > > Currently, if a cluster has a heterogeneous plugin installation, each
> > > worker may have a different definition of "latest". If you call to
> > > validate a configuration, it will validate against whatever latest
> > version
> > > is present locally.
> > > With this KIP, that behavior will continue when the version isn't
> > > specified, when the version is a soft requirement, or when the version
> > is a
> > > range. For users that do not want to tolerate their tasks having
> > > heterogeneous versions, the single-version hard requirements are there,
> > but
> > > come at the cost of micromanagement of upgrades.
> > > Perhaps there is some room in the future for a connector configured
> with
> > a
> > > range to "pin" the version that all of its tasks should use. Or
> > > version-aware-scheduling can schedule connectors and tasks to workers
> > that
> > > will resolve the same versions. The important primitive is being able
> to
> > > enforce which version is being used, everything else looks more like a
> > > convenience feature to me.
> > >
> > > > So far, we could have pointed to the
> > > > misconfigured cluster configuration and somewhat differ this problem
> to
> > > > something outside of connect runtime.
> > >
> > > I think this will still be the case. Even with this feature, we will
> > still
> > > recommend homogeneous clusters that have the same set of plugins
>

Re: [DISCUSS] KIP-891: Running multiple versions of Connector plugins

2024-07-02 Thread Snehashis
Hi Greg,

Thanks for taking a look at this, to conclude on the two points above.

1. I'm okay with the status quo of leaving the dependency management of
plugins to systems outside of the connect runtime as it is now. Given that
the dependencies are homogenous across a connect cluster, it should ensure
that task and connector versions are uniform across a deployment, after it
has stabilised post an upgrade operation. Heterogeneous dependencies in a
multi cluster deployment is highly discouraged and we should point out this
can lead to inconsistent behaviour with connectors and even validations.

2. Let's add the version information for both connector and tasks in the
connector status itself, I don't see another endpoint being required for
this.

I don't have any further points, so if you are okay we can put this to a
vote, once we add these two additions to the KIP (LMK if you want me to
take that up).

Regards
Snehashis

On Tue, Jul 2, 2024 at 12:08 AM Greg Harris 
wrote:

> Hey Snehashis,
>
> Sorry for the late reply, and thanks for helping close out the discussion.
>
> > Note that if my assumptions are correct then
> > this can happen with the existing framework as well, or is there some
> > safeguard from this happening?
>
> Currently, if a cluster has a heterogeneous plugin installation, each
> worker may have a different definition of "latest". If you call to
> validate a configuration, it will validate against whatever latest version
> is present locally.
> With this KIP, that behavior will continue when the version isn't
> specified, when the version is a soft requirement, or when the version is a
> range. For users that do not want to tolerate their tasks having
> heterogeneous versions, the single-version hard requirements are there, but
> come at the cost of micromanagement of upgrades.
> Perhaps there is some room in the future for a connector configured with a
> range to "pin" the version that all of its tasks should use. Or
> version-aware-scheduling can schedule connectors and tasks to workers that
> will resolve the same versions. The important primitive is being able to
> enforce which version is being used, everything else looks more like a
> convenience feature to me.
>
> > So far, we could have pointed to the
> > misconfigured cluster configuration and somewhat differ this problem to
> > something outside of connect runtime.
>
> I think this will still be the case. Even with this feature, we will still
> recommend homogeneous clusters that have the same set of plugins installed
> on every worker. Version-aware scheduling is a rejected alternative, and
> your example for performing validation requests on an arbitrary worker is a
> great example of a complication for heterogeneous clusters that we're not
> ready to solve.
>
> > We can introduce
> > a new path under this for version (/connector/connector-name/version),
> but
> > perhaps adding this as part of the status is a valid alternative.
>
> I think either is fine. Adding a new field to the existing endpoints is
> fine, and shouldn't require special compatibility such as a query
> parameter.
>
> > Also, to go further I think
> > version information for tasks could also be available,
>
> I completely agree. The version of each connector and each task should be
> visible individually, since they may be different during a rolling upgrade.
>
> Thanks!
> Greg
>
> On Tue, Jun 18, 2024 at 5:18 AM Snehashis 
> wrote:
>
> > Hi Greg, Chris
> >
> > Thanks for the in-depth discussion, I have a couple of discussion points
> > and would like your thoughts on this.
> >
> > 1) One concern I have with the new addition of 'soft' and 'hard' version
> > requirements is that there could be a mismatch in the plugin version that
> > two different tasks are running, if a soft requirement is provided and
> the
> > nodes a multi cluster deployment are not in sync w.r.t the plugin
> versions
> > that they are configured with. Note that if my assumptions are correct
> then
> > this can happen with the existing framework as well, or is there some
> > safeguard from this happening? So far, we could have pointed to the
> > misconfigured cluster configuration and somewhat differ this problem to
> > something outside of connect runtime. With this feature in place perhaps
> > the expectation is more on connect to not be running with such
> > inconsistency, especially if a connector version is specified. This is
> also
> > a problem with validation if different cluster have different
> > configurations, as IIRC validations are local to the worker which
> receives
&g

Re: [DISCUSS] KIP-891: Running multiple versions of Connector plugins

2024-06-18 Thread Snehashis
Hi Greg, Chris

Thanks for the in-depth discussion, I have a couple of discussion points
and would like your thoughts on this.

1) One concern I have with the new addition of 'soft' and 'hard' version
requirements is that there could be a mismatch in the plugin version that
two different tasks are running, if a soft requirement is provided and the
nodes a multi cluster deployment are not in sync w.r.t the plugin versions
that they are configured with. Note that if my assumptions are correct then
this can happen with the existing framework as well, or is there some
safeguard from this happening? So far, we could have pointed to the
misconfigured cluster configuration and somewhat differ this problem to
something outside of connect runtime. With this feature in place perhaps
the expectation is more on connect to not be running with such
inconsistency, especially if a connector version is specified. This is also
a problem with validation if different cluster have different
configurations, as IIRC validations are local to the worker which receives
the rest call for validate. So, we might be validating with a certain
version which is different from the one that will be used to create
connector and tasks. Again, this is likely how the current state is, but
perhaps such inconsistencies warrant a deeper look with the addition of
this feature. The problems associated with them can be somewhat insidious
and hard to diagnose.

2) There was some discussion on the need for a new REST endpoint to provide
information on the versions of running connectors, and I think adding this
information via REST is a valuable addition. The way I see it the version
is an intrinsic property of an instance of a running connector and hence
this should be part of the set of APIs under /connector/
(also the /connectors API should also have this information as it is an
amalgamation of all the individual connector information). We can introduce
a new path under this for version (/connector/connector-name/version), but
perhaps adding this as part of the status is a valid alternative. This is
mentioned as a rejected alternative right now. Also, to go further I think
version information for tasks could also be available, especially if we
choose to not address the pitfalls discussed in my point 1), this will
at-least provide admins a quick and easy way to determine if such and
inconsistent state exist in any of the connectors.

Thanks again for reviving my original KIP and working to improve it.
Looking forward to your thoughts on the points mentioned above.
Regards
Snehashis


On Wed, May 29, 2024 at 9:59 PM Chris Egerton 
wrote:

> Hi Greg,
>
> First, an apology! I mistakenly assumed that each plugin appeared only once
> in the responses from GET /connector-plugins?connectorsOnly=false. Thank
> you for correcting me and pointing out that all versions of each plugin
> appear in that response, which does indeed satisfy my desire for users to
> discover this information in at most two REST requests (and in fact, does
> it in only one)!
>
> And secondly, with the revelation about recommenders, I agree that it's
> best to leave the "version" property out of the lists of properties
> returned from the GET /connector-plugins//config endpoint.
>
> With those two points settled, I think the only unresolved item is the
> small change to version parsing added to the KIP (where raw version numbers
> are treated as an exact match, instead of a best-effort match with a
> fallback on the default version). If the KIP is updated with that then I'd
> be ready to vote on it.
>
> Cheers,
>
> Chris
>
> On Wed, May 29, 2024 at 12:00 PM Greg Harris  >
> wrote:
>
> > Hey Chris,
> >
> > Thanks for your thoughts.
> >
> > > Won't it still only expose the
> > > latest version for each plugin, instead of the range of versions
> > available?
> >
> > Here is a snippet of the current output of the GET
> > /connector-plugins?connectorsOnly=false endpoint, after I installed two
> > versions of the debezium PostgresConnector:
> >
> >   {
> > "class": "io.debezium.connector.postgresql.PostgresConnector",
> > "type": "source",
> > "version": "2.0.1.Final"
> >   },
> >   {
> > "class": "io.debezium.connector.postgresql.PostgresConnector",
> > "type": "source",
> > "version": "2.6.1.Final"
> >   },
> >
> > I think this satisfies your requirement to learn about all plugins and
> all
> > versions in two or fewer REST calls.
> >
> > I tried to get an example of the output of `/config` by hardcoding the
> > Recommender, and realized that Recom

Re: [DISCUSS] KIP-891: Running multiple versions of Connector plugins

2024-05-13 Thread Snehashis
Hi Greg,

That is much appreciated. No complaints on the additional scope, I will
make some time out to work on this once we have approval.

Thanks
Snehashis

On Fri, May 10, 2024 at 9:28 PM Greg Harris 
wrote:

> Hey Snehashis,
>
> I'm glad to hear you're still interested in this KIP!
> I'm happy to let you drive this, and I apologize for increasing the
> scope of work so drastically. To make up for that, I'll volunteer to
> be the primary PR reviewer to help get this done quickly once the KIP
> is approved.
>
> Thanks,
> Greg
>
>
> On Fri, May 10, 2024 at 3:51 AM Snehashis 
> wrote:
> >
> > Hi Greg,
> >
> > Thanks for the follow up to my original KIP, I am in favour of the
> > additions made to expand its scope, the addition of range versions
> > specifically make a lot of sense.
> >
> > Apologies if I have not publicly worked on this KIP for a long time. The
> > original work was done when the move to service loading was in discussion
> > and I wanted to loop back to this only after that work was completed.
> Post
> > its conclusion, I have not been able to take this up due to other
> > priorities. If it's okay with you, I would still like to get this
> > implemented myself, including the additional scope.
> >
> > Thanks and regards
> > Snehashis
> >
> > On Fri, May 10, 2024 at 12:45 AM Greg Harris
> 
> > wrote:
> >
> > > Hi all,
> > >
> > > I'd like to reboot the discussion on KIP-891:
> > >
> > >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-891%3A+Running+multiple+versions+of+Connector+plugins
> > >
> > > I've made some changes, most notably:
> > >
> > > 1. Specifying versions for all plugins in Connector configs
> > > (converters, header converters, transforms, and predicates) not just
> > > connectors & tasks
> > > 2. Specifying a range of versions instead of an exact match
> > > 3. New metrics to observe what versions are in-use
> > >
> > > Thanks to Snehashis for the original KIP idea!
> > >
> > > Thanks,
> > > Greg
> > >
> > > On Tue, Jan 2, 2024 at 11:49 AM Greg Harris 
> wrote:
> > > >
> > > > Hi Snehashis,
> > > >
> > > > Thank you for the KIP! This is something I've wanted for a long time.
> > > >
> > > > I know the discussion has gone cold, are you still interested in
> > > > pursuing this feature? I'll make time to review the KIP if you are
> > > > still accepting comments.
> > > >
> > > > Thanks,
> > > > Greg
> > > >
> > > > On Tue, Nov 22, 2022 at 12:29 PM Snehashis  >
> > > wrote:
> > > > >
> > > > > Thanks for the points Sagar.
> > > > >
> > > > > > 1) Should we update the GET /connectors endpoint to include the
> > > version of
> > > > > > the plugin that is running? It could be useful to figure out the
> > > version
> > > > > of
> > > > > > the plugin or I am assuming it gets returned by the expand=info
> call?
> > > > >
> > > > > I think this is good to have and possible future enhancement. The
> > > version
> > > > > info will be present in the config of the connector if the user has
> > > > > specified the version. Otherwise it is the latest version which the
> > > user
> > > > > can find out from the connector-plugin endpoint. The information
> can be
> > > > > introduced to the response of the GET /connectors endpoint itself,
> > > however
> > > > > the most ideal way of doing this would be to get the currently
> running
> > > > > instance of the connector and get the version directly from there.
> > > This is
> > > > > slightly tricky as the connector could be running in a different
> node.
> > > > > One way to do this would be to persist the version information in
> the
> > > > > status backing store during instantiation of the connector. It
> requires
> > > > > some more thought and since the version is part of the configs if
> > > provided
> > > > > and evident otherwise, I have not included it in this KIP.
> > > > >
> > > > > > 2) I am not aware of this and hence asking, can 2 connectors with
> > > > > different
> > > > > > versions have the same name? Doe

Re: [DISCUSS] KIP-891: Running multiple versions of Connector plugins

2024-05-10 Thread Snehashis
Hi Greg,

Thanks for the follow up to my original KIP, I am in favour of the
additions made to expand its scope, the addition of range versions
specifically make a lot of sense.

Apologies if I have not publicly worked on this KIP for a long time. The
original work was done when the move to service loading was in discussion
and I wanted to loop back to this only after that work was completed. Post
its conclusion, I have not been able to take this up due to other
priorities. If it's okay with you, I would still like to get this
implemented myself, including the additional scope.

Thanks and regards
Snehashis

On Fri, May 10, 2024 at 12:45 AM Greg Harris 
wrote:

> Hi all,
>
> I'd like to reboot the discussion on KIP-891:
>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-891%3A+Running+multiple+versions+of+Connector+plugins
>
> I've made some changes, most notably:
>
> 1. Specifying versions for all plugins in Connector configs
> (converters, header converters, transforms, and predicates) not just
> connectors & tasks
> 2. Specifying a range of versions instead of an exact match
> 3. New metrics to observe what versions are in-use
>
> Thanks to Snehashis for the original KIP idea!
>
> Thanks,
> Greg
>
> On Tue, Jan 2, 2024 at 11:49 AM Greg Harris  wrote:
> >
> > Hi Snehashis,
> >
> > Thank you for the KIP! This is something I've wanted for a long time.
> >
> > I know the discussion has gone cold, are you still interested in
> > pursuing this feature? I'll make time to review the KIP if you are
> > still accepting comments.
> >
> > Thanks,
> > Greg
> >
> > On Tue, Nov 22, 2022 at 12:29 PM Snehashis 
> wrote:
> > >
> > > Thanks for the points Sagar.
> > >
> > > > 1) Should we update the GET /connectors endpoint to include the
> version of
> > > > the plugin that is running? It could be useful to figure out the
> version
> > > of
> > > > the plugin or I am assuming it gets returned by the expand=info call?
> > >
> > > I think this is good to have and possible future enhancement. The
> version
> > > info will be present in the config of the connector if the user has
> > > specified the version. Otherwise it is the latest version which the
> user
> > > can find out from the connector-plugin endpoint. The information can be
> > > introduced to the response of the GET /connectors endpoint itself,
> however
> > > the most ideal way of doing this would be to get the currently running
> > > instance of the connector and get the version directly from there.
> This is
> > > slightly tricky as the connector could be running in a different node.
> > > One way to do this would be to persist the version information in the
> > > status backing store during instantiation of the connector. It requires
> > > some more thought and since the version is part of the configs if
> provided
> > > and evident otherwise, I have not included it in this KIP.
> > >
> > > > 2) I am not aware of this and hence asking, can 2 connectors with
> > > different
> > > > versions have the same name? Does the plugin isolation allow this?
> This
> > > > could have a bearing when using the lifecycle endpoints for
> connectors
> > > like
> > > > DELETE etc.
> > >
> > > All connectors in a cluster need to have uniquire connector names
> > > regardless of what version of the plugin the connector is running
> > > underneath. This is something enforced by the connect runtime itself.
> All
> > > connect CRUD operations are keyed on the connector name so there will
> not
> > > be an issue.
> > >
> > > Regards
> > > Snehashis
> > >
> > > On Tue, Nov 22, 2022 at 3:16 PM Sagar 
> wrote:
> > >
> > > > Hey Snehashsih,
> > > >
> > > > Thanks for the KIP. It looks like a very useful feature. Couple of
> > > > small-ish points, let me know what you think:
> > > >
> > > > 1) Should we update the GET /connectors endpoint to include the
> version of
> > > > the plugin that is running? It could be useful to figure out the
> version of
> > > > the plugin or I am assuming it gets returned by the expand=info call?
> > > > 2) I am not aware of this and hence asking, can 2 connectors with
> different
> > > > versions have the same name? Does the plugin isolation allow this?
> This
> > > > could have a bearing when using the lifecycle endpoints 

Re: [DISCUSS] KIP-891: Running multiple versions of a connector.

2022-11-22 Thread Snehashis
Thanks for the points Sagar.

> 1) Should we update the GET /connectors endpoint to include the version of
> the plugin that is running? It could be useful to figure out the version
of
> the plugin or I am assuming it gets returned by the expand=info call?

I think this is good to have and possible future enhancement. The version
info will be present in the config of the connector if the user has
specified the version. Otherwise it is the latest version which the user
can find out from the connector-plugin endpoint. The information can be
introduced to the response of the GET /connectors endpoint itself, however
the most ideal way of doing this would be to get the currently running
instance of the connector and get the version directly from there. This is
slightly tricky as the connector could be running in a different node.
One way to do this would be to persist the version information in the
status backing store during instantiation of the connector. It requires
some more thought and since the version is part of the configs if provided
and evident otherwise, I have not included it in this KIP.

> 2) I am not aware of this and hence asking, can 2 connectors with
different
> versions have the same name? Does the plugin isolation allow this? This
> could have a bearing when using the lifecycle endpoints for connectors
like
> DELETE etc.

All connectors in a cluster need to have uniquire connector names
regardless of what version of the plugin the connector is running
underneath. This is something enforced by the connect runtime itself. All
connect CRUD operations are keyed on the connector name so there will not
be an issue.

Regards
Snehashis

On Tue, Nov 22, 2022 at 3:16 PM Sagar  wrote:

> Hey Snehashsih,
>
> Thanks for the KIP. It looks like a very useful feature. Couple of
> small-ish points, let me know what you think:
>
> 1) Should we update the GET /connectors endpoint to include the version of
> the plugin that is running? It could be useful to figure out the version of
> the plugin or I am assuming it gets returned by the expand=info call?
> 2) I am not aware of this and hence asking, can 2 connectors with different
> versions have the same name? Does the plugin isolation allow this? This
> could have a bearing when using the lifecycle endpoints for connectors like
> DELETE etc.
>
> Thanks!
> Sagar.
>
>
> On Tue, Nov 22, 2022 at 2:10 PM Ashwin 
> wrote:
>
> > Hi Snehasis,
> >
> > > IIUC (please correct me if I am wrong here), what you highlighted
> above,
> > is
> > a versioning scheme for a connector config for the same connector (and
> not
> > different versions of a connector plugin).
> >
> > Sorry for not being more precise in my wording -  I meant registering
> > versions of schema for connector config.
> >
> > Let's take the example of a fictional connector which uses a fictional
> AWS
> > service.
> >
> > Fictional Connector Config schema version:2.0
> > ---
> > {
> >   "$schema": "http://json-schema.org/draft-04/schema#";,
> >   "type": "object",
> >   "properties": {
> > "name": {
> >   "type": "string"
> > },
> > "schema_version": {
> >   "type": "string"
> > },
> > "aws_access_key": {
> >   "type": "string"
> > },
> > "aws_secret_key": {
> >   "type": "string"
> > }
> >   },
> >   "required": [
> > "name",
> > "schema_version",
> > "aws_access_key",
> > "aws_secret_key"
> >   ]
> > }
> >
> > Fictional Connector config schema version:3.0
> > ---
> > {
> >   "$schema": "http://json-schema.org/draft-04/schema#";,
> >   "type": "object",
> >   "properties": {
> > "name": {
> >   "type": "string"
> > },
> > "schema_version": {
> >   "type": "string"
> > },
> >     "iam_role": {
> >   "type": "string"
> > }
> >   },
> >   "required": [
> > "name",
> > "schema_version",
> > "iam_role"
> >   ]
> > }
> >
> > The connector which supports Fictional config schema 2.0  will validate
> the
> > access key and secret key.
> > Whereas a connector which supports config with schema version 3.0 will
> only
> &g

Re: [DISCUSS] KIP-891: Running multiple versions of a connector.

2022-11-22 Thread Snehashis
Thanks for the explanation Ashwin.

This is an interesting notion. This is something which many connectors
implicitly do anyway. There are several connectors which have different
methods of interpreting the configurations provided. Often the user has
some control over how provided configuration should be used, through
omission of configs, boolean flags that activate/deactivate certain
configs, etc. One could argue that this increases the verbosity of the
configurations and makes it monolithic, however the alternative proposal of
having multiple registered schemas only really seems worthwhile if that the
runtime has the ability to alter the functionality of a connector. There
needs to be some way of registering multiple functionalities, one for each
configuration type. Otherwise, if the runtime is simply passing on the
configuration to the connector, regardless of the which schema version it
belongs to, and delegating the responsibility of picking the functionality
to the connector itself, there is very little the runtime is adding by
registering schemas. Multiple connector versions implicitly define
different configs and functionalities and hence, the ability to run
different versions of the connector itself seems like a more elegant
solution to address this problem.

I also don't think multiple configurations are the only use case for
running different versions of a connector. There could be internal changes
to a connector that do not involve any config changes. A change that
targets a particular enhancement may be incompatible with the older
behaviour. Right now in order to make the changes backwards compatible we
would have to gate the changes behind a connector config (or a different
schema and functionality registration). Otherwise the user is forced to
keep using the older connector until they can upgrade. Problem is if there
are multiple such enhancements (and only one is a breaking change) then
they are missing out on all the other enhements. It is simpler for the user
to have the ability to run both versions of the connector.

On Tue, Nov 22, 2022 at 2:11 PM Ashwin  wrote:

> Hi Snehasis,
>
> > IIUC (please correct me if I am wrong here), what you highlighted above,
> is
> a versioning scheme for a connector config for the same connector (and not
> different versions of a connector plugin).
>
> Sorry for not being more precise in my wording -  I meant registering
> versions of schema for connector config.
>
> Let's take the example of a fictional connector which uses a fictional AWS
> service.
>
> Fictional Connector Config schema version:2.0
> ---
> {
>   "$schema": "http://json-schema.org/draft-04/schema#";,
>   "type": "object",
>   "properties": {
> "name": {
>   "type": "string"
> },
> "schema_version": {
>   "type": "string"
> },
> "aws_access_key": {
>   "type": "string"
> },
> "aws_secret_key": {
>   "type": "string"
> }
>   },
>   "required": [
> "name",
> "schema_version",
> "aws_access_key",
> "aws_secret_key"
>   ]
> }
>
> Fictional Connector config schema version:3.0
> ---
> {
>   "$schema": "http://json-schema.org/draft-04/schema#";,
>   "type": "object",
>   "properties": {
> "name": {
>   "type": "string"
> },
> "schema_version": {
>   "type": "string"
> },
> "iam_role": {
>   "type": "string"
> }
>   },
>   "required": [
> "name",
> "schema_version",
> "iam_role"
>   ]
> }
>
> The connector which supports Fictional config schema 2.0  will validate the
> access key and secret key.
> Whereas a connector which supports config with schema version 3.0 will only
> validate the IAM role.
>
> This is the alternative which I wanted to suggest. Each plugin will
> register the schema versions of connector config which it supports.
>
> The plugin paths may be optionally different i.e  we don't have to
> mandatorily add a new plugin path to support a new schema version.
>
> Thanks,
> Ashwin
>
> On Tue, Nov 22, 2022 at 12:47 PM Snehashis 
> wrote:
>
> > Thanks for the input Ashwin.
> >
> > > 1. Can you elaborate on the rejected alternatives ? Suppose connector
> > > config is versioned and has a schema. Then a single plugin (whose
> > > dependencies have not changed) can handle multiple config 

Re: [DISCUSS] KIP-891: Running multiple versions of a connector.

2022-11-21 Thread Snehashis
Thanks for the input Ashwin.

> 1. Can you elaborate on the rejected alternatives ? Suppose connector
> config is versioned and has a schema. Then a single plugin (whose
> dependencies have not changed) can handle multiple config versions for the
> same connector class.

IIUC (please correct me if I am wrong here), what you highlighted above, is
a versioning scheme for a connector config for the same connector (and not
different versions of a connector plugin). That is a somewhat tangential
problem. While it is definitely a useful feature to have, like a log to
check what changes were made over time to the config which might make it
easier to do rollbacks, it is not the focus here. Here by version we mean
to say what underlying version of the plugin should the given configuration
of the connector use. Perhaps it is better to change the name of the
parameter from connector.version to connector.plugin.version or
plugin.version if it was confusing. wdyt?

>  2. Any plans to support assisted migration e.g if a user invokes "POST
> connector/config?migrate=latest", the latest version __attempts__ to
> transform the existing config to the newer version. This would require
> adding a method like "boolean migrate(Version fromVersion)" to the
> connector interface.

This is an enhancement we can think of doing in future. Users can simply do
a PUT call with the updated config which has the updated version number.
The assisted mode could be handy as the user does not need to know the
config but beyond this it does not seem to justify its existence.

Regards
Snehashis

On Tue, Nov 22, 2022 at 10:50 AM Ashwin 
wrote:

> Hi Snehasis,
>
> This is a really useful feature and thanks for initiating this discussion.
>
> I had the following questions -
>
>
> 1. Can you elaborate on the rejected alternatives ? Suppose connector
> config is versioned and has a schema. Then a single plugin (whose
> dependencies have not changed) can handle multiple config versions for the
> same connector class.
>
> 2. Any plans to support assisted migration e.g if a user invokes "POST
> connector/config?migrate=latest", the latest version __attempts__ to
> transform the existing config to the newer version. This would require
> adding a method like "boolean migrate(Version fromVersion)" to the
> connector interface.
>
> Thanks,
> Ashwin
>
> On Mon, Nov 21, 2022 at 2:27 PM Snehashis 
> wrote:
>
> > Hi all,
> >
> > I'd like to start a discussion thread on KIP-891: Running multiple
> versions
> > of a connector.
> >
> > The KIP aims to add the ability for the connect runtime to run multiple
> > versions of a connector.
> >
> >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-891%3A+Running+multiple+versions+of+a+connector
> >
> > Please take a look and let me know what you think.
> >
> > Thank you
> > Snehashis Pal
> >
>


Re: [DISCUSS] KIP-891: Running multiple versions of a connector.

2022-11-21 Thread Snehashis
Hi Mickael. Thanks for your input. Addressing the point you mentioned
below.

> 1) Can you explain how this would work with the GET
> /{pluginName}/config endpoint? How do you specify a version for a
> connector?

This API returns the set of configurations for a given connector. Since
between versions
the configurations can change its allow a user given version to return the
correct
configs. The version is added as a query parameter, for example -
 /S3SinkConnector/config?version=v1.1.1.

> 2) Some connectors come bundled with transformations (for example
> Debezium). How would multiple versions of a transformation be handled?

The version of transformations bundled with a particular connector version
will be used
when the connector is run with the corresponding version number. There will
be implicit
isolation between the two transformation as they are part of two separate
plugins and will
be loaded using different plugin classloaders during connector creation.

> 3) You mention the latest version will be picked by default if not
> specified. The version() method returns a string and currently
> enforces no semantics on the value it returns. Can you clarify the new
> expected semantics and explain how versions will be compared
> (alphabetical, semantics versioning, something else?)

The plugin loading mechanism already compares connector versions (new
connectors are created with only the latest version though).
The comparison between version will remain the same as it is currently
and is done using maven artefact versioning plugin. It is a generic
versioning scheme
that supports semantic, alphabetical and combinations with support
additional modifiers
(like alpha, beta, release and snapshot build versions). Please refer to
this javadoc for the
full comparison method.
https://maven.apache.org/ref/3.5.2/maven-artifact/apidocs/org/apache/maven/artifact/versioning/ComparableVersion.html
.
I do not think its necessary to enforce a new semantic with the version.
IMO the existing
versioning scheme is appropriate and flexible enough for all code
versioning methods.


On Mon, Nov 21, 2022 at 8:37 PM Mickael Maison 
wrote:

> Hi,
>
> Thanks for the KIP, this is something that could be really useful!
>
> 1) Can you explain how this would work with the GET
> /{pluginName}/config endpoint? How do you specify a version for a
> connector?
>
> 2) Some connectors come bundled with transformations (for example
> Debezium). How would multiple versions of a transformation be handled?
>
> 3) You mention the latest version will be picked by default if not
> specified. The version() method returns a string and currently
> enforces no semantics on the value it returns. Can you clarify the new
> expected semantics and explain how versions will be compared
> (alphabetical, semantics versioning, something else?)
>
> Thanks,
> Mickael
>
>
> On Mon, Nov 21, 2022 at 9:57 AM Snehashis 
> wrote:
> >
> > Hi all,
> >
> > I'd like to start a discussion thread on KIP-891: Running multiple
> versions
> > of a connector.
> >
> > The KIP aims to add the ability for the connect runtime to run multiple
> > versions of a connector.
> >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-891%3A+Running+multiple+versions+of+a+connector
> >
> > Please take a look and let me know what you think.
> >
> > Thank you
> > Snehashis Pal
>


RE: Re: [DISCUSS] KIP-891: Running multiple versions of a connector.

2022-11-21 Thread Snehashis
Hi Mickael. Thanks for your input. Addressing the point you mentioned
below.

> 1) Can you explain how this would work with the GET
> /{pluginName}/config endpoint? How do you specify a version for a
> connector?

This API returns the set of configurations for a given connector. Since
between versions
the configurations can change its allow a user given version to return the
correct
configs. The version is added as a query parameter, for example -
 /S3SinkConnector/config?version=v1.1.1.

> 2) Some connectors come bundled with transformations (for example
> Debezium). How would multiple versions of a transformation be handled?

The version of transformations bundled with a particular connector version
will be used
when the connector is run with the corresponding version number. There will
be implicit
isolation between the two transformation as they are part of two separate
plugins and will
be loaded using different plugin classloaders during connector creation.

> 3) You mention the latest version will be picked by default if not
> specified. The version() method returns a string and currently
> enforces no semantics on the value it returns. Can you clarify the new
> expected semantics and explain how versions will be compared
> (alphabetical, semantics versioning, something else?)

The plugin loading mechanism already compares connector versions (new
connectors are created with only the latest version though).
The comparison between version will remain the same as it is currently
and is done using maven artefact versioning plugin. It is a generic
versioning scheme
that supports semantic, alphabetical and combinations with support
additional modifiers
(like alpha, beta, release and snapshot build versions). Please refer to
this javadoc for the
full comparison method.
https://maven.apache.org/ref/3.5.2/maven-artifact/apidocs/org/apache/maven/artifact/versioning/ComparableVersion.html
.
I do not think its necessary to enforce a new semantic with the version.
IMO the existing
versioning scheme is appropriate and flexible enough for all code
versioning methods.


[DISCUSS] KIP-891: Running multiple versions of a connector.

2022-11-21 Thread Snehashis
Hi all,

I'd like to start a discussion thread on KIP-891: Running multiple versions
of a connector.

The KIP aims to add the ability for the connect runtime to run multiple
versions of a connector.

https://cwiki.apache.org/confluence/display/KAFKA/KIP-891%3A+Running+multiple+versions+of+a+connector

Please take a look and let me know what you think.

Thank you
Snehashis Pal


[jira] [Created] (KAFKA-14410) Allow connect runtime to run multiple versions of a connector.

2022-11-21 Thread Snehashis Pal (Jira)
Snehashis Pal created KAFKA-14410:
-

 Summary: Allow connect runtime to run multiple versions of a 
connector. 
 Key: KAFKA-14410
 URL: https://issues.apache.org/jira/browse/KAFKA-14410
 Project: Kafka
  Issue Type: Improvement
  Components: KafkaConnect
Reporter: Snehashis Pal
Assignee: Snehashis Pal


Connect Runtime should support running multiple versions of the same connector. 
Please refer to 
[KIP-891|https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=235834793]
 for more information on the problem and the proposed solution. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Requesting Access to contribute to the wiki.

2022-10-17 Thread Snehashis
Hi team,

I would like to request contributor access to the project, to be able to
write KIPs and assign tickets to me.

Wiki ID: snehashisp
JIRA ID: snehashisp

Thank you
Regards
Snehashis