Hi Greg,

On Tue, Jan 24, 2023 at 2:48 AM Greg Harris
<greg.har...@aiven.io.invalid> wrote:
>
> Federico,
>
> Thanks for taking a look!
>
> > I was just wondering if we should use a better script name like
> > "connect-convert-to-service-provider.sh" or something like this
>
> I agree that the current name is not ideal, while the current name
> describes part of what it is doing, it does not describe all of what it is
> doing, or why that is important to the caller.
> I looked around to see what style the other Kafka commands use, and it
> seems like there's not a standard pattern. Some of the scripts are
> noun-phrases, and some are verb-phrases.
> Some take commands like `create`, `format`, etc, and some take options like
> `--generate`, `--execute`, `--verify`.
> I've updated the command to `bin/connect-plugin-path.sh
> (list|add-manifests|remove-manifests) --worker-config <worker-config>`, let
> me know if that's better or worse than `bin/connect-scan-plugin-path.sh`.
>
> > maybe add a --dry-run option.
>
> Thanks for the suggestion, I've updated the KIP to include a --dry-run
> option with some typical semantics.

This is much better. Thanks!

I think it's better to not deprecate the add-manifests and
remove-manifests script sub-commands. When we will remove the
deprecated plugin discovery modes, one may still have the need to
convert an old connector release, maybe because it's the only version
compatible with a third-party external system.

>
> Chris,
>
> I'm glad you liked the personas! Since this KIP requires others' help to
> improve the ecosystem, I needed a way to make sure everyone knows what role
> they have to play. I hope I've accomplished that.
>
> > 1. Will workers' resiliency to packaging issues be affected when the new
> > scanning behavior is enabled? For example, will a single poorly-packaged
> > plugin prevent other correctly-packaged plugins from being loaded, or
> > worse, cause the worker to fail? And either way, are there any details
> that
> > we can call out to back up any guarantees we want to provide on that
> front?
>
> The current behavior when encountering packaging issues seems to be to log
> errors and continue, and that will certainly continue for the ONLY_SCAN and
> HYBID_WARN modes.
> As written, the HYBRID_FAIL mode breaks from this trend and will throw
> exceptions that crash the worker only when the connectors are missing from
> serviceloader discovery.
> The behavior for SERVICE_LOAD is not currently defined in the KIP. I think
> we can either keep the log-and-continue behavior whenever possible, or
> begin to surface errors in a fail-fast model.
> And if we can't choose between those two, perhaps we can add
> SERVICE_LOAD_FAIL_FAST, or a plugin.path.discovery.fail.fast=true/false to
> allow users to configure the worker to fail to start up when plugin.path
> issues are present.
>
> With regards to incorrectly packaged connectors affecting other correctly
> packaged ones: I think one of the current behaviors of the Reflections
> implementation is that certain exceptions can prevent discovery of other
> unrelated plugins on the path.
> I think that this may be able to be improved in all modes, and doesn't need
> the ServiceLoader mechanism to resolve. I'll explore this some more and
> pursue a fix outside of this KIP.
> The new mechanism should be as good or better than the Reflections
> mechanism in this regard. I am not sure if it is necessary to commit to the
> error handling behavior as part of the public API.
>
> > 2. I think the description of the "--plugin-location" flag may need some
> > updating? In the middle of the list of bullet points there's "The value of
> > this argument will be a single plugin, which can be any of the following:"
> > but the following points don't appear to be related.
>
> Thanks, I think that was a copy-paste error. I have updated the KIP.
>
> > 3. (Nit) Could we rename the migration script something like
> > "connect-migrate-plugin-path" or even just "connect-plugin-path" with a
> > "migrate" subcommand?
>
> Yes, you and Federico had the same idea :)
> Please see my feedback above regarding naming.
>
> > 4. It's noted that "If a plugin class is removed from the path, the
> > corresponding shim manifest should also be removed". Did you consider
> > adding behavior (possibly guarded behind a CLI flag) to find and remove
> > manifests for nonexistent plugins?
>
> When writing that originally, I thought that the script would only remove
> manifests that it had generated, as filtered by some in-file comment or by
> jar filename, which would be an implementation detail.
> If we were to add a CLI flag to also enable/disable the removal behavior,
> we could eliminate that implementation detail and make the removal an
> all-or-nothing part of the script.
>
> I updated the script to accept a `remove-manifests` subcommand which I
> think captures your suggestion. Therefore, a full sync would include two
> invocations: an `add-manifests` and `remove-manifests`.
>
> > 5. What's the concrete plan for updating the plugins that ship OOTB with
> > Connect? Will they be given a service loader manifest, a module info file,
> > or both?
>
> At a minimum, we need to add service loader manifests in this KIP to be
> compliant with the new discovery model.
> This is because Kafka 3.x supports Java 8, and that version of the
> ServiceLoader can only read manifest files, and not module-info files.
> Beyond that, I think it becomes a question of scope, and whether we should
> add module-info files for all of the Connect components.
> I was not planning on doing so within this KIP, and would prefer to leave
> that for a follow-up KIP.
>
> I have clarified this in the KIP.
>
> > 6. In the HYBRID_WARN mode, what precedence will we give to plugins that
> > are found by both the legacy scanning logic and the new service
> > loader/module info loading logic?
>
> In HYBRID_WARN, the control-flow should be the same as SCAN_ONLY, so the
> same classes are discovered in the same order.
> All of the classes that are found via the ServiceLoader mechanism should
> only be used by the runtime to generate warnings.
> I believe that by the nature of the Reflections library, all instances
> which are discoverable by the ServiceLoader should be discoverable by
> Reflections, and so we don't need to merge the two discovery methods or
> handle precedence.
>
> > 7. It's a bit strange to immediately deprecate config property values and
> > the migration script in their first release. IMO as long as we have the
> > migration script in the project, it doesn't make sense to deprecate it,
> and
> > until we've had at least one successful release with the new loading logic
> > and some confidence in it, we should not deprecate the older logic. I
> think
> > a more reasonable pace for the update here could be to deprecate the
> legacy
> > scanning logic in 4.0 (changing the default accordingly) and remove it
> > entirely in 5.0.
>
> I understand that it seems a bit silly to introduce something as
> deprecated, but I think that's because 'deprecated' is not the same as
> 'old', and not mutually exclusive with 'new'.
> The way I understand it, marking something as deprecated and the process of
> deprecation is about communicating the direction in which an API is
> evolving, and which methods can be relied upon for the foreseeable future.
> And if we agree that the old mechanism has some sunset date that we just
> don't know yet, we should still communicate to users that the sunset date
> is somewhere in the future.
>
> I've updated the KIP to describe separate follow-on votes for changing the
> default, and removing the scanning behavior, as that was my original intent
> but I don't think that was made clear.
> I also pushed back the deprecation of the migration sub-commands, since
> we'd like to usher people towards the migration script instead of relying
> on the HYBRID modes.
> I kept the decision for deprecating the non-SERVICE_LOAD modes in the
> current vote for now, but if introducing deprecated configurations is too
> unpalatable we can move the deprecation to a later vote.
>
> Thanks,
> Greg
>
> On Mon, Jan 23, 2023 at 12:49 PM Chris Egerton <chr...@aiven.io.invalid>
> wrote:
>
> > Hi Greg,
> >
> > Thanks for the KIP! This part of Connect has been due for an update for a
> > while, it's nice to see it getting some attention. I especially like how
> > the migration plan is broken down into affected personas; makes review
> > easier and, hopefully, makes the feature easier to understand for affected
> > users.
> >
> > Here are my initial thoughts:
> >
> > 1. Will workers' resiliency to packaging issues be affected when the new
> > scanning behavior is enabled? For example, will a single poorly-packaged
> > plugin prevent other correctly-packaged plugins from being loaded, or
> > worse, cause the worker to fail? And either way, are there any details that
> > we can call out to back up any guarantees we want to provide on that front?
> >
> > 2. I think the description of the "--plugin-location" flag may need some
> > updating? In the middle of the list of bullet points there's "The value of
> > this argument will be a single plugin, which can be any of the following:"
> > but the following points don't appear to be related.
> >
> > 3. (Nit) Could we rename the migration script something like
> > "connect-migrate-plugin-path" or even just "connect-plugin-path" with a
> > "migrate" subcommand?
> >
> > 4. It's noted that "If a plugin class is removed from the path, the
> > corresponding shim manifest should also be removed". Did you consider
> > adding behavior (possibly guarded behind a CLI flag) to find and remove
> > manifests for nonexistent plugins?
> >
> > 5. What's the concrete plan for updating the plugins that ship OOTB with
> > Connect? Will they be given a service loader manifest, a module info file,
> > or both?
> >
> > 6. In the HYBRID_WARN mode, what precedence will we give to plugins that
> > are found by both the legacy scanning logic and the new service
> > loader/module info loading logic?
> >
> > 7. It's a bit strange to immediately deprecate config property values and
> > the migration script in their first release. IMO as long as we have the
> > migration script in the project, it doesn't make sense to deprecate it, and
> > until we've had at least one successful release with the new loading logic
> > and some confidence in it, we should not deprecate the older logic. I think
> > a more reasonable pace for the update here could be to deprecate the legacy
> > scanning logic in 4.0 (changing the default accordingly) and remove it
> > entirely in 5.0.
> >
> > Cheers,
> >
> > Chris
> >
> > On Mon, Jan 23, 2023 at 3:21 AM Federico Valeri <fedeval...@gmail.com>
> > wrote:
> >
> > > Hi Greg, this looks like a useful change to me.
> > >
> > > I was just wondering if we should use a better script name like
> > > "connect-convert-to-service-provider.sh" or something like this, and
> > > maybe add a --dry-run option.
> > >
> > > On Tue, Jan 17, 2023 at 8:45 PM Greg Harris
> > > <greg.har...@aiven.io.invalid> wrote:
> > > >
> > > > Hi all!
> > > >
> > > > I'd like to start a discussion about
> > > >
> > >
> > https://cwiki.apache.org/confluence/display/KAFKA/KIP-898%3A+Modernize+Connect+plugin+discovery
> > > > .
> > > >
> > > > Thanks!
> > > > Greg Harris
> > >
> >

Reply via email to