Hi Greg,

Thanks for the updates. This is looking great. A few minor questions:

1. The description for the list command in the plugin path CLI states that
it'll show "Whether the class is discoverable via scanning". Are there any
cases where a plugin found by the CLI won't be discoverable via scanning?

2. Why are we treating module info files differently than service loader
manifests (where the latter will be removed for not-found plugins on an
opt-out basis, but the former are left unmodified no matter what)? Basing
this off of this sentence in the KIP: "If a plugin declares an
implementation via a module-info.java which is not loadable, the
module-info.java will not be modified."; let me know if I'm misinterpreting.

Cheers,

Chris

On Tue, Feb 7, 2023 at 4:32 PM Greg Harris <greg.har...@aiven.io.invalid>
wrote:

> Chris,
>
> Thanks for the comments!
>
> 1. I've updated the script to sync-manifests [--keep-not-found] with your
> described semantics.
>
> 2. I've pushed all of the deprecation decisions to a follow-up KIP that can
> take place after an intervening release.
> I hope that this feature has a high enough ROI that the direction of the
> migration is obvious to users even without a formal deprecation notice.
> I do think that without a formal deprecation notice that the migration
> scripts may be in use for a long time, particularly for connector
> implementations which are not receiving active maintenance.
>
> 3. Thanks for looking into the ServiceLoader error handling contract.
> From my inspection of the ServiceLoader implementation, it seems to be able
> to recover from most of the poor packaging scenarios that I'm aware of.
> I think with or without the deprecation, operators are free to select the
> ONLY_SCAN mode to work around any errors we don't discover before release.
>
> Thanks,
> Greg
>
> On Mon, Feb 6, 2023 at 8:44 AM Chris Egerton <chr...@aiven.io.invalid>
> wrote:
>
> > Hi Greg,
> >
> > Thanks for the updates. Unless stated below, I agree with your
> responses. A
> > few more thoughts:
> >
> > 1. IMO it's not worth it to have separate commands for adding/removing
> > manifests, mostly because it adds complexity to the tool and might make
> it
> > harder for users to understand. I think a single "update-manifests" or
> > "sync-manifests" command that both adds manifests for found plugins and
> > removes manifests for unfound plugins by default, with a --keep-unfound
> > flag to disable removal on an opt-in basis, would make more sense.
> >
> > 2. I agree with this point you raise about deprecation: "if we agree that
> > the old mechanism has some sunset date that we just don't know yet, we
> > should still communicate to users that the sunset date is somewhere in
> the
> > future." However, I'm not certain that releasing the features proposed in
> > this KIP alone gives us enough confidence in them to sunset the legacy
> > plugin loading logic, which is why I was hoping we could have at least
> one
> > successful release (preferably even two or three) with this feature in
> > order to identify potential gaps in the design before proceeding. This is
> > similar to the strategy we took with incremental rebalancing in Connect,
> > which introduced the "connect.protocol" property with the proposal that
> it
> > could be marked deprecated in the next major release. That strategy
> served
> > us well, since there were a few kinks in the logic for that feature that
> > needed to be worked out, and many users continued to rely on the legacy
> > rebalancing logic as a workaround for those issues.
> >
> > 3. I was wondering about error handling with the service loading
> mechanism
> > because the Javadocs for the Iterator returned by ServiceLoader::iterator
> > [1] state that "If an error is thrown then subsequent invocations of the
> > iterator will make a best effort to locate and instantiate the next
> > available provider, but in general such recovery cannot be guaranteed."
> > Upon further reflection, I agree that it's fine for now to leave details
> on
> > error handling out of the public contract for this new loading mode, but
> > only if we also agree to hold off on deprecating the old loading mode,
> > since there may be cases where worker startup is crippled by an
> > improperly-packaged plugin when using the service loader mechanism for
> its
> > plugins.
> >
> > [1] -
> >
> >
> https://docs.oracle.com/en/java/javase/11/docs/api/java.base/java/util/ServiceLoader.html#iterator()
> >
> > Cheers,
> >
> > Chris
> >
> > On Tue, Jan 24, 2023 at 10:26 AM Federico Valeri <fedeval...@gmail.com>
> > wrote:
> >
> > > Hi Greg,
> > >
> > > On Tue, Jan 24, 2023 at 2:48 AM Greg Harris
> > > <greg.har...@aiven.io.invalid> wrote:
> > > >
> > > > Federico,
> > > >
> > > > Thanks for taking a look!
> > > >
> > > > > I was just wondering if we should use a better script name like
> > > > > "connect-convert-to-service-provider.sh" or something like this
> > > >
> > > > I agree that the current name is not ideal, while the current name
> > > > describes part of what it is doing, it does not describe all of what
> it
> > > is
> > > > doing, or why that is important to the caller.
> > > > I looked around to see what style the other Kafka commands use, and
> it
> > > > seems like there's not a standard pattern. Some of the scripts are
> > > > noun-phrases, and some are verb-phrases.
> > > > Some take commands like `create`, `format`, etc, and some take
> options
> > > like
> > > > `--generate`, `--execute`, `--verify`.
> > > > I've updated the command to `bin/connect-plugin-path.sh
> > > > (list|add-manifests|remove-manifests) --worker-config
> <worker-config>`,
> > > let
> > > > me know if that's better or worse than
> > `bin/connect-scan-plugin-path.sh`.
> > > >
> > > > > maybe add a --dry-run option.
> > > >
> > > > Thanks for the suggestion, I've updated the KIP to include a
> --dry-run
> > > > option with some typical semantics.
> > >
> > > This is much better. Thanks!
> > >
> > > I think it's better to not deprecate the add-manifests and
> > > remove-manifests script sub-commands. When we will remove the
> > > deprecated plugin discovery modes, one may still have the need to
> > > convert an old connector release, maybe because it's the only version
> > > compatible with a third-party external system.
> > >
> > > >
> > > > Chris,
> > > >
> > > > I'm glad you liked the personas! Since this KIP requires others' help
> > to
> > > > improve the ecosystem, I needed a way to make sure everyone knows
> what
> > > role
> > > > they have to play. I hope I've accomplished that.
> > > >
> > > > > 1. Will workers' resiliency to packaging issues be affected when
> the
> > > new
> > > > > scanning behavior is enabled? For example, will a single
> > > poorly-packaged
> > > > > plugin prevent other correctly-packaged plugins from being loaded,
> or
> > > > > worse, cause the worker to fail? And either way, are there any
> > details
> > > > that
> > > > > we can call out to back up any guarantees we want to provide on
> that
> > > > front?
> > > >
> > > > The current behavior when encountering packaging issues seems to be
> to
> > > log
> > > > errors and continue, and that will certainly continue for the
> ONLY_SCAN
> > > and
> > > > HYBID_WARN modes.
> > > > As written, the HYBRID_FAIL mode breaks from this trend and will
> throw
> > > > exceptions that crash the worker only when the connectors are missing
> > > from
> > > > serviceloader discovery.
> > > > The behavior for SERVICE_LOAD is not currently defined in the KIP. I
> > > think
> > > > we can either keep the log-and-continue behavior whenever possible,
> or
> > > > begin to surface errors in a fail-fast model.
> > > > And if we can't choose between those two, perhaps we can add
> > > > SERVICE_LOAD_FAIL_FAST, or a
> plugin.path.discovery.fail.fast=true/false
> > > to
> > > > allow users to configure the worker to fail to start up when
> > plugin.path
> > > > issues are present.
> > > >
> > > > With regards to incorrectly packaged connectors affecting other
> > correctly
> > > > packaged ones: I think one of the current behaviors of the
> Reflections
> > > > implementation is that certain exceptions can prevent discovery of
> > other
> > > > unrelated plugins on the path.
> > > > I think that this may be able to be improved in all modes, and
> doesn't
> > > need
> > > > the ServiceLoader mechanism to resolve. I'll explore this some more
> and
> > > > pursue a fix outside of this KIP.
> > > > The new mechanism should be as good or better than the Reflections
> > > > mechanism in this regard. I am not sure if it is necessary to commit
> to
> > > the
> > > > error handling behavior as part of the public API.
> > > >
> > > > > 2. I think the description of the "--plugin-location" flag may need
> > > some
> > > > > updating? In the middle of the list of bullet points there's "The
> > > value of
> > > > > this argument will be a single plugin, which can be any of the
> > > following:"
> > > > > but the following points don't appear to be related.
> > > >
> > > > Thanks, I think that was a copy-paste error. I have updated the KIP.
> > > >
> > > > > 3. (Nit) Could we rename the migration script something like
> > > > > "connect-migrate-plugin-path" or even just "connect-plugin-path"
> > with a
> > > > > "migrate" subcommand?
> > > >
> > > > Yes, you and Federico had the same idea :)
> > > > Please see my feedback above regarding naming.
> > > >
> > > > > 4. It's noted that "If a plugin class is removed from the path, the
> > > > > corresponding shim manifest should also be removed". Did you
> consider
> > > > > adding behavior (possibly guarded behind a CLI flag) to find and
> > remove
> > > > > manifests for nonexistent plugins?
> > > >
> > > > When writing that originally, I thought that the script would only
> > remove
> > > > manifests that it had generated, as filtered by some in-file comment
> or
> > > by
> > > > jar filename, which would be an implementation detail.
> > > > If we were to add a CLI flag to also enable/disable the removal
> > behavior,
> > > > we could eliminate that implementation detail and make the removal an
> > > > all-or-nothing part of the script.
> > > >
> > > > I updated the script to accept a `remove-manifests` subcommand which
> I
> > > > think captures your suggestion. Therefore, a full sync would include
> > two
> > > > invocations: an `add-manifests` and `remove-manifests`.
> > > >
> > > > > 5. What's the concrete plan for updating the plugins that ship OOTB
> > > with
> > > > > Connect? Will they be given a service loader manifest, a module
> info
> > > file,
> > > > > or both?
> > > >
> > > > At a minimum, we need to add service loader manifests in this KIP to
> be
> > > > compliant with the new discovery model.
> > > > This is because Kafka 3.x supports Java 8, and that version of the
> > > > ServiceLoader can only read manifest files, and not module-info
> files.
> > > > Beyond that, I think it becomes a question of scope, and whether we
> > > should
> > > > add module-info files for all of the Connect components.
> > > > I was not planning on doing so within this KIP, and would prefer to
> > leave
> > > > that for a follow-up KIP.
> > > >
> > > > I have clarified this in the KIP.
> > > >
> > > > > 6. In the HYBRID_WARN mode, what precedence will we give to plugins
> > > that
> > > > > are found by both the legacy scanning logic and the new service
> > > > > loader/module info loading logic?
> > > >
> > > > In HYBRID_WARN, the control-flow should be the same as SCAN_ONLY, so
> > the
> > > > same classes are discovered in the same order.
> > > > All of the classes that are found via the ServiceLoader mechanism
> > should
> > > > only be used by the runtime to generate warnings.
> > > > I believe that by the nature of the Reflections library, all
> instances
> > > > which are discoverable by the ServiceLoader should be discoverable by
> > > > Reflections, and so we don't need to merge the two discovery methods
> or
> > > > handle precedence.
> > > >
> > > > > 7. It's a bit strange to immediately deprecate config property
> values
> > > and
> > > > > the migration script in their first release. IMO as long as we have
> > the
> > > > > migration script in the project, it doesn't make sense to deprecate
> > it,
> > > > and
> > > > > until we've had at least one successful release with the new
> loading
> > > logic
> > > > > and some confidence in it, we should not deprecate the older
> logic. I
> > > > think
> > > > > a more reasonable pace for the update here could be to deprecate
> the
> > > > legacy
> > > > > scanning logic in 4.0 (changing the default accordingly) and remove
> > it
> > > > > entirely in 5.0.
> > > >
> > > > I understand that it seems a bit silly to introduce something as
> > > > deprecated, but I think that's because 'deprecated' is not the same
> as
> > > > 'old', and not mutually exclusive with 'new'.
> > > > The way I understand it, marking something as deprecated and the
> > process
> > > of
> > > > deprecation is about communicating the direction in which an API is
> > > > evolving, and which methods can be relied upon for the foreseeable
> > > future.
> > > > And if we agree that the old mechanism has some sunset date that we
> > just
> > > > don't know yet, we should still communicate to users that the sunset
> > date
> > > > is somewhere in the future.
> > > >
> > > > I've updated the KIP to describe separate follow-on votes for
> changing
> > > the
> > > > default, and removing the scanning behavior, as that was my original
> > > intent
> > > > but I don't think that was made clear.
> > > > I also pushed back the deprecation of the migration sub-commands,
> since
> > > > we'd like to usher people towards the migration script instead of
> > relying
> > > > on the HYBRID modes.
> > > > I kept the decision for deprecating the non-SERVICE_LOAD modes in the
> > > > current vote for now, but if introducing deprecated configurations is
> > too
> > > > unpalatable we can move the deprecation to a later vote.
> > > >
> > > > Thanks,
> > > > Greg
> > > >
> > > > On Mon, Jan 23, 2023 at 12:49 PM Chris Egerton
> <chr...@aiven.io.invalid
> > >
> > > > wrote:
> > > >
> > > > > Hi Greg,
> > > > >
> > > > > Thanks for the KIP! This part of Connect has been due for an update
> > > for a
> > > > > while, it's nice to see it getting some attention. I especially
> like
> > > how
> > > > > the migration plan is broken down into affected personas; makes
> > review
> > > > > easier and, hopefully, makes the feature easier to understand for
> > > affected
> > > > > users.
> > > > >
> > > > > Here are my initial thoughts:
> > > > >
> > > > > 1. Will workers' resiliency to packaging issues be affected when
> the
> > > new
> > > > > scanning behavior is enabled? For example, will a single
> > > poorly-packaged
> > > > > plugin prevent other correctly-packaged plugins from being loaded,
> or
> > > > > worse, cause the worker to fail? And either way, are there any
> > details
> > > that
> > > > > we can call out to back up any guarantees we want to provide on
> that
> > > front?
> > > > >
> > > > > 2. I think the description of the "--plugin-location" flag may need
> > > some
> > > > > updating? In the middle of the list of bullet points there's "The
> > > value of
> > > > > this argument will be a single plugin, which can be any of the
> > > following:"
> > > > > but the following points don't appear to be related.
> > > > >
> > > > > 3. (Nit) Could we rename the migration script something like
> > > > > "connect-migrate-plugin-path" or even just "connect-plugin-path"
> > with a
> > > > > "migrate" subcommand?
> > > > >
> > > > > 4. It's noted that "If a plugin class is removed from the path, the
> > > > > corresponding shim manifest should also be removed". Did you
> consider
> > > > > adding behavior (possibly guarded behind a CLI flag) to find and
> > remove
> > > > > manifests for nonexistent plugins?
> > > > >
> > > > > 5. What's the concrete plan for updating the plugins that ship OOTB
> > > with
> > > > > Connect? Will they be given a service loader manifest, a module
> info
> > > file,
> > > > > or both?
> > > > >
> > > > > 6. In the HYBRID_WARN mode, what precedence will we give to plugins
> > > that
> > > > > are found by both the legacy scanning logic and the new service
> > > > > loader/module info loading logic?
> > > > >
> > > > > 7. It's a bit strange to immediately deprecate config property
> values
> > > and
> > > > > the migration script in their first release. IMO as long as we have
> > the
> > > > > migration script in the project, it doesn't make sense to deprecate
> > > it, and
> > > > > until we've had at least one successful release with the new
> loading
> > > logic
> > > > > and some confidence in it, we should not deprecate the older
> logic. I
> > > think
> > > > > a more reasonable pace for the update here could be to deprecate
> the
> > > legacy
> > > > > scanning logic in 4.0 (changing the default accordingly) and remove
> > it
> > > > > entirely in 5.0.
> > > > >
> > > > > Cheers,
> > > > >
> > > > > Chris
> > > > >
> > > > > On Mon, Jan 23, 2023 at 3:21 AM Federico Valeri <
> > fedeval...@gmail.com>
> > > > > wrote:
> > > > >
> > > > > > Hi Greg, this looks like a useful change to me.
> > > > > >
> > > > > > I was just wondering if we should use a better script name like
> > > > > > "connect-convert-to-service-provider.sh" or something like this,
> > and
> > > > > > maybe add a --dry-run option.
> > > > > >
> > > > > > On Tue, Jan 17, 2023 at 8:45 PM Greg Harris
> > > > > > <greg.har...@aiven.io.invalid> wrote:
> > > > > > >
> > > > > > > Hi all!
> > > > > > >
> > > > > > > I'd like to start a discussion about
> > > > > > >
> > > > > >
> > > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-898%3A+Modernize+Connect+plugin+discovery
> > > > > > > .
> > > > > > >
> > > > > > > Thanks!
> > > > > > > Greg Harris
> > > > > >
> > > > >
> > >
> >
>

Reply via email to