Very nice work, Konstantine. Conflicting dependencies of connectors is
indeed a big issue that makes it hard to manage installed connectors.

I do like Gwen's idea about removing the 'module.isolation.enabled'
property. However, I would have anticipated always using classpath
isolation for *only* those components registered under the module path and
not really for anything else already on the normal classpath. So, people
could continue to place custom connector JARs onto the classpath, though
this would become deprecated in favor of installing custom connector JARs /
modules via the module path. This keeps configuration simple, gives people
time to migrate, but let's people that need classpath isolation get it to
install a variety of connectors each with their dependencies that
potentially conflict with other components.

The challenge is whether there should be a default for 'module.path'.
Ideally there would be so that users know where they can install their
connectors. However, I suspect that this might be difficult to do unless it
can make use of system properties such as "${kafka.home}" so that relative
directories can be specified.

A few other questions/comments:

1) Does the KIP have to specify how are components / modules installed,
discovered, or recognized by Kafka Connect? Or perhaps the KIP needs to
just specify the semantics of the file system module path (e.g., the
directories below those specified in the module path are to be unique and
identify an installed component).

2) Will the module classloader filtering also have to exclude Kafka Connect
dependencies? The only one that I can think of is the SLF4J API, which
can't be loaded from the module's classloader if the connector is to send
its log messages to the same logging system.

3) Rather than specify filtering, would be it a bit more flexible to simply
say that the implementation will need to ensure that Java, Kafka Connect,
and other third party APIs (e.g., SLF4J API) will not be loaded from the
module classloaders? It'd be better to avoid specifying how it will be
done, just in case the implementation needs to evolve or use a different
technique (e.g., load the Java and public Kafka Connect APIs via one
classloader that is reused and that always appears before the module
classloader, while Kafka Connect implementation JARs appear after the
component's classloader.

4) Perhaps to address #2 and #3 above, perhaps the KIP could explicitly
specify the classloader order for a deployed connector. For example,
'java', 'kafka-connect-apis', 'connector-module', 'smt-module-1', ...,
'kafka-connect-impls', where 'connector-module' is the classloader for the
(first) module where the connector is found, 'smt-module-1' is the
classloader for the (first) module where the first SMT class is found (if
specified and found in a separate module), 'smt-module-2' is the
classloader .... Might also need to say that the KIP does not specify how
the implementation will pick the module if a specified class if found in
more than one module.

Thoughts?

Randall

On Mon, May 1, 2017 at 6:43 AM, Gwen Shapira <g...@confluent.io> wrote:

> Hi Konstantine,
>
> Thank you so much for driving this! The connector classpath mess is driving
> me nuts (or worse, driving me to use Docker).
>
> I like the proposal for micro-benchmarks to test the context switching
> overhead.
>
> I have a difficult time figuring out the module.isolation.enabled.
> Especially with a default to false. I can't think of a reason that anyone
> will not want classpath isolation. "No! I want my connectors to mess up
> each other's dependencies" said no one ever.
>
> So it looks like this is mostly for upgrade purpose? Because the initial
> upgrade will not have the module.path set and therefore classpath isolation
> will simply not work by default?
>
> In that case, why don't we simply use the existence of non-empty
> module.path as an indicator of whether isolation should work or not? seem
> simpler and intuitive to me.
>
> Thanks!
>
> Gwen
>
>
>
>
>
> On Sat, Apr 29, 2017 at 9:16 AM, Konstantine Karantasis <
> konstant...@confluent.io> wrote:
>
> > * Because of KIP number collision, please disregard my previous KIP
> > announcement and use this thread for discussion instead *
> >
> >
> > Hi everyone,
> >
> > we aim to address dependency conflicts in Kafka Connect soon by applying
> > class loading isolation.
> >
> > Feel free to take a look at KIP-146 here:
> >
> > *https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> > 146+-+Classloading+Isolation+in+Connect
> > <https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> > 146+-+Classloading+Isolation+in+Connect>*
> >
> > which describes minimal required changes to public interfaces and the
> > general implementation approach.
> >
> > This is a much wanted feature for Kafka Connect. Your feedback is highly
> > appreciated.
> >
> > -Konstantine
> >
>
>
>
> --
> *Gwen Shapira*
> Product Manager | Confluent
> 650.450.2760 | @gwenshap
> Follow us: Twitter <https://twitter.com/ConfluentInc> | blog
> <http://www.confluent.io/blog>
>

Reply via email to