On Wed, May 3, 2017 at 11:17 AM, Konstantine Karantasis <
konstant...@confluent.io> wrote:

> Thanks Ewen. I'm replying inline as well.
>
> On Tue, May 2, 2017 at 11:24 AM, Ewen Cheslack-Postava <e...@confluent.io>
> wrote:
>
> > Thanks for the KIP.
> >
> > A few responses inline, followed by additional comments.
> >
> > On Mon, May 1, 2017 at 9:50 PM, Konstantine Karantasis <
> > konstant...@confluent.io> wrote:
> >
> > > Gwen, Randall thank you for your very insightful observations. I'm glad
> > you
> > > find this first draft to be an adequate platform for discussion.
> > >
> > > I'll attempt replying to your comments in order.
> > >
> > > Gwen, I also debated exactly the same two options: a) interpreting
> > absence
> > > of module path as a user's intention to turn off isolation and b)
> > > explicitly using an additional boolean property. A few reasons why I
> went
> > > with b) in this first draft are:
> > > 1) As Randall mentions, to leave the option of using a default value
> > open.
> > > If not immediately in the first version of isolation, maybe in the
> > future.
> > > 2) I didn't like the implicit character of the choice of interpreting
> an
> > > empty string as a clear intention to turn isolation off by the user.
> Half
> > > the time could be just that users forget to set a location, although
> > they'd
> > > like to use class loading isolation.
> > > 3) There's a slim possibility that in rare occasions a user might want
> to
> > > avoid even the slightest increase in memory consumption due to class
> > > loading duplication. I admit this should be very rare, but given the
> > other
> > > concerns and that we would really like to keep the isolation
> > implementation
> > > simple, the option to turn off this feature by using only one
> additional
> > > config property might not seem too excessive. At least at the start of
> > this
> > > discussion.
> > > 4) Debugging during development might be simpler in some cases.
> > > 5) Finally, as you mention, this could allow for smoother upgrades.
> > >
> >
> > I'm not sure any of these keep you from removing the extra config. Is
> there
> > any reason you couldn't have clean support for relying on the CLASSPATH
> > while still supporting the classloaders? Then getting people onto the new
> > classloaders does require documentation for how to install connectors,
> but
> > that's pretty minimal. And we don't break existing installations where
> > people are just adding to the CLASSPATH. It seems like this:
> >
> > 1. Allows you to set a default. Isolation is always enabled, but we won't
> > include any paths/directories we already use. Setting a default just
> > requires specifying a new location where we'd hold these directories.
> > 2. It doesn't require the implicit choice -- you actually never turn off
> > isolation, but still support the regular CLASSPATH with an empty list of
> > isolated loaders
> > 3. The user can still use CLASSPATH if they want to minimize classloader
> > overhead
> > 4. Debugging can still use CLASSPATH
> > 5. Upgrades just work.
> >
>
> Falling back to CLASSPATH for non-isolated mode makes sense. The extra
> config property was suggested proactively, as well as for clarity and
> handling of defaults. But it's much better if we can do without it. Will be
> removed.
>
>
Thank you (and I'm sure Confluent's support team also thanks you - one
configuration should be harder to accidentally screw up than two...)



>
> >
> >
> > > Randall, regarding your comments:
> > > 1) To keep its focus narrow, this KIP, as well as the first
> > implementation
> > > of isolation in Connect, assume filesystem based discovery. With
> careful
> > > implementation, transitioning to discovery schemes that support broader
> > > URIs I believe should be easy in the future.
> > >
> >
> > Maybe just mention a couple of quick examples in the KIP. When described
> > inline it might be more obvious that it will extend cleanly.
> >
> >
> There's an example for a filesystem-based structure. I will enhance it.
>
>
>
> > > 2) The example you give makes a good point. However I'm inclined to say
> > > that such cases should be addressed more as exceptions rather than as
> > being
> > > the common case. Therefore, I wouldn't see all dependencies imported by
> > the
> > > framework as required to be filtered out, because in that case we lose
> > the
> > > advantage of isolation between the framework and the connectors (and we
> > are
> > > left only with isolation between connectors).
> >
> > 3) I tried to abstract implementation details in this the KIP, but you
> are
> > > right. Even though filtering here is mainly used semantically rather
> than
> > > literally, it gives an implementation hint that we could avoid.
> > >
> >
> > I think we're missing another option -- don't do filtering and require
> that
> > those dependencies are correctly filtered out of the modules. If we want
> to
> > be nicer about this, we could also detect maybe 2 or 3 classes while
> > scanning for Connectors/Converters/Transformations that indicate the
> > classloader has jars that it shouldn't and warn about it. I can't think
> of
> > that many that would be an issue -- basically connect-api,
> connect-runtime
> > if they really mess it up, and maybe slf4j.
> >
> >
> This sounds a bit more strict, which is not necessarily a bad thing. Still,
> it seems to also keep the subject under the category of implementation
> decisions.
>
> > 4) In the same spirit as in 3) I believe we should reserve enough
> > > flexibility to the implementation to discover and load classes, when
> they
> > > appear in multiple locations under the general module location.
> > >
> > >
> > And a couple of addition comments:
> >
> > - module.path should be module.paths if it accepts multiple paths
> >
>
> The configuration property was named in a way that resembles classpath. I
> felt this might offer some intuition, even if the word "path" is too
> filesystem oriented. Users are used to give a list of paths in classpath,
> so I didn't find singular too confusing. Additionally, it will work when
> there's only one "path", or I should say URI, given. But I don't feel very
> strong about the name of this property. (Alternatives that I tried such as
> module.uris, module.locations didn't seem satisfying)
>

> > - I think the description of the module paths is a bit confusing because
> I
> > think for simpler configuration we actually want to specify the *parent*
> > directory of module paths. I definitely prefer this since it's simpler
> > although I am not certain how it will mix with future extensions to other
> > URL handlers (where a zip file or http URL wouldn't do the same discovery
> > within subdirectories)
> >
>
> My impression is that the 2nd paragraph below the introduction of
> module.path is where the "parent" convention is attempted to be explained.
> I will attempt to improve this text, but I feel that, again, the filesystem
> convention will follow the generic definition of the property (a list of
> locations/paths).
>

I was also confused while reading the description. It clicked at some
point, but it wasn't immediately clear that we load from sub-directories
too.



>
> -Konstantine
>
>
> >
> > -Ewen
> >
> >
> > > Thanks again! Let me know what you think.
> > > Konstantine
> > >
> > >
> > > On Mon, May 1, 2017 at 10:12 AM, Randall Hauch <rha...@gmail.com>
> wrote:
> > >
> > > > Very nice work, Konstantine. Conflicting dependencies of connectors
> is
> > > > indeed a big issue that makes it hard to manage installed connectors.
> > > >
> > > > I do like Gwen's idea about removing the 'module.isolation.enabled'
> > > > property. However, I would have anticipated always using classpath
> > > > isolation for *only* those components registered under the module
> path
> > > and
> > > > not really for anything else already on the normal classpath. So,
> > people
> > > > could continue to place custom connector JARs onto the classpath,
> > though
> > > > this would become deprecated in favor of installing custom connector
> > > JARs /
> > > > modules via the module path. This keeps configuration simple, gives
> > > people
> > > > time to migrate, but let's people that need classpath isolation get
> it
> > to
> > > > install a variety of connectors each with their dependencies that
> > > > potentially conflict with other components.
> > > >
> > > > The challenge is whether there should be a default for 'module.path'.
> > > > Ideally there would be so that users know where they can install
> their
> > > > connectors. However, I suspect that this might be difficult to do
> > unless
> > > it
> > > > can make use of system properties such as "${kafka.home}" so that
> > > relative
> > > > directories can be specified.
> > > >
> > > > A few other questions/comments:
> > > >
> > > > 1) Does the KIP have to specify how are components / modules
> installed,
> > > > discovered, or recognized by Kafka Connect? Or perhaps the KIP needs
> to
> > > > just specify the semantics of the file system module path (e.g., the
> > > > directories below those specified in the module path are to be unique
> > and
> > > > identify an installed component).
> > > >
> > > > 2) Will the module classloader filtering also have to exclude Kafka
> > > Connect
> > > > dependencies? The only one that I can think of is the SLF4J API,
> which
> > > > can't be loaded from the module's classloader if the connector is to
> > send
> > > > its log messages to the same logging system.
> > > >
> > > > 3) Rather than specify filtering, would be it a bit more flexible to
> > > simply
> > > > say that the implementation will need to ensure that Java, Kafka
> > Connect,
> > > > and other third party APIs (e.g., SLF4J API) will not be loaded from
> > the
> > > > module classloaders? It'd be better to avoid specifying how it will
> be
> > > > done, just in case the implementation needs to evolve or use a
> > different
> > > > technique (e.g., load the Java and public Kafka Connect APIs via one
> > > > classloader that is reused and that always appears before the module
> > > > classloader, while Kafka Connect implementation JARs appear after the
> > > > component's classloader.
> > > >
> > > > 4) Perhaps to address #2 and #3 above, perhaps the KIP could
> explicitly
> > > > specify the classloader order for a deployed connector. For example,
> > > > 'java', 'kafka-connect-apis', 'connector-module', 'smt-module-1',
> ...,
> > > > 'kafka-connect-impls', where 'connector-module' is the classloader
> for
> > > the
> > > > (first) module where the connector is found, 'smt-module-1' is the
> > > > classloader for the (first) module where the first SMT class is found
> > (if
> > > > specified and found in a separate module), 'smt-module-2' is the
> > > > classloader .... Might also need to say that the KIP does not specify
> > how
> > > > the implementation will pick the module if a specified class if found
> > in
> > > > more than one module.
> > > >
> > > > Thoughts?
> > > >
> > > > Randall
> > > >
> > > > On Mon, May 1, 2017 at 6:43 AM, Gwen Shapira <g...@confluent.io>
> > wrote:
> > > >
> > > > > Hi Konstantine,
> > > > >
> > > > > Thank you so much for driving this! The connector classpath mess is
> > > > driving
> > > > > me nuts (or worse, driving me to use Docker).
> > > > >
> > > > > I like the proposal for micro-benchmarks to test the context
> > switching
> > > > > overhead.
> > > > >
> > > > > I have a difficult time figuring out the module.isolation.enabled.
> > > > > Especially with a default to false. I can't think of a reason that
> > > anyone
> > > > > will not want classpath isolation. "No! I want my connectors to
> mess
> > up
> > > > > each other's dependencies" said no one ever.
> > > > >
> > > > > So it looks like this is mostly for upgrade purpose? Because the
> > > initial
> > > > > upgrade will not have the module.path set and therefore classpath
> > > > isolation
> > > > > will simply not work by default?
> > > > >
> > > > > In that case, why don't we simply use the existence of non-empty
> > > > > module.path as an indicator of whether isolation should work or
> not?
> > > seem
> > > > > simpler and intuitive to me.
> > > > >
> > > > > Thanks!
> > > > >
> > > > > Gwen
> > > > >
> > > > >
> > > > >
> > > > >
> > > > >
> > > > > On Sat, Apr 29, 2017 at 9:16 AM, Konstantine Karantasis <
> > > > > konstant...@confluent.io> wrote:
> > > > >
> > > > > > * Because of KIP number collision, please disregard my previous
> KIP
> > > > > > announcement and use this thread for discussion instead *
> > > > > >
> > > > > >
> > > > > > Hi everyone,
> > > > > >
> > > > > > we aim to address dependency conflicts in Kafka Connect soon by
> > > > applying
> > > > > > class loading isolation.
> > > > > >
> > > > > > Feel free to take a look at KIP-146 here:
> > > > > >
> > > > > > *https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> > > > > > 146+-+Classloading+Isolation+in+Connect
> > > > > > <https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> > > > > > 146+-+Classloading+Isolation+in+Connect>*
> > > > > >
> > > > > > which describes minimal required changes to public interfaces and
> > the
> > > > > > general implementation approach.
> > > > > >
> > > > > > This is a much wanted feature for Kafka Connect. Your feedback is
> > > > highly
> > > > > > appreciated.
> > > > > >
> > > > > > -Konstantine
> > > > > >
> > > > >
> > > > >
> > > > >
> > > > > --
> > > > > *Gwen Shapira*
> > > > > Product Manager | Confluent
> > > > > 650.450.2760 | @gwenshap
> > > > > Follow us: Twitter <https://twitter.com/ConfluentInc> | blog
> > > > > <http://www.confluent.io/blog>
> > > > >
> > > >
> > >
> >
>



-- 
*Gwen Shapira*
Product Manager | Confluent
650.450.2760 | @gwenshap
Follow us: Twitter <https://twitter.com/ConfluentInc> | blog
<http://www.confluent.io/blog>

Reply via email to