Hi Chris,

1. For the REST-server-per-flow setup, it made sense to introduce some
simplified configuration. With a single REST server, it doesn't make sense
anymore, I'm removing it from the KIP.
2. I think that changing the worker ID generation still makes sense,
otherwise there is no way to differentiate between the MM2 instances.

Thanks,
Daniel

On Wed, Aug 31, 2022 at 8:39 PM Chris Egerton <chr...@aiven.io.invalid>
wrote:

> Hi Daniel,
>
> I've taken a look at the KIP in detail. Here are my complete thoughts
> (minus the aforementioned sections that may be affected by changes to an
> internal-only REST API):
>
> 1. Why introduce new mm.host.name and mm.rest.protocol properties instead
> of using the properties that are already used by Kafka Connect: listeners,
> rest.advertised.host.name, rest.advertised.host.port, and
> rest.advertised.listener? We used to have the rest.host.name and rest.port
> properties in Connect but deprecated and eventually removed them in favor
> of the listeners property in KIP-208 [1]; I'm hoping we can keep things as
> similar as possible between MM2 and Connect in order to make it easier for
> users to work with both. I'm also hoping that we can allow users to
> configure the port that their MM2 nodes listen on instead of hardcoding MM2
> to bind to port 0.
>
> 2. Do we still need to change the worker IDs that get used in the status
> topic?
>
> Everything else looks good, or should change once the KIP is updated with
> the internal-only REST API alternative.
>
> Cheers,
>
> Chris
>
> [1] -
>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-208%3A+Add+SSL+support+to+Kafka+Connect+REST+interface
>
> On Mon, Aug 29, 2022 at 1:55 PM Chris Egerton <chr...@aiven.io> wrote:
>
> > Hi Daniel,
> >
> > Yeah, I think that's the way to go. Adding multiple servers for each
> > herder seems like it'd be too much of a pain for users to configure, and
> if
> > we keep the API strictly internal for now, we shouldn't be painting
> > ourselves into too much of a corner if/when we decide to expose a
> > public-facing REST API for dedicated MM2 clusters.
> >
> > I plan to take a look at the rest of the KIP and provide a complete
> review
> > sometime this week; I'll hold off on commenting on anything that seems
> like
> > it'll be affected by switching to an internal-only REST API until those
> > changes are published, but should be able to review everything else.
> >
> > Cheers,
> >
> > Chris
> >
> > On Mon, Aug 29, 2022 at 6:57 AM Dániel Urbán <urb.dani...@gmail.com>
> > wrote:
> >
> >> Hi Chris,
> >>
> >> I understand your point, sounds good to me.
> >> So in short, we should opt for an internal-only API, and preferably a
> >> single server solution. Is that right?
> >>
> >> Thanks
> >> Daniel
> >>
> >> Chris Egerton <chr...@aiven.io.invalid> ezt írta (időpont: 2022. aug.
> >> 26.,
> >> P, 17:36):
> >>
> >> > Hi Daniel,
> >> >
> >> > Glad to hear from you!
> >> >
> >> > With regards to the stripped-down REST API alternative, I don't see
> how
> >> > this would prevent us from introducing the fully-fledged Connect REST
> >> API,
> >> > or even an augmented variant of it, at some point down the road. If we
> >> go
> >> > with the internal-only API now, and want to expand later, can't we
> gate
> >> the
> >> > expansion behind a feature flag configuration property that by default
> >> > disables the new feature?
> >> >
> >> > I'm also not sure that we'd ever want to expose the raw Connect REST
> API
> >> > for dedicated MM2 clusters. If people want that capability, they can
> >> > already spin up a vanilla Connect cluster and run as many MM2
> >> connectors as
> >> > they'd like on it, and as of KIP-458 [1], it's even possible to use a
> >> > single Connect cluster to replicate between any two Kafka clusters
> >> instead
> >> > of only targeting the Kafka cluster that the vanilla Connect cluster
> >> > operates on top of. I do agree that it'd be great to be able to
> >> dynamically
> >> > adjust things like topic filters without having to restart a dedicated
> >> MM2
> >> > node; I'm just not sure that the vanilla Connect REST API is the
> >> > appropriate way to do that, especially since the exact mechanisms that
> >> make
> >> > a single Connect cluster viable for replicating across any two Kafka
> >> > clusters could be abused and cause a dedicated MM2 cluster to start
> >> writing
> >> > to a completely different Kafka cluster that's not even defined in its
> >> > config file.
> >> >
> >> > Finally, as far as security goes--since this is essentially a bug fix,
> >> I'm
> >> > inclined to make it as easy as possible for users to adopt it. MTLS
> is a
> >> > great start for securing a REST API, but it's not sufficient on its
> own
> >> > since anyone who could issue an authenticated REST request against the
> >> MM2
> >> > cluster would still be able to make any changes they want (with the
> >> > exception of accessing internal endpoints, which were secured with
> >> > KIP-507). If we were to bring up the fully-fledged Connect REST API,
> >> > cluster administrators would also likely have to add some kind of
> >> > authorization layer to prevent people from using the REST API to mess
> >> with
> >> > the configurations of the connectors that MM2 brought up. One way of
> >> doing
> >> > that is to add a REST extension to your Connect cluster, but
> >> implementing
> >> > and configuring one in order to be able to run a multi-node MM2
> cluster
> >> > without hitting this bug seems like too much work to be worth it.
> >> >
> >> > I think if we had a better picture of what a REST API for dedicated
> MM2
> >> > clusters would/should look like, then it would be easier to go along
> >> with
> >> > this, and we could even just add the feature flag in this KIP right
> now
> >> to
> >> > address any security concerns. My instinct would be to address this
> in a
> >> > follow-up KIP in order to reduce scope creep, though, and keep this
> KIP
> >> > focused on addressing the bug with multi-node dedicated MM2 clusters.
> >> What
> >> > do you think?
> >> >
> >> > Cheers,
> >> >
> >> > Chris
> >> >
> >> > [1] -
> >> >
> >> >
> >>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-458%3A+Connector+Client+Config+Override+Policy
> >> >
> >> > On Thu, Aug 25, 2022 at 3:55 AM Dániel Urbán <urb.dani...@gmail.com>
> >> > wrote:
> >> >
> >> > > Hi Chris,
> >> > >
> >> > > Thanks for bringing this up again :)
> >> > >
> >> > > 1. I think that is reasonable, though I find the current state of
> MM2
> >> to
> >> > be
> >> > > confusing. The current issue with the distributed mode is not
> >> documented
> >> > > properly, but maybe the logging will help a bit.
> >> > >
> >> > > 2. Going for an internal-only Connect REST version would lock MM2
> out
> >> of
> >> > a
> >> > > path where the REST API can be used to dynamically reconfigure
> >> > > replications. For now, I agree, it would be easy to corrupt the
> state
> >> of
> >> > > MM2 if someone wanted to use the properties and the REST at the same
> >> > time,
> >> > > but in the future, we might have a chance to introduce a different
> >> config
> >> > > mechanism, where only the cluster connections have to be specified
> in
> >> the
> >> > > properties file, and everything else can be configured through REST
> >> > > (enabling replications, changing topic filters, etc.). Because of
> >> this,
> >> > I'm
> >> > > leaning towards a full Connect REST API. To avoid issues with
> >> conflicts
> >> > > between the props file and the REST, we could document security best
> >> > > practices (e.g. turn on basic auth or mTLS on the Connect REST to
> >> avoid
> >> > > unwanted interactions).
> >> > >
> >> > > 3. That is a good point, and I agree, a big plus for motivation.
> >> > >
> >> > > I have a working version of this in which all flows spin up a
> >> dedicated
> >> > > Connect REST, but I can give other solutions a try, too.
> >> > >
> >> > > Thanks,
> >> > > Daniel
> >> > >
> >> > > Chris Egerton <chr...@aiven.io.invalid> ezt írta (időpont: 2022.
> aug.
> >> > 24.,
> >> > > Sze, 17:46):
> >> > >
> >> > > > Hi Daniel,
> >> > > >
> >> > > > I'd like to resurface this KIP in case you're still interested in
> >> > > pursuing
> >> > > > it. I know it's been a while since you published it, and it hasn't
> >> > > received
> >> > > > much attention, but I'm hoping we can give it a try now and
> finally
> >> put
> >> > > > this long-standing bug to rest. To that end, I have some thoughts
> >> about
> >> > > the
> >> > > > proposal. This isn't a complete review, but I wanted to give
> enough
> >> to
> >> > > get
> >> > > > the ball rolling:
> >> > > >
> >> > > > 1. Some environments with firewalls or strict security policies
> may
> >> not
> >> > > be
> >> > > > able to bring up a REST server for each MM2 node. If we decide
> that
> >> > we'd
> >> > > > like to use the Connect REST API (or even just parts of it) to
> >> address
> >> > > this
> >> > > > bug with MM2, it does make sense to eventually make the
> >> availability of
> >> > > the
> >> > > > REST API a hard requirement for running MM2, but it might be a bit
> >> too
> >> > > > abrupt to do that all in a single release. What do you think about
> >> > making
> >> > > > the REST API optional for now, but noting that it will become
> >> required
> >> > > in a
> >> > > > later release (probably 4.0.0 or, if that's not enough time,
> >> 5.0.0)? We
> >> > > > could choose not to bring the REST server for any node whose
> >> > > configuration
> >> > > > doesn't explicitly opt into one, and maybe log a warning message
> on
> >> > > startup
> >> > > > if none is configured. In effect, we'd be marking the current mode
> >> (no
> >> > > REST
> >> > > > server) as deprecated.
> >> > > >
> >> > > > 2. I'm not sure that we should count out the "Creating an
> >> internal-only
> >> > > > derivation of the Connect REST API" rejected alternative. Right
> now,
> >> > the
> >> > > > single source of truth for the configuration of a MM2 cluster
> >> (assuming
> >> > > > it's being run in dedicated mode, and not as a connector in a
> >> vanilla
> >> > > > Connect cluster) is the configuration file used for the process.
> By
> >> > > > bringing up the REST API, we'd expose endpoints to modify
> connector
> >> > > > configurations, which would not only add complexity to the
> operation
> >> > of a
> >> > > > MM2 cluster, but even qualify as an attack vector for malicious
> >> > entities.
> >> > > > Thanks to KIP-507 we have some amount of security around the
> >> > > internal-only
> >> > > > endpoints used by the Connect framework, but for any public
> >> endpoints,
> >> > > the
> >> > > > Connect REST API doesn't come with any security out of the box.
> >> > > >
> >> > > > 3. Small point, but with support for exactly-once source
> connectors
> >> > > coming
> >> > > > out in 3.3.0, it's also worth noting that that's another feature
> >> that
> >> > > won't
> >> > > > work properly with multi-node MM2 clusters without adding a REST
> >> server
> >> > > for
> >> > > > each node (or some substitute that accomplishes the same goal). I
> >> don't
> >> > > > think this will affect the direction of the design discussion too
> >> much,
> >> > > but
> >> > > > it does help strengthen the motivation.
> >> > > >
> >> > > > Cheers,
> >> > > >
> >> > > > Chris
> >> > > >
> >> > > > On 2021/02/18 15:57:36 Dániel Urbán wrote:
> >> > > > > Hello everyone,
> >> > > > >
> >> > > > > * Sorry, I meant KIP-710.
> >> > > > >
> >> > > > > Right now the MirrorMaker cluster is somewhat unreliable, and
> not
> >> > > > > supporting running in a cluster properly. I'd say that fixing
> this
> >> > > would
> >> > > > be
> >> > > > > a nice addition.
> >> > > > > Does anyone have some input on this?
> >> > > > >
> >> > > > > Thanks in advance
> >> > > > > Daniel
> >> > > > >
> >> > > > > Dániel Urbán <ur...@gmail.com> ezt írta (időpont: 2021. jan.
> >> 26., K,
> >> > > > > 15:56):
> >> > > > >
> >> > > > > > Hello everyone,
> >> > > > > >
> >> > > > > > I would like to start a discussion on KIP-709, which addresses
> >> some
> >> > > > > > missing features in MM2 dedicated mode.
> >> > > > > >
> >> > > > > >
> >> > > >
> >> > > >
> >> > >
> >> >
> >>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-710%3A+Full+support+for+distributed+mode+in+dedicated+MirrorMaker+2.0+clusters
> >> > > > > >
> >> > > > > > Currently, the dedicated mode of MM2 does not fully support
> >> running
> >> > > in
> >> > > > a
> >> > > > > > cluster. The core issue is that the Connect REST Server is not
> >> > > included
> >> > > > in
> >> > > > > > the dedicated mode, which makes follower->leader communication
> >> > > > impossible.
> >> > > > > > In some cases, this results in the cluster not being able to
> >> react
> >> > to
> >> > > > > > dynamic configuration changes (e.g. dynamic topic filter
> >> changes).
> >> > > > > > Another smaller detail is that MM2 dedicated mode eagerly
> >> resolves
> >> > > > config
> >> > > > > > provider references in the Connector configurations, which is
> >> > > > undesirable
> >> > > > > > and a breaking change compared to vanilla Connect. This can
> >> cause
> >> > an
> >> > > > issue
> >> > > > > > for example when there is an environment variable reference,
> >> which
> >> > > > contains
> >> > > > > > some host-specific information, like a file path. The leader
> >> > resolves
> >> > > > the
> >> > > > > > reference eagerly, and the resolved value is propagated to
> other
> >> > MM2
> >> > > > nodes
> >> > > > > > instead of the reference being resolved locally, separately on
> >> each
> >> > > > node.
> >> > > > > >
> >> > > > > > The KIP addresses these by adding the Connect REST Server to
> the
> >> > MM2
> >> > > > > > dedicated mode for each replication flow, and postponing the
> >> config
> >> > > > > > provider reference resolution.
> >> > > > > >
> >> > > > > > Please discuss, I know this is a major change, but also an
> >> > important
> >> > > > > > feature for MM2 users.
> >> > > > > >
> >> > > > > > Daniel
> >> > > > > >
> >> > > > >
> >> > > >
> >> > >
> >> >
> >>
> >
>

Reply via email to