Hi Chris,

Thank you for your comments! I updated the KIP. I still need to add the
example before/after log lines, will do that soon, but I addressed all the
other points.
1. Added more details on thread renaming under Public Interfaces, removed
pseudo code.
2. Removed the stale header - originally, client.id related changes were
part of the KIP, and I failed removing all leftovers of that version.
3. Threads listed under Public Interfaces with current/proposed names.
4. Added a comment in the connect-log4j.properties, similar to the one
introduced in KIP-449. We don't have a dedicated MM2 log4j config, not sure
if we should introduce it here.
5. Clarified testing section - I think thread names should not be tested
(they never were), but testing will focus on the new MDC context value.

Thanks,
Daniel

Chris Egerton <chr...@aiven.io.invalid> ezt írta (időpont: 2023. jún. 5.,
H, 16:46):

> Hi Daniel,
>
> Thanks for the updates! A few more thoughts:
>
> 1. The "Proposed Changes" section seems a bit like a pseudocode
> implementation of the KIP. We don't really need this level of detail; most
> of the time, we're just looking for implementation details that don't
> directly affect the user-facing changes proposed in the "Public Interfaces"
> section but are worth mentioning because they (1) demonstrate how the
> user-facing changes are made possible, (2) indirectly affect user-facing
> behavior, or (3) go into more detail (like providing examples) about the
> user-facing behavior. For this KIP, I think examples of user-facing
> behavior (like before/after of thread names and log messages) and possibly
> an official description of the scope of the changes (which threads are
> going to be renamed and/or include the new MDC key, and which aren't?) are
> all that we'd really need in this section; everything else is fairly clear
> IMO. FWIW, the reason we want to discourage going into too much detail with
> KIPs is that it can quickly devolve into code review over mailing list,
> which can hold KIPs up for longer than necessary when the core design
> changes they contain are already basically accepted by everyone.
>
> 2. The "MM2 distributed mode client.id and log change" header seems like
> it
> may no longer be accurate; the contents do not mention any changes to
> client IDs. I might be missing something though; please correct me if I am.
>
> 3. Can you provide some before/after examples of what thread names and log
> messages will look like? I'm wondering about the thread that the
> distributed herder runs on, threads for connectors and tasks, and threads
> for polling internal topics (which we currently manage with the
> KafkaBasedLog class). It's fine if some of these are unchanged, I just want
> to better understand the scope of the proposed changes and get an idea of
> how they may appear to users.
>
> 4. There's no mention of changes to the default Log4j config files that we
> ship. Is this intentional? I feel like we need some way for users to easily
> discover this feature; if we're not going to touch our default Log4j config
> files, is there another way that we can expect users to find out about the
> new MDC key?
>
> 5. RE the "Test Plan" section: can you provide a little more detail of
> which cases we'll be covering with the proposed unit tests? Will we be
> verifying that the MDC context is set in various places? If so, which
> places? And the same with thread names? (There doesn't have to be a ton of
> detail, but a little more than "unit tests" would be nice 😄)
>
> Cheers,
>
> Chris
>
> On Mon, Jun 5, 2023 at 9:45 AM Dániel Urbán <urb.dani...@gmail.com> wrote:
>
> > I updated the KIP accordingly. Tried to come up with more generic terms
> in
> > the Connect code instead of referring to "flow" anywhere.
> > Daniel
> >
> > Dániel Urbán <urb.dani...@gmail.com> ezt írta (időpont: 2023. jún. 5.,
> H,
> > 14:49):
> >
> > > Hi Chris,
> > >
> > > Thank you for your comments!
> > >
> > > I agree that the toString based logging is not ideal, and I believe all
> > > occurrences are within a proper logging context, so they can be
> ignored.
> > > If thread names can be changed unconditionally, I agree, using a new
> MDC
> > > key is the ideal solution.
> > >
> > > Will update the KIP accordingly.
> > >
> > > Thanks,
> > > Daniel
> > >
> > > Chris Egerton <chr...@aiven.io.invalid> ezt írta (időpont: 2023. jún.
> > 2.,
> > > P, 22:23):
> > >
> > >> Hi Daniel,
> > >>
> > >> Are there any cases of Object::toString being used by internal classes
> > for
> > >> logging context that can't be covered by MDC contexts? For example,
> > >> anything logged by WorkerSinkTask (or any WorkerTask subclass) already
> > has
> > >> the MDC set for the task [1]. Since the Object::toString-prefixed
> style
> > of
> > >> logging is a bit obsolete after the introduction of MDC contexts it'd
> > feel
> > >> a little strange to continue trying to accommodate it, especially if
> the
> > >> changes from this KIP are going to be opt-in regardless.
> > >>
> > >> As far as thread names go: unlike log statements, I don't believe
> > changing
> > >> them requires a KIP, and would be fine with merging a PR for that
> > without
> > >> worrying about compatibility.
> > >>
> > >> With that in mind, I'm still wondering if we can add a separate MDC
> key
> > >> for
> > >> MM2 replication flows (perhaps "mirror.maker.flow") and
> unconditionally
> > >> add
> > >> that to the MDC contexts for every thread that gets spun up by each
> > >> DistributedHerder instance that MM2 creates. This would be different
> > from
> > >> the draft PR and KIP, which involve altering the content of the
> existing
> > >> "connector.context" MDC key. Users would opt in to the change by
> > altering
> > >> their Log4j config instead of altering their MM2 config, which matches
> > >> precedent set with the introduction of the "connector.context" key.
> > >>
> > >> Thoughts?
> > >>
> > >> [1] -
> > >>
> > >>
> >
> https://github.com/apache/kafka/blob/146a6976aed0d9f90c70b6f21dca8b887cc34e71/connect/runtime/src/main/java/org/apache/kafka/connect/runtime/WorkerTask.java#L255
> > >>
> > >> Cheers,
> > >>
> > >> Chris
> > >>
> > >> On Tue, May 23, 2023 at 11:36 AM Dániel Urbán <urb.dani...@gmail.com>
> > >> wrote:
> > >>
> > >> > Hi Chris,
> > >> >
> > >> > Thank you for your comments!
> > >> > 1. This approach is OK for me. I thought that the "sample" configs
> in
> > >> the
> > >> > repo do not fall into the same category as the default of a new
> > config.
> > >> > Adding a commented line instead should be ok, and the future opt-out
> > >> change
> > >> > sounds good to me.
> > >> >
> > >> > 2. Besides the MDC, there are 2 other places where the flow context
> > >> > information could be added:
> > >> > - Some of the Connect internal classes use their .toString methods
> > when
> > >> > logging (e.g. WorkerSinkTask has a line like this: log.info("{}
> > >> Executing
> > >> > sink task", this);). These toString implementations contain the
> > >> connector
> > >> > name, and nothing else, so in MM2 dedicated mode, adding the flow
> > would
> > >> > make these lines unique.
> > >> > - Connect worker thread names. Currently, they contain the connector
> > >> > name/task ID, but to make them unique, the flow should be added in
> MM2
> > >> > distributed mode.
> > >> > In my draft PR, I changed both of these, but for the sake of
> backward
> > >> > compatibility, I made the new toString/thread name dependent on the
> > new,
> > >> > suggested configuration flag.
> > >> > If we go with a new MDC key, but we still want to change the
> toString
> > >> > methods and the thread names, we still might need an extra flag to
> > turn
> > >> on
> > >> > the new behavior.
> > >> > AFAIK the toString calls are all made inside a proper logging
> context,
> > >> so
> > >> > changing the toString methods don't add much value. I think that the
> > >> thread
> > >> > name changes are useful, giving more context for log lines written
> > >> outside
> > >> > of a log context.
> > >> >
> > >> > In short, I think that MDC + thread name changes are necessary to
> make
> > >> MM2
> > >> > dedicated logs useful for diagnostics. If we keep both, then maybe
> > >> having a
> > >> > single config to control both at the same time is better than
> having a
> > >> new
> > >> > MDC key (configured separately in the log pattern) and a config flag
> > >> (set
> > >> > separately in the properties file) for the thread name change.
> > >> >
> > >> > Thanks,
> > >> > Daniel
> > >> >
> > >>
> > >
> >
>

Reply via email to