If there are no further comments, I will kick off a vote soon for the KIP. Dániel Urbán <urb.dani...@gmail.com> ezt írta (időpont: 2023. jún. 12., H, 11:27):
> Updated the KIP with a few example log lines before/after the change. > Daniel > > Dániel Urbán <urb.dani...@gmail.com> ezt írta (időpont: 2023. jún. 7., > Sze, 13:59): > >> Hi Chris, >> >> Thank you for your comments! I updated the KIP. I still need to add the >> example before/after log lines, will do that soon, but I addressed all the >> other points. >> 1. Added more details on thread renaming under Public Interfaces, removed >> pseudo code. >> 2. Removed the stale header - originally, client.id related changes were >> part of the KIP, and I failed removing all leftovers of that version. >> 3. Threads listed under Public Interfaces with current/proposed names. >> 4. Added a comment in the connect-log4j.properties, similar to the one >> introduced in KIP-449. We don't have a dedicated MM2 log4j config, not sure >> if we should introduce it here. >> 5. Clarified testing section - I think thread names should not be tested >> (they never were), but testing will focus on the new MDC context value. >> >> Thanks, >> Daniel >> >> Chris Egerton <chr...@aiven.io.invalid> ezt írta (időpont: 2023. jún. >> 5., H, 16:46): >> >>> Hi Daniel, >>> >>> Thanks for the updates! A few more thoughts: >>> >>> 1. The "Proposed Changes" section seems a bit like a pseudocode >>> implementation of the KIP. We don't really need this level of detail; >>> most >>> of the time, we're just looking for implementation details that don't >>> directly affect the user-facing changes proposed in the "Public >>> Interfaces" >>> section but are worth mentioning because they (1) demonstrate how the >>> user-facing changes are made possible, (2) indirectly affect user-facing >>> behavior, or (3) go into more detail (like providing examples) about the >>> user-facing behavior. For this KIP, I think examples of user-facing >>> behavior (like before/after of thread names and log messages) and >>> possibly >>> an official description of the scope of the changes (which threads are >>> going to be renamed and/or include the new MDC key, and which aren't?) >>> are >>> all that we'd really need in this section; everything else is fairly >>> clear >>> IMO. FWIW, the reason we want to discourage going into too much detail >>> with >>> KIPs is that it can quickly devolve into code review over mailing list, >>> which can hold KIPs up for longer than necessary when the core design >>> changes they contain are already basically accepted by everyone. >>> >>> 2. The "MM2 distributed mode client.id and log change" header seems >>> like it >>> may no longer be accurate; the contents do not mention any changes to >>> client IDs. I might be missing something though; please correct me if I >>> am. >>> >>> 3. Can you provide some before/after examples of what thread names and >>> log >>> messages will look like? I'm wondering about the thread that the >>> distributed herder runs on, threads for connectors and tasks, and threads >>> for polling internal topics (which we currently manage with the >>> KafkaBasedLog class). It's fine if some of these are unchanged, I just >>> want >>> to better understand the scope of the proposed changes and get an idea of >>> how they may appear to users. >>> >>> 4. There's no mention of changes to the default Log4j config files that >>> we >>> ship. Is this intentional? I feel like we need some way for users to >>> easily >>> discover this feature; if we're not going to touch our default Log4j >>> config >>> files, is there another way that we can expect users to find out about >>> the >>> new MDC key? >>> >>> 5. RE the "Test Plan" section: can you provide a little more detail of >>> which cases we'll be covering with the proposed unit tests? Will we be >>> verifying that the MDC context is set in various places? If so, which >>> places? And the same with thread names? (There doesn't have to be a ton >>> of >>> detail, but a little more than "unit tests" would be nice 😄) >>> >>> Cheers, >>> >>> Chris >>> >>> On Mon, Jun 5, 2023 at 9:45 AM Dániel Urbán <urb.dani...@gmail.com> >>> wrote: >>> >>> > I updated the KIP accordingly. Tried to come up with more generic >>> terms in >>> > the Connect code instead of referring to "flow" anywhere. >>> > Daniel >>> > >>> > Dániel Urbán <urb.dani...@gmail.com> ezt írta (időpont: 2023. jún. >>> 5., H, >>> > 14:49): >>> > >>> > > Hi Chris, >>> > > >>> > > Thank you for your comments! >>> > > >>> > > I agree that the toString based logging is not ideal, and I believe >>> all >>> > > occurrences are within a proper logging context, so they can be >>> ignored. >>> > > If thread names can be changed unconditionally, I agree, using a new >>> MDC >>> > > key is the ideal solution. >>> > > >>> > > Will update the KIP accordingly. >>> > > >>> > > Thanks, >>> > > Daniel >>> > > >>> > > Chris Egerton <chr...@aiven.io.invalid> ezt írta (időpont: 2023. >>> jún. >>> > 2., >>> > > P, 22:23): >>> > > >>> > >> Hi Daniel, >>> > >> >>> > >> Are there any cases of Object::toString being used by internal >>> classes >>> > for >>> > >> logging context that can't be covered by MDC contexts? For example, >>> > >> anything logged by WorkerSinkTask (or any WorkerTask subclass) >>> already >>> > has >>> > >> the MDC set for the task [1]. Since the Object::toString-prefixed >>> style >>> > of >>> > >> logging is a bit obsolete after the introduction of MDC contexts >>> it'd >>> > feel >>> > >> a little strange to continue trying to accommodate it, especially >>> if the >>> > >> changes from this KIP are going to be opt-in regardless. >>> > >> >>> > >> As far as thread names go: unlike log statements, I don't believe >>> > changing >>> > >> them requires a KIP, and would be fine with merging a PR for that >>> > without >>> > >> worrying about compatibility. >>> > >> >>> > >> With that in mind, I'm still wondering if we can add a separate MDC >>> key >>> > >> for >>> > >> MM2 replication flows (perhaps "mirror.maker.flow") and >>> unconditionally >>> > >> add >>> > >> that to the MDC contexts for every thread that gets spun up by each >>> > >> DistributedHerder instance that MM2 creates. This would be different >>> > from >>> > >> the draft PR and KIP, which involve altering the content of the >>> existing >>> > >> "connector.context" MDC key. Users would opt in to the change by >>> > altering >>> > >> their Log4j config instead of altering their MM2 config, which >>> matches >>> > >> precedent set with the introduction of the "connector.context" key. >>> > >> >>> > >> Thoughts? >>> > >> >>> > >> [1] - >>> > >> >>> > >> >>> > >>> https://github.com/apache/kafka/blob/146a6976aed0d9f90c70b6f21dca8b887cc34e71/connect/runtime/src/main/java/org/apache/kafka/connect/runtime/WorkerTask.java#L255 >>> > >> >>> > >> Cheers, >>> > >> >>> > >> Chris >>> > >> >>> > >> On Tue, May 23, 2023 at 11:36 AM Dániel Urbán < >>> urb.dani...@gmail.com> >>> > >> wrote: >>> > >> >>> > >> > Hi Chris, >>> > >> > >>> > >> > Thank you for your comments! >>> > >> > 1. This approach is OK for me. I thought that the "sample" >>> configs in >>> > >> the >>> > >> > repo do not fall into the same category as the default of a new >>> > config. >>> > >> > Adding a commented line instead should be ok, and the future >>> opt-out >>> > >> change >>> > >> > sounds good to me. >>> > >> > >>> > >> > 2. Besides the MDC, there are 2 other places where the flow >>> context >>> > >> > information could be added: >>> > >> > - Some of the Connect internal classes use their .toString methods >>> > when >>> > >> > logging (e.g. WorkerSinkTask has a line like this: log.info("{} >>> > >> Executing >>> > >> > sink task", this);). These toString implementations contain the >>> > >> connector >>> > >> > name, and nothing else, so in MM2 dedicated mode, adding the flow >>> > would >>> > >> > make these lines unique. >>> > >> > - Connect worker thread names. Currently, they contain the >>> connector >>> > >> > name/task ID, but to make them unique, the flow should be added >>> in MM2 >>> > >> > distributed mode. >>> > >> > In my draft PR, I changed both of these, but for the sake of >>> backward >>> > >> > compatibility, I made the new toString/thread name dependent on >>> the >>> > new, >>> > >> > suggested configuration flag. >>> > >> > If we go with a new MDC key, but we still want to change the >>> toString >>> > >> > methods and the thread names, we still might need an extra flag to >>> > turn >>> > >> on >>> > >> > the new behavior. >>> > >> > AFAIK the toString calls are all made inside a proper logging >>> context, >>> > >> so >>> > >> > changing the toString methods don't add much value. I think that >>> the >>> > >> thread >>> > >> > name changes are useful, giving more context for log lines written >>> > >> outside >>> > >> > of a log context. >>> > >> > >>> > >> > In short, I think that MDC + thread name changes are necessary to >>> make >>> > >> MM2 >>> > >> > dedicated logs useful for diagnostics. If we keep both, then maybe >>> > >> having a >>> > >> > single config to control both at the same time is better than >>> having a >>> > >> new >>> > >> > MDC key (configured separately in the log pattern) and a config >>> flag >>> > >> (set >>> > >> > separately in the properties file) for the thread name change. >>> > >> > >>> > >> > Thanks, >>> > >> > Daniel >>> > >> > >>> > >> >>> > > >>> > >>> >>