Hi Daniel,

Are there any cases of Object::toString being used by internal classes for
logging context that can't be covered by MDC contexts? For example,
anything logged by WorkerSinkTask (or any WorkerTask subclass) already has
the MDC set for the task [1]. Since the Object::toString-prefixed style of
logging is a bit obsolete after the introduction of MDC contexts it'd feel
a little strange to continue trying to accommodate it, especially if the
changes from this KIP are going to be opt-in regardless.

As far as thread names go: unlike log statements, I don't believe changing
them requires a KIP, and would be fine with merging a PR for that without
worrying about compatibility.

With that in mind, I'm still wondering if we can add a separate MDC key for
MM2 replication flows (perhaps "mirror.maker.flow") and unconditionally add
that to the MDC contexts for every thread that gets spun up by each
DistributedHerder instance that MM2 creates. This would be different from
the draft PR and KIP, which involve altering the content of the existing
"connector.context" MDC key. Users would opt in to the change by altering
their Log4j config instead of altering their MM2 config, which matches
precedent set with the introduction of the "connector.context" key.

Thoughts?

[1] -
https://github.com/apache/kafka/blob/146a6976aed0d9f90c70b6f21dca8b887cc34e71/connect/runtime/src/main/java/org/apache/kafka/connect/runtime/WorkerTask.java#L255

Cheers,

Chris

On Tue, May 23, 2023 at 11:36 AM Dániel Urbán <urb.dani...@gmail.com> wrote:

> Hi Chris,
>
> Thank you for your comments!
> 1. This approach is OK for me. I thought that the "sample" configs in the
> repo do not fall into the same category as the default of a new config.
> Adding a commented line instead should be ok, and the future opt-out change
> sounds good to me.
>
> 2. Besides the MDC, there are 2 other places where the flow context
> information could be added:
> - Some of the Connect internal classes use their .toString methods when
> logging (e.g. WorkerSinkTask has a line like this: log.info("{} Executing
> sink task", this);). These toString implementations contain the connector
> name, and nothing else, so in MM2 dedicated mode, adding the flow would
> make these lines unique.
> - Connect worker thread names. Currently, they contain the connector
> name/task ID, but to make them unique, the flow should be added in MM2
> distributed mode.
> In my draft PR, I changed both of these, but for the sake of backward
> compatibility, I made the new toString/thread name dependent on the new,
> suggested configuration flag.
> If we go with a new MDC key, but we still want to change the toString
> methods and the thread names, we still might need an extra flag to turn on
> the new behavior.
> AFAIK the toString calls are all made inside a proper logging context, so
> changing the toString methods don't add much value. I think that the thread
> name changes are useful, giving more context for log lines written outside
> of a log context.
>
> In short, I think that MDC + thread name changes are necessary to make MM2
> dedicated logs useful for diagnostics. If we keep both, then maybe having a
> single config to control both at the same time is better than having a new
> MDC key (configured separately in the log pattern) and a config flag (set
> separately in the properties file) for the thread name change.
>
> Thanks,
> Daniel
>

Reply via email to