Hi Kirk, Thanks for your comments. 1) I agree that this KIP is not aiming to create a new first-class construct of a unique, managed, per-client instance ID. I’ll add some clarifying words.
2) I can see what the KIP is trying to say about the Terminating flag, but it doesn’t quite do it for me. Essentially, a terminating client with an active metrics subscription can send a final metrics push without waiting for the interval to elapse. However, there’s a chance that the subscription has changed and the PushTelemetry RPC fails with INVALID_SUBSCRIPTION_ID. Then, the client is supposed to get a new subscription ID and presumably send its terminating metrics with this new ID without waiting for the push interval to elapse. I will update the text. 3) The KIP is not explicit about the regular expression matcher for matching client selectors. I will change it to call out RE2/J in line with KIP-848. This is also a user-provided, server-side regular expression match. 4) I think you’re right about the inclusion of temporality in the GetTelemetrySubscriptions response. A client would be expected to support both cumulative or delta, although initially the broker will only use delta. However, it’s quite an important part of the OTLP metrics specification. I think there is benefit in supporting this in KIP-714 clients to enable temporality to be used by brokers without requiring another round of client version upgrades. 5) The ClientTelemetry interface gives a level of indirection between the MetricsReporter and the ClientTelemetryReceiver. A MetricsReporter could implement ClientTelemetryReceiver directly, but the implementation of the CTR could equally well be a separate object. Thanks for helping to tighten up the KIP. Andrew > On 20 Jun 2023, at 16:47, Kirk True <k...@kirktrue.pro> wrote: > > Hi Andrew, > > > >> On Jun 13, 2023, at 8:06 AM, Andrew Schofield >> <andrew_schofield_j...@outlook.com> wrote: >> >> Hi, >> I would like to start a new discussion thread on KIP-714: Client metrics and >> observability. >> >> https://cwiki.apache.org/confluence/display/KAFKA/KIP-714%3A+Client+metrics+and+observability >> >> I have edited the proposal significantly to reduce the scope. The overall >> mechanism for client metric subscriptions is unchanged, but the >> KIP is now based on the existing client metrics, rather than introducing new >> metrics. The purpose remains helping cluster operators >> investigate performance problems experienced by clients without requiring >> changes to the client application code or configuration. >> >> Thanks, >> Andrew > > Thanks for the KIP updates. A few questions: > > 1. The concept of a client instance ID is somewhat similar to the unique > producer ID that is created for transactional producers. Can we augment the > name or somehow clarify that this client instance ID is only for use by > telemetry? The pandora’s box alternative is to make the creation, management, > etc. of a unique, per-client instance ID a first-class construct. I assume > that’s something we don’t want to bite off in this KIP ;) > > 2. I’m having trouble understanding where this provision for the terminating > flag would be useful: > >> The Terminating flag may be reused upon the next expiry of PushIntervalMs. > > In the happy path, the terminating flag is set once at time of application > shutdown by the close() method of a client. A buggy/nefarious client may send > multiple push telemetry requests with the terminating flag set to skirt > throttling. What’s the use case where an application would want to send a > second request with the terminating flag set after PushInteralMs? > > 3. KIP-848 introduces a new flavor of regex for topic subscriptions. Is that > what we plan to adopt for the regex used by the subscription match? > > 4. What’s the benefit of having the broker specify the delta temporality if > it’s (for now) always delta, besides API protocol bumping? > > 5. What is gained by the existence of the ClientTelemetry interface? Why not > let interested parties implement ClientTelemetryReceiver directly? > > Thanks!