Re: [DISCUSS] KIP-714: Client metrics and observability

Andrew Schofield Sun, 25 Jun 2023 23:49:32 -0700

Hi Kirk,
Thanks for your comments.

1) I agree that this KIP is not aiming to create a new first-class construct of 
a unique, managed, per-client instance ID.
I’ll add some clarifying words.

2) I can see what the KIP is trying to say about the Terminating flag, but it 
doesn’t quite do it for me. Essentially,
a terminating client with an active metrics subscription can send a final 
metrics push without waiting for the
interval to elapse. However, there’s a chance that the subscription has changed 
and the PushTelemetry RPC fails
with INVALID_SUBSCRIPTION_ID. Then, the client is supposed to get a new 
subscription ID and presumably
send its terminating metrics with this new ID without waiting for the push 
interval to elapse. I will update the text.

3) The KIP is not explicit about the regular expression matcher for matching 
client selectors. I will change it to
call out RE2/J in line with KIP-848. This is also a user-provided, server-side 
regular expression match.

4) I think you’re right about the inclusion of temporality in the 
GetTelemetrySubscriptions response. A client would
be expected to support both cumulative or delta, although initially the broker 
will only use delta. However, it’s quite
an important part of the OTLP metrics specification. I think there is benefit 
in supporting this in KIP-714 clients
to enable temporality to be used by brokers without requiring another round of 
client version upgrades.

5) The ClientTelemetry interface gives a level of indirection between the 
MetricsReporter and the
ClientTelemetryReceiver. A MetricsReporter could implement 
ClientTelemetryReceiver directly, but
the implementation of the CTR could equally well be a separate object.

Thanks for helping to tighten up the KIP.

Andrew

> On 20 Jun 2023, at 16:47, Kirk True <k...@kirktrue.pro> wrote:
>
> Hi Andrew,
>
>
>
>> On Jun 13, 2023, at 8:06 AM, Andrew Schofield 
>> <andrew_schofield_j...@outlook.com> wrote:
>>
>> Hi,
>> I would like to start a new discussion thread on KIP-714: Client metrics and 
>> observability.
>>
>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-714%3A+Client+metrics+and+observability
>>
>> I have edited the proposal significantly to reduce the scope. The overall 
>> mechanism for client metric subscriptions is unchanged, but the
>> KIP is now based on the existing client metrics, rather than introducing new 
>> metrics. The purpose remains helping cluster operators
>> investigate performance problems experienced by clients without requiring 
>> changes to the client application code or configuration.
>>
>> Thanks,
>> Andrew
>
> Thanks for the KIP updates. A few questions:
>
> 1. The concept of a client instance ID is somewhat similar to the unique 
> producer ID that is created for transactional producers. Can we augment the 
> name or somehow clarify that this client instance ID is only for use by 
> telemetry? The pandora’s box alternative is to make the creation, management, 
> etc. of a unique, per-client instance ID a first-class construct. I assume 
> that’s something we don’t want to bite off in this KIP ;)
>
> 2. I’m having trouble understanding where this provision for the terminating 
> flag would be useful:
>
>> The Terminating flag may be reused upon the next expiry of PushIntervalMs.
>
> In the happy path, the terminating flag is set once at time of application 
> shutdown by the close() method of a client. A buggy/nefarious client may send 
> multiple push telemetry requests with the terminating flag set to skirt 
> throttling. What’s the use case where an application would want to send a 
> second request with the terminating flag set after PushInteralMs?
>
> 3. KIP-848 introduces a new flavor of regex for topic subscriptions. Is that 
> what we plan to adopt for the regex used by the subscription match?
>
> 4. What’s the benefit of having the broker specify the delta temporality if 
> it’s (for now) always delta, besides API protocol bumping?
>
> 5. What is gained by the existence of the ClientTelemetry interface? Why not 
> let interested parties implement ClientTelemetryReceiver directly?
>
> Thanks!

Re: [DISCUSS] KIP-714: Client metrics and observability

Reply via email to