Hello,
I did do some more digging about the idea for a "logical client instance
id", and I guess it would overall depend what the goal is, how we would
want do design it.
My personal focus was on KIP-714/KIP-1076 to send KS metrics. Also, the
currently discussed KIP-1324 to send configs, including KS configs, to
the broker.
Not sure what use-cases Mickael had in mind? Given a previous reply, it
seems he was thinking to use a common UUID prefix. This would be
something quite different to what I had in mind. Would love to learn
more. If the goal is to identify every network connection / sent RPC, a
common prefix could make sense. As an alternative, we could add another
tagged field, but it might be overkill? However, with regard to
KIP-714/1076 and KIP-1324 it won't really do what we need there (more
about this below).
I was also a little bit concerned about the "convention to align request
header clientInstanceId to request body fields". The KIP only say that
bumping request versions to avoid this question would be a
"protocol-compatibility nightmare". Digging into it a little bit, I
don't see why it would be a huge problem for KIP-714 RPCs, and I am
wondering if it would still be worth to do the version bump for these
RPCs? Maybe Andrew can elaborate?
However, after some digging I learned that KIP-848/1071 (including
classic group RPCs) actually encoded the `memberId` as `string` type at
protocol level. So while it's an UUID client side, it's not really
broker side. So there is not much to be aligned? It's of course not a
bad idea to generate a single UUID and use for both, but from a protocol
level it seems that bumping these RPCs versions to remove the `string`
fields, and replace with request header `clientInstanceId` field might
not be the right thing to do to begin with. -- So while I agree to the
"nightmare" assessment, it would still be good if the KIP could add more
context why it would be a nightmare.
For KIP-714/1076, the underlying idea of a logical client instance ID
was to know which metrics belong to the physical client vs logical
client. However, the challenge is that a single PushTelemetryRequest may
contain both "native" client metrics as well as user-registered metrics.
Thus, adding a new `logicalClientInstanceId` field at the request level
does not really help (neither does the proposed prefix). I don't think
it would be a good idea to send multiple smaller requests (with
different IDs) to keep "native" and "custom" metrics separated. -- It
was also not a problem yet, to only have the single `clientInstanceId`,
so maybe there is no problem to be solved to begin with? -- As an
alternative, I was thinking if it could make sense to instead encode a
logical client instance id (== KS ProcessId) as metric tag (either at KS
level directly, or by adding it dynamically before sending to the
broker). Some KS metrics already have the `ProcessId` as tag. But it
seems we are heading into a quite independent discussion now, that has
not much to do with KIP-1313 any longer, so I think we might want to
defer it?
The only open question is KIP-1324, for which we indeed cannot use the
same `clientInstanceId` for eg, consumer and KS configs. But the KIP is
still under discussion, so maybe we can find some other solution (there
is the idea to hijack the Admin's network connection...).
Side note: KIP-1331 did drop the `UUID` for the newly added RPCs and we
don't need to worry about it any longer.
So long story short. I am still wondering if bumping request versions of
`GetTelemetrySubscriptionRequest/Response` and `PushTelemetryRequest`
could make sense to simply what we try to do with KIP-1313. The
currently propose protocol for old/new broker/client compatibility seems
to be overly (and unnecessarily) complex? If we would bump the RPC
versiona, it would be much simpler:
- if either broker or client is on older version; nothing changes; v0
won't include the new request headers `clientInstanceId` field, and the
broker would still assign a `clientInstanceId` if it receive a ZERO
- only if both broker and client support the new RPC version, the
`clientInstanceId` is added to the request header, and we can drop the
`clientInstanceId` field from both
GetTelemetrySubscriptionRequest/Response body, and maybe also from
PushTelemetryRequest body?
I would strongly prefer this simplification. These changes have nothing
to do with "logical client instance ID", so should be part of KIP-1313
(if we agree to do this this way).
-Matthias
On 5/27/26 10:11 AM, Jun Rao via dev wrote:
Hi, Andrew,
Thanks for the reply. One more comment.
JR24. "The alignment of other identifiers is by convention (and the Java
client will follow the convention) rather than mandate." Could you describe
the convention to convert a clientInstanceId (UUID) to a memberId (String)?
Jun
On Tue, May 19, 2026 at 2:36 AM Andrew Schofield <[email protected]>
wrote:
Hi Jun,
Thanks for your response.
JR23: You are absolutely correct. It seems to me that not sending a
clientInstanceId in the header and explicitly sending a zero UUID as the
clientInstanceId in the header can be treated as semantically equivalent.
I've tweaked the words slightly.
Thanks,
Andrew
On 2026/05/19 03:42:16 Jun Rao via dev wrote:
Hi, Andrew,
Thanks for the reply.
JR23. Our message protocol doc says "Any fields in the message object
that
are not present in the version that you are deserializing will be reset
to
default values. Unless a custom default has been set:". Uuid fields
default to zero uuid.
So if the server gets header.clientInstanceId=0 in the deserialized
header,
could it distinguish between the ID not being present (since client is
old)
and the ID being explicitly set to 0 by the client?
Jun
On Mon, May 18, 2026 at 7:45 PM Andrew Schofield <[email protected]>
wrote:
Hi Jun,
Thanks for your reply. It's tricky squaring a circle.
JR23: For GetTelemetrySubscriptions, I have changed it so that a client
which omits the ClientInstanceId from the request header is permitted
to
specify a zero ClientInstanceId in the request body, following original
KIP-714 precedent. However, a client which specifies a
ClientInstanceId in
the request header MUST specify the same ClientInstanceId in the
request
body. This ensures that the header and telemetry UUIDs are the same.
Thanks,
Andrew
On 2026/05/12 17:48:23 Andrew Schofield wrote:
Hi Jun,
Thanks for the reply and digging into the details.
JR23: Correct. The client telemetry component will use UUID-B as the
client instance ID.
JR23.1: Yes, I agree. It's not ideal. When I was drawing up the
tables,
I was thinking that this might be a possibility, but I'm less convinced
now. I think that I should mandate that if a client specifies
header.ClientInstanceId on GetTelemetrySubscriptions request, then
request.ClientInstanceId must either be zero or equal to
header.ClientInstanceId.
JR23.2: This is perhaps the interesting one. From its original
intent,
it should be UUID-B (the telemetry UUID), but then that contradicts the
change in signature to remove the timeout. Unless I make the change
above,
in which case it will be UUID-H.
Thanks,
Andrew
On 2026/05/12 17:23:58 Jun Rao via dev wrote:
Hi, Andrew,
Thanks for the reply.
JR23. In the new client -> old broker case, we have
header.ClientInstanceId=UUID-H
request.ClientInstanceId=UUID-B
response.ClientInstanceId=0
On the server side, I guess the telemetry component will use
UUID-B as
the
clientInstanceId? This has a couple of implications.
JR23.1 On the server side, we have two different clientInstanceIds
used in
different places, UUID-H for request logging and UUID-B in
telemetry.
This
seems confusing since we can't uniquely identify a client on the
server
side.
JR23.2 On the client side. what uuid does clientInstanceId(Duration
timeout) return? If it returns UUID-H, it will be confusing since
it
doesn't match the ID used for telemetry on the server.
Jun
On Tue, May 12, 2026 at 12:58 AM Andrew Schofield <
[email protected]>
wrote:
Hi Jun,
Thanks for your response.
JR20: I have improved (I hope) the wording. The client sends
request.clientInstanceId = 0 and header.clientInstanceId =
UUID-H,
and the
broker responds response.clientInstanceId=UUID-H. In this way,
the
broker
will have taken the UUID-H from the header, and told the client
to
use it
for client telemetry also.
JR21: Done. Look for "henceforth".
JR22: Summary table added.
Thanks,
Andrew
On 2026/05/11 19:18:24 Jun Rao via dev wrote:
Hi, Andrew,
Thanks for the reply.
JR20. "If the client requests a new client instance ID on its
initial
GetTelemetrySubscriptions request and it sends a client
instance
ID in
the
request header, the broker will send back that client instance
ID
rather
than generating a new UUID. This will automatically align the
UUID
in the
request headers and client telemetry."
This seems inconsistent with what's in the table. In the
table, for
example, if the client has the following:
GetTelemetrySubscriptions v0
header.ClientInstanceId = UUID-H
request.ClientInstanceId = UUID-H
or
GetTelemetrySubscriptions v0
header.ClientInstanceId = UUID-H
request.ClientInstanceId = UUID-R
the broker returns
response.ClientInstanceId = 0.
JR21. It will be useful to document what the new client does
with
the
returned response.ClientInstanceId. Note that return value may
or
may not
be 0.
JR22. It's probably clearer if we could populate the table
with 4
combinations: old/new clients with old/new brokers.
Jun
On Fri, May 8, 2026 at 2:49 AM Andrew Schofield <
[email protected]>
wrote:
Hi Jun and Chia-Ping,
I've overhauled part of the KIP to do with alignment of the
request
header
client instance ID, client telemetry client instance ID and
group
protocol
member IDs. The alignment is by convention, not mandate
(SHOULD
not
MUST).
It would be possible to go around the existing RPCs such as
ConsumerGroupHeartbeat and GetTelemetrySubscriptions, and
remove
the
fields
containing the existing identifiers which are intended to be
aligned.
Doing
so would be a bad idea though, because we would then have RPC
versions
which essentially depend upon the presence of a tagged field
in
the
request
header. This is a protocol-compatibility nightmare.
I have removed the new versions of GetTelemetrySubscriptions
and
PushTelemetry. I have also explained the behavior of
GetTelemetrySubscriptions in the presence and absence of a
client
instance
ID in the request header.
Let me know what you think.
Thanks,
Andrew
On 2026/05/07 15:09:31 Andrew Schofield wrote:
Hi Jun and Chia-Ping,
I've been thinking and discussing the changes to the
KIP-714
RPCs.
There
are too many combinations for my liking at the moment. I
want to
take
another pass at this area and will make an update in a few
days.
I intend to start a new vote once we have consensus because
the spec
has
changed somewhat since the earliest votes.
Thanks,
Andrew
On 2026/05/06 17:28:27 Chia-Ping Tsai wrote:
hi Andrew
chia_0: If the consensus is to remove the "duplicate"
field
from
the
RPC payloads, the tagged field in the header will essentially
become a
required field. This means the broker needs to handle the
edge
case
where
both the header and the request body have no
ClientInstanceId,
right?
If
so, would you mind clarifying the expected broker behavior in
the KIP?
Best,
Chia-Ping
On 2026/04/03 16:17:37 Andrew Schofield wrote:
Hi,
I would like to start the discussion on KIP-1313. This
adds a
unique
client instance ID to the request header of all Kafka
protocol
requests to
give a unique identifier which can be used to correlate the
requests
from
each client for the purposes of problem determination.
https://urldefense.com/v3/__https://cwiki.apache.org/confluence/display/KAFKA/KIP-1313*3A*Client*instance*ID*in*all*request*headers__;JSsrKysrKys!!Ayb5sqE7!uqWf0-b_X82WmpmCYImD2W2rht_s_q5vHcqB9ToMV4IaeQbZF42eMJyS5XC5b5qE_qJJUj3KTCXcqEvYbwYS$
Thanks,
Andrew