[ https://issues.apache.org/jira/browse/KAFKA-15190?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17744726#comment-17744726 ]
Matthias J. Sax commented on KAFKA-15190: ----------------------------------------- {quote}but although {{StreamsPartitionAssignor}} sometimes calls it a client ID and sometimes a process ID it's a {{UUID}} so I assume it really is the process ID. {quote} Thanks for calling this out. You are right; I missed this point. As you did mention "max recovery lag", I assume you have a stateful app that uses in-memory stores only? Another thing coming to my mind: the `client.id` has actually different purpose and should not be unique per `KafkaStreams` instance, but should be the _same_ for all instances (the name is a little bit mis-leading). For example, if you configure quotas, it's based on `client.id` and you usually want quotas to be set per application, not per instance. > Allow configuring a streams process ID > -------------------------------------- > > Key: KAFKA-15190 > URL: https://issues.apache.org/jira/browse/KAFKA-15190 > Project: Kafka > Issue Type: Wish > Components: streams > Reporter: Joe Wreschnig > Priority: Major > Labels: needs-kip > > We run our Kafka Streams applications in containers with no persistent > storage, and therefore the mitigation of persisting process ID the state > directly in KAFKA-10716 does not help us avoid shuffling lots of tasks during > restarts. > However, we do have a persistent container ID (from a Kubernetes > StatefulSet). Would it be possible to expose a configuration option to let us > set the streams process ID ourselves? > We are already using this ID as our group.instance.id - would it make sense > to have the process ID be automatically derived from this (plus > application/client IDs) if it's set? The two IDs seem to have overlapping > goals of identifying "this consumer" across restarts. -- This message was sent by Atlassian Jira (v8.20.10#820010)