[jira] [Commented] (KAFKA-6035) Avoid creating changelog topics for state stores that are directly piped to a sink topic

Andy Coates (Jira) Tue, 07 Mar 2023 04:08:06 -0800


    [ 
https://issues.apache.org/jira/browse/KAFKA-6035?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17697387#comment-17697387
 ]


Andy Coates commented on KAFKA-6035:
------------------------------------

I've had a couple of instances now where I've had to suffer these "dual 
changelog topics". A few of these times the topic in question was a busy topic 
and having two copies was expensive in terms of cluster load / storage.

Consider a KS based microservice architecture, where each service defines sets 
of static input and output topics, using sensible naming conventions where the 
name of the output topic should be any one of the following:
 # static, i.e. not dependent on something that can be changed in config, i.e. 
application.id
 # data-centric, i.e. based on the data set it contains, not the service that 
happens to be generating it
 # hierarchical, i.e. the topic prefix should conform to some org-wide data 
model
 # etc

Any of the above mean a change-log topic name of 
"<app.id>-<store-name>-changelog" is going to be problematic.

Either avoiding the internal change-log (as covered by this issue), or allowing 
full control of the internal topics name (as covered by 
https://issues.apache.org/jira/browse/KAFKA-5386), would work as a solution.

> Avoid creating changelog topics for state stores that are directly piped to a 
> sink topic
> ----------------------------------------------------------------------------------------
>
>                 Key: KAFKA-6035
>                 URL: https://issues.apache.org/jira/browse/KAFKA-6035
>             Project: Kafka
>          Issue Type: Sub-task
>          Components: streams
>            Reporter: Guozhang Wang
>            Assignee: Jeyhun Karimov
>            Priority: Major
>
> Today Streams make all state stores to be backed by a changelog topic by 
> default unless users overrides it by {{disableLogging}} when creating the 
> state store / materializing the KTable. However there are a few cases where a 
> separate changelog topic would not be required as we can re-use an existing 
> topic for that. This ticket summarize a specific issue that can be optimized:
> Consider the case when a KTable is materialized and then sent directly into a 
> sink topic with the same key, e.g.
> {code}
> table1 = stream.groupBy(...).aggregate("state1").to("topic2");
> {code}
> Then we do not need to create a {{state1-changelog}} but can just use 
> {{topic2}} as its changelog.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (KAFKA-6035) Avoid creating changelog topics for state stores that are directly piped to a sink topic

Reply via email to