[jira] [Commented] (KAFKA-15376) Explore options of removing data earlier to the current leader's leader epoch lineage for topics enabled with tiered storage.
[ https://issues.apache.org/jira/browse/KAFKA-15376?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17877798#comment-17877798 ] Colin McCabe commented on KAFKA-15376: -- Changing target fix version to 4.0 since this is not a blocker and we are past code freeze. > Explore options of removing data earlier to the current leader's leader epoch > lineage for topics enabled with tiered storage. > - > > Key: KAFKA-15376 > URL: https://issues.apache.org/jira/browse/KAFKA-15376 > Project: Kafka > Issue Type: Task > Components: core >Reporter: Satish Duggana >Priority: Major > Fix For: 3.9.0 > > > Followup on the discussion thread: > [https://github.com/apache/kafka/pull/13561#discussion_r1288778006] > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (KAFKA-15376) Explore options of removing data earlier to the current leader's leader epoch lineage for topics enabled with tiered storage.
[ https://issues.apache.org/jira/browse/KAFKA-15376?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17800586#comment-17800586 ] Stanislav Kozlovski commented on KAFKA-15376: - Changing target fix version to 3.8 since this is not a blocker and we are cutting a 3.7 RC > Explore options of removing data earlier to the current leader's leader epoch > lineage for topics enabled with tiered storage. > - > > Key: KAFKA-15376 > URL: https://issues.apache.org/jira/browse/KAFKA-15376 > Project: Kafka > Issue Type: Task > Components: core >Reporter: Satish Duggana >Priority: Major > Fix For: 3.8.0 > > > Followup on the discussion thread: > [https://github.com/apache/kafka/pull/13561#discussion_r1288778006] > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (KAFKA-15376) Explore options of removing data earlier to the current leader's leader epoch lineage for topics enabled with tiered storage.
[ https://issues.apache.org/jira/browse/KAFKA-15376?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17785962#comment-17785962 ] Kamal Chandraprakash commented on KAFKA-15376: -- With unclean-leader-election enabled, there can be log-divergence, log-loss, and exactly-once-delivery is not applicable. We are trying to extend the same contract that is for local storage to remote when this feature is enabled. There are pros and cons to this feature: *Pros* 1. The replica will serve the data that it seen so far back to the client even if it never interact with any other replica. *Cons* 1. RemoteStorageManager / RemoteLogManager will have additional work to maintain the unreferenced segments and cleaning up them. > Explore options of removing data earlier to the current leader's leader epoch > lineage for topics enabled with tiered storage. > - > > Key: KAFKA-15376 > URL: https://issues.apache.org/jira/browse/KAFKA-15376 > Project: Kafka > Issue Type: Task > Components: core >Reporter: Satish Duggana >Priority: Major > Fix For: 3.7.0 > > > Followup on the discussion thread: > [https://github.com/apache/kafka/pull/13561#discussion_r1288778006] > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (KAFKA-15376) Explore options of removing data earlier to the current leader's leader epoch lineage for topics enabled with tiered storage.
[ https://issues.apache.org/jira/browse/KAFKA-15376?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17785956#comment-17785956 ] Kamal Chandraprakash commented on KAFKA-15376: -- [~divijvaidya] The [example|https://github.com/apache/kafka/pull/13561#discussion_r1293286722] provided in the discussion is misleading. Let's divide the example into two to navigate it easier: Assume that there are two replicas Broker A and Broker B for partition tp0: *Case-1* Both the replicas A and B are insync on startup and they hold the leader-epoch 0. Then, the brokers started to go down in ping-pong fashion. Each broker will hold the following epoch in it's leader-epoch-checkpoint file: A: 0, 2, 4, 6, 8 B: 0, 1, 3, 5, 7 Since this is unclean-leader-election, the logs of Broker A and B might be diverged. As long as anyone of them is online, they continue to serve all the records according to the leader-epoch-checkpoint file. Once both the brokers becomes online, the follower truncates itself up-to the largest common log prefix offset so that the logs won't be diverged between the leader and follower. In this case, we continue to serve the data from the remote storage as no segments will be removed due to leader-epoch-cache truncation since both of them holds the LE0. Note that the approach taken here is similar to local-log where the broker will serve the log that they have until they sync with each other. *Case-2* Both the replicas A and B are out-of-sync on startup and the follower doesn't hold leader-epoch 0. Assume that Broker A is the leader and B is the follower & doesn't hold any data about the partition (empty-disk). When the Broker A goes down, there will be offline partition and B will be elected as unclean leader, the log-end-offset of the partition will be reset back to 0. >From the example provided in the discussion: At T1, Broker A {code:java} - leader-epoch | start-offset | - 0 0 1 180 2 400 - {code} At T2, Broker B, the start-offset will be reset back to 0: (Note that the leader does not interact with remote storage to find the next offset trade-off b/w availability and durabilty) {code:java} - leader-epoch | start-offset | - 3 0 4 780 6 900 7 990 - {code} Now, if we hold the data for both the lineage and ping-pong the brokers, we will be serving the diverged data back to the client for the same fetch-offset depends on the broker which is online. Once, the replicas start to interact with each other, they truncate the remote data themselves based on the current leader epoch lineage. The example provided in the discussion is applicable only when the replicas never interacted among themselves at-least once. > Explore options of removing data earlier to the current leader's leader epoch > lineage for topics enabled with tiered storage. > - > > Key: KAFKA-15376 > URL: https://issues.apache.org/jira/browse/KAFKA-15376 > Project: Kafka > Issue Type: Task > Components: core >Reporter: Satish Duggana >Priority: Major > Fix For: 3.7.0 > > > Followup on the discussion thread: > [https://github.com/apache/kafka/pull/13561#discussion_r1288778006] > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (KAFKA-15376) Explore options of removing data earlier to the current leader's leader epoch lineage for topics enabled with tiered storage.
[ https://issues.apache.org/jira/browse/KAFKA-15376?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17785869#comment-17785869 ] Divij Vaidya commented on KAFKA-15376: -- Hey [~ckamal] I think the motivation of this ticket to determine whether there are alternative options to remove leader epoch. As an example, in current implementation, if the non-current leader epoch chain becomes current, we will end up losing data in remote. With this ticket we wanted to explore if we can choose to retain all non-current leader epoch chain (for case when one of them may become a current chain) and leave the garbage collection to RSM. You can find the context of the conversation at [https://github.com/apache/kafka/pull/13561#discussion_r1298029817] I am re-opening this ticket. Please add your thoughts here if you still think that this should be resolved. > Explore options of removing data earlier to the current leader's leader epoch > lineage for topics enabled with tiered storage. > - > > Key: KAFKA-15376 > URL: https://issues.apache.org/jira/browse/KAFKA-15376 > Project: Kafka > Issue Type: Task > Components: core >Reporter: Satish Duggana >Assignee: Satish Duggana >Priority: Major > Fix For: 3.7.0 > > > Followup on the discussion thread: > [https://github.com/apache/kafka/pull/13561#discussion_r1288778006] > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (KAFKA-15376) Explore options of removing data earlier to the current leader's leader epoch lineage for topics enabled with tiered storage.
[ https://issues.apache.org/jira/browse/KAFKA-15376?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17780815#comment-17780815 ] hudeqi commented on KAFKA-15376: Can I take over this issue? [~satish.duggana] > Explore options of removing data earlier to the current leader's leader epoch > lineage for topics enabled with tiered storage. > - > > Key: KAFKA-15376 > URL: https://issues.apache.org/jira/browse/KAFKA-15376 > Project: Kafka > Issue Type: Task > Components: core >Reporter: Satish Duggana >Priority: Major > Fix For: 3.7.0 > > > Followup on the discussion thread: > [https://github.com/apache/kafka/pull/13561#discussion_r1288778006] > -- This message was sent by Atlassian Jira (v8.20.10#820010)