Re: [DISCUSS] KIP-704: Send a hint to broker if it is an unclean leader

2022-02-07 Thread José Armando García Sancio
You are correct Raman. I updated the KIP to reflect your observations and corrections. Raman wrote: > I think the line #2 here is wrong with regard to AlterPartitionRequest. Yes. Bullet 2. Now reads: 2. The LeaderRecoveryState is changing from RECOVERED to RECOVERING. Raman wrote: > Also, in

Re: [DISCUSS] KIP-704: Send a hint to broker if it is an unclean leader

2022-02-07 Thread Raman Verma
Also, in the response schema, I think the last line is a bit misleading. As we discussed offline, the field can be either Recovering or Recovered, depending on when the leader sent out an AlterPartition request to the controller (i.e. during recovery or after recovery has completed). But the

Re: [DISCUSS] KIP-704: Send a hint to broker if it is an unclean leader

2022-02-07 Thread Raman Verma
Jose, I think the line #2 here is wrong with regard to AlterPartitionRequest. ``` Request Handling The controller will validate the LeaderRecoveryState field and return an INVALID_REQUEST error when: 1. The size of the ISR is greater than 1 and the LeaderRecoveryState is RECOVERING. 2. The

Re: [DISCUSS] KIP-704: Send a hint to broker if it is an unclean leader

2022-02-04 Thread José Armando García Sancio
David Jacot wrote: > The behavior of the leader is clear. However, the part which is > not clear to me is how followers which could get fetch requests > from consumers as well will handle them. Sorry if I was not clear. Got it. I updated the KIP to add more information regarding how the topic

Re: [DISCUSS] KIP-704: Send a hint to broker if it is an unclean leader

2022-02-03 Thread David Jacot
Hi José, Thanks for your reply. The behavior of the leader is clear. However, the part which is not clear to me is how followers which could get fetch requests from consumers as well will handle them. Sorry if I was not clear. Best, David On Thu, Feb 3, 2022 at 5:47 AM José Armando García

Re: [DISCUSS] KIP-704: Send a hint to broker if it is an unclean leader

2022-02-02 Thread José Armando García Sancio
Thanks for the feedback David Jacot David Jacot wrote: > I have one question regarding how fetch from followers will > work when the leader is recovering. My understanding is that > the leader will reject any produce and fetch requests with a > NOT_LEADER_OR_FOLLOWER error while the followers >

Re: [DISCUSS] KIP-704: Send a hint to broker if it is an unclean leader

2022-02-02 Thread David Jacot
Hi José, Thanks for the KIP. I have one question regarding how fetch from followers will work when the leader is recovering. My understanding is that the leader will reject any produce and fetch requests with a NOT_LEADER_OR_FOLLOWER error while the followers will fence any fetch requests based

Re: [DISCUSS] KIP-704: Send a hint to broker if it is an unclean leader

2022-01-28 Thread José Armando García Sancio
Hi all, Jason and I discussed this offline. At a high-level I have made the following changes to the KIP. 1. IBP will be used to enable this feature and to determine which version of LeaderAndIsr and AlterPartition will be used. 2. The LeaderRecoveryState field for LeaderAndIsr and

Re: [DISCUSS] KIP-704: Send a hint to broker if it is an unclean leader

2022-01-27 Thread José Armando García Sancio
Hi Jason, Jason wrote: > Thanks for the updates. I noticed that `LeaderRecoveryState` is marked as > ignorable in the `AlterPartition` request. It would be helpful to > understand the motivation for that. I think it is fine for this property to be marked as ignorable because the property is

Re: [DISCUSS] KIP-704: Send a hint to broker if it is an unclean leader

2022-01-26 Thread Jason Gustafson
Hey Jose, Thanks for the updates. I noticed that `LeaderRecoveryState` is marked as ignorable in the `AlterPartition` request. It would be helpful to understand the motivation for that. Thanks, Jason On Wed, Jan 26, 2022 at 10:25 AM Colin McCabe wrote: > On Wed, Jan 26, 2022, at 09:14, José

Re: [DISCUSS] KIP-704: Send a hint to broker if it is an unclean leader

2022-01-26 Thread Colin McCabe
On Wed, Jan 26, 2022, at 09:14, José Armando García Sancio wrote: > Thanks for the feedback Colin. > > Colin wrote: >> We already have many classes that are called "partition state." For example, >> PartitionStates.java on the client side, PartitionStateMachine.scala and >>

Re: [DISCUSS] KIP-704: Send a hint to broker if it is an unclean leader

2022-01-26 Thread José Armando García Sancio
Thanks for the feedback Colin. Colin wrote: > We already have many classes that are called "partition state." For example, > PartitionStates.java on the client side, PartitionStateMachine.scala and > TopicPartitionStateZNode in the old controller, > RemotePartitionDeleteState.java under

Re: [DISCUSS] KIP-704: Send a hint to broker if it is an unclean leader

2022-01-25 Thread Colin McCabe
On Fri, Jan 21, 2022, at 11:07, José Armando García Sancio wrote: > Hi all, > > The following suggestions are not strictly required to implement this > KIP but what do we think about: > > 1. Changing the name of the AlterIsr RPC to AlterPartition RPC. > > 2. Change the name of the field

Re: [DISCUSS] KIP-704: Send a hint to broker if it is an unclean leader

2022-01-24 Thread José Armando García Sancio
Thanks for the additional context regarding AlterIsrResponse. Jason wrote: > In regard to the naming of `IsLeaderRecovering`, I agree it still seems a > bit awkward. I kind of liked the idea of turning it into a `PartitionState` > field instead. That would also address the inconsistent type in

Re: [DISCUSS] KIP-704: Send a hint to broker if it is an unclean leader

2022-01-21 Thread Jason Gustafson
For a little background on why `AlterIsr` returns the state back in the response, originally the idea was that the partition leader could use the response to reset its own state after a failed request. The tricky thing for ISR changes is always ensuring that the partition leader reflects a

Re: [DISCUSS] KIP-704: Send a hint to broker if it is an unclean leader

2022-01-21 Thread José Armando García Sancio
Hi all, The following suggestions are not strictly required to implement this KIP but what do we think about: 1. Changing the name of the AlterIsr RPC to AlterPartition RPC. 2. Change the name of the field "CurrentIsrVersion" to "PartitionEpoch". This is the name that we use in the KRaft

Re: [DISCUSS] KIP-704: Send a hint to broker if it is an unclean leader

2022-01-21 Thread José Armando García Sancio
Thanks Raman and Colin for your feedback. Raman wrote: > - Could you please explain the following about backward compatibility. > If a leader has been elected unclean. And we decide to roll the > cluster back when the leader is in the middle of recovery, leader will > simply not be able to

Re: [DISCUSS] KIP-704: Send a hint to broker if it is an unclean leader

2022-01-20 Thread Colin McCabe
Hi José, Thanks for the changes. "isLeaderRecovering" sounds pretty awkward. If we want to call this "leader recovery" then maybe the flag could be something like "inLeaderRecovery." Actually, how about "inElectionRecovery" to emphasize the fact that we are recovering from an unclean leader

Re: [DISCUSS] KIP-704: Send a hint to broker if it is an unclean leader

2022-01-20 Thread Raman Verma
Jose, Is there a specific need for adding "isUnclean '' flag to `AlterIsrResponse`. A potential sequence of events will be: 1. Controller sets the flag at `ZK` and informs the leader via the `LeaderAndIsrRequest` 2. Leader undertakes the necessary recovery and sends out a `AlterIsrRequest` to

Re: [DISCUSS] KIP-704: Send a hint to broker if it is an unclean leader

2022-01-19 Thread Raman Verma
Thanks for this KIP, Jose - Could you please explain the following about backward compatibility. If a leader has been elected unclean. And we decide to roll the cluster back when the leader is in the middle of recovery, leader will simply not be able to recover when we roll back because it will

Re: [DISCUSS] KIP-704: Send a hint to broker if it is an unclean leader

2022-01-19 Thread José Armando García Sancio
Hi all, I made the following changes to the KIP: https://cwiki.apache.org/confluence/pages/diffpagesbyversion.action?pageId=173082256=12=11 Some of the highlights are: 1. Changed the field from IsUnclean to IsLeaderRecovering 2. Added a few more sentences explaining why this KIP is backward

Re: [DISCUSS] KIP-704: Send a hint to broker if it is an unclean leader

2022-01-19 Thread José Armando García Sancio
Thanks for the feedback Colin and Luke. Colin wrote: > The KIP talks a bit about "recovery," which is a new term (as far as I > know). If I understand correctly, this is a state that the partition enters > into after an unclean leader election. I would suggest using a different > term for this,

Re: [DISCUSS] KIP-704: Send a hint to broker if it is an unclean leader

2022-01-19 Thread Luke Chen
Hi José, Thanks for the KIP. It's indeed a problem that we should handle it well! Just a minor comment: In the `PartitionChangeRecord` schema, you set the `isUnclean` field as type boolean, but the value (-1, 0, 1) are all int values. Did you mean int32 type or the value should be updated?

Re: [DISCUSS] KIP-704: Send a hint to broker if it is an unclean leader

2022-01-18 Thread Colin McCabe
Hi José, Thanks for the KIP. The KIP talks a bit about "recovery," which is a new term (as far as I know). If I understand correctly, this is a state that the partition enters into after an unclean leader election. I would suggest using a different term for this, since we already use the term

Re: [DISCUSS] KIP-704: Send a hint to broker if it is an unclean leader

2022-01-17 Thread José Armando García Sancio
Jose wrote: > I'll update the KIP with this information. The leader will return > "NOT_LEADER_OR_FOLLOWER" for any partition that is still recovering > for Fetch, Produce, OffsetsForLeaderEpoch and DeleteRecords requests. > This error type is retriable by the clients. I forgot to include

Re: [DISCUSS] KIP-704: Send a hint to broker if it is an unclean leader

2022-01-17 Thread José Armando García Sancio
Thanks Jason and David for your feedback. See my comments below. David wrote: > 1) Does recovering from an unclean state bump the leader epoch? Looking at the controller code, the leader epoch is only increased if the leader id changes. David wrote: > 2) The name of "NewIsUnclean" field in

Re: [DISCUSS] KIP-704: Send a hint to broker if it is an unclean leader

2022-01-12 Thread David Arthur
Jose, thanks for the KIP! 1) Does recovering from an unclean state bump the leader epoch? 2) The name of "NewIsUnclean" field in AlterIsrRequest is a little strange. >From the description, it sounds like this will be used to by the broker to indicate to the controller that it has recovered from

Re: [DISCUSS] KIP-704: Send a hint to broker if it is an unclean leader

2022-01-11 Thread Jason Gustafson
Hi Jose, Thanks for the KIP. Just a minor question about this: > This means that the leader will not allow followers to join the ISR until it has recovered from the unclean leader election. If I understand correctly, the main reason for this is to avoid the need to propagate the "IsUnclean"

[DISCUSS] KIP-704: Send a hint to broker if it is an unclean leader

2022-01-10 Thread José Armando García Sancio
Hi all, I would like to open the discussion on implementing "KIP-704: Send a hint to broker if it is an unclean leader." See this wiki page for details: https://cwiki.apache.org/confluence/x/kAZRCg Thanks! -- -Jose