Re: [DISCISS] KIP-860: Add client-provided option to guard against unintentional replication factor change during partition reassignments

2022-09-09 Thread Stanislav Kozlovski
Thanks Ismael, I added an extra paragraph in the motivation. We have certainly hit this within our internal Confluent reassignment software and from a quick skim in the popular Cruise Control repository, I notice that similar problems have been hit there too. Hopefully the examples in the KIP are

Re: [DISCISS] KIP-860: Add client-provided option to guard against unintentional replication factor change during partition reassignments

2022-09-07 Thread Ismael Juma
Thanks for the details, Colin. I understand how this can happen. But this API has been out for a long time. Are we saying that we have seen Cruise Control cause this kind of problem? If so, it would be good to mention it in the KIP as evidence that the current approach is brittle. Ismael On Wed,

Re: [DISCISS] KIP-860: Add client-provided option to guard against unintentional replication factor change during partition reassignments

2022-09-07 Thread Colin McCabe
Hi Ismael, I think this issue comes up when people write software that automatically creates partition reassignments to balance the cluster. Cruise Control is one example; Confluent also has some software that does this. If there is already a reassignment that is going on for some partition

Re: [DISCISS] KIP-860: Add client-provided option to guard against unintentional replication factor change during partition reassignments

2022-09-07 Thread Ismael Juma
Thanks for the KIP. Can we explain a bit more why this is an important use case to address? For example, do we have concrete examples of people running into this? The way the KIP is written, it sounds like a potential problem but no information is given on whether it's a real problem in practice.

Re: [DISCISS] KIP-860: Add client-provided option to guard against unintentional replication factor change during partition reassignments

2022-08-22 Thread Stanislav Kozlovski
Thanks David, I do prefer the `disallow-replication-factor-change` flag but only for the CLI. I assume that's what you're proposing instead of "disable-replication-factor-change". The wording is more natural in your suggestion I feel. If we were to modify more (e.g RPC, Admin API), I think it'd

Re: [DISCISS] KIP-860: Add client-provided option to guard against unintentional replication factor change during partition reassignments

2022-08-16 Thread David Jacot
Thanks Stan. Overall, the KIP looks good to me. I have two minor comments: * Could we document the different versions in the AlterPartitionReassignmentsRequest/Request schema? You can look at other requests/responses to see how we have done this so far. * I wonder if

Re: [DISCISS] KIP-860: Add client-provided option to guard against unintentional replication factor change during partition reassignments

2022-08-15 Thread Stanislav Kozlovski
Thanks for the discussion all, I have updated the KIP to mention throwing an UnsupportedVersionException if the server is running an old version that would not honor the configured allowReplicationFactor option. Please take a look: - KIP:

Re: [DISCISS] KIP-860: Add client-provided option to guard against unintentional replication factor change during partition reassignments

2022-08-09 Thread David Jacot
Throwing an UnsupportedVersionException with an appropriate message seems to be the best option when the new API is not supported and AllowReplicationFactorChange is not set to the default value. Cheers, David On Mon, Aug 8, 2022 at 6:25 PM Vikas Singh wrote: > > I personally like the UVE

Re: [DISCISS] KIP-860: Add client-provided option to guard against unintentional replication factor change during partition reassignments

2022-08-08 Thread Vikas Singh
I personally like the UVE option. It provides options for clients to go either way, retry or abort. If we do it in AdminClient, then users have to live with what we have chosen. > Note this can happen during an RF change too. e.g [1,2,3] => [4,5,6,7] (RF > change, intermediate set is

Re: [DISCISS] KIP-860: Add client-provided option to guard against unintentional replication factor change during partition reassignments

2022-08-07 Thread Stanislav Kozlovski
Thank you for the reviews. Vikas, > > In the case of an already-reassigning partition being reassigned again, the validation compares the targetReplicaSet size of the reassignment to the targetReplicaSet size of the new reassignment and throws if those differ. > Can you add more detail to this,

Re: [DISCISS] KIP-860: Add client-provided option to guard against unintentional replication factor change during partition reassignments

2022-08-05 Thread Colin McCabe
Hi Stanislav, Thanks for the KIP. I think this is a nice solution to the problem of not wanting to change the replication factor during reassignments. Just from a writing point of view, it would be nice for the first paragraph to be a bit more explicit about this goal. Maybe lead with "Many

RE: [DISCISS] KIP-860: Add client-provided option to guard against unintentional replication factor change during partition reassignments

2022-08-05 Thread Daniel Gospodinov
Hi Stanislav, Thanks for offering a solution to reduce race conditions in Kafka. I went through your Wiki and your proposal sounds reasonable to me. I like it. Best, Daniel On 2022/07/28 08:59:18 Stanislav Kozlovski wrote: > Hey all, > > I'd like to start a discussion on a proposal to help

Re: [DISCISS] KIP-860: Add client-provided option to guard against unintentional replication factor change during partition reassignments

2022-08-04 Thread Vikas Singh
Thanks Stanislav for the KIP. Seems like a reasonable proposal, preventing users from accidentally altering the replica set under certain conditions. I have couple of comments: > In the case of an already-reassigning partition being reassigned again, the validation compares the targetReplicaSet

[DISCISS] KIP-860: Add client-provided option to guard against unintentional replication factor change during partition reassignments

2022-07-28 Thread Stanislav Kozlovski
Hey all, I'd like to start a discussion on a proposal to help API users from inadvertently increasing the replication factor of a topic through the alter partition reassignments API. The KIP describes two fairly easy-to-hit race conditions in which this can happen. The KIP itself is pretty