Thanks Stan. Overall, the KIP looks good to me. I have two minor comments: * Could we document the different versions in the AlterPartitionReassignmentsRequest/Request schema? You can look at other requests/responses to see how we have done this so far.
* I wonder if --disallow-replication-factor-change would be a better name. I don't feel strong about this so I am happy to go with the quorum here. Best, David On Tue, Aug 16, 2022 at 12:31 AM Stanislav Kozlovski <stanis...@confluent.io.invalid> wrote: > > Thanks for the discussion all, > > I have updated the KIP to mention throwing an UnsupportedVersionException > if the server is running an old version that would not honor the configured > allowReplicationFactor option. > > Please take a look: > - KIP: > https://cwiki.apache.org/confluence/display/KAFKA/KIP-860%3A+Add+client-provided+option+to+guard+against+replication+factor+change+during+partition+reassignments > - changes: > https://cwiki.apache.org/confluence/pages/diffpagesbyversion.action?pageId=217392873&selectedPageVersions=4&selectedPageVersions=3 > > If there aren't extra comments, I plan on starting a vote thread by the end > of this week. > > Best, > Stanislav > > On Tue, Aug 9, 2022 at 5:06 AM David Jacot <dja...@confluent.io.invalid> > wrote: > > > Throwing an UnsupportedVersionException with an appropriate message > > seems to be the best option when the new API is not supported and > > AllowReplicationFactorChange is not set to the default value. > > > > Cheers, > > David > > > > On Mon, Aug 8, 2022 at 6:25 PM Vikas Singh <vi...@confluent.io.invalid> > > wrote: > > > > > > I personally like the UVE option. It provides options for clients to go > > > either way, retry or abort. If we do it in AdminClient, then users have > > to > > > live with what we have chosen. > > > > > > > Note this can happen during an RF change too. e.g [1,2,3] => [4,5,6,7] > > (RF > > > > change, intermediate set is [1,2,3,4,5,6,7]), and we try to do a > > > > reassignment to [9,10,11], the logic will compare [4,5,6,7] to > > [9,10,11]. > > > > In such a situation where one wants to cancel the RF increase and > > reassign > > > > again, one first needs to cancel the existing reassignment via the API > > (no > > > > special action required despite RF change) > > > > > > Thanks for the explanation. I did realize this nuance and thus requested > > to > > > put that in KIP as it's not mentioned why the choice was made. I am fine > > if > > > you choose to not do it in the interest of brevity. > > > > > > Vikas > > > > > > On Sun, Aug 7, 2022 at 9:02 AM Stanislav Kozlovski > > > <stanis...@confluent.io.invalid> wrote: > > > > > > > Thank you for the reviews. > > > > > > > > Vikas, > > > > > > In the case of an already-reassigning partition being reassigned > > again, > > > > the validation compares the targetReplicaSet size of the reassignment > > to > > > > the targetReplicaSet size of the new reassignment and throws if those > > > > differ. > > > > > Can you add more detail to this, or clarify what is targetReplicaSet > > (for > > > > e.g. why not sourceReplicaSet?) and how the target replica set will be > > > > calculated? > > > > If a reassignment is ongoing, such that [1,2,3] => [4,5,6] (the > > replica set > > > > in Kafka will be [1,2,3,4,5,6] during the reassignment), and you try to > > > > issue a new reassignment (e.g [7,8,9], Kafka should NOT think that the > > RF > > > > of the partition is 6 just because a reassignment is ongoing. Hence, we > > > > compare [4,5,6]'s length to [7,8,9] > > > > The targetReplicaSet is a term we use in KIP-455 > > > > < > > > > > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-455%3A+Create+an+Administrative+API+for+Replica+Reassignment > > > > >. > > > > It means the desired replica set that a given reassignment is trying to > > > > achieve. Here we compare said set of the on-going reassignment to the > > new > > > > reassignment. > > > > > > > > Note this can happen during an RF change too. e.g [1,2,3] => [4,5,6,7] > > (RF > > > > change, intermediate set is [1,2,3,4,5,6,7]), and we try to do a > > > > reassignment to [9,10,11], the logic will compare [4,5,6,7] to > > [9,10,11]. > > > > In such a situation where one wants to cancel the RF increase and > > reassign > > > > again, one first needs to cancel the existing reassignment via the API > > (no > > > > special action required despite RF change) > > > > > > > > > > > > > And what about the reassign partitions CLI? Do we want to expose the > > > > option there too? > > > > Yes, this is already present in the KIP if I'm not mistaken. We > > describe it > > > > in "Accordingly, the kafka-reassign-partitions.sh tool will be updated > > to > > > > allow supplying the new option:" > > > > I have edited the KIP to contain two clear paragraphs called Admin API > > and > > > > CLI now. > > > > > > > > Colin, > > > > > > > > > it would be nice for the first paragraph to be a bit more explicit > > about > > > > this goal. > > > > sounds good, updated it with that suggestion. > > > > > > > > > client-side forward compatibility > > > > I was under the assumption that it is not recommended to upgrade > > clients > > > > before brokers, but a quick search cleared it up to me that we're > > pretty > > > > intentional about allowing that > > > > < > > > > > > https://www.confluent.io/blog/upgrading-apache-kafka-clients-just-got-easier/ > > > > > > > > > . > > > > Do you happen to know if we have any policy on client-side forward > > > > compatibility with regard to such things -- extending "write" APIs > > (that > > > > mutate the state) with fields that conditionally limit that > > modification? > > > > It seems like a rare use case to me, hence renaming it to something > > like > > > > tryDisableReplicationFactorChange may unnecessary impair the API. > > > > > > > > Would Admin API documentation that says "this is supported only from > > > > brokers on version X and above. If not supported, the default behavior > > (no > > > > replication factor guards) is applied" be sufficient? > > > > > > > > Best, > > > > Stanislav > > > > > > > > > > > > On Fri, Aug 5, 2022 at 8:32 PM Colin McCabe <cmcc...@apache.org> > > wrote: > > > > > > > > > Hi Stanislav, > > > > > > > > > > Thanks for the KIP. I think this is a nice solution to the problem > > of not > > > > > wanting to change the replication factor during reassignments. > > > > > > > > > > Just from a writing point of view, it would be nice for the first > > > > > paragraph to be a bit more explicit about this goal. Maybe lead with > > > > "Many > > > > > times, we don't want to change the replication factor of a partition > > > > during > > > > > reassignment..." As it is, we're talking about metadata races before > > > > we've > > > > > even explained what the goal is that the metadata races are > > thwarting. :) > > > > > > > > > > I like the RPC and command-line format changes; they are well-done > > and > > > > > well-written. One thing we do need to spell out, those, is what the > > > > > behavior is when the server does not support this new option. The > > > > simplest > > > > > thing to do would be for the client to throw > > UnsupportedVersionException > > > > > with an exception message indicating what the problem is. Then the > > caller > > > > > could catch this and re-try the call without the flag (or give up, as > > > > > appropriate?) > > > > > > > > > > The other option is to continue on but not actually protect > > replication > > > > > factor. If we do this, at minimum we'd need to rename the flag > > something > > > > > like "try to protect replication factor" to make it clear that it's > > > > > best-effort. > > > > > > > > > > It's sort of debatable which way is better. In principle the UVE > > sounds > > > > > nicer, but in practice maybe the other behavior is best? I suspect > > most > > > > > systems would turn around and retry without the flag in the event of > > a > > > > > UVE... > > > > > > > > > > best, > > > > > Colin > > > > > > > > > > > > > > > On Thu, Aug 4, 2022, at 13:37, Vikas Singh wrote: > > > > > > Thanks Stanislav for the KIP. Seems like a reasonable proposal, > > > > > > preventing users from accidentally altering the replica set under > > > > certain > > > > > > conditions. I have couple of comments: > > > > > > > > > > > > > > > > > >> In the case of an already-reassigning partition being reassigned > > > > again, > > > > > > the validation compares the targetReplicaSet size of the > > reassignment > > > > to > > > > > > the targetReplicaSet size of the new reassignment and throws if > > those > > > > > > differ. > > > > > > Can you add more detail to this, or clarify what is > > targetReplicaSet > > > > (for > > > > > > e.g. why not sourceReplicaSet?) and how the target replica set > > will be > > > > > > calculated? > > > > > > > > > > > > And what about the reassign partitions CLI? Do we want to expose > > the > > > > > option > > > > > > there too? > > > > > > > > > > > > Cheers, > > > > > > Vikas > > > > > > > > > > > > On Thu, Jul 28, 2022 at 1:59 AM Stanislav Kozlovski < > > > > > stanis...@confluent.io> > > > > > > wrote: > > > > > > > > > > > >> Hey all, > > > > > >> > > > > > >> I'd like to start a discussion on a proposal to help API users > > from > > > > > >> inadvertently increasing the replication factor of a topic through > > > > > >> the alter partition reassignments API. The KIP describes two > > fairly > > > > > >> easy-to-hit race conditions in which this can happen. > > > > > >> > > > > > >> The KIP itself is pretty simple, yet has a couple of alternatives > > that > > > > > can > > > > > >> help solve the same problem. I would appreciate thoughts from the > > > > > community > > > > > >> on how you think we should proceed, and whether the proposal makes > > > > > sense in > > > > > >> the first place. > > > > > >> > > > > > >> Thanks! > > > > > >> > > > > > >> KIP: > > > > > >> > > > > > > > > > > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-860%3A+Add+client-provided+option+to+guard+against+replication+factor+change+during+partition+reassignments > > > > > >> JIRA: https://issues.apache.org/jira/browse/KAFKA-14121 > > > > > >> > > > > > >> -- > > > > > >> Best, > > > > > >> Stanislav > > > > > >> > > > > > > > > > > > > > > > > > -- > > > > Best, > > > > Stanislav > > > > > > > > > -- > Best, > Stanislav