Thanks for the discussion all, I have updated the KIP to mention throwing an UnsupportedVersionException if the server is running an old version that would not honor the configured allowReplicationFactor option.
Please take a look: - KIP: https://cwiki.apache.org/confluence/display/KAFKA/KIP-860%3A+Add+client-provided+option+to+guard+against+replication+factor+change+during+partition+reassignments - changes: https://cwiki.apache.org/confluence/pages/diffpagesbyversion.action?pageId=217392873&selectedPageVersions=4&selectedPageVersions=3 If there aren't extra comments, I plan on starting a vote thread by the end of this week. Best, Stanislav On Tue, Aug 9, 2022 at 5:06 AM David Jacot <dja...@confluent.io.invalid> wrote: > Throwing an UnsupportedVersionException with an appropriate message > seems to be the best option when the new API is not supported and > AllowReplicationFactorChange is not set to the default value. > > Cheers, > David > > On Mon, Aug 8, 2022 at 6:25 PM Vikas Singh <vi...@confluent.io.invalid> > wrote: > > > > I personally like the UVE option. It provides options for clients to go > > either way, retry or abort. If we do it in AdminClient, then users have > to > > live with what we have chosen. > > > > > Note this can happen during an RF change too. e.g [1,2,3] => [4,5,6,7] > (RF > > > change, intermediate set is [1,2,3,4,5,6,7]), and we try to do a > > > reassignment to [9,10,11], the logic will compare [4,5,6,7] to > [9,10,11]. > > > In such a situation where one wants to cancel the RF increase and > reassign > > > again, one first needs to cancel the existing reassignment via the API > (no > > > special action required despite RF change) > > > > Thanks for the explanation. I did realize this nuance and thus requested > to > > put that in KIP as it's not mentioned why the choice was made. I am fine > if > > you choose to not do it in the interest of brevity. > > > > Vikas > > > > On Sun, Aug 7, 2022 at 9:02 AM Stanislav Kozlovski > > <stanis...@confluent.io.invalid> wrote: > > > > > Thank you for the reviews. > > > > > > Vikas, > > > > > In the case of an already-reassigning partition being reassigned > again, > > > the validation compares the targetReplicaSet size of the reassignment > to > > > the targetReplicaSet size of the new reassignment and throws if those > > > differ. > > > > Can you add more detail to this, or clarify what is targetReplicaSet > (for > > > e.g. why not sourceReplicaSet?) and how the target replica set will be > > > calculated? > > > If a reassignment is ongoing, such that [1,2,3] => [4,5,6] (the > replica set > > > in Kafka will be [1,2,3,4,5,6] during the reassignment), and you try to > > > issue a new reassignment (e.g [7,8,9], Kafka should NOT think that the > RF > > > of the partition is 6 just because a reassignment is ongoing. Hence, we > > > compare [4,5,6]'s length to [7,8,9] > > > The targetReplicaSet is a term we use in KIP-455 > > > < > > > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-455%3A+Create+an+Administrative+API+for+Replica+Reassignment > > > >. > > > It means the desired replica set that a given reassignment is trying to > > > achieve. Here we compare said set of the on-going reassignment to the > new > > > reassignment. > > > > > > Note this can happen during an RF change too. e.g [1,2,3] => [4,5,6,7] > (RF > > > change, intermediate set is [1,2,3,4,5,6,7]), and we try to do a > > > reassignment to [9,10,11], the logic will compare [4,5,6,7] to > [9,10,11]. > > > In such a situation where one wants to cancel the RF increase and > reassign > > > again, one first needs to cancel the existing reassignment via the API > (no > > > special action required despite RF change) > > > > > > > > > > And what about the reassign partitions CLI? Do we want to expose the > > > option there too? > > > Yes, this is already present in the KIP if I'm not mistaken. We > describe it > > > in "Accordingly, the kafka-reassign-partitions.sh tool will be updated > to > > > allow supplying the new option:" > > > I have edited the KIP to contain two clear paragraphs called Admin API > and > > > CLI now. > > > > > > Colin, > > > > > > > it would be nice for the first paragraph to be a bit more explicit > about > > > this goal. > > > sounds good, updated it with that suggestion. > > > > > > > client-side forward compatibility > > > I was under the assumption that it is not recommended to upgrade > clients > > > before brokers, but a quick search cleared it up to me that we're > pretty > > > intentional about allowing that > > > < > > > > https://www.confluent.io/blog/upgrading-apache-kafka-clients-just-got-easier/ > > > > > > > . > > > Do you happen to know if we have any policy on client-side forward > > > compatibility with regard to such things -- extending "write" APIs > (that > > > mutate the state) with fields that conditionally limit that > modification? > > > It seems like a rare use case to me, hence renaming it to something > like > > > tryDisableReplicationFactorChange may unnecessary impair the API. > > > > > > Would Admin API documentation that says "this is supported only from > > > brokers on version X and above. If not supported, the default behavior > (no > > > replication factor guards) is applied" be sufficient? > > > > > > Best, > > > Stanislav > > > > > > > > > On Fri, Aug 5, 2022 at 8:32 PM Colin McCabe <cmcc...@apache.org> > wrote: > > > > > > > Hi Stanislav, > > > > > > > > Thanks for the KIP. I think this is a nice solution to the problem > of not > > > > wanting to change the replication factor during reassignments. > > > > > > > > Just from a writing point of view, it would be nice for the first > > > > paragraph to be a bit more explicit about this goal. Maybe lead with > > > "Many > > > > times, we don't want to change the replication factor of a partition > > > during > > > > reassignment..." As it is, we're talking about metadata races before > > > we've > > > > even explained what the goal is that the metadata races are > thwarting. :) > > > > > > > > I like the RPC and command-line format changes; they are well-done > and > > > > well-written. One thing we do need to spell out, those, is what the > > > > behavior is when the server does not support this new option. The > > > simplest > > > > thing to do would be for the client to throw > UnsupportedVersionException > > > > with an exception message indicating what the problem is. Then the > caller > > > > could catch this and re-try the call without the flag (or give up, as > > > > appropriate?) > > > > > > > > The other option is to continue on but not actually protect > replication > > > > factor. If we do this, at minimum we'd need to rename the flag > something > > > > like "try to protect replication factor" to make it clear that it's > > > > best-effort. > > > > > > > > It's sort of debatable which way is better. In principle the UVE > sounds > > > > nicer, but in practice maybe the other behavior is best? I suspect > most > > > > systems would turn around and retry without the flag in the event of > a > > > > UVE... > > > > > > > > best, > > > > Colin > > > > > > > > > > > > On Thu, Aug 4, 2022, at 13:37, Vikas Singh wrote: > > > > > Thanks Stanislav for the KIP. Seems like a reasonable proposal, > > > > > preventing users from accidentally altering the replica set under > > > certain > > > > > conditions. I have couple of comments: > > > > > > > > > > > > > > >> In the case of an already-reassigning partition being reassigned > > > again, > > > > > the validation compares the targetReplicaSet size of the > reassignment > > > to > > > > > the targetReplicaSet size of the new reassignment and throws if > those > > > > > differ. > > > > > Can you add more detail to this, or clarify what is > targetReplicaSet > > > (for > > > > > e.g. why not sourceReplicaSet?) and how the target replica set > will be > > > > > calculated? > > > > > > > > > > And what about the reassign partitions CLI? Do we want to expose > the > > > > option > > > > > there too? > > > > > > > > > > Cheers, > > > > > Vikas > > > > > > > > > > On Thu, Jul 28, 2022 at 1:59 AM Stanislav Kozlovski < > > > > stanis...@confluent.io> > > > > > wrote: > > > > > > > > > >> Hey all, > > > > >> > > > > >> I'd like to start a discussion on a proposal to help API users > from > > > > >> inadvertently increasing the replication factor of a topic through > > > > >> the alter partition reassignments API. The KIP describes two > fairly > > > > >> easy-to-hit race conditions in which this can happen. > > > > >> > > > > >> The KIP itself is pretty simple, yet has a couple of alternatives > that > > > > can > > > > >> help solve the same problem. I would appreciate thoughts from the > > > > community > > > > >> on how you think we should proceed, and whether the proposal makes > > > > sense in > > > > >> the first place. > > > > >> > > > > >> Thanks! > > > > >> > > > > >> KIP: > > > > >> > > > > > > > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-860%3A+Add+client-provided+option+to+guard+against+replication+factor+change+during+partition+reassignments > > > > >> JIRA: https://issues.apache.org/jira/browse/KAFKA-14121 > > > > >> > > > > >> -- > > > > >> Best, > > > > >> Stanislav > > > > >> > > > > > > > > > > > > > -- > > > Best, > > > Stanislav > > > > -- Best, Stanislav