Hi Justine, Thank you for the comment. Yes, that is what we do today. Let's say there's a record that has two tagged fields, x and y. If the coordinator's schema only supports field x, that will be the only field it reads.
Thanks, Jeff On Wed, Mar 29, 2023 at 12:43 PM Justine Olshan <jols...@confluent.io.invalid> wrote: > Hey Jeff, > > Going off of Alexandre's point -- if we have a tagged field in one version > but not another (or even in the same version we could have one with and one > without) -- are we simply checking if the tagged field exists in the > message before we try to read it? > I think this is the sort of thing we already do, but just wanted to > clarify. We don't need to distinguish versions since we just need to > distinguish if the tagged field is there or not. > > Thanks, > Justine > > > On Tue, Mar 28, 2023 at 3:42 PM Jeff Kim <jeff....@confluent.io.invalid> > wrote: > > > Hi Alexandre, > > > > Thanks for the feedback. > > > > 100. You're correct. I updated the KIP to explicitly state that once we > > introduce a new tagged field in a later version, that version can never > be > > downgraded to the listed versions. > > > > 101. I don't think there is a way to differentiate the two but also not > > sure if there will be a need to. > > > > Best, > > Jeff > > > > On Tue, Mar 28, 2023 at 2:13 PM Alexandre Dupriez < > > alexandre.dupr...@gmail.com> wrote: > > > > > Hi Jeff, > > > > > > Thank you for the fast answer! > > > > > > 100. Got it, I think I am starting to understand based on your example > > > exposing three Kafka versions. Please let me know if the following > > > corresponds to the correct interpretation. > > > > > > Taking OffsetCommitValue as an example, the idea is to bump the schema > > > to version 4 to make the record flexible. The schema version 4 will be > > > back-ported to versions 3.0.3, 3.1.3, 3.2.4, 3.3.3, 3.4.1 and > > > potentially 3.5. These versions will continue to serialise the record > > > with version 3 so that downgrades from there to earlier minor and/or > > > major versions are possible. And, these versions will deserialize the > > > records with version 4 so that they can support flexible records, > > > although they do not make use of tagged fields (if any is present). > > > > > > Then, in a future version, 3.x, x >= 6, a tag field will be added to > > > the record - say, topic id. Downgrade from 3.x to 3.0.3, 3.1.3, 3.2.4, > > > 3.3.3, 3.4.1 and 3.5 will be possible because these versions will > > > perform deserialisation with the record schema version 4, that is, > > > supporting tagged fields. > > > > > > Hence, this approach is both backward and forward looking and allows > > > to extend the scope of compatible versions for downgrades. > > > > > > [N] 3.x, x >= 6 --> downgradable to [I] 3.0.3, 3.1.3, 3.2.4, 3.3.3, > > > 3.4.1, 3.5 --> downgradable to e.g. [O] 3.0.0. > > > > > > One note though is that a transitive downgrade from 3.x, x >= 6, to > > > 3.0.0 via the version domain [I], will not be supported. Should it be > > > explicitly mentioned in the KIP that downgrades from 3.0.3, 3.1.3, > > > 3.2.4, 3.3.3, 3.4.1, 3.5 to earlier versions may not be possible (if > > > offsets or group metadata records were generated by a higher version > > > 3.x, x >= 6)? Or am I still misunderstanding? > > > > > > 101. I will try to provide an example to illustrate what I mean by > > > version. Consider the GroupMetadataValue schema version 4 and the > > > tagged field topicId introduced in 3.6. Let's say a new optional field > > > needs to be added in 3.7. We will have two records version 4, one with > > > topic id, the other with topic id and the new field. The new field is > > > semantically optional (structurally it is always optional since it is > > > a tagged field) but we want to make the distinction between a record > > > generated by 3.6 and one generated by 3.7. How do we resolve the > > > ambiguity? > > > > > > Thanks! > > > Alexandre > > > > > > Le mar. 28 mars 2023 à 16:13, Jeff Kim <jeff....@confluent.io.invalid> > a > > > écrit : > > > > > > > > Hi Alexandre, > > > > > > > > Thank you for taking a look! > > > > > > > > 100. I am not sure I fully understand what you mean by forcefully > > adding > > > > tagged fields. Let's say VX does not have a flexible version, > > > > VY allows deserialization but serializes with a non-flexible version, > > and > > > > VZ introduces a new tagged field. > > > > VX upgrade to VY then downgrade back to VX works because even if > group > > > > metadata changes VY will serialize with > > > > the highest non-flexible version. VZ to VY back to VZ also works > > because > > > > even if VY serializes with a non-flexible field > > > > VZ will be able to deserialize it as it is a supported version. Does > > this > > > > answer your question? > > > > > > > > 101. The future versioning scheme needs to be backward compatible > with > > > > older coordinators. Wouldn't segregating into 2 versions > > > > be incompatible? > > > > > > > > Thanks, > > > > Jeff > > > > > > > > On Tue, Mar 28, 2023 at 5:47 AM Alexandre Dupriez < > > > > alexandre.dupr...@gmail.com> wrote: > > > > > > > > > Hi Jeff, Team, > > > > > > > > > > Thank you for the KIP. This is a very interesting approach. I feel > it > > > > > is simpler than the described alternative although it comes with > > > > > tradeoffs, thanks for highlighting those. If I may, I would like to > > > > > share two naive questions. > > > > > > > > > > 100. The KIP mentions that records will be serialised with the > > highest > > > > > non-flexible version (e.g. 3 for GroupMetadataValue and > > > > > OffsetCommitValue) so that records can be deserialized with earlier > > > > > versions of Kafka. I am not sure I understand correctly: is the > idea > > > > > to forcefully add tagged fields at the end of the records while > > > > > maintaining the existing version (3 for the two record types just > > > > > mentioned) so that they can be deserialized by existing Kafka > > versions > > > > > for which the version of these record types is not known as > flexible, > > > > > while at the same time preserving the new tagged fields to new > Kafka > > > > > versions abreast of the addition of a new flexible version for > these > > > > > record types? If so, is it "bypassing" the protocol convention > which > > > > > prescribes the use of a flexible version to allow the use of tagged > > > > > fields? > > > > > > > > > > 101. After the bump of the records to a new version indicated as > > > > > flexible, the record version is expected to never change while the > > > > > underlying tagged fields could potentially still evolve over time. > > One > > > > > potential downside is that we lose the benefits of the versioning > > > > > scheme enforced by the serde protocol. Could this become a problem > in > > > > > the future if there is ever a need to segregate two distinct > > > > > "versions" of the appended record structure held by the tagged > > fields? > > > > > > > > > > Thanks, > > > > > Alexandre > > > > > > > > > > Le jeu. 23 mars 2023 à 18:15, Jeff Kim > <jeff....@confluent.io.invalid > > > > > > a > > > > > écrit : > > > > > > > > > > > > Hi Yi, > > > > > > > > > > > > > Does it mean with a flexible version, the future > > > > > > version of these value types will stay at the version where the > > > > > flexibility > > > > > > is first introduced? Will there be any need to bump the version > > > again in > > > > > > the future? > > > > > > > > > > > > Yes, there will be no need to bump the version since we will only > > be > > > > > adding > > > > > > tagged fields but in the chance the version is bumped, we will > > > > > deserialize > > > > > > to the highest known (flexible) version which will ignore unknown > > > tagged > > > > > > fields. > > > > > > > > > > > > > To enforce the version not bumping, is it possible to have a > unit > > > test > > > > > to > > > > > > gate? > > > > > > > > > > > > Do you have some tests in mind? I don't think we can tell > whether a > > > > > version > > > > > > was bumped or not. > > > > > > > > > > > > Best, > > > > > > Jeff > > > > > > > > > > > > On Thu, Mar 23, 2023 at 12:07 PM Yi Ding > > <yd...@confluent.io.invalid > > > > > > > > > wrote: > > > > > > > > > > > > > Hi Jeff, > > > > > > > > > > > > > > Thanks for the update. Does it mean with a flexible version, > the > > > future > > > > > > > version of these value types will stay at the version where the > > > > > flexibility > > > > > > > is first introduced? Will there be any need to bump the version > > > again > > > > > in > > > > > > > the future? > > > > > > > To enforce the version not bumping, is it possible to have a > unit > > > test > > > > > to > > > > > > > gate? > > > > > > > > > > > > > > > > > > > > > On Wed, Mar 22, 2023 at 3:19 PM Jeff Kim > > > <jeff....@confluent.io.invalid > > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > Hi all, > > > > > > > > > > > > > > > > After discussing with my colleagues, I have repurposed the > KIP > > > for a > > > > > > > > general downgrade solution for both transaction and group > > > > > coordinators. > > > > > > > The > > > > > > > > KIP no longer discusses the downgrade path but instead lays > out > > > the > > > > > > > > foundation for future downgrade solutions. > > > > > > > > > > > > > > > > Link: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-915%3A+Txn+and+Group+Coordinator+Downgrade+Foundation > > > > > > > > > > > > > > > > Thanks, > > > > > > > > Jeff > > > > > > > > > > > > > > > > On Mon, Mar 20, 2023 at 7:37 PM Jeff Kim < > > jeff....@confluent.io> > > > > > wrote: > > > > > > > > > > > > > > > > > Hi David and Justine, > > > > > > > > > > > > > > > > > > Thank you both for the detailed feedback. > > > > > > > > > > > > > > > > > > David, > > > > > > > > > > > > > > > > > > 1. That makes sense. I revised the "Reading new fields" > > section > > > > > with > > > > > > > how > > > > > > > > > we can downgrade to the highest known version and that this > > was > > > > > > > confirmed > > > > > > > > > via unit testing. I also attempted to dive deeper into > using > > > tagged > > > > > > > > fields > > > > > > > > > and the rejected alternative. Please let me know what you > > > think. > > > > > > > > > > > > > > > > > > 2. Under "Restrictions and Guidelines" I updated the > > paragraph > > > to > > > > > > > clearly > > > > > > > > > state that we want to use tagged fields across all record > > types > > > > > > > > introduced > > > > > > > > > in KIP-848 including OffsetCommitValue. > > > > > > > > > > > > > > > > > > 3. Would it be possible to bump the OffsetCommitValue > record > > > > > version to > > > > > > > > > make it flexible along with the changes to parse with the > > > highest > > > > > known > > > > > > > > > version? I'm not sure I understand why we cannot make both > > > changes > > > > > > > > together. > > > > > > > > > > > > > > > > > > 4. I completely missed this. Added some notes at the end of > > > > > > > "Restrictions > > > > > > > > > and Guidelines". Unfortunately I can't think of a solution > at > > > the > > > > > > > moment. > > > > > > > > > Will get back to you. > > > > > > > > > > > > > > > > > > 5. I have a section under "Identifying New Record Types" > that > > > > > discusses > > > > > > > > > this: > > > > > > > > > > We can automate the cleanup by writing tombstones when > the > > > > > > > coordinator > > > > > > > > > reads unrecognizable records. This may add duplicate work > if > > > > > tombstones > > > > > > > > > were already added but not yet pruned by the log cleaner. > > > > > > > > > This is a sure way to delete any unknown record types even > if > > > the > > > > > > > > operator > > > > > > > > > does not follow the steps. > > > > > > > > > > > > > > > > > > 6. Thanks, I have expanded on the section on transactional > > > offset > > > > > > > > commits. > > > > > > > > > As for log compaction, my understanding was that we can > > > control the > > > > > > > > process > > > > > > > > > by forcing compaction. Is my understanding incorrect? > > > > > > > > > > > > > > > > > > 7. We throw an exception if ConfigResource.type is unknown > > > which is > > > > > > > true > > > > > > > > > in our case because KIP-848 introduces a new GROUP resource > > > type. > > > > > We > > > > > > > will > > > > > > > > > need to add some sort of safeguard if the dynamic configs > are > > > not > > > > > > > deleted > > > > > > > > > before the server downgrade. I'll give you an update once I > > > update > > > > > the > > > > > > > > > section. > > > > > > > > > > > > > > > > > > Justine, > > > > > > > > > > > > > > > > > > On the transactional offset commits, I have updated the KIP > > to > > > > > reflect > > > > > > > > > your points after our offline discussion (much > appreciated). > > It > > > > > seems > > > > > > > > that > > > > > > > > > the work required to abort from the server side is fairly > big > > > and > > > > > will > > > > > > > > > require additional investigation if we are to go down this > > > path. > > > > > > > > > > > > > > > > > > As for downgrade limitations, I missed that part so thanks > > for > > > the > > > > > > > catch > > > > > > > > > (along with David). Unfortunately, the proposed design > won't > > > allow > > > > > > > > > downgrades even after the new record types are deleted > > because > > > the > > > > > > > > existing > > > > > > > > > OffsetCommitValue record is not flexible and will not be > able > > > to > > > > > parse > > > > > > > > > the topicId tagged field. I'll think more about this and > get > > > back > > > > > to > > > > > > > you. > > > > > > > > > > > > > > > > > > Best, > > > > > > > > > Jeff > > > > > > > > > > > > > > > > > > On Mon, Mar 20, 2023 at 2:16 PM Justine Olshan > > > > > > > > > <jols...@confluent.io.invalid> wrote: > > > > > > > > > > > > > > > > > >> Hey Jeff and David, > > > > > > > > >> > > > > > > > > >> Thanks for the KIP! I was also looking into this a bit > since > > > I may > > > > > > > want > > > > > > > > to > > > > > > > > >> change the record format for KIP-890 as well (finally > > > > > implementing the > > > > > > > > >> record change from KIP-360 to better support the epoch > bump. > > > This > > > > > will > > > > > > > > >> potentially be helpful for me to implement that work. > > > > > > > > >> > > > > > > > > >> I discussed some aspects with David offline. Namely, the > > > > > alternative > > > > > > > > >> solution to rewrite the records in the old format upon > > > downgrade. > > > > > One > > > > > > > > plus > > > > > > > > >> to this approach is that it should trigger compaction for > > all > > > the > > > > > > > > existing > > > > > > > > >> (new format) records. > > > > > > > > >> The main issue we ran into was how to handle ongoing > > > transactions. > > > > > > > (Ie, > > > > > > > > >> could we abort and rewrite with the old format) I think > > that's > > > > > one of > > > > > > > > the > > > > > > > > >> main downsides. I think it could potentially be possible > to > > do > > > > > this, > > > > > > > but > > > > > > > > >> we'd definitely need a server-side mechanism to abort and > > > rewrite > > > > > the > > > > > > > > >> records. > > > > > > > > >> > > > > > > > > >> I think the trouble I was running into with the current > > > solution > > > > > is > > > > > > > that > > > > > > > > >> we > > > > > > > > >> can only downgrade to a version slightly before the new > > group > > > > > > > > coordinator. > > > > > > > > >> If someone was running 3.2 and they upgrade to 3.6, but > > find a > > > > > bug in > > > > > > > > 3.4 > > > > > > > > >> (or any version between 3.5 and the one they upgraded > from), > > > then > > > > > they > > > > > > > > can > > > > > > > > >> only downgrade to 3.5. This puts us in a difficult spot. I > > > guess > > > > > in > > > > > > > this > > > > > > > > >> scenario, they will need to wait for the new format > records > > > to get > > > > > > > > cleared > > > > > > > > >> away before downgrading again. Is that correct? > > > > > > > > >> > > > > > > > > >> Thanks, > > > > > > > > >> Justine > > > > > > > > >> > > > > > > > > >> On Thu, Mar 16, 2023 at 10:41 AM David Jacot > > > > > > > > <dja...@confluent.io.invalid > > > > > > > > >> > > > > > > > > > >> wrote: > > > > > > > > >> > > > > > > > > >> > Hi Jeff, > > > > > > > > >> > > > > > > > > > >> > Thanks for the KIP! I am really glad that we are finally > > > > > addressing > > > > > > > > this > > > > > > > > >> > gap in KIP-848. I have a few general comments. > > > > > > > > >> > > > > > > > > > >> > 1. Overall, I feel like the important bits are not bold > > > enough > > > > > in > > > > > > > the > > > > > > > > >> KIP. > > > > > > > > >> > I think that it is good to explain the overall > > > upgrade/downgrade > > > > > > > > process > > > > > > > > >> > and to highlight where the issues are but I think that > the > > > core > > > > > bits > > > > > > > > >> should > > > > > > > > >> > give more details. For instance, we should explain why > > > relying > > > > > on > > > > > > > tag > > > > > > > > >> > fields works to ignore fields added in future releases. > My > > > > > > > > >> understanding is > > > > > > > > >> > that it works because the buffer for the tagged fields > is > > > > > serialized > > > > > > > > at > > > > > > > > >> the > > > > > > > > >> > end so reading with the old version, which is a prefix > of > > > the > > > > > new > > > > > > > one, > > > > > > > > >> > works. > > > > > > > > >> > > > > > > > > > >> > 2. Moreover, it would be great if we could make the > > > principle > > > > > more > > > > > > > > >> general. > > > > > > > > >> > My hope is that we can keep reusing the principles > > > introduced > > > > > in the > > > > > > > > >> KIP in > > > > > > > > >> > future releases as well. For instance, let's say that we > > > need to > > > > > > > add a > > > > > > > > >> new > > > > > > > > >> > field to one of the new records introduced by KIP-848 or > > > that we > > > > > > > will > > > > > > > > >> have > > > > > > > > >> > to introduce a new record type as well. Would it work > for > > > those > > > > > > > cases > > > > > > > > as > > > > > > > > >> > well? > > > > > > > > >> > > > > > > > > > >> > 3. Regarding enabling support for tagged fields for the > > > > > > > > >> OffsetCommitValue > > > > > > > > >> > record, it would be great if you could give more details > > on > > > the > > > > > > > steps > > > > > > > > to > > > > > > > > >> > get there in the KIP. My understanding is that we would > > > have to > > > > > do > > > > > > > the > > > > > > > > >> > following: 1) Update the code which reads the records to > > > fail > > > > > back > > > > > > > to > > > > > > > > >> the > > > > > > > > >> > highest known version if the version stored in the log > is > > > > > unknown. > > > > > > > > Let's > > > > > > > > >> > say that we do this in AK 3.5. 2) We need to turn on > > tagged > > > > > fields > > > > > > > for > > > > > > > > >> the > > > > > > > > >> > record. I think that we can only do this in AK 3.6+. > > > > > > > > >> > > > > > > > > > >> > 4. I may have missed this part but we should clearly > > > explain the > > > > > > > > >> drawback > > > > > > > > >> > of the proposed approach as well. Say that we enable > > tagged > > > > > fields > > > > > > > for > > > > > > > > >> > OffsetCommitValue in AK 3.6. This means that it won't be > > > > > possible to > > > > > > > > >> > downgrade a cluster from 3.6 to a version earlier than > > 3.5. > > > > > This is > > > > > > > a > > > > > > > > >> > significant limitation in my opinion because, I think, > > users > > > > > don't > > > > > > > > >> > necessarily upgrade to all versions. > > > > > > > > >> > > > > > > > > > >> > 5. In the proposal, it is not clear about whether the > old > > > > > software > > > > > > > > will > > > > > > > > >> > delete unknown records or not. It is true that new > records > > > will > > > > > be > > > > > > > > >> deleted > > > > > > > > >> > when the group is downgraded but this only works if the > > > operator > > > > > > > > >> respects > > > > > > > > >> > the process. > > > > > > > > >> > > > > > > > > > >> > 6. It would be great if we could extend the rejected > > > > > alternative. > > > > > > > The > > > > > > > > >> > alternative sounds clearly better when you read it so we > > > should > > > > > > > really > > > > > > > > >> > explain the reason to reject it. 1) One issue that you > > > mention > > > > > is > > > > > > > that > > > > > > > > >> the > > > > > > > > >> > log must be compacted before downgrading and we don't > > really > > > > > control > > > > > > > > >> this > > > > > > > > >> > process. 2) Transactions may be difficult to handle. I > > > suppose > > > > > that > > > > > > > it > > > > > > > > >> is > > > > > > > > >> > possible to handle them though. Have you thought about > > this? > > > > > > > > >> > > > > > > > > > >> > 7. For the new dynamic configs, what happens if they are > > > kept > > > > > and > > > > > > > the > > > > > > > > >> > quorum controller is downgraded? > > > > > > > > >> > > > > > > > > > >> > Best, > > > > > > > > >> > David > > > > > > > > >> > > > > > > > > > >> > On Thu, Mar 16, 2023 at 12:56 AM Jeff Kim > > > > > > > > <jeff....@confluent.io.invalid > > > > > > > > >> > > > > > > > > > >> > wrote: > > > > > > > > >> > > > > > > > > > >> > > Hi folks, > > > > > > > > >> > > > > > > > > > > >> > > I would like to start a discussion thread for KIP-915: > > > Next > > > > > Gen > > > > > > > > Group > > > > > > > > >> > > Coordinator Downgrade Path which proposes the > downgrade > > > > > design for > > > > > > > > the > > > > > > > > >> > new > > > > > > > > >> > > group coordinator introduced in KIP-848 > > > > > > > > >> > > < > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > > > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-848%3A+The+Next+Generation+of+the+Consumer+Rebalance+Protocol > > > > > > > > >> > > > > > > > > > > > >> > > . > > > > > > > > >> > > > > > > > > > > >> > > KIP: > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > > > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-915%3A+Next+Gen+Group+Coordinator+Downgrade+Path > > > > > > > > >> > > > > > > > > > > >> > > Thanks, > > > > > > > > >> > > Jeff > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > >