Hi Jeff,

Thanks for the KIP! I am really glad that we are finally addressing this
gap in KIP-848. I have a few general comments.

1. Overall, I feel like the important bits are not bold enough in the KIP.
I think that it is good to explain the overall upgrade/downgrade process
and to highlight where the issues are but I think that the core bits should
give more details. For instance, we should explain why relying on tag
fields works to ignore fields added in future releases. My understanding is
that it works because the buffer for the tagged fields is serialized at the
end so reading with the old version, which is a prefix of the new one,
works.

2. Moreover, it would be great if we could make the principle more general.
My hope is that we can keep reusing the principles introduced in the KIP in
future releases as well. For instance, let's say that we need to add a new
field to one of the new records introduced by KIP-848 or that we will have
to introduce a new record type as well. Would it work for those cases as
well?

3. Regarding enabling support for tagged fields for the OffsetCommitValue
record, it would be great if you could give more details on the steps to
get there in the KIP. My understanding is that we would have to do the
following: 1) Update the code which reads the records to fail back to the
highest known version if the version stored in the log is unknown. Let's
say that we do this in AK 3.5. 2) We need to turn on tagged fields for the
record. I think that we can only do this in AK 3.6+.

4. I may have missed this part but we should clearly explain the drawback
of the proposed approach as well. Say that we enable tagged fields for
OffsetCommitValue in AK 3.6. This means that it won't be possible to
downgrade a cluster from 3.6 to a version earlier than 3.5. This is a
significant limitation in my opinion because, I think, users don't
necessarily upgrade to all versions.

5. In the proposal, it is not clear about whether the old software will
delete unknown records or not. It is true that new records will be deleted
when the group is downgraded but this only works if the operator respects
the process.

6. It would be great if we could extend the rejected alternative. The
alternative sounds clearly better when you read it so we should really
explain the reason to reject it. 1) One issue that you mention is that the
log must be compacted before downgrading and we don't really control this
process. 2) Transactions may be difficult to handle. I suppose that it is
possible to handle them though. Have you thought about this?

7. For the new dynamic configs, what happens if they are kept and the
quorum controller is downgraded?

Best,
David

On Thu, Mar 16, 2023 at 12:56 AM Jeff Kim <jeff....@confluent.io.invalid>
wrote:

> Hi folks,
>
> I would like to start a discussion thread for KIP-915: Next Gen Group
> Coordinator Downgrade Path which proposes the downgrade design for the new
> group coordinator introduced in KIP-848
> <
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-848%3A+The+Next+Generation+of+the+Consumer+Rebalance+Protocol
> >
> .
>
> KIP:
>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-915%3A+Next+Gen+Group+Coordinator+Downgrade+Path
>
> Thanks,
> Jeff
>

Reply via email to