Hi Justine, Thanks for your comments. Please find my answers below.
- Yes, the new protocol relies on topic IDs with the exception of the topic names based in the ConsumerGroupHeartbeatRequest. I am not sure if using topic names is the right call here. I need to think about it a little more. Obviously, the KIP does not change the fetch/commit offsets RPCs to use topic IDs. This may be something that we should include though as it would give better overall guarantee in the producer. - You're right. I think that I should not have mentioned this flag at all. I will remove it. We can use an internal configuration while developing the feature. - Both cluster types will be supported. The change is orthogonal. The only requirement is that the cluster uses topic IDs. Best, David On Mon, Jul 11, 2022 at 9:53 PM Guozhang Wang <[email protected]> wrote: > > Hi Ismael, > > Thanks for the feedback. Here are some replies inlined below: > > On Sat, Jul 9, 2022 at 2:53 AM Ismael Juma <[email protected]> wrote: > > > Thanks for the KIP. This has the potential to be a great improvement. A few > > initial questions/comments: > > > > 1. I think it's premature to talk about target versions for deprecation and > > removal of the existing group protocol. Unlike KRaft, this affects a core > > client protocol and hence deprecation/removal will be heavily dependent on > > how quickly applications migrate to the new protocol. > > > > Yeah I agree with you. I think we can remove the proposed timeline in the > `Compatibility, Deprecation, and Migration Plan` and instead just state > that we will decide in the future about when we would deprecate old > protocol and behaviors. > > > > 2. The KIP says we intend to release this in 4.x, but it wasn't made clear > > why. If we added that as a way to estimate when we'd deprecate and remove > > the group protocol, I also suggest removing this part. > > > > I think that's not specifically related to the deprecation/removal timeline > plan, but it's more for client upgrades. I.e. the broker-side > implementation may be done first, and then the client side, and we would > only mark it as "released" by the time clients implementations are done. At > that time, to enable the feature the clients need to first swap-in the > bytecode with a rolling bounce and then set the flag with a second rolling > bounce, and hence we feel it's better to be released in a major version. > > > > 3. We need to flesh out the details of the migration story. It sounds like > > we're saying we will support online migrations. Is that correct? We should > > explain this in detail. It could also be done as a separate KIP, if it's > > easier. > > > > Yes I think that's the part we can be more concrete about for sure (and > this is related to your question 2) above). We will work on making it more > explicit in parallel as we solicit more feedback. > > > > 4. I am happy that we are pushing the pattern subscriptions to the server, > > but it seems like there could be some tricky compatibility issues. Will we > > have a mechanism for users to detect that they need to update their regex > > before switching to the new protocol? > > > > Yes I think we need some tooling for non-java client users to sort of > "dry-run" the client before switching to the new protocol. I do not have a > specific idea on top of my head though, maybe others like @Matt Howlett can > chime-in here? > > > > 5. Related to the last question, will the Java client allow the users to > > stick with the current regex engine for compatibility reasons? For example, > > it may be handy to keep using client based regex at first to keep > > migrations simple and then migrate to server based regexes as a second > > step. > > > > Honestly I have not thought about that for java clients, and we can discuss > that. What kind of compatibility issues do you have in mind? > > > > 6. When we say that the group coordinator will be responsible for storing > > the configurations and that the configurations will be deleted when the > > group is deleted. Will a transition to DEAD trigger deletion of > > configurations? > > > > Yes, since the DEAD state is an ending state (we would only transit to that > state when the group is EMPTY and also all of its metadata are gone), once > it's transited to DEAD this group would never be revived. > > > > 7. Will the choice to store the configs in the group coordinator make it > > harder to list all cluster configs and their values? > > > > That's a good question, and our thoughts are that the so-called "group > configurations" are overrides of the cluster-level configurations > customized per group so when an admin list cluster configs it's okay to > list just the cluster-level "defaults", not showing any per-group > customizations. > > > > 8. How would someone configure a group before starting the consumers? Have > > we considered allowing the explicit creation of groups? Alternatively, the > > configs could be decoupled from the group lifecycle. > > > > The configs can be created before the group itself as an independent entity > --- of course, this requires the corresponding request to be routed to the > right coordinator based on the group id --- the only thing that differs is, > when the group itself is gone we also check if there are any configuration > entities related to that group and delete as well. > > Admittedly this indeed introduces an asymmetry on the creation / deletion > lifecycles of the config entities, and we would like to hear everyone's > feelings whether we should aim for symmetry i.e. totally decouple group > configs and hence not delete them at all when the group is gone, but always > require explicit deletion operations by themselves. > > > > 9. Will the Consumer.subscribe method for the Java client still take a > > `java.util.regex.Pattern` of do we have to introduce an overload? > > > > I think we do not need to introduce an overload, but I'm all ears if there > may be some compatibility issues that we may overlook. > > > > 10. I agree with Justine that we should be clearer about the reason to > > switch to IBP/metadata.version from the feature flag. Maybe we mean that we > > can switch the default for the feature flag to true based on the > > metadata.version once we want to make it the default. > > 11. Some of the protocol APIs don't mention the required ACLs, it would be > > good to add that for consistency. > > > > Ack, we can certainly do that. > > > > 12. It is a bit odd that ConsumerGroupHeartbeat requires "Read Group" even > > though it seems to do more than reading. > > > > I had that thought myself as well, but in the end we could not find a > better alternative: adding Write Group seems an overkill here since we do > not have it elsewhere (we only have Read / Delete and Describe on groups so > far). Would like to hear others thoughts. > > > > 13. How is topic recreation handled by the consumer with the new group > > protocol? It would be good to have a section on this. > > > > You mean with regex subscription right? Yes we can add a section about > that, but basically the idea is that consumer would be totally agnostic in > the new protocol as it's handled all by the brokers. > > > > 14. The KIP mentions we will write the new coordinator in Java. Even though > > this is an implementation detail, do we plan to have a new gradle module > > for it? > > > > We have not thought about that. But I think the answer should be yes. > > > > 15. Do we have a scalability goal when it comes to how many members the new > > group protocol can support? > > > > Within a group, I think we should shoot for 1000s of members. But that > scalability goals also depend on the offset management (commit, fetch) > capabilities of the coordinator which we did not cover in this KIP, so it's > hard to give a number that applies universally. > > > > 16. Did we consider having SubscribedTopidIds instead of > > SubscribedTopicNames in ConsumerGroupHeartbeatRequest? Is the idea that > > since we have to resolve the regex on the server, we can do the same for > > the topic name? The difference is that sending the regex is more efficient > > whereas sending the topic names is less efficient. Furthermore, delete and > > recreation is easier to handle if we have topic ids. > > > > The main reason to still let the clients send names is to keep the > reasoning of names -> ids on the broker / admin client only. Note that > although we added topic id in KIP-516, we never implemented the logic on > consumer/producers leveraging the related newer versioned RPCs, instead we > just set the topic id as empty UUID. We want to keep the consumer/producer > to be thin and only delegate the reasoning on broker and potentially admin > clients. > > > > > > Thanks, > > Ismael > > > > On Wed, Jul 6, 2022 at 10:45 AM David Jacot <[email protected]> > > wrote: > > > > > Hi all, > > > > > > I would like to start a discussion thread on KIP-848: The Next > > > Generation of the Consumer Rebalance Protocol. With this KIP, we aim > > > to make the rebalance protocol (for consumers) more reliable, more > > > scalable, easier to implement for clients, and easier to debug for > > > operators. > > > > > > The KIP is here: https://cwiki.apache.org/confluence/x/HhD1D. > > > > > > Please take a look and let me know what you think. > > > > > > Best, > > > David > > > > > > PS: I will be away from July 18th to August 8th. That gives you a bit > > > of time to read and digest this long KIP. > > > > > > > > -- > -- Guozhang
