artemlivshits commented on code in PR #14705: URL: https://github.com/apache/kafka/pull/14705#discussion_r1388499287
########## group-coordinator/src/main/java/org/apache/kafka/coordinator/group/runtime/CoordinatorRuntime.java: ########## @@ -659,9 +660,21 @@ public TopicPartition key() { */ @Override public void run() { + try { + runAsync().get(); Review Comment: > if one client's log append is blocked for additional async check That is correct, it may become a perf problem, we can measure and see if it's worth fixing in practice, we'll have this choice (as well as the choice to postpone the fix, if we have time pressure to release). But it won't be a functional problem. Right now it is a functional problem, which is suboptimal in many ways: - appendRecords has async interface, thus adding async stages under such an interface can be done without inspection and understanding all callers (that's what an interface is -- any compliant implementation is valid), but doing so will break the current logic (so from the proper interface usage perspective it is a bug in the caller, which this proposal fixes) - we cannot release new transaction protocol (or new coordinator) without implementing new logic, which makes hard dependencies and pushes against timelines (now all of a sudden KIP-848 got a new work to do before release, just because there is some independent work is going on in transaction area) - KIP-890 part2 design is still under discussion, the verification protocol is likely to change, so any changes in KIP-890 protocol are going to have ripple effects on KIP-848 - 2 fairly complex components are now tied together -- we cannot just innovate on transaction protocol implementation details (or to be broader -- on the whole IO subsystem implementation details -- e.g. Async IO) without understanding group coordinator implementation detail and we cannot innovate on group coordinator implementation detail without understanding implementation details of transaction protocol - to make the previous point worse, the dependency is not visible at the "point of use" -- someone tasked with improving transaction protocol (or IO in general) would have no indication from the appendRecords interface, that adding an async stage would need to have a corresponding change in group coordinator - the work needs to be duplicated in group coordinator (and the protocol is going slightly different for different client versions) which becomes a likely source of bugs IMO, the fact that transaction verification implementation just doesn't work out-of-box with the new group coordinator (and in fact requires quite non-trivial follow-up work that will block the release) is an architectural issue. We should strive to make the system more decoupled, so that the context an engineer needs to understand to make local changes in a part of system is less. > Each new group coordinator thread handles requests from multiple groups and multiple clients within the same group. I don't think it's bound to a thread, but indeed the concurrency is limited to partition -- we don't let operations on the same partition run concurrently, so all the groups that are mapped to the same partition are contending. This is, however, a specific implementation choice, it should be possible to make a group to be a unit of concurrency, and if that's not enough, we can let offset commits for different partitions go concurrently as well (they just need to make sure that group doesn't change, which is sort of a "read lock" on the group), at which point there probably wouldn't be any contention in the common path. Now, one might ask a question, implementing per-group synchronization adds complexity and handling transaction verification as an explicit state transition in group coordinator adds complexity, what the difference? I'd say the difference is fundamental -- per-group synchronization complexity is encapsulated in one component and keeps the system decoupled: an engineer tasked to improve transaction protocol, doesn't need understand implementation details of group coordinator and vice versa. Changes are smaller, can be made faster, and less bug prone. Win-win-win. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org