vvcephei commented on pull request #9338: URL: https://github.com/apache/kafka/pull/9338#issuecomment-707890589
Thanks for the feedback, @thake . Your perspective makes perfect sense. It would certainly simplify our internal code, and it would also help to organize the API a little more. There are a few reasons I leaned more toward the "flat" API: 1. Immediate concern: The KIP to get lost in the weeds of designing a whole hierarchy of contexts and deciding where each member should reside. As it is the KIP already took an incredibly long time to converge, and splitting out StateStoreContext dragged us into a new round of proposals about reorganizing the "record context" (which became Record and RecordMetadata). We were down to the wire for 2.7 at that point, and shipping something seemed better than shipping nothing. 2. Long-term concern: I was afraid I'd be creating a situation in which every single update to any context would kick off a whole new round of bikeshedding about which part of the hierarchy the new thing should belong to, or whether we need a new level in the hierarchy, etc. etc. It seems like having two completely independent kinds of context (for state stores and for processors) would drive people to think only of where something new is _needed_, not where it _belongs_, avoiding the abstract philosophical discussions that programmers are prone to. 3. Related long-term concern: Coupling: the notion of "context" is similar, but not identical across state stores and processors. It has already happened that we realized that something shared needed to be restricted to just one of the context. If there were any kind of shared interface (like a super-interface or a common interface as a member, like `getTaskContext()`), then we would have more challenges in "un-sharing" a member versus just being able to deprecate an un-shared member where it is inappropriate. 4. Usability: The idea of organizing information for users has two sides. On one hand, it might be nice to be able to say "ah, this is a task property, so I know to look for it in the "TaskContext." But on the other hand, if you need (for example) the configured serde, and you don't see it in your Context, and you've forgotten that it's filed away under "task context", then you'll have to go hunting for it. Then, once you find it, you'll have to try to remember for the future where it's kept, occupying precious mental space that you could be using for other stuff. There's an art to striking a balance between a "wide" API and a "deep" one: a flat interface with just a few members is trivially easy to use, but a flat interface with too many members is a burden. Once an interface becomes too wide, then some amount of organization is a benefit, but too many levels of organization, or too many internal nodes in the interface tree at all is also a burden. It doesn't feel like we have to o many members in either StateStoreContext or ProcessorContext now, so I'm hesitant to organize further at all. A moderate amount of duplication in the internal code is the price we pay for these trade-offs. If the duplication itself looked too extensive or risky, then that would be another argument to converge the types, but so far, it doesn't seem too bad. Anyway, time will tell if this was the right tradeoff. We can always shuffle stuff around in the future. Does this all seem to make sense? ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org