vvcephei commented on pull request #9338:
URL: https://github.com/apache/kafka/pull/9338#issuecomment-707890589


   Thanks for the feedback, @thake . Your perspective makes perfect sense. It 
would certainly simplify our internal code, and it would also help to organize 
the API a little more. There are a few reasons I leaned more toward the "flat" 
API:
   1. Immediate concern: The KIP to get lost in the weeds of designing a whole 
hierarchy of contexts and deciding where each member should reside. As it is 
the KIP already took an incredibly long time to converge, and splitting out 
StateStoreContext dragged us into a new round of proposals about reorganizing 
the "record context" (which became Record and RecordMetadata). We were down to 
the wire for 2.7 at that point, and shipping something seemed better than 
shipping nothing.
   2. Long-term concern: I was afraid I'd be creating a situation in which 
every single update to any context would kick off a whole new round of 
bikeshedding about which part of the hierarchy the new thing should belong to, 
or whether we need a new level in the hierarchy, etc. etc. It seems like having 
two completely independent kinds of context (for state stores and for 
processors) would drive people to think only of where something new is 
_needed_, not where it _belongs_, avoiding the abstract philosophical 
discussions that programmers are prone to.
   3. Related long-term concern: Coupling: the notion of "context" is similar, 
but not identical across state stores and processors. It has already happened 
that we realized that something shared needed to be restricted to just one of 
the context. If there were any kind of shared interface (like a super-interface 
or a common interface as a member, like `getTaskContext()`), then we would have 
more challenges in "un-sharing" a member versus just being able to deprecate an 
un-shared member where it is inappropriate.
   4. Usability: The idea of organizing information for users has two sides. On 
one hand, it might be nice to be able to say "ah, this is a task property, so I 
know to look for it in the "TaskContext." But on the other hand, if you need 
(for example) the configured serde, and you don't see it in your Context, and 
you've forgotten that it's filed away under "task context", then you'll have to 
go hunting for it. Then, once you find it, you'll have to try to remember for 
the future where it's kept, occupying precious mental space that you could be 
using for other stuff. There's an art to striking a balance between a "wide" 
API and a "deep" one: a flat interface with just a few members is trivially 
easy to use, but a flat interface with too many members is a burden. Once an 
interface becomes too wide, then some amount of organization is a benefit, but 
too many levels of organization, or too many internal nodes in the interface 
tree at all is also a burden. It doesn't feel like we have to
 o many members in either StateStoreContext or ProcessorContext now, so I'm 
hesitant to organize further at all.
   
   A moderate amount of duplication in the internal code is the price we pay 
for these trade-offs. If the duplication itself looked too extensive or risky, 
then that would be another argument to converge the types, but so far, it 
doesn't seem too bad.
   
   Anyway, time will tell if this was the right tradeoff. We can always shuffle 
stuff around in the future. Does this all seem to make sense?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Reply via email to