Re: [DISCUSS] KIP-598: Augment TopologyDescription with store and source / sink serde information

John Roesler Sat, 02 May 2020 08:19:50 -0700

Hi all,

I’ve been sitting on another concern about this proposal. Since Matthias has 
just submitted a few questions, perhaps I can pile on two more this round.


(10) Can we avoid coupling this KIP’s behavior to the choice of ‘build’ method? 
I.e., can we return the improved description even when people just call 
‘build()’?

Clearly, we need a placeholder if no serde is specified. How about “unknown”, 
or the name of the config keys, “default.key.serde”/“default.value.serde”?

I still have some deep reservation about the ‘build(Parameters)’ method itself. 
I don’t really want to side-track this conversation with all my concerns if we 
can avoid it, though. It seems like justification enough that introducing 
dramatically different behavior based in on seemingly minor differences in api 
calls will be a source of mystery and complexity for users.

I.e., I’m characterizing a completely different string format as “dramatically 
different”, as opposed to just having a placeholder string.

(11) Regarding the wrapper serdes, I bet we can capture and print the inner 
types as well. 

Thanks again for the KIP!
-John



On Thu, Apr 30, 2020, at 19:10, Matthias J. Sax wrote:
> Guozhang,
> 
> thanks for the KIP!
> 
> Couple of comments/questions.
> 
> (1) In the new TopologyDescription output, the line for the
> windowed-count processor is:
> 
> >  Processor: myname (stores: [(myname-store, serdes: [SessionWindowedSerde, 
> > FullChangeSerde])])
> 
> For this case, both Serdes are wrappers and user would actually only
> specified wrapped Serdes for the key and value. Can we do anything about
> this? Otherwise, there might still be a runtime `ClassCastException`
> that a user cannot easily debug.
> 
> 
> (2) Nit: The JavaDocs of `Processor#storeSet()` seems to be incorrect
> (it says "The names of all connected stores." -- guess it's c&p error)?
> 
> 
> (3) The KIP mentioned to add `Store#changelogTopic()` method, but the
> output of `TopologyDescription#toString()` does not contain it. I think
> it might be good do add it, too?
> 
> 
> (4) The KIP also list https://issues.apache.org/jira/browse/KAFKA-9913
> but it seems not to address it yet?
> 
> 
> (5) As John, I also noticed that `List<String> Store#sedersNames()` is
> not a great API. I am not sure if I understand your reply thought.
> AFAIK, there is no exiting API
> 
> > List<Serde> StoreBuilder#serdes()
> 
> (cf
> https://github.com/apache/kafka/blob/trunk/streams/src/main/java/org/apache/kafka/streams/state/StoreBuilder.java)
> 
> 
> (6) Atm, we return `String` type for the Serdes. Do we think it's
> sufficient? Just want to double check.
> 
> 
> 
> -Matthias
> 
> 
> 
> 
> On 4/25/20 1:24 PM, Guozhang Wang wrote:
> > Hi John,
> > 
> > Thanks for the review! Replied inline.
> > 
> > On Fri, Apr 24, 2020 at 8:09 PM John Roesler <vvcep...@apache.org> wrote:
> > 
> >> Hi Guozhang,
> >>
> >> Thanks for the KIP! I took a quick look, and I'm really happy to see this
> >> underway.
> >>
> >> Some quick questions:
> >>
> >> 1.  Can you elaborate on the reason that stores just have a list of
> >> serdes, whereas
> >> other components have an explicit key/value serde?
> >>
> > 
> > This is because of the existing API "List<Serde> StoreBuilder#serdes()".
> > Although both of its implementations would return two serdes (one for key
> > and one for value), the API is more general to return a list. And hence the
> > TopologyDescription#Store which gets them directly from StoreBuilder is
> > exposing the same API.
> > 
> > 1.5. A side-effect of this seems to be that the string-formatted serde
> >> description is
> >> different, depending on whether the serdes are listed on a store or a
> >> topic. Just an
> >> observation.
> >>
> > 
> > Yes I agree. I think we can probably change the "List<Serde>
> > StoreBuilder#serdes()" signature as well (which would be a breaking change
> > though, so we should do that via deprecation), but I'm a bit concerned
> > since it was designed for future store types which may not be of K-V format
> > any more.
> > 
> > 
> >> 2. You mentioned the key compatibility concern in my mind. We do know that
> >> such
> >> use cases exist. Namely, our own tests and
> >> https://zz85.github.io/kafka-streams-viz/
> >> I'm wondering if we can add a forward-compatible machine-readable format
> >> to the
> >> KIP, so that even though we must break the parsers right now, maybe we'll
> >> never
> >> have to break them again. For example, I'm thinking of a "toJson" method
> >> on the
> >> TopologyDescription that formats the entire topology description as a json
> >> string.
> >>
> >>
> > Yes, I also have concerns about that (as described in the compatibility
> > section). One proposal I have is that we ONLY augment the toString result
> > if the TopologyDescription is from a Topology built from
> > `StreamsBuilder#build(Properties)`, which is only recently added and hence
> > most old usage would not get the benefits of it. But after thinking about
> > this a bit more, I'm now more inclined to just always augment the string,
> > and also add a `toJson` method in addition to `toString`.
> > 
> > 
> >> Thanks again!
> >> -John
> >>
> >> On Fri, Apr 24, 2020, at 00:26, Guozhang Wang wrote:
> >>> Hello folks,
> >>>
> >>> I'd like to start the discussion for KIP-598:
> >>>
> >>>
> >> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=148648762
> >>>
> >>> It proposes to augment the topology description's sub-classes with store
> >>> and source / sink serde information. Let me know what you think, thanks!
> >>>
> >>> --
> >>> -- Guozhang
> >>>
> >>
> > 
> > 
> 
> 
> Attachments:
> * signature.asc

Re: [DISCUSS] KIP-598: Augment TopologyDescription with store and source / sink serde information

Reply via email to