[ https://issues.apache.org/jira/browse/KAFKA-7654?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16694474#comment-16694474 ]
Bruno Bieth commented on KAFKA-7654: ------------------------------------ bq. It a Java limitation that we loose type safety for default Serdes I think it's a design issue rather than a language issue. Even with reified generics, having a global configuration that's set once and carry a pair of Serde (for the whole application!) isn't going to work. `ProcessorContext` could be `ProcessorContext<K,V>` but then you'll be stuck getting the `Serde<K>` / `Serde<V>` from the global config. bq. They are useful if for example, all you data is always in AVRO or JSON. Ok, I see. This is based under the assumption that the user of the API will be serializing its types using an inherently non-type-safe library. Say Jackson, which will do a best effort runtime serialization job on any types (and fail, at runtime, otherwise). But if you're using a type-safe library like circe (as we do), then we'll suffer that assumption. I guess there's another assumption, which is that the source and sinks formats are controlled by the streams. In our case we neither control the source nor the sink format, i.e we get our JSON from team A and we send it to team B each with their own formats. I thought that would be a common case? bq. I also don't see, why it "leaks to the API" – cast are only required in internal classed. User facing public DSL API does not require any casts (or do I miss something?). I leaks in the form of a non-intuitive API, one that, for instance, take both a Materialized and a Produced (`table`) because the Materialized needs to be overridden so that defaults (which aren't type-safe) aren't used. bq. If a KStream knows its Serde or only knows its deserializer seems to be a runtime detail Again, if you consider serialization to be only happening at runtime (using reflection) - which IMO is a bad thing. Look up circe, it's a great library, that is worth moving to scala ;) bq. but nothing a user should see in the API from my point of view. The user is already seeing this when you have to workaround the default Serde, as in `table`: you have 1) a default pair of Serde in the global config 2) serdes in the Materialized and 3) as a parameter in the `table` method. With my suggestion you only have one pair of Serde. From the user perspective it's a whole lot easier to reason about. As you said it's a matter of trade-off, whether you want to support the "all your types are encoded/decoded by AVRO / Jackson at runtime" use case or have a type-safe serialization and a cleaner API. Personally I would go for the type-safe solution, even if that means that Jackson users end-up passing their `objectMapper` a couple more times. This will at least make them think about serialization, and avoid situations like "how come my Car format isn't right? oh there's a global serde which is made of a global objectMapper that is configured with a CustomCarDeserializer in some remote part of the application". bq. it has the disadvantage that it introduces two classes that (from a business logic point of view) are the same thing In this fluent style I don't think the end-user cares much about which builder classes are returned as long as the IDE suggests valid methods. The main point is discoverability. > Relax requirements on serializing-only methods. > ----------------------------------------------- > > Key: KAFKA-7654 > URL: https://issues.apache.org/jira/browse/KAFKA-7654 > Project: Kafka > Issue Type: Improvement > Components: streams > Reporter: Bruno Bieth > Priority: Major > > Methods such as KStream#to shouldn't require a Produced as only the > serializing part is ever used. -- This message was sent by Atlassian JIRA (v7.6.3#76005)