Hi Daniyar,

Thanks for that update. I took a look, and I think this is in good shape.

One note, the statement "New method public static <T> Serde<List<T>>
ListSerde() in org.apache.kafka.common.serialization.Serdes class
(infers list implementation and inner serde from config file)" is
still present in the KIP, although I do it is was removed from the PR.

Once you remove that statement from the KIP, then I think this KIP is
ready to go up for a vote! Then, we can really review the PR in
earnest and get this thing merged.

Thanks,
-john

On Tue, Jul 16, 2019 at 2:05 PM Development <d...@yeralin.net> wrote:
>
> Hi,
>
> Pushed new changes under my PR: https://github.com/apache/kafka/pull/6592 
> <https://github.com/apache/kafka/pull/6592>
>
> Feel free to put any comments in there.
>
> Best,
> Daniyar Yeralin
>
> > On Jul 15, 2019, at 1:06 PM, Development <d...@yeralin.net> wrote:
> >
> > Hi John,
> >
> > I knew I was missing something. Yes, that makes sense now, I removed all 
> > `listSerde()` methods, and left empty constructors instead.
> >
> > As per `CommonClientConfigs` I looked at the class, it doesn’t have any 
> > properties related to serdes, and that bothers me a little.
> >
> > All properties like `default.key.serde` `default.windowed.key.serde.*` are 
> > located in StreamsConfig. I don’t want to create a confusion.
> > What also doesn’t make sense to me is that `WindowedSerdes` and its 
> > (de)serializers are not located in org.apache.kafka.common.serialization. I 
> > guess it kind of makes sense since windowed serdes are only available for 
> > kafka streams, not vice versa.
> >
> > If everyone is okay to put list properties in `CommonClientConfigs` class, 
> > I’ll go ahead and do that then.
> >
> > Thank you for your input!
> >
> > Best,
> > Daniyar Yeralin
> >
> >> On Jul 15, 2019, at 11:45 AM, John Roesler <j...@confluent.io> wrote:
> >>
> >> Hi all,
> >>
> >> Regarding the placement, you might as well move the constants to 
> >> `org.apache.kafka.clients.CommonClientConfigs`, so that the constants and 
> >> the configs and the code are in the same module.
> >>
> >> Regarding the constructor... What Matthias said is correct: The serde, 
> >> serializer, and deserializer all need to have zero-arg constructors so 
> >> they can be instantiated reflectively by Kafka. However, the factory 
> >> method you proposed "New method public static <T> Serde<List<T>> 
> >> ListSerde()" is not a constructor, and is not required. It would be used 
> >> purely from the Java interface, but has the drawbacks I listed above. This 
> >> method, not the constructor, is what I proposed to remove from the KIP.
> >>
> >> Thanks,
> >> -John
> >>
> >>
> >> On Mon, Jul 15, 2019 at 10:15 AM Development <d...@yeralin.net 
> >> <mailto:d...@yeralin.net>> wrote:
> >> One problem though.
> >>
> >> Since WindowedSerde (Windowed(De)Serializer) are so similar, I’m trying to 
> >> mimic the implementation of my ListSerde accordingly.
> >>
> >> I created couple constants under StreamsConfig:
> >>
> >>
> >>
> >> And trying to do similar construct:
> >> final String propertyName = isKey ? 
> >> StreamsConfig.DEFAULT_LIST_KEY_SERDE_INNER_CLASS : 
> >> StreamsConfig.DEFAULT_LIST_VALUE_SERDE_INNER_CLASS;
> >> But then found out that StreamsConfig is not accessible from 
> >> org.apache.kafka.common.serialization package while window serde 
> >> (de)serializers are located under org.apache.kafka.streams.kstream package.
> >>
> >> What should I do? Should I move my classes under 
> >> org.apache.kafka.streams.kstream package instead?
> >>
> >>> On Jul 15, 2019, at 10:45 AM, Development <d...@yeralin.net 
> >>> <mailto:d...@yeralin.net>> wrote:
> >>>
> >>> Hi Matthias,
> >>>
> >>> Thank you for your input.
> >>>
> >>> I updated the KIP, made it a little more readable.
> >>>
> >>> I think the configuration parameters strategy is finalized then.
> >>>
> >>> Do you have any other questions/concerns regarding this KIP?
> >>>
> >>> Meanwhile I’ll start doing appropriate code changes, and commit them 
> >>> under my PR.
> >>>
> >>> Best,
> >>> Daniyar Yeralin
> >>>
> >>>> On Jul 11, 2019, at 2:44 PM, Matthias J. Sax <matth...@confluent.io 
> >>>> <mailto:matth...@confluent.io>> wrote:
> >>>>
> >>>> Daniyar,
> >>>>
> >>>> thanks for the update to the KIP. It's in really good shape and well
> >>>> written.
> >>>>
> >>>> About the default constructor question:
> >>>>
> >>>> All Serdes/Serializer/Deserializer classes need a default constructor to
> >>>> create them easily via reflections when specifies in a config. I
> >>>> understand that it is not super user friendly, but all existing code
> >>>> works this way. Hence, it seems best to stick with the established 
> >>>> pattern.
> >>>>
> >>>> We have a similar issue with `TimeWindowedSerde` and
> >>>> `SessionWindowedSerde`, and I just recently did a PR to improve user
> >>>> experience that address the exact issue John raised. (cf
> >>>> https://github.com/apache/kafka/pull/7067 
> >>>> <https://github.com/apache/kafka/pull/7067>)
> >>>>
> >>>> Note, that if a user would instantiate the Serde manually, the user
> >>>> would also need to call `configure()` to setup the inner serdes. Kafka
> >>>> Streams would not setup those automatically and one might most likely
> >>>> end-up with an NPE.
> >>>>
> >>>>
> >>>> Coming back the KIP, and the parameter names. `WindowedSerdes` are
> >>>> similar to `ListSerde` as they wrap another Serde. For `WindowedSerdes`,
> >>>> we use the following parameter names:
> >>>>
> >>>> - default.windowed.key.serde.inner
> >>>> - default.windowed.value.serde.inner
> >>>>
> >>>>
> >>>> It might be good to align the naming pattern. I would also suggest to
> >>>> use `type` instead of `impl`?
> >>>>
> >>>>
> >>>> default.key.list.serde.impl  ->  default.list.key.serde.type
> >>>> default.value.list.serde.impl  ->  default.list.value.serde.type
> >>>> default.key.list.serde.element  ->  default.list.key.serde.inner
> >>>> default.value.list.serde.element  ->  default.list.value.serde.inner
> >>>>
> >>>>
> >>>>
> >>>> -Matthias
> >>>>
> >>>>
> >>>> On 7/10/19 8:52 AM, Development wrote:
> >>>>> Hi John,
> >>>>>
> >>>>> Yes, I do agree. That totally makes sense. The only thing is that it 
> >>>>> goes against what Matthias suggested earlier:
> >>>>> "I think that ... `ListSerde` should have an default constructor and it 
> >>>>> should be possible to pass in the `Class listClass` information via a 
> >>>>> configuration. Otherwise, KafkaStreams cannot use it as default serde.”
> >>>>>
> >>>>> What do you think about that? I hope I’m not confusing anything.
> >>>>>
> >>>>> Best,
> >>>>> Daniyar Yeralin
> >>>>>
> >>>>>> On Jul 9, 2019, at 5:56 PM, John Roesler <j...@confluent.io 
> >>>>>> <mailto:j...@confluent.io>> wrote:
> >>>>>>
> >>>>>> Ah, my apologies, I must have just overlooked it. Thanks for the 
> >>>>>> update, too.
> >>>>>>
> >>>>>> Just one more super-small question, do we need this variant:
> >>>>>>
> >>>>>>> New method public static <T> Serde<List<T>> ListSerde() in 
> >>>>>>> org.apache.kafka.common.serialization.Serdes class (infers list 
> >>>>>>> implementation and inner serde from config file)
> >>>>>>
> >>>>>> It seems like this situation implies my config file is already set up 
> >>>>>> for the list serde, so passing this serde (e.g., in Produced) would 
> >>>>>> have the same effect as not specifying it.
> >>>>>>
> >>>>>> I guess that it could be the case that you have the 
> >>>>>> `default.key/value.serde` set to something else, like StringSerde, but 
> >>>>>> you still have the `default.key/value.list.serde.impl/element` set. 
> >>>>>> This seems like it would result in more confusion than convenience, so 
> >>>>>> my gut instinct is maybe we shouldn't introduce the `ListSerde()` 
> >>>>>> variant until people actually request it later on.
> >>>>>>
> >>>>>> Thus, we'd just stick with fully config-driven or fully 
> >>>>>> source-code-driven, not half/half.
> >>>>>>
> >>>>>> What do you think?
> >>>>>>
> >>>>>> Thanks,
> >>>>>> -John
> >>>>>>
> >>>>>>
> >>>>>> On Tue, Jul 9, 2019 at 9:58 AM Development <d...@yeralin.net 
> >>>>>> <mailto:d...@yeralin.net> <mailto:d...@yeralin.net 
> >>>>>> <mailto:d...@yeralin.net>>> wrote:
> >>>>>>>
> >>>>>>> Hi John,
> >>>>>>>
> >>>>>>> I hope everyone had a great long weekend.
> >>>>>>>
> >>>>>>> Regarding Java interfaces, I may not understand you correctly, but I 
> >>>>>>> think I already listed them:
> >>>>>>>
> >>>>>>> So for Produced, you would use it in the following fashion, for 
> >>>>>>> example: Produced.keySerde(Serdes.ListSerde(ArrayList.class, 
> >>>>>>> Serdes.Integer()))
> >>>>>>>
> >>>>>>> I also updated the KIP, and added a section “Serialization Strategy” 
> >>>>>>> where I describe our logic of conditional serialization based on the 
> >>>>>>> type of an inner serde.
> >>>>>>>
> >>>>>>> Thank you!
> >>>>>>>
> >>>>>>> Best,
> >>>>>>> Daniyar Yeralin
> >>>>>>>
> >>>>>>> On Jun 26, 2019, at 11:44 AM, John Roesler <j...@confluent.io 
> >>>>>>> <mailto:j...@confluent.io> <mailto:j...@confluent.io 
> >>>>>>> <mailto:j...@confluent.io>>> wrote:
> >>>>>>>
> >>>>>>> Thanks for the update, Daniyar!
> >>>>>>>
> >>>>>>> In addition to specifying the config interface, can you also specify
> >>>>>>> the Java interface? Namely, if I need to pass an instance of this
> >>>>>>> serde in to the DSL directly, as in Produced, Materialized, etc., what
> >>>>>>> constructor(s) would I have available? Likewise with the Serializer
> >>>>>>> and Deserailizer. I don't think you need to specify the implementation
> >>>>>>> logic, since we've already discussed it here.
> >>>>>>>
> >>>>>>> If you also want to specify the serialized format of the data records
> >>>>>>> in the KIP, it could be useful documentation, as well as letting us
> >>>>>>> verify the schema for forward/backward compatibility concerns, etc.
> >>>>>>>
> >>>>>>> Thanks,
> >>>>>>> John
> >>>>>>>
> >>>>>>> On Wed, Jun 26, 2019 at 10:33 AM Development <d...@yeralin.net 
> >>>>>>> <mailto:d...@yeralin.net> <mailto:d...@yeralin.net 
> >>>>>>> <mailto:d...@yeralin.net>>> wrote:
> >>>>>>>
> >>>>>>>
> >>>>>>> Hey,
> >>>>>>>
> >>>>>>> Finally made updates to the KIP: 
> >>>>>>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-466%3A+Add+support+for+List%3CT%3E+serialization+and+deserialization
> >>>>>>>  
> >>>>>>> <https://cwiki.apache.org/confluence/display/KAFKA/KIP-466%3A+Add+support+for+List%3CT%3E+serialization+and+deserialization>
> >>>>>>>  
> >>>>>>> <https://cwiki.apache.org/confluence/display/KAFKA/KIP-466%3A+Add+support+for+List%3CT%3E+serialization+and+deserialization
> >>>>>>>  
> >>>>>>> <https://cwiki.apache.org/confluence/display/KAFKA/KIP-466%3A+Add+support+for+List%3CT%3E+serialization+and+deserialization>>
> >>>>>>>  
> >>>>>>> <https://cwiki.apache.org/confluence/display/KAFKA/KIP-466:+Add+support+for+List%3CT%3E+serialization+and+deserialization
> >>>>>>>  
> >>>>>>> <https://cwiki.apache.org/confluence/display/KAFKA/KIP-466:+Add+support+for+List%3CT%3E+serialization+and+deserialization>
> >>>>>>>  
> >>>>>>> <https://cwiki.apache.org/confluence/display/KAFKA/KIP-466:+Add+support+for+List%3CT%3E+serialization+and+deserialization
> >>>>>>>  
> >>>>>>> <https://cwiki.apache.org/confluence/display/KAFKA/KIP-466:+Add+support+for+List%3CT%3E+serialization+and+deserialization>>>
> >>>>>>> Sorry for the delay :)
> >>>>>>>
> >>>>>>> Thank You!
> >>>>>>>
> >>>>>>> Best,
> >>>>>>> Daniyar Yeralin
> >>>>>>>
> >>>>>>> On Jun 22, 2019, at 12:49 AM, Matthias J. Sax <matth...@confluent.io 
> >>>>>>> <mailto:matth...@confluent.io> <mailto:matth...@confluent.io 
> >>>>>>> <mailto:matth...@confluent.io>>> wrote:
> >>>>>>>
> >>>>>>> Yes, something like this. I did not think about good configuration
> >>>>>>> parameter names yet. I am also not sure if I understand all proposed
> >>>>>>> configs atm. But all configs should be listed and explained in the KIP
> >>>>>>> anyway, and we can discuss further after you have updated the KIP (I 
> >>>>>>> can
> >>>>>>> ask more detailed question if I have any).
> >>>>>>>
> >>>>>>>
> >>>>>>> -Matthias
> >>>>>>>
> >>>>>>> On 6/21/19 2:05 PM, Development wrote:
> >>>>>>>
> >>>>>>> Yes, you are right. ByteSerializer is not what I need to have in a 
> >>>>>>> list
> >>>>>>> of primitives.
> >>>>>>>
> >>>>>>> As for the default constructor and configurability, just want to make
> >>>>>>> sure. Is this what you have on your mind?
> >>>>>>>
> >>>>>>> Best,
> >>>>>>> Daniyar Yeralin
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>> On Jun 21, 2019, at 2:51 PM, Matthias J. Sax <matth...@confluent.io 
> >>>>>>> <mailto:matth...@confluent.io> <mailto:matth...@confluent.io 
> >>>>>>> <mailto:matth...@confluent.io>>
> >>>>>>> <mailto:matth...@confluent.io <mailto:matth...@confluent.io> 
> >>>>>>> <mailto:matth...@confluent.io <mailto:matth...@confluent.io>>>> wrote:
> >>>>>>>
> >>>>>>> Thanks for the update!
> >>>>>>>
> >>>>>>> I think that `ListDeserializer`, `ListSerializer`, and `ListSerde`
> >>>>>>> should have an default constructor and it should be possible to pass 
> >>>>>>> in
> >>>>>>> the `Class listClass` information via a configuration. Otherwise,
> >>>>>>> KafkaStreams cannot use it as default serde.
> >>>>>>>
> >>>>>>>
> >>>>>>> For the primitive serializers: `BytesSerializer` is not primitive 
> >>>>>>> IMHO,
> >>>>>>> as is it for `byte[]` with variable length -- it's for arrays, not for
> >>>>>>> single `byte` (note, that `Bytes` is a Kafka class wrapping `byte[]`).
> >>>>>>>
> >>>>>>>
> >>>>>>> For tests, we can comment on the PR. No need to do this in the KIP
> >>>>>>> discussion.
> >>>>>>>
> >>>>>>>
> >>>>>>> Can you also update the KIP?
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>> -Matthias
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>> On 6/21/19 11:29 AM, Development wrote:
> >>>>>>>
> >>>>>>> I made and pushed necessary commits, so we could review the final
> >>>>>>> version under PR https://github.com/apache/kafka/pull/6592 
> >>>>>>> <https://github.com/apache/kafka/pull/6592> 
> >>>>>>> <https://github.com/apache/kafka/pull/6592 
> >>>>>>> <https://github.com/apache/kafka/pull/6592>>
> >>>>>>>
> >>>>>>> I also need some advice on writing tests for this new serde. So far I
> >>>>>>> only have two test cases (roundtrip and empty payload), I’m not sure
> >>>>>>> if it is enough.
> >>>>>>>
> >>>>>>> Thank y’all for your help in this KIP :)
> >>>>>>>
> >>>>>>> Best,
> >>>>>>> Daniyar Yeralin
> >>>>>>>
> >>>>>>>
> >>>>>>> On Jun 21, 2019, at 1:44 PM, John Roesler <j...@confluent.io 
> >>>>>>> <mailto:j...@confluent.io> <mailto:j...@confluent.io 
> >>>>>>> <mailto:j...@confluent.io>>
> >>>>>>> <mailto:j...@confluent.io <mailto:j...@confluent.io> 
> >>>>>>> <mailto:j...@confluent.io <mailto:j...@confluent.io>>>> wrote:
> >>>>>>>
> >>>>>>> Hey Daniyar,
> >>>>>>>
> >>>>>>> Looks good to me! Thanks for considering it.
> >>>>>>>
> >>>>>>> Thanks,
> >>>>>>> -John
> >>>>>>>
> >>>>>>> On Fri, Jun 21, 2019 at 9:04 AM Development <d...@yeralin.net 
> >>>>>>> <mailto:d...@yeralin.net> <mailto:d...@yeralin.net 
> >>>>>>> <mailto:d...@yeralin.net>>
> >>>>>>> <mailto:d...@yeralin.net <mailto:d...@yeralin.net> 
> >>>>>>> <mailto:d...@yeralin.net <mailto:d...@yeralin.net>>> 
> >>>>>>> <mailto:d...@yeralin.net <mailto:d...@yeralin.net> 
> >>>>>>> <mailto:d...@yeralin.net <mailto:d...@yeralin.net>>>> wrote:
> >>>>>>> Hey John and Matthias,
> >>>>>>>
> >>>>>>> Yes, now I see it all. I’m storing lots of redundant information.
> >>>>>>> Here is my final idea. Yes, now a user should pass a list type. I
> >>>>>>> realized that’s the type is not really needed in ListSerializer, but
> >>>>>>> only in ListDeserializer:
> >>>>>>>
> >>>>>>>
> >>>>>>> In ListSerializer we will start storing sizes only if serializer is
> >>>>>>> not a primitive serializer:
> >>>>>>>
> >>>>>>>
> >>>>>>> Then, in deserializer, we persist passed list type, so that during
> >>>>>>> deserialization we could create an instance of it with predefined
> >>>>>>> listSize for better performance.
> >>>>>>> We also try to locate a primitiveSize based on passed deserializer.
> >>>>>>> If it is not there, then primitiveSize will be null. Which means
> >>>>>>> that each entry’s size was encoded individually.
> >>>>>>>
> >>>>>>>
> >>>>>>> This looks much cleaner and more concise.
> >>>>>>>
> >>>>>>> What do you think?
> >>>>>>>
> >>>>>>> Best,
> >>>>>>> Daniyar Yeralin
> >>>>>>>
> >>>>>>> On Jun 20, 2019, at 5:45 PM, Matthias J. Sax <matth...@confluent.io 
> >>>>>>> <mailto:matth...@confluent.io> <mailto:matth...@confluent.io 
> >>>>>>> <mailto:matth...@confluent.io>>
> >>>>>>> <mailto:matth...@confluent.io <mailto:matth...@confluent.io> 
> >>>>>>> <mailto:matth...@confluent.io <mailto:matth...@confluent.io>>> 
> >>>>>>> <mailto:matth...@confluent.io <mailto:matth...@confluent.io> 
> >>>>>>> <mailto:matth...@confluent.io <mailto:matth...@confluent.io>>>> wrote:
> >>>>>>>
> >>>>>>> For encoding the list-type: I see John's point about re-encoding the
> >>>>>>> list-type redundantly. However, I also don't like the idea that the
> >>>>>>> Deserializer returns a fixed type...
> >>>>>>>
> >>>>>>> Maybe it's best allow users to specify the target list type on
> >>>>>>> deserialization via config?
> >>>>>>>
> >>>>>>> Similar for the primitive types: I don't think we need to encode the
> >>>>>>> type size, but users could specify the type on the deserializer (via a
> >>>>>>> config again)?
> >>>>>>>
> >>>>>>>
> >>>>>>> About generics: nesting could be arbitrarily deep. Hence, I doubt
> >>>>>>> we can
> >>>>>>> support this and a cast will be necessary at some point in the user
> >>>>>>> code.
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>> -Matthias
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>> On 6/20/19 1:21 PM, John Roesler wrote:
> >>>>>>>
> >>>>>>> Hey Daniyar,
> >>>>>>>
> >>>>>>> Thanks for looking at it!
> >>>>>>>
> >>>>>>> Something like your screenshot is more along the lines of what I was
> >>>>>>> thinking. Sorry, but I didn't follow what you mean, how would that not
> >>>>>>> be "vanilla java"?
> >>>>>>>
> >>>>>>> Unfortunately the deserializer needs more information, though. For
> >>>>>>> example, what if the inner type is a Map<String,String>? The serde
> >>>>>>> could
> >>>>>>> only be used to produce a LinkedList<Map>, thus, we'd still need an
> >>>>>>> inner serde, like you have in the KIP (Serde<T> innerSerde).
> >>>>>>>
> >>>>>>> Something more like Serde<LinkedList<MyRecord>> = Serdes.listSerde(
> >>>>>>> /**list type**/ LinkedList.class,
> >>>>>>> /**inner serde**/ new MyRecordSerde()
> >>>>>>> )
> >>>>>>>
> >>>>>>> And in configuration, it's something like:
> >>>>>>> default.key.serde: org...ListSerde
> >>>>>>> default.key.list.serde.type: java.util.LinkedList
> >>>>>>> default.key.list.serde.inner: com.mycompany.MyRecordSerde
> >>>>>>>
> >>>>>>>
> >>>>>>> What do you think?
> >>>>>>> Thanks,
> >>>>>>> -John
> >>>>>>>
> >>>>>>> On Thu, Jun 20, 2019 at 2:46 PM Development <d...@yeralin.net 
> >>>>>>> <mailto:d...@yeralin.net> <mailto:d...@yeralin.net 
> >>>>>>> <mailto:d...@yeralin.net>>
> >>>>>>> <mailto:d...@yeralin.net <mailto:d...@yeralin.net> 
> >>>>>>> <mailto:d...@yeralin.net <mailto:d...@yeralin.net>>> 
> >>>>>>> <mailto:d...@yeralin.net <mailto:d...@yeralin.net> 
> >>>>>>> <mailto:d...@yeralin.net <mailto:d...@yeralin.net>>>
> >>>>>>> <mailto:d...@yeralin.net <mailto:d...@yeralin.net> 
> >>>>>>> <mailto:d...@yeralin.net <mailto:d...@yeralin.net>> 
> >>>>>>> <mailto:d...@yeralin.net <mailto:d...@yeralin.net> 
> >>>>>>> <mailto:d...@yeralin.net <mailto:d...@yeralin.net>>>>> wrote:
> >>>>>>>
> >>>>>>> Hey John,
> >>>>>>>
> >>>>>>> I gave read about TypeReference. It could work for the list serde.
> >>>>>>> However, it is not directly
> >>>>>>> supported:
> >>>>>>> https://github.com/FasterXML/jackson-databind/issues/1490 
> >>>>>>> <https://github.com/FasterXML/jackson-databind/issues/1490> 
> >>>>>>> <https://github.com/FasterXML/jackson-databind/issues/1490 
> >>>>>>> <https://github.com/FasterXML/jackson-databind/issues/1490>>
> >>>>>>> <https://github.com/FasterXML/jackson-databind/issues/1490 
> >>>>>>> <https://github.com/FasterXML/jackson-databind/issues/1490> 
> >>>>>>> <https://github.com/FasterXML/jackson-databind/issues/1490 
> >>>>>>> <https://github.com/FasterXML/jackson-databind/issues/1490>>>
> >>>>>>> The only way is to pass an actual class object into the constructor,
> >>>>>>> something like:
> >>>>>>>
> >>>>>>> It could be an option, but not a pretty one. What do you think of my
> >>>>>>> approach to use vanilla java and canonical class name? (As described
> >>>>>>> previously)
> >>>>>>>
> >>>>>>> Best,
> >>>>>>> Daniyar Yeralin
> >>>>>>>
> >>>>>>> On Jun 20, 2019, at 2:45 PM, Development <d...@yeralin.net 
> >>>>>>> <mailto:d...@yeralin.net> <mailto:d...@yeralin.net 
> >>>>>>> <mailto:d...@yeralin.net>>
> >>>>>>> <mailto:d...@yeralin.net <mailto:d...@yeralin.net> 
> >>>>>>> <mailto:d...@yeralin.net <mailto:d...@yeralin.net>>> 
> >>>>>>> <mailto:d...@yeralin.net <mailto:d...@yeralin.net> 
> >>>>>>> <mailto:d...@yeralin.net <mailto:d...@yeralin.net>>>
> >>>>>>> <mailto:d...@yeralin.net <mailto:d...@yeralin.net> 
> >>>>>>> <mailto:d...@yeralin.net <mailto:d...@yeralin.net>> 
> >>>>>>> <mailto:d...@yeralin.net <mailto:d...@yeralin.net> 
> >>>>>>> <mailto:d...@yeralin.net <mailto:d...@yeralin.net>>>>> wrote:
> >>>>>>>
> >>>>>>> Hi John,
> >>>>>>>
> >>>>>>> Thank you for your input! Yes, my idea looks a little bit over
> >>>>>>> engineered :)
> >>>>>>>
> >>>>>>> I also wanted to see a feedback from Mathias as well since he gave
> >>>>>>> me an idea about storing fixed/variable size entries.
> >>>>>>>
> >>>>>>> Best,
> >>>>>>> Daniyar Yeralin
> >>>>>>>
> >>>>>>> On Jun 18, 2019, at 6:06 PM, John Roesler <j...@confluent.io 
> >>>>>>> <mailto:j...@confluent.io> <mailto:j...@confluent.io 
> >>>>>>> <mailto:j...@confluent.io>>
> >>>>>>> <mailto:j...@confluent.io <mailto:j...@confluent.io> 
> >>>>>>> <mailto:j...@confluent.io <mailto:j...@confluent.io>>> 
> >>>>>>> <mailto:j...@confluent.io <mailto:j...@confluent.io> 
> >>>>>>> <mailto:j...@confluent.io <mailto:j...@confluent.io>>>
> >>>>>>> <mailto:j...@confluent.io <mailto:j...@confluent.io> 
> >>>>>>> <mailto:j...@confluent.io <mailto:j...@confluent.io>> 
> >>>>>>> <mailto:j...@confluent.io <mailto:j...@confluent.io> 
> >>>>>>> <mailto:j...@confluent.io <mailto:j...@confluent.io>>>>> wrote:
> >>>>>>>
> >>>>>>> Hi Daniyar,
> >>>>>>>
> >>>>>>> That's a very clever solution!
> >>>>>>>
> >>>>>>> One observation is that, now, this is what we might call a
> >>>>>>> polymorphic
> >>>>>>> serde. That is, you're detecting the actual concrete type and then
> >>>>>>> promising to produce the exact same concrete type on read.
> >>>>>>> There are
> >>>>>>> some inherent problems with this approach, which in general
> >>>>>>> require
> >>>>>>> some kind of  schema registry (not necessarily Schema
> >>>>>>> Registry, just
> >>>>>>> any registry for schemas) to solve.
> >>>>>>>
> >>>>>>> Notice that every serialized record has quite a bit of duplicated
> >>>>>>> information: the concrete type as well as a byte to indicate
> >>>>>>> whether
> >>>>>>> the value type is a fixed size, and, if so, an integer to
> >>>>>>> indicate the
> >>>>>>> actual size. These constitute a schema, of sorts, because they
> >>>>>>> tell us
> >>>>>>> later how exactly to deserialize the data. Unfortunately, this
> >>>>>>> information is completely redundant. In all likelihood, the
> >>>>>>> information will be exactly the same for every record in the
> >>>>>>> topic.
> >>>>>>> This problem is essentially the core motivation for serializations
> >>>>>>> like Avro: to move the schema outside of the serialization
> >>>>>>> itself, so
> >>>>>>> that the records won't contain so much redundant information.
> >>>>>>>
> >>>>>>> In this light, I'm wondering if it makes sense to go back to
> >>>>>>> something
> >>>>>>> like what you had earlier in which you don't support perfectly
> >>>>>>> preserving the concrete type for _this_ serde, but instead just
> >>>>>>> support deserializing to _some_ List. Then, you could defer full,
> >>>>>>> perfect, type preservation to serdes that have an external
> >>>>>>> system in
> >>>>>>> which to register their type information.
> >>>>>>>
> >>>>>>> There does exist an alternative, if we really do want to
> >>>>>>> preserve the
> >>>>>>> concrete type (which does seem kind of nice). You can add a
> >>>>>>> configuration option specifically for the serde to configure
> >>>>>>> what the
> >>>>>>> list type will be, and maybe what the element type is, as well.
> >>>>>>>
> >>>>>>> As far as "related work" goes, you might be interested to take
> >>>>>>> a look
> >>>>>>> at how Jackson can be configured to deserialize into a specific,
> >>>>>>> arbitrarily nested, generically parameterized class structure.
> >>>>>>> Specifically, you might find
> >>>>>>> https://fasterxml.github.io/jackson-core/javadoc/2.0.0/com/fasterxml/jackson/core/type/TypeReference.html
> >>>>>>>  
> >>>>>>> <https://fasterxml.github.io/jackson-core/javadoc/2.0.0/com/fasterxml/jackson/core/type/TypeReference.html>
> >>>>>>>  
> >>>>>>> <https://fasterxml.github.io/jackson-core/javadoc/2.0.0/com/fasterxml/jackson/core/type/TypeReference.html
> >>>>>>>  
> >>>>>>> <https://fasterxml.github.io/jackson-core/javadoc/2.0.0/com/fasterxml/jackson/core/type/TypeReference.html>>
> >>>>>>> <https://fasterxml.github.io/jackson-core/javadoc/2.0.0/com/fasterxml/jackson/core/type/TypeReference.html
> >>>>>>>  
> >>>>>>> <https://fasterxml.github.io/jackson-core/javadoc/2.0.0/com/fasterxml/jackson/core/type/TypeReference.html>
> >>>>>>>  
> >>>>>>> <https://fasterxml.github.io/jackson-core/javadoc/2.0.0/com/fasterxml/jackson/core/type/TypeReference.html
> >>>>>>>  
> >>>>>>> <https://fasterxml.github.io/jackson-core/javadoc/2.0.0/com/fasterxml/jackson/core/type/TypeReference.html>>>
> >>>>>>> interesting.
> >>>>>>>
> >>>>>>> Thanks,
> >>>>>>> -John
> >>>>>>>
> >>>>>>> On Mon, Jun 17, 2019 at 12:38 PM Development <d...@yeralin.net 
> >>>>>>> <mailto:d...@yeralin.net> <mailto:d...@yeralin.net 
> >>>>>>> <mailto:d...@yeralin.net>>
> >>>>>>> <mailto:d...@yeralin.net <mailto:d...@yeralin.net> 
> >>>>>>> <mailto:d...@yeralin.net <mailto:d...@yeralin.net>>> 
> >>>>>>> <mailto:d...@yeralin.net <mailto:d...@yeralin.net> 
> >>>>>>> <mailto:d...@yeralin.net <mailto:d...@yeralin.net>>>
> >>>>>>> <mailto:d...@yeralin.net <mailto:d...@yeralin.net> 
> >>>>>>> <mailto:d...@yeralin.net <mailto:d...@yeralin.net>> 
> >>>>>>> <mailto:d...@yeralin.net <mailto:d...@yeralin.net> 
> >>>>>>> <mailto:d...@yeralin.net <mailto:d...@yeralin.net>>>>> wrote:
> >>>>>>>
> >>>>>>>
> >>>>>>> bump
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>
> >>>>>
> >>>>
> >>>
> >>
> >
>

Reply via email to