Good point.

I guess the simplest solution is, to actually add

>> default.list.key/value.serde.type
>> default.list.key/value.serde.inner

to both `CommonClientConfigs` and `StreamsConfig`.

It's not super clean, but I think it's the best we can do. Thoughts?


-Matthias

On 7/19/19 1:23 PM, Development wrote:
> Hi Matthias,
> 
> I agree, ConsumerConfig did not seem like a right place for these 
> configurations.
> I’ll put them in ProducerConfig, ConsumerConfig, and StreamsConfig.
> 
> However, I have a question. What should I do in "configure(Map<String, ?> 
> configs, boolean isKey)” methods? Which configurations should I try to 
> locate? I was comparing my (de)serializer implementations with 
> SessionWindows(De)serializer classes, and they use StreamsConfig class to get 
>  either StreamsConfig.DEFAULT_WINDOWED_KEY_SERDE_INNER_CLASS : 
> StreamsConfig.DEFAULT_WINDOWED_VALUE_SERDE_INNER_CLASS
> 
> In my case, as I mentioned earlier, StreamsConfig class is not accessible 
> from org.apache.kafka.common.serialization package. So, I can’t utilize it. 
> Any suggestions here?
> 
> Best,
> Daniyar Yeralin
> 
> 
>> On Jul 18, 2019, at 8:46 PM, Matthias J. Sax <matth...@confluent.io> wrote:
>>
>> Thanks!
>>
>> One minor question about the configs. The KIP adds three classes, a
>> Serializer, a Deserializer, and a Serde.
>>
>> Hence, would it make sense to add the corresponding configs to
>> `ConsumerConfig`, `ProducerConfig`, and `StreamsConfig` using slightly
>> different names each time?
>>
>>
>> Somethin like this:
>>
>> ProducerConfig:
>>
>> list.key/value.serializer.type
>> list.key/value.serializer.inner
>>
>> ConsumerConfig:
>>
>> list.key/value.deserializer.type
>> list.key/value.deserializer.inner
>>
>> StreamsConfig:
>>
>> default.list.key/value.serde.type
>> default.list.key/value.serde.inner
>>
>>
>> Adding `d.l.k/v.serde.t/i` to `CommonClientConfigs does not sound right
>> to me. Also note, that it seems better to avoid the `default.` prefix
>> for consumers and producers because there is only one Serializer or
>> Deserializer anyway. Only for Streams, there are multiple and
>> StreamsConfig specifies the default one of an operator does not
>> overwrite it.
>>
>> Thoughts?
>>
>>
>> Also, the KIP should explicitly mention to what classed certain configs
>> are added. Atm, the KIP only list parameter names, but does not state
>> where those are added.
>>
>>
>> -Matthias
>>
>>
>>
>>
>>
>> On 7/16/19 1:11 PM, Development wrote:
>>> Hi,
>>>
>>> Yes, totally forgot about the statement. KIP-466 is updated.
>>>
>>> Thank you so much John Roesler, Matthias J. Sax, Sophie Blee-Goldman for 
>>> your valuable input!
>>>
>>> I hope I did not cause too much trouble :)
>>>
>>> I’ll start the vote now.
>>>
>>> Best,
>>> Daniyar Yeralin
>>>
>>>> On Jul 16, 2019, at 3:17 PM, John Roesler <j...@confluent.io> wrote:
>>>>
>>>> Hi Daniyar,
>>>>
>>>> Thanks for that update. I took a look, and I think this is in good shape.
>>>>
>>>> One note, the statement "New method public static <T> Serde<List<T>>
>>>> ListSerde() in org.apache.kafka.common.serialization.Serdes class
>>>> (infers list implementation and inner serde from config file)" is
>>>> still present in the KIP, although I do it is was removed from the PR.
>>>>
>>>> Once you remove that statement from the KIP, then I think this KIP is
>>>> ready to go up for a vote! Then, we can really review the PR in
>>>> earnest and get this thing merged.
>>>>
>>>> Thanks,
>>>> -john
>>>>
>>>> On Tue, Jul 16, 2019 at 2:05 PM Development <d...@yeralin.net> wrote:
>>>>>
>>>>> Hi,
>>>>>
>>>>> Pushed new changes under my PR: https://github.com/apache/kafka/pull/6592 
>>>>> <https://github.com/apache/kafka/pull/6592>
>>>>>
>>>>> Feel free to put any comments in there.
>>>>>
>>>>> Best,
>>>>> Daniyar Yeralin
>>>>>
>>>>>> On Jul 15, 2019, at 1:06 PM, Development <d...@yeralin.net> wrote:
>>>>>>
>>>>>> Hi John,
>>>>>>
>>>>>> I knew I was missing something. Yes, that makes sense now, I removed all 
>>>>>> `listSerde()` methods, and left empty constructors instead.
>>>>>>
>>>>>> As per `CommonClientConfigs` I looked at the class, it doesn’t have any 
>>>>>> properties related to serdes, and that bothers me a little.
>>>>>>
>>>>>> All properties like `default.key.serde` `default.windowed.key.serde.*` 
>>>>>> are located in StreamsConfig. I don’t want to create a confusion.
>>>>>> What also doesn’t make sense to me is that `WindowedSerdes` and its 
>>>>>> (de)serializers are not located in 
>>>>>> org.apache.kafka.common.serialization. I guess it kind of makes sense 
>>>>>> since windowed serdes are only available for kafka streams, not vice 
>>>>>> versa.
>>>>>>
>>>>>> If everyone is okay to put list properties in `CommonClientConfigs` 
>>>>>> class, I’ll go ahead and do that then.
>>>>>>
>>>>>> Thank you for your input!
>>>>>>
>>>>>> Best,
>>>>>> Daniyar Yeralin
>>>>>>
>>>>>>> On Jul 15, 2019, at 11:45 AM, John Roesler <j...@confluent.io> wrote:
>>>>>>>
>>>>>>> Hi all,
>>>>>>>
>>>>>>> Regarding the placement, you might as well move the constants to 
>>>>>>> `org.apache.kafka.clients.CommonClientConfigs`, so that the constants 
>>>>>>> and the configs and the code are in the same module.
>>>>>>>
>>>>>>> Regarding the constructor... What Matthias said is correct: The serde, 
>>>>>>> serializer, and deserializer all need to have zero-arg constructors so 
>>>>>>> they can be instantiated reflectively by Kafka. However, the factory 
>>>>>>> method you proposed "New method public static <T> Serde<List<T>> 
>>>>>>> ListSerde()" is not a constructor, and is not required. It would be 
>>>>>>> used purely from the Java interface, but has the drawbacks I listed 
>>>>>>> above. This method, not the constructor, is what I proposed to remove 
>>>>>>> from the KIP.
>>>>>>>
>>>>>>> Thanks,
>>>>>>> -John
>>>>>>>
>>>>>>>
>>>>>>> On Mon, Jul 15, 2019 at 10:15 AM Development <d...@yeralin.net 
>>>>>>> <mailto:d...@yeralin.net>> wrote:
>>>>>>> One problem though.
>>>>>>>
>>>>>>> Since WindowedSerde (Windowed(De)Serializer) are so similar, I’m trying 
>>>>>>> to mimic the implementation of my ListSerde accordingly.
>>>>>>>
>>>>>>> I created couple constants under StreamsConfig:
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> And trying to do similar construct:
>>>>>>> final String propertyName = isKey ? 
>>>>>>> StreamsConfig.DEFAULT_LIST_KEY_SERDE_INNER_CLASS : 
>>>>>>> StreamsConfig.DEFAULT_LIST_VALUE_SERDE_INNER_CLASS;
>>>>>>> But then found out that StreamsConfig is not accessible from 
>>>>>>> org.apache.kafka.common.serialization package while window serde 
>>>>>>> (de)serializers are located under org.apache.kafka.streams.kstream 
>>>>>>> package.
>>>>>>>
>>>>>>> What should I do? Should I move my classes under 
>>>>>>> org.apache.kafka.streams.kstream package instead?
>>>>>>>
>>>>>>>> On Jul 15, 2019, at 10:45 AM, Development <d...@yeralin.net 
>>>>>>>> <mailto:d...@yeralin.net>> wrote:
>>>>>>>>
>>>>>>>> Hi Matthias,
>>>>>>>>
>>>>>>>> Thank you for your input.
>>>>>>>>
>>>>>>>> I updated the KIP, made it a little more readable.
>>>>>>>>
>>>>>>>> I think the configuration parameters strategy is finalized then.
>>>>>>>>
>>>>>>>> Do you have any other questions/concerns regarding this KIP?
>>>>>>>>
>>>>>>>> Meanwhile I’ll start doing appropriate code changes, and commit them 
>>>>>>>> under my PR.
>>>>>>>>
>>>>>>>> Best,
>>>>>>>> Daniyar Yeralin
>>>>>>>>
>>>>>>>>> On Jul 11, 2019, at 2:44 PM, Matthias J. Sax <matth...@confluent.io 
>>>>>>>>> <mailto:matth...@confluent.io>> wrote:
>>>>>>>>>
>>>>>>>>> Daniyar,
>>>>>>>>>
>>>>>>>>> thanks for the update to the KIP. It's in really good shape and well
>>>>>>>>> written.
>>>>>>>>>
>>>>>>>>> About the default constructor question:
>>>>>>>>>
>>>>>>>>> All Serdes/Serializer/Deserializer classes need a default constructor 
>>>>>>>>> to
>>>>>>>>> create them easily via reflections when specifies in a config. I
>>>>>>>>> understand that it is not super user friendly, but all existing code
>>>>>>>>> works this way. Hence, it seems best to stick with the established 
>>>>>>>>> pattern.
>>>>>>>>>
>>>>>>>>> We have a similar issue with `TimeWindowedSerde` and
>>>>>>>>> `SessionWindowedSerde`, and I just recently did a PR to improve user
>>>>>>>>> experience that address the exact issue John raised. (cf
>>>>>>>>> https://github.com/apache/kafka/pull/7067 
>>>>>>>>> <https://github.com/apache/kafka/pull/7067>)
>>>>>>>>>
>>>>>>>>> Note, that if a user would instantiate the Serde manually, the user
>>>>>>>>> would also need to call `configure()` to setup the inner serdes. Kafka
>>>>>>>>> Streams would not setup those automatically and one might most likely
>>>>>>>>> end-up with an NPE.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Coming back the KIP, and the parameter names. `WindowedSerdes` are
>>>>>>>>> similar to `ListSerde` as they wrap another Serde. For 
>>>>>>>>> `WindowedSerdes`,
>>>>>>>>> we use the following parameter names:
>>>>>>>>>
>>>>>>>>> - default.windowed.key.serde.inner
>>>>>>>>> - default.windowed.value.serde.inner
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> It might be good to align the naming pattern. I would also suggest to
>>>>>>>>> use `type` instead of `impl`?
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> default.key.list.serde.impl  ->  default.list.key.serde.type
>>>>>>>>> default.value.list.serde.impl  ->  default.list.value.serde.type
>>>>>>>>> default.key.list.serde.element  ->  default.list.key.serde.inner
>>>>>>>>> default.value.list.serde.element  ->  default.list.value.serde.inner
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> -Matthias
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On 7/10/19 8:52 AM, Development wrote:
>>>>>>>>>> Hi John,
>>>>>>>>>>
>>>>>>>>>> Yes, I do agree. That totally makes sense. The only thing is that it 
>>>>>>>>>> goes against what Matthias suggested earlier:
>>>>>>>>>> "I think that ... `ListSerde` should have an default constructor and 
>>>>>>>>>> it should be possible to pass in the `Class listClass` information 
>>>>>>>>>> via a configuration. Otherwise, KafkaStreams cannot use it as 
>>>>>>>>>> default serde.”
>>>>>>>>>>
>>>>>>>>>> What do you think about that? I hope I’m not confusing anything.
>>>>>>>>>>
>>>>>>>>>> Best,
>>>>>>>>>> Daniyar Yeralin
>>>>>>>>>>
>>>>>>>>>>> On Jul 9, 2019, at 5:56 PM, John Roesler <j...@confluent.io 
>>>>>>>>>>> <mailto:j...@confluent.io>> wrote:
>>>>>>>>>>>
>>>>>>>>>>> Ah, my apologies, I must have just overlooked it. Thanks for the 
>>>>>>>>>>> update, too.
>>>>>>>>>>>
>>>>>>>>>>> Just one more super-small question, do we need this variant:
>>>>>>>>>>>
>>>>>>>>>>>> New method public static <T> Serde<List<T>> ListSerde() in 
>>>>>>>>>>>> org.apache.kafka.common.serialization.Serdes class (infers list 
>>>>>>>>>>>> implementation and inner serde from config file)
>>>>>>>>>>>
>>>>>>>>>>> It seems like this situation implies my config file is already set 
>>>>>>>>>>> up for the list serde, so passing this serde (e.g., in Produced) 
>>>>>>>>>>> would have the same effect as not specifying it.
>>>>>>>>>>>
>>>>>>>>>>> I guess that it could be the case that you have the 
>>>>>>>>>>> `default.key/value.serde` set to something else, like StringSerde, 
>>>>>>>>>>> but you still have the `default.key/value.list.serde.impl/element` 
>>>>>>>>>>> set. This seems like it would result in more confusion than 
>>>>>>>>>>> convenience, so my gut instinct is maybe we shouldn't introduce the 
>>>>>>>>>>> `ListSerde()` variant until people actually request it later on.
>>>>>>>>>>>
>>>>>>>>>>> Thus, we'd just stick with fully config-driven or fully 
>>>>>>>>>>> source-code-driven, not half/half.
>>>>>>>>>>>
>>>>>>>>>>> What do you think?
>>>>>>>>>>>
>>>>>>>>>>> Thanks,
>>>>>>>>>>> -John
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On Tue, Jul 9, 2019 at 9:58 AM Development <d...@yeralin.net 
>>>>>>>>>>> <mailto:d...@yeralin.net> <mailto:d...@yeralin.net 
>>>>>>>>>>> <mailto:d...@yeralin.net>>> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>> Hi John,
>>>>>>>>>>>>
>>>>>>>>>>>> I hope everyone had a great long weekend.
>>>>>>>>>>>>
>>>>>>>>>>>> Regarding Java interfaces, I may not understand you correctly, but 
>>>>>>>>>>>> I think I already listed them:
>>>>>>>>>>>>
>>>>>>>>>>>> So for Produced, you would use it in the following fashion, for 
>>>>>>>>>>>> example: Produced.keySerde(Serdes.ListSerde(ArrayList.class, 
>>>>>>>>>>>> Serdes.Integer()))
>>>>>>>>>>>>
>>>>>>>>>>>> I also updated the KIP, and added a section “Serialization 
>>>>>>>>>>>> Strategy” where I describe our logic of conditional serialization 
>>>>>>>>>>>> based on the type of an inner serde.
>>>>>>>>>>>>
>>>>>>>>>>>> Thank you!
>>>>>>>>>>>>
>>>>>>>>>>>> Best,
>>>>>>>>>>>> Daniyar Yeralin
>>>>>>>>>>>>
>>>>>>>>>>>> On Jun 26, 2019, at 11:44 AM, John Roesler <j...@confluent.io 
>>>>>>>>>>>> <mailto:j...@confluent.io> <mailto:j...@confluent.io 
>>>>>>>>>>>> <mailto:j...@confluent.io>>> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>> Thanks for the update, Daniyar!
>>>>>>>>>>>>
>>>>>>>>>>>> In addition to specifying the config interface, can you also 
>>>>>>>>>>>> specify
>>>>>>>>>>>> the Java interface? Namely, if I need to pass an instance of this
>>>>>>>>>>>> serde in to the DSL directly, as in Produced, Materialized, etc., 
>>>>>>>>>>>> what
>>>>>>>>>>>> constructor(s) would I have available? Likewise with the Serializer
>>>>>>>>>>>> and Deserailizer. I don't think you need to specify the 
>>>>>>>>>>>> implementation
>>>>>>>>>>>> logic, since we've already discussed it here.
>>>>>>>>>>>>
>>>>>>>>>>>> If you also want to specify the serialized format of the data 
>>>>>>>>>>>> records
>>>>>>>>>>>> in the KIP, it could be useful documentation, as well as letting us
>>>>>>>>>>>> verify the schema for forward/backward compatibility concerns, etc.
>>>>>>>>>>>>
>>>>>>>>>>>> Thanks,
>>>>>>>>>>>> John
>>>>>>>>>>>>
>>>>>>>>>>>> On Wed, Jun 26, 2019 at 10:33 AM Development <d...@yeralin.net 
>>>>>>>>>>>> <mailto:d...@yeralin.net> <mailto:d...@yeralin.net 
>>>>>>>>>>>> <mailto:d...@yeralin.net>>> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> Hey,
>>>>>>>>>>>>
>>>>>>>>>>>> Finally made updates to the KIP: 
>>>>>>>>>>>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-466%3A+Add+support+for+List%3CT%3E+serialization+and+deserialization
>>>>>>>>>>>>  
>>>>>>>>>>>> <https://cwiki.apache.org/confluence/display/KAFKA/KIP-466%3A+Add+support+for+List%3CT%3E+serialization+and+deserialization>
>>>>>>>>>>>>  
>>>>>>>>>>>> <https://cwiki.apache.org/confluence/display/KAFKA/KIP-466%3A+Add+support+for+List%3CT%3E+serialization+and+deserialization
>>>>>>>>>>>>  
>>>>>>>>>>>> <https://cwiki.apache.org/confluence/display/KAFKA/KIP-466%3A+Add+support+for+List%3CT%3E+serialization+and+deserialization>>
>>>>>>>>>>>>  
>>>>>>>>>>>> <https://cwiki.apache.org/confluence/display/KAFKA/KIP-466:+Add+support+for+List%3CT%3E+serialization+and+deserialization
>>>>>>>>>>>>  
>>>>>>>>>>>> <https://cwiki.apache.org/confluence/display/KAFKA/KIP-466:+Add+support+for+List%3CT%3E+serialization+and+deserialization>
>>>>>>>>>>>>  
>>>>>>>>>>>> <https://cwiki.apache.org/confluence/display/KAFKA/KIP-466:+Add+support+for+List%3CT%3E+serialization+and+deserialization
>>>>>>>>>>>>  
>>>>>>>>>>>> <https://cwiki.apache.org/confluence/display/KAFKA/KIP-466:+Add+support+for+List%3CT%3E+serialization+and+deserialization>>>
>>>>>>>>>>>> Sorry for the delay :)
>>>>>>>>>>>>
>>>>>>>>>>>> Thank You!
>>>>>>>>>>>>
>>>>>>>>>>>> Best,
>>>>>>>>>>>> Daniyar Yeralin
>>>>>>>>>>>>
>>>>>>>>>>>> On Jun 22, 2019, at 12:49 AM, Matthias J. Sax 
>>>>>>>>>>>> <matth...@confluent.io <mailto:matth...@confluent.io> 
>>>>>>>>>>>> <mailto:matth...@confluent.io <mailto:matth...@confluent.io>>> 
>>>>>>>>>>>> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>> Yes, something like this. I did not think about good configuration
>>>>>>>>>>>> parameter names yet. I am also not sure if I understand all 
>>>>>>>>>>>> proposed
>>>>>>>>>>>> configs atm. But all configs should be listed and explained in the 
>>>>>>>>>>>> KIP
>>>>>>>>>>>> anyway, and we can discuss further after you have updated the KIP 
>>>>>>>>>>>> (I can
>>>>>>>>>>>> ask more detailed question if I have any).
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> -Matthias
>>>>>>>>>>>>
>>>>>>>>>>>> On 6/21/19 2:05 PM, Development wrote:
>>>>>>>>>>>>
>>>>>>>>>>>> Yes, you are right. ByteSerializer is not what I need to have in a 
>>>>>>>>>>>> list
>>>>>>>>>>>> of primitives.
>>>>>>>>>>>>
>>>>>>>>>>>> As for the default constructor and configurability, just want to 
>>>>>>>>>>>> make
>>>>>>>>>>>> sure. Is this what you have on your mind?
>>>>>>>>>>>>
>>>>>>>>>>>> Best,
>>>>>>>>>>>> Daniyar Yeralin
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> On Jun 21, 2019, at 2:51 PM, Matthias J. Sax 
>>>>>>>>>>>> <matth...@confluent.io <mailto:matth...@confluent.io> 
>>>>>>>>>>>> <mailto:matth...@confluent.io <mailto:matth...@confluent.io>>
>>>>>>>>>>>> <mailto:matth...@confluent.io <mailto:matth...@confluent.io> 
>>>>>>>>>>>> <mailto:matth...@confluent.io <mailto:matth...@confluent.io>>>> 
>>>>>>>>>>>> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>> Thanks for the update!
>>>>>>>>>>>>
>>>>>>>>>>>> I think that `ListDeserializer`, `ListSerializer`, and `ListSerde`
>>>>>>>>>>>> should have an default constructor and it should be possible to 
>>>>>>>>>>>> pass in
>>>>>>>>>>>> the `Class listClass` information via a configuration. Otherwise,
>>>>>>>>>>>> KafkaStreams cannot use it as default serde.
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> For the primitive serializers: `BytesSerializer` is not primitive 
>>>>>>>>>>>> IMHO,
>>>>>>>>>>>> as is it for `byte[]` with variable length -- it's for arrays, not 
>>>>>>>>>>>> for
>>>>>>>>>>>> single `byte` (note, that `Bytes` is a Kafka class wrapping 
>>>>>>>>>>>> `byte[]`).
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> For tests, we can comment on the PR. No need to do this in the KIP
>>>>>>>>>>>> discussion.
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> Can you also update the KIP?
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> -Matthias
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> On 6/21/19 11:29 AM, Development wrote:
>>>>>>>>>>>>
>>>>>>>>>>>> I made and pushed necessary commits, so we could review the final
>>>>>>>>>>>> version under PR https://github.com/apache/kafka/pull/6592 
>>>>>>>>>>>> <https://github.com/apache/kafka/pull/6592> 
>>>>>>>>>>>> <https://github.com/apache/kafka/pull/6592 
>>>>>>>>>>>> <https://github.com/apache/kafka/pull/6592>>
>>>>>>>>>>>>
>>>>>>>>>>>> I also need some advice on writing tests for this new serde. So 
>>>>>>>>>>>> far I
>>>>>>>>>>>> only have two test cases (roundtrip and empty payload), I’m not 
>>>>>>>>>>>> sure
>>>>>>>>>>>> if it is enough.
>>>>>>>>>>>>
>>>>>>>>>>>> Thank y’all for your help in this KIP :)
>>>>>>>>>>>>
>>>>>>>>>>>> Best,
>>>>>>>>>>>> Daniyar Yeralin
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> On Jun 21, 2019, at 1:44 PM, John Roesler <j...@confluent.io 
>>>>>>>>>>>> <mailto:j...@confluent.io> <mailto:j...@confluent.io 
>>>>>>>>>>>> <mailto:j...@confluent.io>>
>>>>>>>>>>>> <mailto:j...@confluent.io <mailto:j...@confluent.io> 
>>>>>>>>>>>> <mailto:j...@confluent.io <mailto:j...@confluent.io>>>> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>> Hey Daniyar,
>>>>>>>>>>>>
>>>>>>>>>>>> Looks good to me! Thanks for considering it.
>>>>>>>>>>>>
>>>>>>>>>>>> Thanks,
>>>>>>>>>>>> -John
>>>>>>>>>>>>
>>>>>>>>>>>> On Fri, Jun 21, 2019 at 9:04 AM Development <d...@yeralin.net 
>>>>>>>>>>>> <mailto:d...@yeralin.net> <mailto:d...@yeralin.net 
>>>>>>>>>>>> <mailto:d...@yeralin.net>>
>>>>>>>>>>>> <mailto:d...@yeralin.net <mailto:d...@yeralin.net> 
>>>>>>>>>>>> <mailto:d...@yeralin.net <mailto:d...@yeralin.net>>> 
>>>>>>>>>>>> <mailto:d...@yeralin.net <mailto:d...@yeralin.net> 
>>>>>>>>>>>> <mailto:d...@yeralin.net <mailto:d...@yeralin.net>>>> wrote:
>>>>>>>>>>>> Hey John and Matthias,
>>>>>>>>>>>>
>>>>>>>>>>>> Yes, now I see it all. I’m storing lots of redundant information.
>>>>>>>>>>>> Here is my final idea. Yes, now a user should pass a list type. I
>>>>>>>>>>>> realized that’s the type is not really needed in ListSerializer, 
>>>>>>>>>>>> but
>>>>>>>>>>>> only in ListDeserializer:
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> In ListSerializer we will start storing sizes only if serializer is
>>>>>>>>>>>> not a primitive serializer:
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> Then, in deserializer, we persist passed list type, so that during
>>>>>>>>>>>> deserialization we could create an instance of it with predefined
>>>>>>>>>>>> listSize for better performance.
>>>>>>>>>>>> We also try to locate a primitiveSize based on passed deserializer.
>>>>>>>>>>>> If it is not there, then primitiveSize will be null. Which means
>>>>>>>>>>>> that each entry’s size was encoded individually.
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> This looks much cleaner and more concise.
>>>>>>>>>>>>
>>>>>>>>>>>> What do you think?
>>>>>>>>>>>>
>>>>>>>>>>>> Best,
>>>>>>>>>>>> Daniyar Yeralin
>>>>>>>>>>>>
>>>>>>>>>>>> On Jun 20, 2019, at 5:45 PM, Matthias J. Sax 
>>>>>>>>>>>> <matth...@confluent.io <mailto:matth...@confluent.io> 
>>>>>>>>>>>> <mailto:matth...@confluent.io <mailto:matth...@confluent.io>>
>>>>>>>>>>>> <mailto:matth...@confluent.io <mailto:matth...@confluent.io> 
>>>>>>>>>>>> <mailto:matth...@confluent.io <mailto:matth...@confluent.io>>> 
>>>>>>>>>>>> <mailto:matth...@confluent.io <mailto:matth...@confluent.io> 
>>>>>>>>>>>> <mailto:matth...@confluent.io <mailto:matth...@confluent.io>>>> 
>>>>>>>>>>>> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>> For encoding the list-type: I see John's point about re-encoding 
>>>>>>>>>>>> the
>>>>>>>>>>>> list-type redundantly. However, I also don't like the idea that the
>>>>>>>>>>>> Deserializer returns a fixed type...
>>>>>>>>>>>>
>>>>>>>>>>>> Maybe it's best allow users to specify the target list type on
>>>>>>>>>>>> deserialization via config?
>>>>>>>>>>>>
>>>>>>>>>>>> Similar for the primitive types: I don't think we need to encode 
>>>>>>>>>>>> the
>>>>>>>>>>>> type size, but users could specify the type on the deserializer 
>>>>>>>>>>>> (via a
>>>>>>>>>>>> config again)?
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> About generics: nesting could be arbitrarily deep. Hence, I doubt
>>>>>>>>>>>> we can
>>>>>>>>>>>> support this and a cast will be necessary at some point in the user
>>>>>>>>>>>> code.
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> -Matthias
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> On 6/20/19 1:21 PM, John Roesler wrote:
>>>>>>>>>>>>
>>>>>>>>>>>> Hey Daniyar,
>>>>>>>>>>>>
>>>>>>>>>>>> Thanks for looking at it!
>>>>>>>>>>>>
>>>>>>>>>>>> Something like your screenshot is more along the lines of what I 
>>>>>>>>>>>> was
>>>>>>>>>>>> thinking. Sorry, but I didn't follow what you mean, how would that 
>>>>>>>>>>>> not
>>>>>>>>>>>> be "vanilla java"?
>>>>>>>>>>>>
>>>>>>>>>>>> Unfortunately the deserializer needs more information, though. For
>>>>>>>>>>>> example, what if the inner type is a Map<String,String>? The serde
>>>>>>>>>>>> could
>>>>>>>>>>>> only be used to produce a LinkedList<Map>, thus, we'd still need an
>>>>>>>>>>>> inner serde, like you have in the KIP (Serde<T> innerSerde).
>>>>>>>>>>>>
>>>>>>>>>>>> Something more like Serde<LinkedList<MyRecord>> = Serdes.listSerde(
>>>>>>>>>>>> /**list type**/ LinkedList.class,
>>>>>>>>>>>> /**inner serde**/ new MyRecordSerde()
>>>>>>>>>>>> )
>>>>>>>>>>>>
>>>>>>>>>>>> And in configuration, it's something like:
>>>>>>>>>>>> default.key.serde: org...ListSerde
>>>>>>>>>>>> default.key.list.serde.type: java.util.LinkedList
>>>>>>>>>>>> default.key.list.serde.inner: com.mycompany.MyRecordSerde
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> What do you think?
>>>>>>>>>>>> Thanks,
>>>>>>>>>>>> -John
>>>>>>>>>>>>
>>>>>>>>>>>> On Thu, Jun 20, 2019 at 2:46 PM Development <d...@yeralin.net 
>>>>>>>>>>>> <mailto:d...@yeralin.net> <mailto:d...@yeralin.net 
>>>>>>>>>>>> <mailto:d...@yeralin.net>>
>>>>>>>>>>>> <mailto:d...@yeralin.net <mailto:d...@yeralin.net> 
>>>>>>>>>>>> <mailto:d...@yeralin.net <mailto:d...@yeralin.net>>> 
>>>>>>>>>>>> <mailto:d...@yeralin.net <mailto:d...@yeralin.net> 
>>>>>>>>>>>> <mailto:d...@yeralin.net <mailto:d...@yeralin.net>>>
>>>>>>>>>>>> <mailto:d...@yeralin.net <mailto:d...@yeralin.net> 
>>>>>>>>>>>> <mailto:d...@yeralin.net <mailto:d...@yeralin.net>> 
>>>>>>>>>>>> <mailto:d...@yeralin.net <mailto:d...@yeralin.net> 
>>>>>>>>>>>> <mailto:d...@yeralin.net <mailto:d...@yeralin.net>>>>> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>> Hey John,
>>>>>>>>>>>>
>>>>>>>>>>>> I gave read about TypeReference. It could work for the list serde.
>>>>>>>>>>>> However, it is not directly
>>>>>>>>>>>> supported:
>>>>>>>>>>>> https://github.com/FasterXML/jackson-databind/issues/1490 
>>>>>>>>>>>> <https://github.com/FasterXML/jackson-databind/issues/1490> 
>>>>>>>>>>>> <https://github.com/FasterXML/jackson-databind/issues/1490 
>>>>>>>>>>>> <https://github.com/FasterXML/jackson-databind/issues/1490>>
>>>>>>>>>>>> <https://github.com/FasterXML/jackson-databind/issues/1490 
>>>>>>>>>>>> <https://github.com/FasterXML/jackson-databind/issues/1490> 
>>>>>>>>>>>> <https://github.com/FasterXML/jackson-databind/issues/1490 
>>>>>>>>>>>> <https://github.com/FasterXML/jackson-databind/issues/1490>>>
>>>>>>>>>>>> The only way is to pass an actual class object into the 
>>>>>>>>>>>> constructor,
>>>>>>>>>>>> something like:
>>>>>>>>>>>>
>>>>>>>>>>>> It could be an option, but not a pretty one. What do you think of 
>>>>>>>>>>>> my
>>>>>>>>>>>> approach to use vanilla java and canonical class name? (As 
>>>>>>>>>>>> described
>>>>>>>>>>>> previously)
>>>>>>>>>>>>
>>>>>>>>>>>> Best,
>>>>>>>>>>>> Daniyar Yeralin
>>>>>>>>>>>>
>>>>>>>>>>>> On Jun 20, 2019, at 2:45 PM, Development <d...@yeralin.net 
>>>>>>>>>>>> <mailto:d...@yeralin.net> <mailto:d...@yeralin.net 
>>>>>>>>>>>> <mailto:d...@yeralin.net>>
>>>>>>>>>>>> <mailto:d...@yeralin.net <mailto:d...@yeralin.net> 
>>>>>>>>>>>> <mailto:d...@yeralin.net <mailto:d...@yeralin.net>>> 
>>>>>>>>>>>> <mailto:d...@yeralin.net <mailto:d...@yeralin.net> 
>>>>>>>>>>>> <mailto:d...@yeralin.net <mailto:d...@yeralin.net>>>
>>>>>>>>>>>> <mailto:d...@yeralin.net <mailto:d...@yeralin.net> 
>>>>>>>>>>>> <mailto:d...@yeralin.net <mailto:d...@yeralin.net>> 
>>>>>>>>>>>> <mailto:d...@yeralin.net <mailto:d...@yeralin.net> 
>>>>>>>>>>>> <mailto:d...@yeralin.net <mailto:d...@yeralin.net>>>>> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>> Hi John,
>>>>>>>>>>>>
>>>>>>>>>>>> Thank you for your input! Yes, my idea looks a little bit over
>>>>>>>>>>>> engineered :)
>>>>>>>>>>>>
>>>>>>>>>>>> I also wanted to see a feedback from Mathias as well since he gave
>>>>>>>>>>>> me an idea about storing fixed/variable size entries.
>>>>>>>>>>>>
>>>>>>>>>>>> Best,
>>>>>>>>>>>> Daniyar Yeralin
>>>>>>>>>>>>
>>>>>>>>>>>> On Jun 18, 2019, at 6:06 PM, John Roesler <j...@confluent.io 
>>>>>>>>>>>> <mailto:j...@confluent.io> <mailto:j...@confluent.io 
>>>>>>>>>>>> <mailto:j...@confluent.io>>
>>>>>>>>>>>> <mailto:j...@confluent.io <mailto:j...@confluent.io> 
>>>>>>>>>>>> <mailto:j...@confluent.io <mailto:j...@confluent.io>>> 
>>>>>>>>>>>> <mailto:j...@confluent.io <mailto:j...@confluent.io> 
>>>>>>>>>>>> <mailto:j...@confluent.io <mailto:j...@confluent.io>>>
>>>>>>>>>>>> <mailto:j...@confluent.io <mailto:j...@confluent.io> 
>>>>>>>>>>>> <mailto:j...@confluent.io <mailto:j...@confluent.io>> 
>>>>>>>>>>>> <mailto:j...@confluent.io <mailto:j...@confluent.io> 
>>>>>>>>>>>> <mailto:j...@confluent.io <mailto:j...@confluent.io>>>>> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>> Hi Daniyar,
>>>>>>>>>>>>
>>>>>>>>>>>> That's a very clever solution!
>>>>>>>>>>>>
>>>>>>>>>>>> One observation is that, now, this is what we might call a
>>>>>>>>>>>> polymorphic
>>>>>>>>>>>> serde. That is, you're detecting the actual concrete type and then
>>>>>>>>>>>> promising to produce the exact same concrete type on read.
>>>>>>>>>>>> There are
>>>>>>>>>>>> some inherent problems with this approach, which in general
>>>>>>>>>>>> require
>>>>>>>>>>>> some kind of  schema registry (not necessarily Schema
>>>>>>>>>>>> Registry, just
>>>>>>>>>>>> any registry for schemas) to solve.
>>>>>>>>>>>>
>>>>>>>>>>>> Notice that every serialized record has quite a bit of duplicated
>>>>>>>>>>>> information: the concrete type as well as a byte to indicate
>>>>>>>>>>>> whether
>>>>>>>>>>>> the value type is a fixed size, and, if so, an integer to
>>>>>>>>>>>> indicate the
>>>>>>>>>>>> actual size. These constitute a schema, of sorts, because they
>>>>>>>>>>>> tell us
>>>>>>>>>>>> later how exactly to deserialize the data. Unfortunately, this
>>>>>>>>>>>> information is completely redundant. In all likelihood, the
>>>>>>>>>>>> information will be exactly the same for every record in the
>>>>>>>>>>>> topic.
>>>>>>>>>>>> This problem is essentially the core motivation for serializations
>>>>>>>>>>>> like Avro: to move the schema outside of the serialization
>>>>>>>>>>>> itself, so
>>>>>>>>>>>> that the records won't contain so much redundant information.
>>>>>>>>>>>>
>>>>>>>>>>>> In this light, I'm wondering if it makes sense to go back to
>>>>>>>>>>>> something
>>>>>>>>>>>> like what you had earlier in which you don't support perfectly
>>>>>>>>>>>> preserving the concrete type for _this_ serde, but instead just
>>>>>>>>>>>> support deserializing to _some_ List. Then, you could defer full,
>>>>>>>>>>>> perfect, type preservation to serdes that have an external
>>>>>>>>>>>> system in
>>>>>>>>>>>> which to register their type information.
>>>>>>>>>>>>
>>>>>>>>>>>> There does exist an alternative, if we really do want to
>>>>>>>>>>>> preserve the
>>>>>>>>>>>> concrete type (which does seem kind of nice). You can add a
>>>>>>>>>>>> configuration option specifically for the serde to configure
>>>>>>>>>>>> what the
>>>>>>>>>>>> list type will be, and maybe what the element type is, as well.
>>>>>>>>>>>>
>>>>>>>>>>>> As far as "related work" goes, you might be interested to take
>>>>>>>>>>>> a look
>>>>>>>>>>>> at how Jackson can be configured to deserialize into a specific,
>>>>>>>>>>>> arbitrarily nested, generically parameterized class structure.
>>>>>>>>>>>> Specifically, you might find
>>>>>>>>>>>> https://fasterxml.github.io/jackson-core/javadoc/2.0.0/com/fasterxml/jackson/core/type/TypeReference.html
>>>>>>>>>>>>  
>>>>>>>>>>>> <https://fasterxml.github.io/jackson-core/javadoc/2.0.0/com/fasterxml/jackson/core/type/TypeReference.html>
>>>>>>>>>>>>  
>>>>>>>>>>>> <https://fasterxml.github.io/jackson-core/javadoc/2.0.0/com/fasterxml/jackson/core/type/TypeReference.html
>>>>>>>>>>>>  
>>>>>>>>>>>> <https://fasterxml.github.io/jackson-core/javadoc/2.0.0/com/fasterxml/jackson/core/type/TypeReference.html>>
>>>>>>>>>>>> <https://fasterxml.github.io/jackson-core/javadoc/2.0.0/com/fasterxml/jackson/core/type/TypeReference.html
>>>>>>>>>>>>  
>>>>>>>>>>>> <https://fasterxml.github.io/jackson-core/javadoc/2.0.0/com/fasterxml/jackson/core/type/TypeReference.html>
>>>>>>>>>>>>  
>>>>>>>>>>>> <https://fasterxml.github.io/jackson-core/javadoc/2.0.0/com/fasterxml/jackson/core/type/TypeReference.html
>>>>>>>>>>>>  
>>>>>>>>>>>> <https://fasterxml.github.io/jackson-core/javadoc/2.0.0/com/fasterxml/jackson/core/type/TypeReference.html>>>
>>>>>>>>>>>> interesting.
>>>>>>>>>>>>
>>>>>>>>>>>> Thanks,
>>>>>>>>>>>> -John
>>>>>>>>>>>>
>>>>>>>>>>>> On Mon, Jun 17, 2019 at 12:38 PM Development <d...@yeralin.net 
>>>>>>>>>>>> <mailto:d...@yeralin.net> <mailto:d...@yeralin.net 
>>>>>>>>>>>> <mailto:d...@yeralin.net>>
>>>>>>>>>>>> <mailto:d...@yeralin.net <mailto:d...@yeralin.net> 
>>>>>>>>>>>> <mailto:d...@yeralin.net <mailto:d...@yeralin.net>>> 
>>>>>>>>>>>> <mailto:d...@yeralin.net <mailto:d...@yeralin.net> 
>>>>>>>>>>>> <mailto:d...@yeralin.net <mailto:d...@yeralin.net>>>
>>>>>>>>>>>> <mailto:d...@yeralin.net <mailto:d...@yeralin.net> 
>>>>>>>>>>>> <mailto:d...@yeralin.net <mailto:d...@yeralin.net>> 
>>>>>>>>>>>> <mailto:d...@yeralin.net <mailto:d...@yeralin.net> 
>>>>>>>>>>>> <mailto:d...@yeralin.net <mailto:d...@yeralin.net>>>>> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> bump
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>
>>
> 
> 

Attachment: signature.asc
Description: OpenPGP digital signature

Reply via email to