Bump

> On Jul 22, 2019, at 11:26 AM, Development <d...@yeralin.net> wrote:
> 
> Hey Matthias,
> 
> It looks a little confusing, but I don’t have enough expertise to judge on 
> the configuration placement.
> 
> If you think, it is fine I’ll go ahead with this approach.
> 
> Best,
> Daniyar Yeralin
> 
>> On Jul 19, 2019, at 5:49 PM, Matthias J. Sax <matth...@confluent.io> wrote:
>> 
>> Good point.
>> 
>> I guess the simplest solution is, to actually add
>> 
>>>> default.list.key/value.serde.type
>>>> default.list.key/value.serde.inner
>> 
>> to both `CommonClientConfigs` and `StreamsConfig`.
>> 
>> It's not super clean, but I think it's the best we can do. Thoughts?
>> 
>> 
>> -Matthias
>> 
>> On 7/19/19 1:23 PM, Development wrote:
>>> Hi Matthias,
>>> 
>>> I agree, ConsumerConfig did not seem like a right place for these 
>>> configurations.
>>> I’ll put them in ProducerConfig, ConsumerConfig, and StreamsConfig.
>>> 
>>> However, I have a question. What should I do in "configure(Map<String, ?> 
>>> configs, boolean isKey)” methods? Which configurations should I try to 
>>> locate? I was comparing my (de)serializer implementations with 
>>> SessionWindows(De)serializer classes, and they use StreamsConfig class to 
>>> get  either StreamsConfig.DEFAULT_WINDOWED_KEY_SERDE_INNER_CLASS : 
>>> StreamsConfig.DEFAULT_WINDOWED_VALUE_SERDE_INNER_CLASS
>>> 
>>> In my case, as I mentioned earlier, StreamsConfig class is not accessible 
>>> from org.apache.kafka.common.serialization package. So, I can’t utilize it. 
>>> Any suggestions here?
>>> 
>>> Best,
>>> Daniyar Yeralin
>>> 
>>> 
>>>> On Jul 18, 2019, at 8:46 PM, Matthias J. Sax <matth...@confluent.io> wrote:
>>>> 
>>>> Thanks!
>>>> 
>>>> One minor question about the configs. The KIP adds three classes, a
>>>> Serializer, a Deserializer, and a Serde.
>>>> 
>>>> Hence, would it make sense to add the corresponding configs to
>>>> `ConsumerConfig`, `ProducerConfig`, and `StreamsConfig` using slightly
>>>> different names each time?
>>>> 
>>>> 
>>>> Somethin like this:
>>>> 
>>>> ProducerConfig:
>>>> 
>>>> list.key/value.serializer.type
>>>> list.key/value.serializer.inner
>>>> 
>>>> ConsumerConfig:
>>>> 
>>>> list.key/value.deserializer.type
>>>> list.key/value.deserializer.inner
>>>> 
>>>> StreamsConfig:
>>>> 
>>>> default.list.key/value.serde.type
>>>> default.list.key/value.serde.inner
>>>> 
>>>> 
>>>> Adding `d.l.k/v.serde.t/i` to `CommonClientConfigs does not sound right
>>>> to me. Also note, that it seems better to avoid the `default.` prefix
>>>> for consumers and producers because there is only one Serializer or
>>>> Deserializer anyway. Only for Streams, there are multiple and
>>>> StreamsConfig specifies the default one of an operator does not
>>>> overwrite it.
>>>> 
>>>> Thoughts?
>>>> 
>>>> 
>>>> Also, the KIP should explicitly mention to what classed certain configs
>>>> are added. Atm, the KIP only list parameter names, but does not state
>>>> where those are added.
>>>> 
>>>> 
>>>> -Matthias
>>>> 
>>>> 
>>>> 
>>>> 
>>>> 
>>>> On 7/16/19 1:11 PM, Development wrote:
>>>>> Hi,
>>>>> 
>>>>> Yes, totally forgot about the statement. KIP-466 is updated.
>>>>> 
>>>>> Thank you so much John Roesler, Matthias J. Sax, Sophie Blee-Goldman for 
>>>>> your valuable input!
>>>>> 
>>>>> I hope I did not cause too much trouble :)
>>>>> 
>>>>> I’ll start the vote now.
>>>>> 
>>>>> Best,
>>>>> Daniyar Yeralin
>>>>> 
>>>>>> On Jul 16, 2019, at 3:17 PM, John Roesler <j...@confluent.io> wrote:
>>>>>> 
>>>>>> Hi Daniyar,
>>>>>> 
>>>>>> Thanks for that update. I took a look, and I think this is in good shape.
>>>>>> 
>>>>>> One note, the statement "New method public static <T> Serde<List<T>>
>>>>>> ListSerde() in org.apache.kafka.common.serialization.Serdes class
>>>>>> (infers list implementation and inner serde from config file)" is
>>>>>> still present in the KIP, although I do it is was removed from the PR.
>>>>>> 
>>>>>> Once you remove that statement from the KIP, then I think this KIP is
>>>>>> ready to go up for a vote! Then, we can really review the PR in
>>>>>> earnest and get this thing merged.
>>>>>> 
>>>>>> Thanks,
>>>>>> -john
>>>>>> 
>>>>>> On Tue, Jul 16, 2019 at 2:05 PM Development <d...@yeralin.net> wrote:
>>>>>>> 
>>>>>>> Hi,
>>>>>>> 
>>>>>>> Pushed new changes under my PR: 
>>>>>>> https://github.com/apache/kafka/pull/6592 
>>>>>>> <https://github.com/apache/kafka/pull/6592>
>>>>>>> 
>>>>>>> Feel free to put any comments in there.
>>>>>>> 
>>>>>>> Best,
>>>>>>> Daniyar Yeralin
>>>>>>> 
>>>>>>>> On Jul 15, 2019, at 1:06 PM, Development <d...@yeralin.net> wrote:
>>>>>>>> 
>>>>>>>> Hi John,
>>>>>>>> 
>>>>>>>> I knew I was missing something. Yes, that makes sense now, I removed 
>>>>>>>> all `listSerde()` methods, and left empty constructors instead.
>>>>>>>> 
>>>>>>>> As per `CommonClientConfigs` I looked at the class, it doesn’t have 
>>>>>>>> any properties related to serdes, and that bothers me a little.
>>>>>>>> 
>>>>>>>> All properties like `default.key.serde` `default.windowed.key.serde.*` 
>>>>>>>> are located in StreamsConfig. I don’t want to create a confusion.
>>>>>>>> What also doesn’t make sense to me is that `WindowedSerdes` and its 
>>>>>>>> (de)serializers are not located in 
>>>>>>>> org.apache.kafka.common.serialization. I guess it kind of makes sense 
>>>>>>>> since windowed serdes are only available for kafka streams, not vice 
>>>>>>>> versa.
>>>>>>>> 
>>>>>>>> If everyone is okay to put list properties in `CommonClientConfigs` 
>>>>>>>> class, I’ll go ahead and do that then.
>>>>>>>> 
>>>>>>>> Thank you for your input!
>>>>>>>> 
>>>>>>>> Best,
>>>>>>>> Daniyar Yeralin
>>>>>>>> 
>>>>>>>>> On Jul 15, 2019, at 11:45 AM, John Roesler <j...@confluent.io> wrote:
>>>>>>>>> 
>>>>>>>>> Hi all,
>>>>>>>>> 
>>>>>>>>> Regarding the placement, you might as well move the constants to 
>>>>>>>>> `org.apache.kafka.clients.CommonClientConfigs`, so that the constants 
>>>>>>>>> and the configs and the code are in the same module.
>>>>>>>>> 
>>>>>>>>> Regarding the constructor... What Matthias said is correct: The 
>>>>>>>>> serde, serializer, and deserializer all need to have zero-arg 
>>>>>>>>> constructors so they can be instantiated reflectively by Kafka. 
>>>>>>>>> However, the factory method you proposed "New method public static 
>>>>>>>>> <T> Serde<List<T>> ListSerde()" is not a constructor, and is not 
>>>>>>>>> required. It would be used purely from the Java interface, but has 
>>>>>>>>> the drawbacks I listed above. This method, not the constructor, is 
>>>>>>>>> what I proposed to remove from the KIP.
>>>>>>>>> 
>>>>>>>>> Thanks,
>>>>>>>>> -John
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> On Mon, Jul 15, 2019 at 10:15 AM Development <d...@yeralin.net 
>>>>>>>>> <mailto:d...@yeralin.net>> wrote:
>>>>>>>>> One problem though.
>>>>>>>>> 
>>>>>>>>> Since WindowedSerde (Windowed(De)Serializer) are so similar, I’m 
>>>>>>>>> trying to mimic the implementation of my ListSerde accordingly.
>>>>>>>>> 
>>>>>>>>> I created couple constants under StreamsConfig:
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> And trying to do similar construct:
>>>>>>>>> final String propertyName = isKey ? 
>>>>>>>>> StreamsConfig.DEFAULT_LIST_KEY_SERDE_INNER_CLASS : 
>>>>>>>>> StreamsConfig.DEFAULT_LIST_VALUE_SERDE_INNER_CLASS;
>>>>>>>>> But then found out that StreamsConfig is not accessible from 
>>>>>>>>> org.apache.kafka.common.serialization package while window serde 
>>>>>>>>> (de)serializers are located under org.apache.kafka.streams.kstream 
>>>>>>>>> package.
>>>>>>>>> 
>>>>>>>>> What should I do? Should I move my classes under 
>>>>>>>>> org.apache.kafka.streams.kstream package instead?
>>>>>>>>> 
>>>>>>>>>> On Jul 15, 2019, at 10:45 AM, Development <d...@yeralin.net 
>>>>>>>>>> <mailto:d...@yeralin.net>> wrote:
>>>>>>>>>> 
>>>>>>>>>> Hi Matthias,
>>>>>>>>>> 
>>>>>>>>>> Thank you for your input.
>>>>>>>>>> 
>>>>>>>>>> I updated the KIP, made it a little more readable.
>>>>>>>>>> 
>>>>>>>>>> I think the configuration parameters strategy is finalized then.
>>>>>>>>>> 
>>>>>>>>>> Do you have any other questions/concerns regarding this KIP?
>>>>>>>>>> 
>>>>>>>>>> Meanwhile I’ll start doing appropriate code changes, and commit them 
>>>>>>>>>> under my PR.
>>>>>>>>>> 
>>>>>>>>>> Best,
>>>>>>>>>> Daniyar Yeralin
>>>>>>>>>> 
>>>>>>>>>>> On Jul 11, 2019, at 2:44 PM, Matthias J. Sax <matth...@confluent.io 
>>>>>>>>>>> <mailto:matth...@confluent.io>> wrote:
>>>>>>>>>>> 
>>>>>>>>>>> Daniyar,
>>>>>>>>>>> 
>>>>>>>>>>> thanks for the update to the KIP. It's in really good shape and well
>>>>>>>>>>> written.
>>>>>>>>>>> 
>>>>>>>>>>> About the default constructor question:
>>>>>>>>>>> 
>>>>>>>>>>> All Serdes/Serializer/Deserializer classes need a default 
>>>>>>>>>>> constructor to
>>>>>>>>>>> create them easily via reflections when specifies in a config. I
>>>>>>>>>>> understand that it is not super user friendly, but all existing code
>>>>>>>>>>> works this way. Hence, it seems best to stick with the established 
>>>>>>>>>>> pattern.
>>>>>>>>>>> 
>>>>>>>>>>> We have a similar issue with `TimeWindowedSerde` and
>>>>>>>>>>> `SessionWindowedSerde`, and I just recently did a PR to improve user
>>>>>>>>>>> experience that address the exact issue John raised. (cf
>>>>>>>>>>> https://github.com/apache/kafka/pull/7067 
>>>>>>>>>>> <https://github.com/apache/kafka/pull/7067>)
>>>>>>>>>>> 
>>>>>>>>>>> Note, that if a user would instantiate the Serde manually, the user
>>>>>>>>>>> would also need to call `configure()` to setup the inner serdes. 
>>>>>>>>>>> Kafka
>>>>>>>>>>> Streams would not setup those automatically and one might most 
>>>>>>>>>>> likely
>>>>>>>>>>> end-up with an NPE.
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> Coming back the KIP, and the parameter names. `WindowedSerdes` are
>>>>>>>>>>> similar to `ListSerde` as they wrap another Serde. For 
>>>>>>>>>>> `WindowedSerdes`,
>>>>>>>>>>> we use the following parameter names:
>>>>>>>>>>> 
>>>>>>>>>>> - default.windowed.key.serde.inner
>>>>>>>>>>> - default.windowed.value.serde.inner
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> It might be good to align the naming pattern. I would also suggest 
>>>>>>>>>>> to
>>>>>>>>>>> use `type` instead of `impl`?
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> default.key.list.serde.impl  ->  default.list.key.serde.type
>>>>>>>>>>> default.value.list.serde.impl  ->  default.list.value.serde.type
>>>>>>>>>>> default.key.list.serde.element  ->  default.list.key.serde.inner
>>>>>>>>>>> default.value.list.serde.element  ->  default.list.value.serde.inner
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> -Matthias
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> On 7/10/19 8:52 AM, Development wrote:
>>>>>>>>>>>> Hi John,
>>>>>>>>>>>> 
>>>>>>>>>>>> Yes, I do agree. That totally makes sense. The only thing is that 
>>>>>>>>>>>> it goes against what Matthias suggested earlier:
>>>>>>>>>>>> "I think that ... `ListSerde` should have an default constructor 
>>>>>>>>>>>> and it should be possible to pass in the `Class listClass` 
>>>>>>>>>>>> information via a configuration. Otherwise, KafkaStreams cannot 
>>>>>>>>>>>> use it as default serde.”
>>>>>>>>>>>> 
>>>>>>>>>>>> What do you think about that? I hope I’m not confusing anything.
>>>>>>>>>>>> 
>>>>>>>>>>>> Best,
>>>>>>>>>>>> Daniyar Yeralin
>>>>>>>>>>>> 
>>>>>>>>>>>>> On Jul 9, 2019, at 5:56 PM, John Roesler <j...@confluent.io 
>>>>>>>>>>>>> <mailto:j...@confluent.io>> wrote:
>>>>>>>>>>>>> 
>>>>>>>>>>>>> Ah, my apologies, I must have just overlooked it. Thanks for the 
>>>>>>>>>>>>> update, too.
>>>>>>>>>>>>> 
>>>>>>>>>>>>> Just one more super-small question, do we need this variant:
>>>>>>>>>>>>> 
>>>>>>>>>>>>>> New method public static <T> Serde<List<T>> ListSerde() in 
>>>>>>>>>>>>>> org.apache.kafka.common.serialization.Serdes class (infers list 
>>>>>>>>>>>>>> implementation and inner serde from config file)
>>>>>>>>>>>>> 
>>>>>>>>>>>>> It seems like this situation implies my config file is already 
>>>>>>>>>>>>> set up for the list serde, so passing this serde (e.g., in 
>>>>>>>>>>>>> Produced) would have the same effect as not specifying it.
>>>>>>>>>>>>> 
>>>>>>>>>>>>> I guess that it could be the case that you have the 
>>>>>>>>>>>>> `default.key/value.serde` set to something else, like 
>>>>>>>>>>>>> StringSerde, but you still have the 
>>>>>>>>>>>>> `default.key/value.list.serde.impl/element` set. This seems like 
>>>>>>>>>>>>> it would result in more confusion than convenience, so my gut 
>>>>>>>>>>>>> instinct is maybe we shouldn't introduce the `ListSerde()` 
>>>>>>>>>>>>> variant until people actually request it later on.
>>>>>>>>>>>>> 
>>>>>>>>>>>>> Thus, we'd just stick with fully config-driven or fully 
>>>>>>>>>>>>> source-code-driven, not half/half.
>>>>>>>>>>>>> 
>>>>>>>>>>>>> What do you think?
>>>>>>>>>>>>> 
>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>> -John
>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>>> On Tue, Jul 9, 2019 at 9:58 AM Development <d...@yeralin.net 
>>>>>>>>>>>>> <mailto:d...@yeralin.net> <mailto:d...@yeralin.net 
>>>>>>>>>>>>> <mailto:d...@yeralin.net>>> wrote:
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> Hi John,
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> I hope everyone had a great long weekend.
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> Regarding Java interfaces, I may not understand you correctly, 
>>>>>>>>>>>>>> but I think I already listed them:
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> So for Produced, you would use it in the following fashion, for 
>>>>>>>>>>>>>> example: Produced.keySerde(Serdes.ListSerde(ArrayList.class, 
>>>>>>>>>>>>>> Serdes.Integer()))
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> I also updated the KIP, and added a section “Serialization 
>>>>>>>>>>>>>> Strategy” where I describe our logic of conditional 
>>>>>>>>>>>>>> serialization based on the type of an inner serde.
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> Thank you!
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> Best,
>>>>>>>>>>>>>> Daniyar Yeralin
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> On Jun 26, 2019, at 11:44 AM, John Roesler <j...@confluent.io 
>>>>>>>>>>>>>> <mailto:j...@confluent.io> <mailto:j...@confluent.io 
>>>>>>>>>>>>>> <mailto:j...@confluent.io>>> wrote:
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> Thanks for the update, Daniyar!
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> In addition to specifying the config interface, can you also 
>>>>>>>>>>>>>> specify
>>>>>>>>>>>>>> the Java interface? Namely, if I need to pass an instance of this
>>>>>>>>>>>>>> serde in to the DSL directly, as in Produced, Materialized, 
>>>>>>>>>>>>>> etc., what
>>>>>>>>>>>>>> constructor(s) would I have available? Likewise with the 
>>>>>>>>>>>>>> Serializer
>>>>>>>>>>>>>> and Deserailizer. I don't think you need to specify the 
>>>>>>>>>>>>>> implementation
>>>>>>>>>>>>>> logic, since we've already discussed it here.
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> If you also want to specify the serialized format of the data 
>>>>>>>>>>>>>> records
>>>>>>>>>>>>>> in the KIP, it could be useful documentation, as well as letting 
>>>>>>>>>>>>>> us
>>>>>>>>>>>>>> verify the schema for forward/backward compatibility concerns, 
>>>>>>>>>>>>>> etc.
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>> John
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> On Wed, Jun 26, 2019 at 10:33 AM Development <d...@yeralin.net 
>>>>>>>>>>>>>> <mailto:d...@yeralin.net> <mailto:d...@yeralin.net 
>>>>>>>>>>>>>> <mailto:d...@yeralin.net>>> wrote:
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> Hey,
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> Finally made updates to the KIP: 
>>>>>>>>>>>>>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-466%3A+Add+support+for+List%3CT%3E+serialization+and+deserialization
>>>>>>>>>>>>>>  
>>>>>>>>>>>>>> <https://cwiki.apache.org/confluence/display/KAFKA/KIP-466%3A+Add+support+for+List%3CT%3E+serialization+and+deserialization>
>>>>>>>>>>>>>>  
>>>>>>>>>>>>>> <https://cwiki.apache.org/confluence/display/KAFKA/KIP-466%3A+Add+support+for+List%3CT%3E+serialization+and+deserialization
>>>>>>>>>>>>>>  
>>>>>>>>>>>>>> <https://cwiki.apache.org/confluence/display/KAFKA/KIP-466%3A+Add+support+for+List%3CT%3E+serialization+and+deserialization>>
>>>>>>>>>>>>>>  
>>>>>>>>>>>>>> <https://cwiki.apache.org/confluence/display/KAFKA/KIP-466:+Add+support+for+List%3CT%3E+serialization+and+deserialization
>>>>>>>>>>>>>>  
>>>>>>>>>>>>>> <https://cwiki.apache.org/confluence/display/KAFKA/KIP-466:+Add+support+for+List%3CT%3E+serialization+and+deserialization>
>>>>>>>>>>>>>>  
>>>>>>>>>>>>>> <https://cwiki.apache.org/confluence/display/KAFKA/KIP-466:+Add+support+for+List%3CT%3E+serialization+and+deserialization
>>>>>>>>>>>>>>  
>>>>>>>>>>>>>> <https://cwiki.apache.org/confluence/display/KAFKA/KIP-466:+Add+support+for+List%3CT%3E+serialization+and+deserialization>>>
>>>>>>>>>>>>>> Sorry for the delay :)
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> Thank You!
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> Best,
>>>>>>>>>>>>>> Daniyar Yeralin
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> On Jun 22, 2019, at 12:49 AM, Matthias J. Sax 
>>>>>>>>>>>>>> <matth...@confluent.io <mailto:matth...@confluent.io> 
>>>>>>>>>>>>>> <mailto:matth...@confluent.io <mailto:matth...@confluent.io>>> 
>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> Yes, something like this. I did not think about good 
>>>>>>>>>>>>>> configuration
>>>>>>>>>>>>>> parameter names yet. I am also not sure if I understand all 
>>>>>>>>>>>>>> proposed
>>>>>>>>>>>>>> configs atm. But all configs should be listed and explained in 
>>>>>>>>>>>>>> the KIP
>>>>>>>>>>>>>> anyway, and we can discuss further after you have updated the 
>>>>>>>>>>>>>> KIP (I can
>>>>>>>>>>>>>> ask more detailed question if I have any).
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> -Matthias
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> On 6/21/19 2:05 PM, Development wrote:
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> Yes, you are right. ByteSerializer is not what I need to have in 
>>>>>>>>>>>>>> a list
>>>>>>>>>>>>>> of primitives.
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> As for the default constructor and configurability, just want to 
>>>>>>>>>>>>>> make
>>>>>>>>>>>>>> sure. Is this what you have on your mind?
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> Best,
>>>>>>>>>>>>>> Daniyar Yeralin
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> On Jun 21, 2019, at 2:51 PM, Matthias J. Sax 
>>>>>>>>>>>>>> <matth...@confluent.io <mailto:matth...@confluent.io> 
>>>>>>>>>>>>>> <mailto:matth...@confluent.io <mailto:matth...@confluent.io>>
>>>>>>>>>>>>>> <mailto:matth...@confluent.io <mailto:matth...@confluent.io> 
>>>>>>>>>>>>>> <mailto:matth...@confluent.io <mailto:matth...@confluent.io>>>> 
>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> Thanks for the update!
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> I think that `ListDeserializer`, `ListSerializer`, and 
>>>>>>>>>>>>>> `ListSerde`
>>>>>>>>>>>>>> should have an default constructor and it should be possible to 
>>>>>>>>>>>>>> pass in
>>>>>>>>>>>>>> the `Class listClass` information via a configuration. Otherwise,
>>>>>>>>>>>>>> KafkaStreams cannot use it as default serde.
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> For the primitive serializers: `BytesSerializer` is not 
>>>>>>>>>>>>>> primitive IMHO,
>>>>>>>>>>>>>> as is it for `byte[]` with variable length -- it's for arrays, 
>>>>>>>>>>>>>> not for
>>>>>>>>>>>>>> single `byte` (note, that `Bytes` is a Kafka class wrapping 
>>>>>>>>>>>>>> `byte[]`).
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> For tests, we can comment on the PR. No need to do this in the 
>>>>>>>>>>>>>> KIP
>>>>>>>>>>>>>> discussion.
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> Can you also update the KIP?
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> -Matthias
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> On 6/21/19 11:29 AM, Development wrote:
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> I made and pushed necessary commits, so we could review the final
>>>>>>>>>>>>>> version under PR https://github.com/apache/kafka/pull/6592 
>>>>>>>>>>>>>> <https://github.com/apache/kafka/pull/6592> 
>>>>>>>>>>>>>> <https://github.com/apache/kafka/pull/6592 
>>>>>>>>>>>>>> <https://github.com/apache/kafka/pull/6592>>
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> I also need some advice on writing tests for this new serde. So 
>>>>>>>>>>>>>> far I
>>>>>>>>>>>>>> only have two test cases (roundtrip and empty payload), I’m not 
>>>>>>>>>>>>>> sure
>>>>>>>>>>>>>> if it is enough.
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> Thank y’all for your help in this KIP :)
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> Best,
>>>>>>>>>>>>>> Daniyar Yeralin
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> On Jun 21, 2019, at 1:44 PM, John Roesler <j...@confluent.io 
>>>>>>>>>>>>>> <mailto:j...@confluent.io> <mailto:j...@confluent.io 
>>>>>>>>>>>>>> <mailto:j...@confluent.io>>
>>>>>>>>>>>>>> <mailto:j...@confluent.io <mailto:j...@confluent.io> 
>>>>>>>>>>>>>> <mailto:j...@confluent.io <mailto:j...@confluent.io>>>> wrote:
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> Hey Daniyar,
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> Looks good to me! Thanks for considering it.
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>> -John
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> On Fri, Jun 21, 2019 at 9:04 AM Development <d...@yeralin.net 
>>>>>>>>>>>>>> <mailto:d...@yeralin.net> <mailto:d...@yeralin.net 
>>>>>>>>>>>>>> <mailto:d...@yeralin.net>>
>>>>>>>>>>>>>> <mailto:d...@yeralin.net <mailto:d...@yeralin.net> 
>>>>>>>>>>>>>> <mailto:d...@yeralin.net <mailto:d...@yeralin.net>>> 
>>>>>>>>>>>>>> <mailto:d...@yeralin.net <mailto:d...@yeralin.net> 
>>>>>>>>>>>>>> <mailto:d...@yeralin.net <mailto:d...@yeralin.net>>>> wrote:
>>>>>>>>>>>>>> Hey John and Matthias,
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> Yes, now I see it all. I’m storing lots of redundant information.
>>>>>>>>>>>>>> Here is my final idea. Yes, now a user should pass a list type. I
>>>>>>>>>>>>>> realized that’s the type is not really needed in ListSerializer, 
>>>>>>>>>>>>>> but
>>>>>>>>>>>>>> only in ListDeserializer:
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> In ListSerializer we will start storing sizes only if serializer 
>>>>>>>>>>>>>> is
>>>>>>>>>>>>>> not a primitive serializer:
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> Then, in deserializer, we persist passed list type, so that 
>>>>>>>>>>>>>> during
>>>>>>>>>>>>>> deserialization we could create an instance of it with predefined
>>>>>>>>>>>>>> listSize for better performance.
>>>>>>>>>>>>>> We also try to locate a primitiveSize based on passed 
>>>>>>>>>>>>>> deserializer.
>>>>>>>>>>>>>> If it is not there, then primitiveSize will be null. Which means
>>>>>>>>>>>>>> that each entry’s size was encoded individually.
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> This looks much cleaner and more concise.
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> What do you think?
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> Best,
>>>>>>>>>>>>>> Daniyar Yeralin
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> On Jun 20, 2019, at 5:45 PM, Matthias J. Sax 
>>>>>>>>>>>>>> <matth...@confluent.io <mailto:matth...@confluent.io> 
>>>>>>>>>>>>>> <mailto:matth...@confluent.io <mailto:matth...@confluent.io>>
>>>>>>>>>>>>>> <mailto:matth...@confluent.io <mailto:matth...@confluent.io> 
>>>>>>>>>>>>>> <mailto:matth...@confluent.io <mailto:matth...@confluent.io>>> 
>>>>>>>>>>>>>> <mailto:matth...@confluent.io <mailto:matth...@confluent.io> 
>>>>>>>>>>>>>> <mailto:matth...@confluent.io <mailto:matth...@confluent.io>>>> 
>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> For encoding the list-type: I see John's point about re-encoding 
>>>>>>>>>>>>>> the
>>>>>>>>>>>>>> list-type redundantly. However, I also don't like the idea that 
>>>>>>>>>>>>>> the
>>>>>>>>>>>>>> Deserializer returns a fixed type...
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> Maybe it's best allow users to specify the target list type on
>>>>>>>>>>>>>> deserialization via config?
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> Similar for the primitive types: I don't think we need to encode 
>>>>>>>>>>>>>> the
>>>>>>>>>>>>>> type size, but users could specify the type on the deserializer 
>>>>>>>>>>>>>> (via a
>>>>>>>>>>>>>> config again)?
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> About generics: nesting could be arbitrarily deep. Hence, I doubt
>>>>>>>>>>>>>> we can
>>>>>>>>>>>>>> support this and a cast will be necessary at some point in the 
>>>>>>>>>>>>>> user
>>>>>>>>>>>>>> code.
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> -Matthias
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> On 6/20/19 1:21 PM, John Roesler wrote:
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> Hey Daniyar,
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> Thanks for looking at it!
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> Something like your screenshot is more along the lines of what I 
>>>>>>>>>>>>>> was
>>>>>>>>>>>>>> thinking. Sorry, but I didn't follow what you mean, how would 
>>>>>>>>>>>>>> that not
>>>>>>>>>>>>>> be "vanilla java"?
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> Unfortunately the deserializer needs more information, though. 
>>>>>>>>>>>>>> For
>>>>>>>>>>>>>> example, what if the inner type is a Map<String,String>? The 
>>>>>>>>>>>>>> serde
>>>>>>>>>>>>>> could
>>>>>>>>>>>>>> only be used to produce a LinkedList<Map>, thus, we'd still need 
>>>>>>>>>>>>>> an
>>>>>>>>>>>>>> inner serde, like you have in the KIP (Serde<T> innerSerde).
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> Something more like Serde<LinkedList<MyRecord>> = 
>>>>>>>>>>>>>> Serdes.listSerde(
>>>>>>>>>>>>>> /**list type**/ LinkedList.class,
>>>>>>>>>>>>>> /**inner serde**/ new MyRecordSerde()
>>>>>>>>>>>>>> )
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> And in configuration, it's something like:
>>>>>>>>>>>>>> default.key.serde: org...ListSerde
>>>>>>>>>>>>>> default.key.list.serde.type: java.util.LinkedList
>>>>>>>>>>>>>> default.key.list.serde.inner: com.mycompany.MyRecordSerde
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> What do you think?
>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>> -John
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> On Thu, Jun 20, 2019 at 2:46 PM Development <d...@yeralin.net 
>>>>>>>>>>>>>> <mailto:d...@yeralin.net> <mailto:d...@yeralin.net 
>>>>>>>>>>>>>> <mailto:d...@yeralin.net>>
>>>>>>>>>>>>>> <mailto:d...@yeralin.net <mailto:d...@yeralin.net> 
>>>>>>>>>>>>>> <mailto:d...@yeralin.net <mailto:d...@yeralin.net>>> 
>>>>>>>>>>>>>> <mailto:d...@yeralin.net <mailto:d...@yeralin.net> 
>>>>>>>>>>>>>> <mailto:d...@yeralin.net <mailto:d...@yeralin.net>>>
>>>>>>>>>>>>>> <mailto:d...@yeralin.net <mailto:d...@yeralin.net> 
>>>>>>>>>>>>>> <mailto:d...@yeralin.net <mailto:d...@yeralin.net>> 
>>>>>>>>>>>>>> <mailto:d...@yeralin.net <mailto:d...@yeralin.net> 
>>>>>>>>>>>>>> <mailto:d...@yeralin.net <mailto:d...@yeralin.net>>>>> wrote:
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> Hey John,
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> I gave read about TypeReference. It could work for the list 
>>>>>>>>>>>>>> serde.
>>>>>>>>>>>>>> However, it is not directly
>>>>>>>>>>>>>> supported:
>>>>>>>>>>>>>> https://github.com/FasterXML/jackson-databind/issues/1490 
>>>>>>>>>>>>>> <https://github.com/FasterXML/jackson-databind/issues/1490> 
>>>>>>>>>>>>>> <https://github.com/FasterXML/jackson-databind/issues/1490 
>>>>>>>>>>>>>> <https://github.com/FasterXML/jackson-databind/issues/1490>>
>>>>>>>>>>>>>> <https://github.com/FasterXML/jackson-databind/issues/1490 
>>>>>>>>>>>>>> <https://github.com/FasterXML/jackson-databind/issues/1490> 
>>>>>>>>>>>>>> <https://github.com/FasterXML/jackson-databind/issues/1490 
>>>>>>>>>>>>>> <https://github.com/FasterXML/jackson-databind/issues/1490>>>
>>>>>>>>>>>>>> The only way is to pass an actual class object into the 
>>>>>>>>>>>>>> constructor,
>>>>>>>>>>>>>> something like:
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> It could be an option, but not a pretty one. What do you think 
>>>>>>>>>>>>>> of my
>>>>>>>>>>>>>> approach to use vanilla java and canonical class name? (As 
>>>>>>>>>>>>>> described
>>>>>>>>>>>>>> previously)
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> Best,
>>>>>>>>>>>>>> Daniyar Yeralin
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> On Jun 20, 2019, at 2:45 PM, Development <d...@yeralin.net 
>>>>>>>>>>>>>> <mailto:d...@yeralin.net> <mailto:d...@yeralin.net 
>>>>>>>>>>>>>> <mailto:d...@yeralin.net>>
>>>>>>>>>>>>>> <mailto:d...@yeralin.net <mailto:d...@yeralin.net> 
>>>>>>>>>>>>>> <mailto:d...@yeralin.net <mailto:d...@yeralin.net>>> 
>>>>>>>>>>>>>> <mailto:d...@yeralin.net <mailto:d...@yeralin.net> 
>>>>>>>>>>>>>> <mailto:d...@yeralin.net <mailto:d...@yeralin.net>>>
>>>>>>>>>>>>>> <mailto:d...@yeralin.net <mailto:d...@yeralin.net> 
>>>>>>>>>>>>>> <mailto:d...@yeralin.net <mailto:d...@yeralin.net>> 
>>>>>>>>>>>>>> <mailto:d...@yeralin.net <mailto:d...@yeralin.net> 
>>>>>>>>>>>>>> <mailto:d...@yeralin.net <mailto:d...@yeralin.net>>>>> wrote:
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> Hi John,
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> Thank you for your input! Yes, my idea looks a little bit over
>>>>>>>>>>>>>> engineered :)
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> I also wanted to see a feedback from Mathias as well since he 
>>>>>>>>>>>>>> gave
>>>>>>>>>>>>>> me an idea about storing fixed/variable size entries.
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> Best,
>>>>>>>>>>>>>> Daniyar Yeralin
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> On Jun 18, 2019, at 6:06 PM, John Roesler <j...@confluent.io 
>>>>>>>>>>>>>> <mailto:j...@confluent.io> <mailto:j...@confluent.io 
>>>>>>>>>>>>>> <mailto:j...@confluent.io>>
>>>>>>>>>>>>>> <mailto:j...@confluent.io <mailto:j...@confluent.io> 
>>>>>>>>>>>>>> <mailto:j...@confluent.io <mailto:j...@confluent.io>>> 
>>>>>>>>>>>>>> <mailto:j...@confluent.io <mailto:j...@confluent.io> 
>>>>>>>>>>>>>> <mailto:j...@confluent.io <mailto:j...@confluent.io>>>
>>>>>>>>>>>>>> <mailto:j...@confluent.io <mailto:j...@confluent.io> 
>>>>>>>>>>>>>> <mailto:j...@confluent.io <mailto:j...@confluent.io>> 
>>>>>>>>>>>>>> <mailto:j...@confluent.io <mailto:j...@confluent.io> 
>>>>>>>>>>>>>> <mailto:j...@confluent.io <mailto:j...@confluent.io>>>>> wrote:
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> Hi Daniyar,
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> That's a very clever solution!
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> One observation is that, now, this is what we might call a
>>>>>>>>>>>>>> polymorphic
>>>>>>>>>>>>>> serde. That is, you're detecting the actual concrete type and 
>>>>>>>>>>>>>> then
>>>>>>>>>>>>>> promising to produce the exact same concrete type on read.
>>>>>>>>>>>>>> There are
>>>>>>>>>>>>>> some inherent problems with this approach, which in general
>>>>>>>>>>>>>> require
>>>>>>>>>>>>>> some kind of  schema registry (not necessarily Schema
>>>>>>>>>>>>>> Registry, just
>>>>>>>>>>>>>> any registry for schemas) to solve.
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> Notice that every serialized record has quite a bit of duplicated
>>>>>>>>>>>>>> information: the concrete type as well as a byte to indicate
>>>>>>>>>>>>>> whether
>>>>>>>>>>>>>> the value type is a fixed size, and, if so, an integer to
>>>>>>>>>>>>>> indicate the
>>>>>>>>>>>>>> actual size. These constitute a schema, of sorts, because they
>>>>>>>>>>>>>> tell us
>>>>>>>>>>>>>> later how exactly to deserialize the data. Unfortunately, this
>>>>>>>>>>>>>> information is completely redundant. In all likelihood, the
>>>>>>>>>>>>>> information will be exactly the same for every record in the
>>>>>>>>>>>>>> topic.
>>>>>>>>>>>>>> This problem is essentially the core motivation for 
>>>>>>>>>>>>>> serializations
>>>>>>>>>>>>>> like Avro: to move the schema outside of the serialization
>>>>>>>>>>>>>> itself, so
>>>>>>>>>>>>>> that the records won't contain so much redundant information.
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> In this light, I'm wondering if it makes sense to go back to
>>>>>>>>>>>>>> something
>>>>>>>>>>>>>> like what you had earlier in which you don't support perfectly
>>>>>>>>>>>>>> preserving the concrete type for _this_ serde, but instead just
>>>>>>>>>>>>>> support deserializing to _some_ List. Then, you could defer full,
>>>>>>>>>>>>>> perfect, type preservation to serdes that have an external
>>>>>>>>>>>>>> system in
>>>>>>>>>>>>>> which to register their type information.
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> There does exist an alternative, if we really do want to
>>>>>>>>>>>>>> preserve the
>>>>>>>>>>>>>> concrete type (which does seem kind of nice). You can add a
>>>>>>>>>>>>>> configuration option specifically for the serde to configure
>>>>>>>>>>>>>> what the
>>>>>>>>>>>>>> list type will be, and maybe what the element type is, as well.
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> As far as "related work" goes, you might be interested to take
>>>>>>>>>>>>>> a look
>>>>>>>>>>>>>> at how Jackson can be configured to deserialize into a specific,
>>>>>>>>>>>>>> arbitrarily nested, generically parameterized class structure.
>>>>>>>>>>>>>> Specifically, you might find
>>>>>>>>>>>>>> https://fasterxml.github.io/jackson-core/javadoc/2.0.0/com/fasterxml/jackson/core/type/TypeReference.html
>>>>>>>>>>>>>>  
>>>>>>>>>>>>>> <https://fasterxml.github.io/jackson-core/javadoc/2.0.0/com/fasterxml/jackson/core/type/TypeReference.html>
>>>>>>>>>>>>>>  
>>>>>>>>>>>>>> <https://fasterxml.github.io/jackson-core/javadoc/2.0.0/com/fasterxml/jackson/core/type/TypeReference.html
>>>>>>>>>>>>>>  
>>>>>>>>>>>>>> <https://fasterxml.github.io/jackson-core/javadoc/2.0.0/com/fasterxml/jackson/core/type/TypeReference.html>>
>>>>>>>>>>>>>> <https://fasterxml.github.io/jackson-core/javadoc/2.0.0/com/fasterxml/jackson/core/type/TypeReference.html
>>>>>>>>>>>>>>  
>>>>>>>>>>>>>> <https://fasterxml.github.io/jackson-core/javadoc/2.0.0/com/fasterxml/jackson/core/type/TypeReference.html>
>>>>>>>>>>>>>>  
>>>>>>>>>>>>>> <https://fasterxml.github.io/jackson-core/javadoc/2.0.0/com/fasterxml/jackson/core/type/TypeReference.html
>>>>>>>>>>>>>>  
>>>>>>>>>>>>>> <https://fasterxml.github.io/jackson-core/javadoc/2.0.0/com/fasterxml/jackson/core/type/TypeReference.html>>>
>>>>>>>>>>>>>> interesting.
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>> -John
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> On Mon, Jun 17, 2019 at 12:38 PM Development <d...@yeralin.net 
>>>>>>>>>>>>>> <mailto:d...@yeralin.net> <mailto:d...@yeralin.net 
>>>>>>>>>>>>>> <mailto:d...@yeralin.net>>
>>>>>>>>>>>>>> <mailto:d...@yeralin.net <mailto:d...@yeralin.net> 
>>>>>>>>>>>>>> <mailto:d...@yeralin.net <mailto:d...@yeralin.net>>> 
>>>>>>>>>>>>>> <mailto:d...@yeralin.net <mailto:d...@yeralin.net> 
>>>>>>>>>>>>>> <mailto:d...@yeralin.net <mailto:d...@yeralin.net>>>
>>>>>>>>>>>>>> <mailto:d...@yeralin.net <mailto:d...@yeralin.net> 
>>>>>>>>>>>>>> <mailto:d...@yeralin.net <mailto:d...@yeralin.net>> 
>>>>>>>>>>>>>> <mailto:d...@yeralin.net <mailto:d...@yeralin.net> 
>>>>>>>>>>>>>> <mailto:d...@yeralin.net <mailto:d...@yeralin.net>>>>> wrote:
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> bump
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>> 
>>>>>>>> 
>>>>>>> 
>>>>> 
>>>> 
>>> 
>>> 
>> 
> 

Reply via email to