Thanks for the update!

I think that `ListDeserializer`, `ListSerializer`, and `ListSerde`
should have an default constructor and it should be possible to pass in
the `Class listClass` information via a configuration. Otherwise,
KafkaStreams cannot use it as default serde.


For the primitive serializers: `BytesSerializer` is not primitive IMHO,
as is it for `byte[]` with variable length -- it's for arrays, not for
single `byte` (note, that `Bytes` is a Kafka class wrapping `byte[]`).


For tests, we can comment on the PR. No need to do this in the KIP
discussion.


Can you also update the KIP?



-Matthias





On 6/21/19 11:29 AM, Development wrote:
> I made and pushed necessary commits, so we could review the final version 
> under PR https://github.com/apache/kafka/pull/6592
> 
> I also need some advice on writing tests for this new serde. So far I only 
> have two test cases (roundtrip and empty payload), I’m not sure if it is 
> enough.
> 
> Thank y’all for your help in this KIP :)
> 
> Best,
> Daniyar Yeralin
> 
> 
>> On Jun 21, 2019, at 1:44 PM, John Roesler <j...@confluent.io> wrote:
>>
>> Hey Daniyar,
>>
>> Looks good to me! Thanks for considering it.
>>
>> Thanks,
>> -John
>>
>> On Fri, Jun 21, 2019 at 9:04 AM Development <d...@yeralin.net 
>> <mailto:d...@yeralin.net>> wrote:
>> Hey John and Matthias,
>>
>> Yes, now I see it all. I’m storing lots of redundant information.
>> Here is my final idea. Yes, now a user should pass a list type. I realized 
>> that’s the type is not really needed in ListSerializer, but only in 
>> ListDeserializer:
>>
>>
>> In ListSerializer we will start storing sizes only if serializer is not a 
>> primitive serializer:
>>
>>
>> Then, in deserializer, we persist passed list type, so that during 
>> deserialization we could create an instance of it with predefined listSize 
>> for better performance.
>> We also try to locate a primitiveSize based on passed deserializer. If it is 
>> not there, then primitiveSize will be null. Which means that each entry’s 
>> size was encoded individually.
>>
>>
>> This looks much cleaner and more concise.
>>
>> What do you think?
>>
>> Best,
>> Daniyar Yeralin 
>>
>>> On Jun 20, 2019, at 5:45 PM, Matthias J. Sax <matth...@confluent.io 
>>> <mailto:matth...@confluent.io>> wrote:
>>>
>>> For encoding the list-type: I see John's point about re-encoding the
>>> list-type redundantly. However, I also don't like the idea that the
>>> Deserializer returns a fixed type...
>>>
>>> Maybe it's best allow users to specify the target list type on
>>> deserialization via config?
>>>
>>> Similar for the primitive types: I don't think we need to encode the
>>> type size, but users could specify the type on the deserializer (via a
>>> config again)?
>>>
>>>
>>> About generics: nesting could be arbitrarily deep. Hence, I doubt we can
>>> support this and a cast will be necessary at some point in the user code.
>>>
>>>
>>>
>>> -Matthias
>>>
>>>
>>>
>>> On 6/20/19 1:21 PM, John Roesler wrote:
>>>> Hey Daniyar,
>>>>
>>>> Thanks for looking at it!
>>>>
>>>> Something like your screenshot is more along the lines of what I was
>>>> thinking. Sorry, but I didn't follow what you mean, how would that not
>>>> be "vanilla java"?
>>>>
>>>> Unfortunately the deserializer needs more information, though. For
>>>> example, what if the inner type is a Map<String,String>? The serde could
>>>> only be used to produce a LinkedList<Map>, thus, we'd still need an
>>>> inner serde, like you have in the KIP (Serde<T> innerSerde).
>>>>
>>>> Something more like Serde<LinkedList<MyRecord>> = Serdes.listSerde(
>>>>   /**list type**/ LinkedList.class,
>>>>   /**inner serde**/ new MyRecordSerde()
>>>> )
>>>>
>>>> And in configuration, it's something like:
>>>> default.key.serde: org...ListSerde
>>>> default.key.list.serde.type: java.util.LinkedList
>>>> default.key.list.serde.inner: com.mycompany.MyRecordSerde
>>>>
>>>>
>>>> What do you think?
>>>> Thanks,
>>>> -John
>>>>
>>>> On Thu, Jun 20, 2019 at 2:46 PM Development <d...@yeralin.net 
>>>> <mailto:d...@yeralin.net>
>>>> <mailto:d...@yeralin.net <mailto:d...@yeralin.net>>> wrote:
>>>>
>>>>    Hey John,
>>>>
>>>>    I gave read about TypeReference. It could work for the list serde.
>>>>    However, it is not directly
>>>>    supported: https://github.com/FasterXML/jackson-databind/issues/1490 
>>>> <https://github.com/FasterXML/jackson-databind/issues/1490>
>>>>    The only way is to pass an actual class object into the constructor,
>>>>    something like:
>>>>
>>>>    It could be an option, but not a pretty one. What do you think of my
>>>>    approach to use vanilla java and canonical class name? (As described
>>>>    previously)
>>>>
>>>>    Best,
>>>>    Daniyar Yeralin
>>>>
>>>>>    On Jun 20, 2019, at 2:45 PM, Development <d...@yeralin.net 
>>>>> <mailto:d...@yeralin.net>
>>>>>    <mailto:d...@yeralin.net <mailto:d...@yeralin.net>>> wrote:
>>>>>
>>>>>    Hi John,
>>>>>
>>>>>    Thank you for your input! Yes, my idea looks a little bit over
>>>>>    engineered :)
>>>>>
>>>>>    I also wanted to see a feedback from Mathias as well since he gave
>>>>>    me an idea about storing fixed/variable size entries.
>>>>>
>>>>>    Best,
>>>>>    Daniyar Yeralin
>>>>>
>>>>>>    On Jun 18, 2019, at 6:06 PM, John Roesler <j...@confluent.io 
>>>>>> <mailto:j...@confluent.io>
>>>>>>    <mailto:j...@confluent.io <mailto:j...@confluent.io>>> wrote:
>>>>>>
>>>>>>    Hi Daniyar,
>>>>>>
>>>>>>    That's a very clever solution!
>>>>>>
>>>>>>    One observation is that, now, this is what we might call a
>>>>>>    polymorphic
>>>>>>    serde. That is, you're detecting the actual concrete type and then
>>>>>>    promising to produce the exact same concrete type on read. There are
>>>>>>    some inherent problems with this approach, which in general require
>>>>>>    some kind of  schema registry (not necessarily Schema Registry, just
>>>>>>    any registry for schemas) to solve.
>>>>>>
>>>>>>    Notice that every serialized record has quite a bit of duplicated
>>>>>>    information: the concrete type as well as a byte to indicate whether
>>>>>>    the value type is a fixed size, and, if so, an integer to
>>>>>>    indicate the
>>>>>>    actual size. These constitute a schema, of sorts, because they
>>>>>>    tell us
>>>>>>    later how exactly to deserialize the data. Unfortunately, this
>>>>>>    information is completely redundant. In all likelihood, the
>>>>>>    information will be exactly the same for every record in the topic.
>>>>>>    This problem is essentially the core motivation for serializations
>>>>>>    like Avro: to move the schema outside of the serialization itself, so
>>>>>>    that the records won't contain so much redundant information.
>>>>>>
>>>>>>    In this light, I'm wondering if it makes sense to go back to
>>>>>>    something
>>>>>>    like what you had earlier in which you don't support perfectly
>>>>>>    preserving the concrete type for _this_ serde, but instead just
>>>>>>    support deserializing to _some_ List. Then, you could defer full,
>>>>>>    perfect, type preservation to serdes that have an external system in
>>>>>>    which to register their type information.
>>>>>>
>>>>>>    There does exist an alternative, if we really do want to preserve the
>>>>>>    concrete type (which does seem kind of nice). You can add a
>>>>>>    configuration option specifically for the serde to configure what the
>>>>>>    list type will be, and maybe what the element type is, as well.
>>>>>>
>>>>>>    As far as "related work" goes, you might be interested to take a look
>>>>>>    at how Jackson can be configured to deserialize into a specific,
>>>>>>    arbitrarily nested, generically parameterized class structure.
>>>>>>    Specifically, you might find
>>>>>>    
>>>>>> https://fasterxml.github.io/jackson-core/javadoc/2.0.0/com/fasterxml/jackson/core/type/TypeReference.html
>>>>>>  
>>>>>> <https://fasterxml.github.io/jackson-core/javadoc/2.0.0/com/fasterxml/jackson/core/type/TypeReference.html>
>>>>>>    interesting.
>>>>>>
>>>>>>    Thanks,
>>>>>>    -John
>>>>>>
>>>>>>    On Mon, Jun 17, 2019 at 12:38 PM Development <d...@yeralin.net 
>>>>>> <mailto:d...@yeralin.net>
>>>>>>    <mailto:d...@yeralin.net <mailto:d...@yeralin.net>>> wrote:
>>>>>>>
>>>>>>>    bump
>>
> 
> 

Attachment: signature.asc
Description: OpenPGP digital signature

Reply via email to