[ 
https://issues.apache.org/jira/browse/FLINK-6022?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16245577#comment-16245577
 ] 

Stephan Ewen commented on FLINK-6022:
-------------------------------------

We are not serializing the schema in the Avro Serializer. If the Avro 
Serializer is chosen, this is fixed.

I am wondering if the case is if one uses explicitly a "generic record" from 
Avro as the exchange data type. That is not a good idea in the first place in 
my opinion. In that case, isn't it possible that each generic record is 
different and thus you always need a schema anyways.

> Don't serialise Schema when serialising Avro GenericRecord
> ----------------------------------------------------------
>
>                 Key: FLINK-6022
>                 URL: https://issues.apache.org/jira/browse/FLINK-6022
>             Project: Flink
>          Issue Type: Improvement
>          Components: Type Serialization System
>            Reporter: Robert Metzger
>            Assignee: Stephan Ewen
>             Fix For: 1.5.0
>
>
> Currently, Flink is serializing the schema for each Avro GenericRecord in the 
> stream.
> This leads to a lot of overhead over the wire/disk + high serialization costs.
> Therefore, I'm proposing to improve the support for GenericRecord in Flink by 
> shipping the schema to each serializer  through the AvroTypeInformation.
> Then, we can only support GenericRecords with the same type per stream, but 
> the performance will be much better.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to