[ https://issues.apache.org/jira/browse/FLINK-6022?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16245577#comment-16245577 ]
Stephan Ewen commented on FLINK-6022: ------------------------------------- We are not serializing the schema in the Avro Serializer. If the Avro Serializer is chosen, this is fixed. I am wondering if the case is if one uses explicitly a "generic record" from Avro as the exchange data type. That is not a good idea in the first place in my opinion. In that case, isn't it possible that each generic record is different and thus you always need a schema anyways. > Don't serialise Schema when serialising Avro GenericRecord > ---------------------------------------------------------- > > Key: FLINK-6022 > URL: https://issues.apache.org/jira/browse/FLINK-6022 > Project: Flink > Issue Type: Improvement > Components: Type Serialization System > Reporter: Robert Metzger > Assignee: Stephan Ewen > Fix For: 1.5.0 > > > Currently, Flink is serializing the schema for each Avro GenericRecord in the > stream. > This leads to a lot of overhead over the wire/disk + high serialization costs. > Therefore, I'm proposing to improve the support for GenericRecord in Flink by > shipping the schema to each serializer through the AvroTypeInformation. > Then, we can only support GenericRecords with the same type per stream, but > the performance will be much better. -- This message was sent by Atlassian JIRA (v6.4.14#64029)