EugeneYushin commented on issue #8187: [FLINK-12197] [Formats] Avro row deser for Confluent binary format URL: https://github.com/apache/flink/pull/8187#issuecomment-485708497 Hi @dawidwys. Thanks for a review. Please find my answers below. > Why do we need a whole duplicated `DeserializationSchema`? Can't we just wrap the original `AvroDeserializationSchema` and just introduce a wrapper that converts the output of the schema to `Row`? I've covered this in feature description briefly: > Use the same approach as for AvroRowDeserializationSchema, extending AbstractDeserializationSchema instead of AvroDeserializationSchema cause' the latter works with Generic/SpecificRecord abstractions from avro libs Here's a more detailed explanation. `AvroDeserializationSchema` doesn't provide any methods which can be re-used in case of `Row` passed as an input class: - [constructor](https://github.com/apache/flink/blob/master/flink-formats/flink-avro/src/main/java/org/apache/flink/formats/avro/AvroDeserializationSchema.java#L98) is intended to work with avro SpecificRecord/GenericRecord classes ``` * @param recordClazz class to which deserialize. Should be one of: * {@link org.apache.avro.specific.SpecificRecord}, * {@link org.apache.avro.generic.GenericRecord}. ``` - [deserialize()](https://github.com/apache/flink/blob/master/flink-formats/flink-avro/src/main/java/org/apache/flink/formats/avro/AvroDeserializationSchema.java#L126) method can't be used as a wrapper cause': -- this method is parametrized with return type, so parents' `AvroDeserializationSchema.deserialize` should return Row, while it's intended to return avro SpecificRecord/GenericRecord -- it sets reader schema only -- confluent avro binary format includes schema id in record, which should be used during decoding to set writer schema as well - [getProducedType()](https://github.com/apache/flink/blob/master/flink-formats/flink-avro/src/main/java/org/apache/flink/formats/avro/AvroDeserializationSchema.java#L165) deals only with avro SpecificRecord/GenericRecord classes - the same concerns I have if to choose `RegistryAvroDeserializationSchema` instead of `AvroDeserializationSchema` for extending Please, let me know if this makes sense to you.
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services
