mukul-8 commented on PR #27806: URL: https://github.com/apache/flink/pull/27806#issuecomment-4161866251
One thing to consider with this upgrade — Avro 1.12.1 enables fastReaderEnabled=true by default in GenericDatumReader/ SpecificDatumReader. This introduces an optimized execution plan for deserialization that gets rebuilt whenever setSchema() is called. In Flink, AvroDeserializationSchema.deserialize() calls datumReader.setSchema(readerSchema) on every record. With the fast reader enabled, this means the optimized execution plan is rebuilt per record, which could negate the performance benefit and potentially cause a regression compared to 1.11.x. This is discussed in detail in #27499 ([FLINK-39005](https://issues.apache.org/jira/browse/FLINK-39005)). It would be worth studying the impact before merging — either benchmarking the current setSchema-per-record path with fastReaderEnabled=true, or explicitly setting fastReaderEnabled=false until the redundant setSchema() call is addressed. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
