[ https://issues.apache.org/jira/browse/FLINK-33058?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17766109#comment-17766109 ]
Dale Lane commented on FLINK-33058: ----------------------------------- hi [~rskraba] - thanks very much, a review would be much appreciated. As for use cases, I should give some context. I work for IBM - we sell a Kafka distribution with a schema registry that comes with serdes clients offering both binary and JSON-encoded Avro support. As a part of this, I've worked with many customers who use and value JSON-encoding. As you suggest, sometimes this is a temporary thing, related to the phase of a project - I've seen some customers who will use JSON-encoding during development, and when they're ready to go into test/prod phases they flip the switch to binary-encoding. However, there have also been times where I've seen customers use JSON-encoding even in production - generally where the topic throughput is low enough that any performance issues are outweighed by the benefits of greater readability and compatibility that JSON-encoding offers. Don't get me wrong, I don't dispute at all that binary-encoding is the more common choice, and comes with major network and disk usage improvements - so it makes sense that Flink would've started with that. But I would love to enable my customers to use Flink with their JSON-encoded Avro topics in the same way that they're able to use other tools, which is what prompted me to offer the pull request. > Support for JSON-encoded Avro > ----------------------------- > > Key: FLINK-33058 > URL: https://issues.apache.org/jira/browse/FLINK-33058 > Project: Flink > Issue Type: Improvement > Components: Formats (JSON, Avro, Parquet, ORC, SequenceFile) > Reporter: Dale Lane > Priority: Minor > Labels: avro, flink, flink-formats, pull-request-available > > Avro supports two serialization encoding methods: binary and JSON > cf. [https://avro.apache.org/docs/1.11.1/specification/#encodings] > flink-avro currently has a hard-coded assumption that Avro data is > binary-encoded (and cannot process Avro data that has been JSON-encoded). > I propose adding a new optional format option to flink-avro: *avro.encoding* > It will support two options: 'binary' and 'json'. > It unset, it will default to 'binary' to maintain compatibility/consistency > with current behaviour. -- This message was sent by Atlassian Jira (v8.20.10#820010)