[ 
https://issues.apache.org/jira/browse/FLINK-33058?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17766109#comment-17766109
 ] 

Dale Lane commented on FLINK-33058:
-----------------------------------

hi [~rskraba] - thanks very much, a review would be much appreciated. 

As for use cases, I should give some context. I work for IBM - we sell a Kafka 
distribution with a schema registry that comes with serdes clients offering 
both binary and JSON-encoded Avro support. As a part of this, I've worked with 
many customers who use and value JSON-encoding.

As you suggest, sometimes this is a temporary thing, related to the phase of a 
project - I've seen some customers who will use JSON-encoding during 
development, and when they're ready to go into test/prod phases they flip the 
switch to binary-encoding.

However, there have also been times where I've seen customers use JSON-encoding 
even in production - generally where the topic throughput is low enough that 
any performance issues are outweighed by the benefits of greater readability 
and compatibility that JSON-encoding offers. 

Don't get me wrong, I don't dispute at all that binary-encoding is the more 
common choice, and comes with major network and disk usage improvements - so it 
makes sense that Flink would've started with that. But I would love to enable 
my customers to use Flink with their JSON-encoded Avro topics in the same way 
that they're able to use other tools, which is what prompted me to offer the 
pull request. 

> Support for JSON-encoded Avro
> -----------------------------
>
>                 Key: FLINK-33058
>                 URL: https://issues.apache.org/jira/browse/FLINK-33058
>             Project: Flink
>          Issue Type: Improvement
>          Components: Formats (JSON, Avro, Parquet, ORC, SequenceFile)
>            Reporter: Dale Lane
>            Priority: Minor
>              Labels: avro, flink, flink-formats, pull-request-available
>
> Avro supports two serialization encoding methods: binary and JSON
> cf. [https://avro.apache.org/docs/1.11.1/specification/#encodings] 
> flink-avro currently has a hard-coded assumption that Avro data is 
> binary-encoded (and cannot process Avro data that has been JSON-encoded).
> I propose adding a new optional format option to flink-avro: *avro.encoding*
> It will support two options: 'binary' and 'json'. 
> It unset, it will default to 'binary' to maintain compatibility/consistency 
> with current behaviour. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to